more FAQ docs

This commit is contained in:
fjy 2014-10-21 16:08:56 -07:00
parent 5df92aff2c
commit 2d96bc5f1f
4 changed files with 64 additions and 9 deletions

View File

@ -21,3 +21,8 @@ SSDs are highly recommended for historical and real-time nodes if you are not ru
Although Druid supports schemaless ingestion of dimensions, because of https://github.com/metamx/druid/issues/658, you may sometimes get bigger segments than necessary. To ensure segments are as compact as possible, providing dimension names in lexicographic order is recommended. This may require some ETL processing on your data however.
# Read FAQs
You should read common problems people have here:
1) [Ingestion-FAQ](Ingestion-FAQ.html)
2) [Performance-FAQ](Performance-FAQ.html)

View File

@ -20,9 +20,7 @@ io.druid.cli.Main server coordinator
Rules
-----
Segments are loaded and dropped from the cluster based on a set of rules. Rules indicate how segments should be assigned to different historical node tiers and how many replicants of a segment should exist in each tier. Rules may also indicate when segments should be dropped entirely from the cluster. The coordinator loads a set of rules from the database. Rules may be specific to a certain datasource and/or a default set of rules can be configured. Rules are read in order and hence the ordering of rules is important. The coordinator will cycle through all available segments and match each segment with the first rule that applies. Each segment may only match a single rule.
For more information on rules, see [Rule Configuration](Rule-Configuration.html).
Segments can be automatically loaded and dropped from the cluster based on a set of rules. For more information on rules, see [Rule Configuration](Rule-Configuration.html).
Cleaning Up Segments
--------------------

View File

@ -21,6 +21,12 @@ druid.storage.bucket=druid
druid.storage.baseKey=sample
```
Other common reasons that hand-off fails are as follows:
1) Historical nodes are out of capacity and cannot download any more segments. You'll see exceptions in the coordinator logs if this occurs.
2) Segments are corrupt and cannot download. You'll see exceptions in your historical nodes if this occurs.
3) Deep storage is improperly configured. Make sure that your segment actually exists in deep storage and that the coordinator logs have no errors.
## How do I get HDFS to work?
Make sure to include the `druid-hdfs-storage` module as one of your extensions and set `druid.storage.type=hdfs`.
@ -50,6 +56,9 @@ To do this use the IngestSegmentFirehose and run an indexer task. The IngestSegm
Typically the above will be run as a batch job to say everyday feed in a chunk of data and aggregate it.
## Real-time ingestion seems to be stuck
There are a few ways this can occur. Druid will throttle ingestion to prevent out of memory problems if the intermediate persists are taking too long or if hand-off is taking too long. If your node logs indicate certain columns are taking a very long time to build (for example, if your segment granularity is hourly, but creating a single column takes 30 minutes), you should re-evaluate your configuration or scale up your real-time ingestion.
## More information

View File

@ -2,12 +2,34 @@
layout: doc_page
---
# Configuring Rules for Coordinator Nodes
Rules indicate how segments should be assigned to different historical node tiers and how many replicas of a segment should exist in each tier. Rules may also indicate when segments should be dropped entirely from the cluster. The coordinator loads a set of rules from the metadata storage. Rules may be specific to a certain datasource and/or a default set of rules can be configured. Rules are read in order and hence the ordering of rules is important. The coordinator will cycle through all available segments and match each segment with the first rule that applies. Each segment may only match a single rule.
Note: It is recommended that the coordinator console is used to configure rules. However, the coordinator node does have HTTP endpoints to programmatically configure rules.
Load Rules
----------
Load rules indicate how many replicants of a segment should exist in a server tier.
Load rules indicate how many replicas of a segment should exist in a server tier.
### Forever Load Rule
Forever load rules are of the form:
```json
{
"type" : "loadForever",
"tieredReplicants": {
"hot": 1,
"_default_tier" : 1
}
}
```
* `type` - this should always be "loadByInterval"
* `tieredReplicants` - A JSON Object where the keys are the tier names and values are the number of replicas for that tier.
### Interval Load Rule
@ -17,13 +39,16 @@ Interval load rules are of the form:
{
"type" : "loadByInterval",
"interval": "2012-01-01/2013-01-01",
"tier" : "hot"
"tieredReplicants": {
"hot": 1,
"_default_tier" : 1
}
}
```
* `type` - this should always be "loadByInterval"
* `interval` - A JSON Object representing ISO-8601 Intervals
* `tier` - the configured historical node tier
* `tieredReplicants` - A JSON Object where the keys are the tier names and values are the number of replicas for that tier.
### Period Load Rule
@ -33,13 +58,16 @@ Period load rules are of the form:
{
"type" : "loadByPeriod",
"period" : "P1M",
"tier" : "hot"
"tieredReplicants": {
"hot": 1,
"_default_tier" : 1
}
}
```
* `type` - this should always be "loadByPeriod"
* `period` - A JSON Object representing ISO-8601 Periods
* `tier` - the configured historical node tier
* `tieredReplicants` - A JSON Object where the keys are the tier names and values are the number of replicas for that tier.
The interval of a segment will be compared against the specified period. The rule matches if the period overlaps the interval.
@ -48,6 +76,21 @@ Drop Rules
Drop rules indicate when segments should be dropped from the cluster.
### Forever Drop Rule
Forever drop rules are of the form:
```json
{
"type" : "dropForever"
}
```
* `type` - this should always be "dropByPeriod"
All segments that match this rule are dropped from the cluster.
### Interval Drop Rule
Interval drop rules are of the form: