diff --git a/docs/content/Configuration.md b/docs/content/Configuration.md index 76098522594..72076db1a1c 100644 --- a/docs/content/Configuration.md +++ b/docs/content/Configuration.md @@ -32,7 +32,44 @@ We recommend just setting the base ZK path and the ZK service host, but all ZK p |`druid.zk.paths.base`|Base Zookeeper path.|`/druid`| |`druid.zk.service.host`|The ZooKeeper hosts to connect to. This is a REQUIRED property and therefore a host address must be supplied.|none| -See the [Zookeeper](ZooKeeper.html) page for more information on configuration options for ZK integration. +#### Zookeeper Behavior + +|Property|Description|Default| +|--------|-----------|-------| +|`druid.zk.service.sessionTimeoutMs`|ZooKeeper session timeout, in milliseconds.|`30000`| +|`druid.curator.compress`|Boolean flag for whether or not created Znodes should be compressed.|`true`| + +#### Path Configuration +Druid interacts with ZK through a set of standard path configurations. We recommend just setting the base ZK path, but all ZK paths that Druid uses can be overwritten to absolute paths. + +|Property|Description|Default| +|--------|-----------|-------| +|`druid.zk.paths.base`|Base Zookeeper path.|`/druid`| +|`druid.zk.paths.propertiesPath`|Zookeeper properties path.|`${druid.zk.paths.base}/properties`| +|`druid.zk.paths.announcementsPath`|Druid node announcement path.|`${druid.zk.paths.base}/announcements`| +|`druid.zk.paths.liveSegmentsPath`|Current path for where Druid nodes announce their segments.|`${druid.zk.paths.base}/segments`| +|`druid.zk.paths.loadQueuePath`|Entries here cause historical nodes to load and drop segments.|`${druid.zk.paths.base}/loadQueue`| +|`druid.zk.paths.coordinatorPath`|Used by the coordinator for leader election.|`${druid.zk.paths.base}/coordinator`| +|`druid.zk.paths.servedSegmentsPath`|@Deprecated. Legacy path for where Druid nodes announce their segments.|`${druid.zk.paths.base}/servedSegments`| + +The indexing service also uses its own set of paths. These configs can be included in the common configuration. + +|Property|Description|Default| +|--------|-----------|-------| +|`druid.zk.paths.indexer.base`|Base zookeeper path for |`${druid.zk.paths.base}/indexer`| +|`druid.zk.paths.indexer.announcementsPath`|Middle managers announce themselves here.|`${druid.zk.paths.indexer.base}/announcements`| +|`druid.zk.paths.indexer.tasksPath`|Used to assign tasks to middle managers.|`${druid.zk.paths.indexer.base}/tasks`| +|`druid.zk.paths.indexer.statusPath`|Parent path for announcement of task statuses.|`${druid.zk.paths.indexer.base}/status`| +|`druid.zk.paths.indexer.leaderLatchPath`|Used for Overlord leader election.|`${druid.zk.paths.indexer.base}/leaderLatchPath`| + +If `druid.zk.paths.base` and `druid.zk.paths.indexer.base` are both set, and none of the other `druid.zk.paths.*` or `druid.zk.paths.indexer.*` values are set, then the other properties will be evaluated relative to their respective `base`. +For example, if `druid.zk.paths.base` is set to `/druid1` and `druid.zk.paths.indexer.base` is set to `/druid2` then `druid.zk.paths.announcementsPath` will default to `/druid1/announcements` while `druid.zk.paths.indexer.announcementsPath` will default to `/druid2/announcements`. + +The following path is used service discovery and are **not** affected by `druid.zk.paths.base` and **must** be specified separately. + +|Property|Description|Default| +|--------|-----------|-------| +|`druid.discovery.curator.path`|Services announce themselves under this ZooKeeper path.|`/druid/discovery`| ### Request Logging diff --git a/docs/content/Deep-Storage.md b/docs/content/Deep-Storage.md index 165b7906a4d..aa0e8730c75 100644 --- a/docs/content/Deep-Storage.md +++ b/docs/content/Deep-Storage.md @@ -4,40 +4,13 @@ layout: doc_page # Deep Storage Deep storage is where segments are stored. It is a storage mechanism that Druid does not provide. This deep storage infrastructure defines the level of durability of your data, as long as Druid nodes can see this storage infrastructure and get at the segments stored on it, you will not lose data no matter how many Druid nodes you lose. If segments disappear from this storage layer, then you will lose whatever data those segments represented. -The currently supported types of deep storage follow. Other deep-storage options, such as [Cassandra](http://planetcassandra.org/blog/post/cassandra-as-a-deep-storage-mechanism-for-druid-real-time-analytics-engine/), have been developed by members of the community. +## Production Tested Deep Stores -## S3-compatible +### Local Mount -S3-compatible deep storage is basically either S3 or something like riak-cs which exposes the same API as S3. This is the default deep storage implementation. +A local mount can be used for storage of segments as well. This allows you to use just your local file system or anything else that can be mount locally like NFS, Ceph, etc. This is the default deep storage implementation. -S3 configuration parameters are - -``` -druid.s3.accessKey= -druid.s3.secretKey= -druid.storage.bucket= -druid.storage.baseKey= -``` - -## HDFS - -As of 0.4.0, HDFS can be used for storage of segments as well. - -In order to use hdfs for deep storage, you need to set the following configuration on your real-time nodes. - -``` -druid.storage.type=hdfs -druid.storage.storageDirectory= -``` - -If you are using the Hadoop indexer, set your output directory to be a location on Hadoop and it will work - - -## Local Mount - -A local mount can be used for storage of segments as well. This allows you to use just your local file system or anything else that can be mount locally like NFS, Ceph, etc. - -In order to use a local mount for deep storage, you need to set the following configuration on your real-time nodes. +In order to use a local mount for deep storage, you need to set the following configuration in your common configs. ``` druid.storage.type=local @@ -49,9 +22,35 @@ Note that you should generally set `druid.storage.storageDirectory` to something If you are using the Hadoop indexer in local mode, then just give it a local file as your output directory and it will work. -## Cassandra +### S3-compatible + +S3-compatible deep storage is basically either S3 or something like Google Storage which exposes the same API as S3. + +S3 configuration parameters are + +``` +druid.s3.accessKey= +druid.s3.secretKey= +druid.storage.bucket= +druid.storage.baseKey= +``` + +### HDFS + +In order to use hdfs for deep storage, you need to set the following configuration in your common configs. + +``` +druid.storage.type=hdfs +druid.storage.storageDirectory= +``` + +If you are using the Hadoop indexer, set your output directory to be a location on Hadoop and it will work + +## Community Contributed Deep Stores + +### Cassandra [Apache Cassandra](http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra) can also be leveraged for deep storage. This requires some additional druid configuration as well as setting up the necessary schema within a Cassandra keystore. -For more information on using Cassandra as deep storage, see [Cassandra Deep Storage](Cassandra-Deep-Storage.html). +Please note that this is a community contributed module and does not support Cassandra 2.x or hadoop-based batch indexing. For more information on using Cassandra as deep storage, see [Cassandra Deep Storage](Cassandra-Deep-Storage.html). diff --git a/docs/content/Design.md b/docs/content/Design.md index abacd8af362..32847b0f9d8 100644 --- a/docs/content/Design.md +++ b/docs/content/Design.md @@ -7,7 +7,7 @@ For a comprehensive look at the architecture of Druid, read the [White Paper](ht What is Druid? ============== -Druid is a system built to allow fast ("real-time") access to large sets of seldom-changing data. It was designed with the intent of being a service and maintaining 100% uptime in the face of code deployments, machine failures and other eventualities of a production system. It can be useful for back-office use cases as well, but design decisions were made explicitly targetting an always-up service. +Druid is a system built to allow fast ("real-time") access to large sets of seldom-changing data. It was designed with the intent of being a service and maintaining 100% uptime in the face of code deployments, machine failures and other eventualities of a production system. It can be useful for back-office use cases as well, but design decisions were made explicitly targeting an always-up service. Druid currently allows for single-table queries in a similar manner to [Dremel](http://research.google.com/pubs/pub36632.html) and [PowerDrill](http://www.vldb.org/pvldb/vol5/p1436_alexanderhall_vldb2012.pdf). It adds to the mix @@ -19,7 +19,7 @@ Druid currently allows for single-table queries in a similar manner to [Dremel]( As far as a comparison of systems is concerned, Druid sits in between PowerDrill and Dremel on the spectrum of functionality. It implements almost everything Dremel offers (Dremel handles arbitrary nested data structures while Druid only allows for a single level of array-based nesting) and gets into some of the interesting data layout and compression methods from PowerDrill. -Druid is a good fit for products that require real-time data ingestion of a single, large data stream. Especially if you are targetting no-downtime operation and are building your product on top of a time-oriented summarization of the incoming data stream. Druid is probably not the right solution if you care more about query flexibility and raw data access than query speed and no-downtime operation. When talking about query speed it is important to clarify what "fast" means: with Druid it is entirely within the realm of possibility (we have done it) to achieve queries that run in single-digit seconds across a 6TB data set. +Druid is a good fit for products that require real-time data ingestion of a single, large data stream. Especially if you are targeting no-downtime operation and are building your product on top of a time-oriented summarization of the incoming data stream. Druid is probably not the right solution if you care more about query flexibility and raw data access than query speed and no-downtime operation. When talking about query speed it is important to clarify what "fast" means: with Druid it is entirely within the realm of possibility (we have done it) to achieve queries that run in single-digit seconds across a 6TB data set. ### Architecture @@ -54,7 +54,7 @@ The following diagram illustrates the cluster's management layer, showing how ce -### Data Storage +### Segments and Data Storage Getting data into the Druid system requires an indexing process, as shown in the diagrams above. This gives the system a chance to analyze the data, add indexing structures, compress and adjust the layout in an attempt to optimize query speed. A quick list of what happens to the data follows. @@ -66,7 +66,9 @@ Getting data into the Druid system requires an indexing process, as shown in the - Bitmap compression - RLE (on the roadmap, but not yet implemented) -The output of the indexing process is stored in a "deep storage" LOB store/file system (see [Deep Storage](Deep-Storage.html) for information about potential options). Data is then loaded by Historical nodes by first downloading the data to their local disk and then memory-mapping it before serving queries. +The output of the indexing process is called a "segment". Segments are the fundamental structure to store data in Druid. Segments contain the various dimensions and metrics in a data set, stored in a column orientation, as well as the indexes for those columns. + +Segments are stored in a "deep storage" LOB store/file system (see [Deep Storage](Deep-Storage.html) for information about potential options). Data is then loaded by Historical nodes by first downloading the data to their local disk and then memory-mapping it before serving queries. If a Historical node dies, it will no longer serve its segments, but given that the segments are still available on the "deep storage", any other node can simply download the segment and start serving it. This means that it is possible to actually remove all historical nodes from the cluster and then re-provision them without any data loss. It also means that if the "deep storage" is not available, the nodes can continue to serve the segments they have already pulled down (i.e. the cluster goes stale, not down). diff --git a/docs/content/Historical.md b/docs/content/Historical.md index d29218df3ef..be270e25b04 100644 --- a/docs/content/Historical.md +++ b/docs/content/Historical.md @@ -35,7 +35,7 @@ Querying Segments Please see [Querying](Querying.html) for more information on querying historical nodes. -For every query that a historical node services, it will log the query and report metrics on the time taken to run the query. +A historical can be configured to log and report metrics for every query it services. HTTP Endpoints -------------- diff --git a/docs/content/Indexing-Service-Config.md b/docs/content/Indexing-Service-Config.md index e71e2863c89..116b227463c 100644 --- a/docs/content/Indexing-Service-Config.md +++ b/docs/content/Indexing-Service-Config.md @@ -190,13 +190,40 @@ Middle managers pass their configurations down to their child peons. The middle |Property|Description|Default| |--------|-----------|-------| +|`druid.indexer.runner.allowedPrefixes`|Whitelist of prefixes for configs that can be passed down to child peons.|"com.metamx", "druid", "io.druid", "user.timezone","file.encoding"| +|`druid.indexer.runner.compressZnodes`|Indicates whether or not the middle managers should compress Znodes.|true| +|`druid.indexer.runner.classpath`|Java classpath for the peon.|System.getProperty("java.class.path")| +|`druid.indexer.runner.javaCommand`|Command required to execute java.|java| +|`druid.indexer.runner.javaOpts`|-X Java options to run the peon in its own JVM.|""| +|`druid.indexer.runner.maxZnodeBytes`|The maximum size Znode in bytes that can be created in Zookeeper.|524288| +|`druid.indexer.runner.startPort`|The port that peons begin running on.|8100| |`druid.worker.ip`|The IP of the worker.|localhost| |`druid.worker.version`|Version identifier for the middle manager.|0| |`druid.worker.capacity`|Maximum number of tasks the middle manager can accept.|Number of available processors - 1| -|`druid.indexer.runner.compressZnodes`|Indicates whether or not the middle managers should compress Znodes.|false| -|`druid.indexer.runner.maxZnodeBytes`|The maximum size Znode in bytes that can be created in Zookeeper.|524288| -|`druid.indexer.runner.javaCommand`|Command required to execute java.|java| -|`druid.indexer.runner.javaOpts`|-X Java options to run the peon in its own JVM.|""| -|`druid.indexer.runner.classpath`|Java classpath for the peon.|System.getProperty("java.class.path")| -|`druid.indexer.runner.startPort`|The port that peons begin running on.|8100| -|`druid.indexer.runner.allowedPrefixes`|Whitelist of prefixes for configs that can be passed down to child peons.|"com.metamx", "druid", "io.druid", "user.timezone","file.encoding"| + + +#### Peon Configs +Although peons inherit the configurations of their parent middle managers, explicit child peon configs in middlemanager can be set by prefixing them with: + +``` +druid.indexer.fork.property +``` +Additional peon configs include: + +|Property|Description|Default| +|--------|-----------|-------| +|`druid.peon.mode`|Choices are "local" and "remote". Setting this to local means you intend to run the peon as a standalone node (Not recommended).|remote| +|`druid.indexer.task.baseDir`|Base temporary working directory.|/tmp| +|`druid.indexer.task.baseTaskDir`|Base temporary working directory for tasks.|/tmp/persistent/tasks| +|`druid.indexer.task.hadoopWorkingPath`|Temporary working directory for Hadoop tasks.|/tmp/druid-indexing| +|`druid.indexer.task.defaultRowFlushBoundary`|Highest row count before persisting to disk. Used for indexing generating tasks.|50000| +|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client:2.3.0| +|`druid.indexer.task.chathandler.type`|Choices are "noop" and "announce". Certain tasks will use service discovery to announce an HTTP endpoint that events can be posted to.|noop| + +If the peon is running in remote mode, there must be an overlord up and running. Peons in remote mode can set the following configurations: + +|Property|Description|Default| +|--------|-----------|-------| +|`druid.peon.taskActionClient.retry.minWait`|The minimum retry time to communicate with overlord.|PT1M| +|`druid.peon.taskActionClient.retry.maxWait`|The maximum retry time to communicate with overlord.|PT10M| +|`druid.peon.taskActionClient.retry.maxRetryCount`|The maximum number of retries to communicate with overlord.|10| diff --git a/docs/content/Indexing-Service.md b/docs/content/Indexing-Service.md index f7d65d78242..a7e38c3dbbb 100644 --- a/docs/content/Indexing-Service.md +++ b/docs/content/Indexing-Service.md @@ -89,3 +89,13 @@ Tasks ----- See [Tasks](Tasks.html). + +HTTP Endpoints +-------------- + +### GET + +* `/status` + +Returns the Druid version, loaded extensions, memory used, total memory and other useful information about the node. + diff --git a/docs/content/Ingestion-FAQ.md b/docs/content/Ingestion-FAQ.md index 982b3123658..5d53d64c17c 100644 --- a/docs/content/Ingestion-FAQ.md +++ b/docs/content/Ingestion-FAQ.md @@ -12,34 +12,24 @@ Depending on what `druid.storage.type` is set to, Druid will upload segments to ## My realtime node is not handing segments off -Make sure that the `druid.publish.type` on your real-time nodes is set to "metadata". Also make sure that `druid.storage.type` is set to a deep storage that makes sense. Some example configs: - -``` -druid.publish.type=db - -druid.db.connector.connectURI=jdbc\:mysql\://localhost\:3306/druid -druid.db.connector.user=druid -druid.db.connector.password=diurd - -druid.storage.type=s3 -druid.storage.bucket=druid -druid.storage.baseKey=sample -``` +First, make sure there are no exceptions in the logs of your node. Also make sure that `druid.storage.type` is set to a deep storage that makes sense. Other common reasons that hand-off fails are as follows: -1) Historical nodes are out of capacity and cannot download any more segments. You'll see exceptions in the coordinator logs if this occurs. +1) Druid is unable to write to the metadata storage. Make sure your configuration is correct. -2) Segments are corrupt and cannot download. You'll see exceptions in your historical nodes if this occurs. +2) Historical nodes are out of capacity and cannot download any more segments. You'll see exceptions in the coordinator logs if this occurs. -3) Deep storage is improperly configured. Make sure that your segment actually exists in deep storage and that the coordinator logs have no errors. +3) Segments are corrupt and cannot download. You'll see exceptions in your historical nodes if this occurs. + +4) Deep storage is improperly configured. Make sure that your segment actually exists in deep storage and that the coordinator logs have no errors. ## How do I get HDFS to work? -Make sure to include the `druid-hdfs-storage` module as one of your extensions and set `druid.storage.type=hdfs`. +Make sure to include the `druid-hdfs-storage` module as one of your extensions and set `druid.storage.type=hdfs`. You may also need to include hadoop configs on the classpath. ## I don't see my Druid segments on my historical nodes -You can check the coordinator console located at `:/cluster.html`. Make sure that your segments have actually loaded on [historical nodes](Historical.html). If your segments are not present, check the coordinator logs for messages about capacity of replication errors. One reason that segments are not downloaded is because historical nodes have maxSizes that are too small, making them incapable of downloading more data. You can change that with (for example): +You can check the coordinator console located at `:`. Make sure that your segments have actually loaded on [historical nodes](Historical.html). If your segments are not present, check the coordinator logs for messages about capacity of replication errors. One reason that segments are not downloaded is because historical nodes have maxSizes that are too small, making them incapable of downloading more data. You can change that with (for example): ``` -Ddruid.segmentCache.locations=[{"path":"/tmp/druid/storageLocation","maxSize":"500000000000"}] @@ -69,4 +59,4 @@ There are a few ways this can occur. Druid will throttle ingestion to prevent ou ## More information -Getting data into Druid can definitely be difficult for first time users. Please don't hesitate to ask questions in our IRC channel or on our [google groups page](https://groups.google.com/forum/#!forum/druid-development). +Getting data into Druid can definitely be difficult for first time users. Please don't hesitate to ask questions in our IRC channel or on our [google groups page](https://groups.google.com/forum/#!forum/druid-user). diff --git a/docs/content/Libraries.md b/docs/content/Libraries.md index a0b5d7d55af..d85326ce32f 100644 --- a/docs/content/Libraries.md +++ b/docs/content/Libraries.md @@ -2,14 +2,14 @@ layout: doc_page --- -Community Libraries -------------------- +Community Query Libraries +------------------------- Some great folks have written their own libraries to interact with Druid -#### Ruby +#### Node.js -* [madvertise/ruby-druid](https://github.com/madvertise/ruby-druid) - A ruby client for Druid +* [7eggs/node-druid-query](https://github.com/7eggs/node-druid-query) - A Node.js client for Druid #### Python @@ -17,13 +17,19 @@ Some great folks have written their own libraries to interact with Druid #### R -- [RDruid](https://github.com/metamx/RDruid) - Druid connector for R +* [metamx/RDruid](https://github.com/metamx/RDruid) - Druid connector for R -#### Node.js +#### Ruby -- [7eggs/node-druid-query](https://github.com/7eggs/node-druid-query) - A Node.js client for Druid +* [madvertise/ruby-druid](https://github.com/madvertise/ruby-druid) - A ruby client for Druid -#### Helper Libraries +#### SQL + +* [srikalyc/Sql4D](https://github.com/srikalyc/Sql4D) - A SQL client for Druid. Used in production at Yahoo. + + +Community Helper Libraries +-------------------------- * [madvertise/druid-dumbo](https://github.com/madvertise/druid-dumbo) - Scripts to help generate batch configs for the ingestion of data into Druid * [housejester/druid-test-harness](https://github.com/housejester/druid-test-harness) - A set of scripts to simplify standing up some servers and seeing how things work diff --git a/docs/content/Metrics.md b/docs/content/Metrics.md new file mode 100644 index 00000000000..4f35587eb23 --- /dev/null +++ b/docs/content/Metrics.md @@ -0,0 +1,13 @@ +--- +layout: doc_page +--- +# Druid Metrics + +Druid emits a variety of query, ingestion, and coordination metrics. + +Metrics can be displayed in the runtime log file or over http (to a service such as Apache Kafka). Metric emission is disabled by default. + +Available Metrics +----------------- + +Please refer to the following [spreadsheet](https://docs.google.com/spreadsheets/d/15XxGrGv2ggdt4SnCoIsckqUNJBZ9ludyhC9hNQa3l-M/edit#gid=0) for available Druid metrics. In the near future, we will be reworking Druid metrics and fully listing out important metrics to monitor and their expected values. diff --git a/docs/content/Middlemanager.md b/docs/content/Middlemanager.md index b3347d87f2d..0212867a69f 100644 --- a/docs/content/Middlemanager.md +++ b/docs/content/Middlemanager.md @@ -5,52 +5,23 @@ layout: doc_page Middle Manager Node ------------------ +For Middlemanager Node Configuration, see [Indexing Service Configuration](Indexing-Service-Config.html). + The middle manager node is a worker node that executes submitted tasks. Middle Managers forward tasks to peons that run in separate JVMs. -The reason we have separate JVMs for tasks is for log isolation. Each [Peon](Peons.html) is capable of running only one task at a time, however, a middle manager may have multiple peons. +The reason we have separate JVMs for tasks is for resource and log isolation. Each [Peon](Peons.html) is capable of running only one task at a time, however, a middle manager may have multiple peons. -Quick Start ------------------- - -#### Running +Running +------- ``` io.druid.cli.Main server middleManager ``` -With the following JVM configuration: +HTTP Endpoints +-------------- -``` --Duser.timezone=UTC --Dfile.encoding=UTF-8 +### GET --Ddruid.host=localhost --Ddruid.port=8091 --Ddruid.service=middleManager - --Ddruid.zk.service.host=localhost - --Ddruid.metadata.storage.connector.connectURI=jdbc:mysql://localhost:3306/druid --Ddruid.metadata.storage.connector.user=druid --Ddruid.metadata.storage.connector.password=diurd --Ddruid.selectors.indexing.serviceName=overlord --Ddruid.indexer.runner.startPort=8092 --Ddruid.indexer.fork.property.druid.computation.buffer.size=268435456 -``` - -#### JVM Configuration - -Middle managers pass their configurations down to their child peons. The middle manager module requires the following configs: - -|Property|Description|Default| -|--------|-----------|-------| -|`druid.worker.ip`|The IP of the worker.|localhost| -|`druid.worker.version`|Version identifier for the middle manager.|0| -|`druid.worker.capacity`|Maximum number of tasks the middle manager can accept.|Number of available processors - 1| -|`druid.indexer.runner.compressZnodes`|Indicates whether or not the middle managers should compress Znodes.|false| -|`druid.indexer.runner.maxZnodeBytes`|The maximum size Znode in bytes that can be created in Zookeeper.|524288| -|`druid.indexer.runner.javaCommand`|Command required to execute java.|java| -|`druid.indexer.runner.javaOpts`|-X Java options to run the peon in its own JVM.|""| -|`druid.indexer.runner.classpath`|Java classpath for the peon.|System.getProperty("java.class.path")| -|`druid.indexer.runner.startPort`|The port that peons begin running on.|8100| -|`druid.indexer.runner.allowedPrefixes`|Whitelist of prefixes for configs that can be passed down to child peons.|"com.metamx", "druid", "io.druid", "user.timezone","file.encoding"| +* `/status` +Returns the Druid version, loaded extensions, memory used, total memory and other useful information about the node. diff --git a/docs/content/Peons.md b/docs/content/Peons.md index 87b99fad362..6d024bed978 100644 --- a/docs/content/Peons.md +++ b/docs/content/Peons.md @@ -4,42 +4,20 @@ layout: doc_page Peons ----- + +For Peon Configuration, see [Peon Configuration](Indexing-Service-Config.html). + Peons run a single task in a single JVM. MiddleManager is responsible for creating Peons for running tasks. Peons should rarely (if ever for testing purposes) be run on their own. -#### JVM Configuration -Although peons inherit the configurations of their parent middle managers, explicit child peon configs in middlemanager can be set by prefixing them with: +Running +------- -``` -druid.indexer.fork.property -``` -Additional peon configs include: - -|Property|Description|Default| -|--------|-----------|-------| -|`druid.peon.mode`|Choices are "local" and "remote". Setting this to local means you intend to run the peon as a standalone node (Not recommended).|remote| -|`druid.indexer.task.baseDir`|Base temporary working directory.|/tmp| -|`druid.indexer.task.baseTaskDir`|Base temporary working directory for tasks.|/tmp/persistent/tasks| -|`druid.indexer.task.hadoopWorkingPath`|Temporary working directory for Hadoop tasks.|/tmp/druid-indexing| -|`druid.indexer.task.defaultRowFlushBoundary`|Highest row count before persisting to disk. Used for indexing generating tasks.|50000| -|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client:2.3.0| -|`druid.indexer.task.chathandler.type`|Choices are "noop" and "announce". Certain tasks will use service discovery to announce an HTTP endpoint that events can be posted to.|noop| - -If the peon is running in remote mode, there must be an overlord up and running. Running peons in remote mode require the following configurations: - -|Property|Description|Default| -|--------|-----------|-------| -|`druid.peon.taskActionClient.retry.minWait`|The minimum retry time to communicate with overlord.|PT1M| -|`druid.peon.taskActionClient.retry.maxWait`|The maximum retry time to communicate with overlord.|PT10M| -|`druid.peon.taskActionClient.retry.maxRetryCount`|The maximum number of retries to communicate with overlord.|10| - -#### Running - -The peon should very rarely ever be run independent of the middle manager. +The peon should very rarely ever be run independent of the middle manager unless for development purposes. ``` io.druid.cli.Main internal peon ``` The task file contains the task JSON object. -The status file indicates where the task status will be output. \ No newline at end of file +The status file indicates where the task status will be output. diff --git a/docs/content/Performance-FAQ.md b/docs/content/Performance-FAQ.md index 3a574c21777..8860f75cb1f 100644 --- a/docs/content/Performance-FAQ.md +++ b/docs/content/Performance-FAQ.md @@ -2,13 +2,18 @@ layout: doc_page --- +## I can't match your benchmarked results +Improper configuration is by far the largest problem we see people trying to deploy Druid. The example configurations listed in the tutorials are designed for a small volume of data where all nodes are on a single machine. The configs are extremely poor for actual production use. + ## What should I set my JVM heap? The size of the JVM heap really depends on the type of Druid node you are running. Below are a few considerations. -[Broker nodes](Broker.html) uses the JVM heap mainly to merge results from historicals and real-times. Brokers also use off-heap memory and processing threads for groupBy queries. +[Broker nodes](Broker.html) uses the JVM heap mainly to merge results from historicals and real-times. Brokers also use off-heap memory and processing threads for groupBy queries. We recommend 20G-30G of heap here. [Historical nodes](Historical.html) use off-heap memory to store intermediate results, and by default, all segments are memory mapped before they can be queried. Typically, the more memory is available on a historical node, the more segments can be served without the possibility of data being paged on to disk. On historicals, the JVM heap is used for [GroupBy queries](GroupByQuery.html), some data structures used for intermediate computation, and general processing. One way to calculate how much space there is for segments is: memory_for_segments = total_memory - heap - direct_memory - jvm_overhead. +We recommend 250mb * (processing.numThreads) for the heap. + [Coordinator nodes](Coordinator nodes) do not require off-heap memory and the heap is used for loading information about all segments to determine what segments need to be loaded, dropped, moved, or replicated. ## What is the intermediate computation buffer? @@ -19,6 +24,7 @@ Server maxSize sets the maximum cumulative segment size (in bytes) that a node c ## My logs are really chatty, can I set them to asynchronously write? Yes, using a `log4j2.xml` similar to the following causes some of the more chatty classes to write asynchronously: + ``` @@ -46,4 +52,4 @@ Yes, using a `log4j2.xml` similar to the following causes some of the more chatt -``` \ No newline at end of file +``` diff --git a/docs/content/Realtime.md b/docs/content/Realtime.md index 5085122f426..acfdaea265d 100644 --- a/docs/content/Realtime.md +++ b/docs/content/Realtime.md @@ -21,7 +21,7 @@ The segment propagation diagram for real-time data ingestion can be seen below: ![Segment Propagation](../img/segmentPropagation.png "Segment Propagation") -You can read about the various components shown in this diagram under the Architecture section (see the menu on the left). +You can read about the various components shown in this diagram under the Architecture section (see the menu on the left). Note that some of the names are now outdated. ### Firehose diff --git a/docs/content/Segments.md b/docs/content/Segments.md index eeefd5b80d3..509b69ca7cb 100644 --- a/docs/content/Segments.md +++ b/docs/content/Segments.md @@ -4,8 +4,6 @@ layout: doc_page Segments ======== -Segments are the fundamental structure to store data in Druid. [Historical](Historical.html) and [Realtime](Realtime.html) nodes load and serve segments for querying. To construct segments, Druid will always shard data by a time partition. Data may be further sharded based on dimension cardinality and row count. - The latest Druid segment version is `v9`. Naming Convention diff --git a/docs/content/Tutorial:-A-First-Look-at-Druid.md b/docs/content/Tutorial:-A-First-Look-at-Druid.md index 0928b23d1bd..86073070ba0 100644 --- a/docs/content/Tutorial:-A-First-Look-at-Druid.md +++ b/docs/content/Tutorial:-A-First-Look-at-Druid.md @@ -278,4 +278,4 @@ Additional Information This tutorial is merely showcasing a small fraction of what Druid can do. If you are interested in more information about Druid, including setting up a more sophisticated Druid cluster, read more of the Druid documentation and blogs found on druid.io. -Hopefully you learned a thing or two about Druid real-time ingestion, querying Druid, and how Druid can be used to solve problems. If you have additional questions, feel free to post in our [google groups page](https://groups.google.com/forum/#!forum/druid-development). +Hopefully you learned a thing or two about Druid real-time ingestion, querying Druid, and how Druid can be used to solve problems. If you have additional questions, feel free to post in our [google groups page](https://groups.google.com/forum/#!forum/druid-user). diff --git a/docs/content/Tutorial:-Loading-Batch-Data.md b/docs/content/Tutorial:-Loading-Batch-Data.md index 0425c1c4d1d..0819f245bf5 100644 --- a/docs/content/Tutorial:-Loading-Batch-Data.md +++ b/docs/content/Tutorial:-Loading-Batch-Data.md @@ -334,4 +334,4 @@ We demonstrated using the indexing service as a way to ingest data into Druid. P Additional Information ---------------------- -Getting data into Druid can definitely be difficult for first time users. Please don't hesitate to ask questions in our IRC channel or on our [google groups page](https://groups.google.com/forum/#!forum/druid-development). +Getting data into Druid can definitely be difficult for first time users. Please don't hesitate to ask questions in our IRC channel or on our [google groups page](https://groups.google.com/forum/#!forum/druid-user). diff --git a/docs/content/Tutorial:-Loading-Streaming-Data.md b/docs/content/Tutorial:-Loading-Streaming-Data.md index 0af0e6c9dec..f5e6d803ef1 100644 --- a/docs/content/Tutorial:-Loading-Streaming-Data.md +++ b/docs/content/Tutorial:-Loading-Streaming-Data.md @@ -149,5 +149,5 @@ Druid solved the availability problem by switching from a pull-based model to a Additional Information ---------------------- -Getting data into Druid can definitely be difficult for first time users. Please don't hesitate to ask questions in our IRC channel or on our [google groups page](https://groups.google.com/forum/#!forum/druid-development). +Getting data into Druid can definitely be difficult for first time users. Please don't hesitate to ask questions in our IRC channel or on our [google groups page](https://groups.google.com/forum/#!forum/druid-user). diff --git a/docs/content/ZooKeeper.md b/docs/content/ZooKeeper.md index d9cb8e560cc..20ef93b1f91 100644 --- a/docs/content/ZooKeeper.md +++ b/docs/content/ZooKeeper.md @@ -10,49 +10,6 @@ Druid uses [ZooKeeper](http://zookeeper.apache.org/) (ZK) for management of curr 4. [Overlord](Indexing-Service.html) leader election 5. [Indexing Service](Indexing-Service.html) task management -### Property Configuration - -ZooKeeper paths are set via the `runtime.properties` configuration file. Druid will automatically create paths that do not exist, so typos in config files is a very easy way to become split-brained. - - -|Property|Description|Default| -|--------|-----------|-------| -|`druid.zk.service.host`|The ZooKeeper hosts to connect to. This is a REQUIRED property and therefore a host address must be supplied.|none| -|`druid.zk.service.sessionTimeoutMs`|ZooKeeper session timeout, in milliseconds.|`30000`| -|`druid.curator.compress`|Boolean flag for whether or not created Znodes should be compressed.|`false`| - -### Path Configuration -Druid interacts with ZK through a set of standard path configurations. We recommend just setting the base ZK path, but all ZK paths that Druid uses can be overwritten to absolute paths. - -|Property|Description|Default| -|--------|-----------|-------| -|`druid.zk.paths.base`|Base Zookeeper path.|`/druid`| -|`druid.zk.paths.propertiesPath`|Zookeeper properties path.|`${druid.zk.paths.base}/properties`| -|`druid.zk.paths.announcementsPath`|Druid node announcement path.|`${druid.zk.paths.base}/announcements`| -|`druid.zk.paths.liveSegmentsPath`|Current path for where Druid nodes announce their segments.|`${druid.zk.paths.base}/segments`| -|`druid.zk.paths.loadQueuePath`|Entries here cause historical nodes to load and drop segments.|`${druid.zk.paths.base}/loadQueue`| -|`druid.zk.paths.coordinatorPath`|Used by the coordinator for leader election.|`${druid.zk.paths.base}/coordinator`| -|`druid.zk.paths.servedSegmentsPath`|@Deprecated. Legacy path for where Druid nodes announce their segments.|`${druid.zk.paths.base}/servedSegments`| - -The indexing service also uses its own set of paths. These configs can be included in the common configuration. - -|Property|Description|Default| -|--------|-----------|-------| -|`druid.zk.paths.indexer.base`|Base zookeeper path for |`${druid.zk.paths.base}/indexer`| -|`druid.zk.paths.indexer.announcementsPath`|Middle managers announce themselves here.|`${druid.zk.paths.indexer.base}/announcements`| -|`druid.zk.paths.indexer.tasksPath`|Used to assign tasks to middle managers.|`${druid.zk.paths.indexer.base}/tasks`| -|`druid.zk.paths.indexer.statusPath`|Parent path for announcement of task statuses.|`${druid.zk.paths.indexer.base}/status`| -|`druid.zk.paths.indexer.leaderLatchPath`|Used for Overlord leader election.|`${druid.zk.paths.indexer.base}/leaderLatchPath`| - -If `druid.zk.paths.base` and `druid.zk.paths.indexer.base` are both set, and none of the other `druid.zk.paths.*` or `druid.zk.paths.indexer.*` values are set, then the other properties will be evaluated relative to their respective `base`. -For example, if `druid.zk.paths.base` is set to `/druid1` and `druid.zk.paths.indexer.base` is set to `/druid2` then `druid.zk.paths.announcementsPath` will default to `/druid1/announcements` while `druid.zk.paths.indexer.announcementsPath` will default to `/druid2/announcements`. - -The following path is used service discovery and are **not** affected by `druid.zk.paths.base` and **must** be specified separately. - -|Property|Description|Default| -|--------|-----------|-------| -|`druid.discovery.curator.path`|Services announce themselves under this ZooKeeper path.|`/druid/discovery`| - ### Coordinator Leader Election We use the Curator LeadershipLatch recipe to do leader election at path diff --git a/docs/content/index.md b/docs/content/index.md index 811ca3f6bd1..d4d3496e133 100644 --- a/docs/content/index.md +++ b/docs/content/index.md @@ -4,7 +4,7 @@ layout: doc_page # About Druid -Druid is an open-source analytics data store designed for real-time exploratory queries on large-scale data sets (trillions of events, petabytes of data). Druid provides cost-effective and always-on realtime data ingestion and arbitrary data exploration. +Druid is an open-source analytics data store designed for OLAP queries on timeseries data (trillions of events, petabytes of data). Druid provides cost-effective and always-on real-time data ingestion, arbitrary data exploration, and fast data aggregation. - Try out Druid with our Getting Started [Tutorial](./Tutorial%3A-A-First-Look-at-Druid.html) - Learn more by reading the [White Paper](http://static.druid.io/docs/druid.pdf) @@ -13,7 +13,7 @@ Key Features ------------ - **Designed for Analytics** - Druid is built for exploratory analytics for OLAP workflows. It supports a variety of filters, aggregators and query types and provides a framework for plugging in new functionality. Users have leveraged Druid’s infrastructure to develop features such as top K queries and histograms. -- **Interactive Queries** - Druid’s low-latency data ingestion architecture allows events to be queried milliseconds after they are created. Druid’s query latency is optimized by reading and scanning only exactly what is needed. Aggregate and filter on data without sitting around waiting for results. +- **Interactive Queries** - Druid’s low-latency data ingestion architecture allows events to be queried milliseconds after they are created. Druid’s query latency is optimized by reading and scanning only exactly what is needed. Aggregate and filter on data without sitting around waiting for results. Druid is ideal for powering analytic dashboards. - **Highly Available** - Druid is used to back SaaS implementations that need to be up all the time. Your data is still available and queryable during system updates. Scale up or down without data loss. - **Scalable** - Existing Druid deployments handle billions of events and terabytes of data per day. Druid is designed to be petabyte scale. @@ -21,7 +21,7 @@ Key Features Why Druid? ---------- -Druid was originally created to resolve query latency issues seen with trying to use Hadoop to power an interactive service. It's especially useful if you are summarizing your data sets and then querying the summarizations. Put your summarizations into Druid and get quick queryability out of a system that you can be confident will scale up as your data volumes increase. Deployments have scaled up to 2TB of data per hour at peak ingested and aggregated in real-time. +Druid was originally created to resolve query latency issues seen with trying to use Hadoop to power an interactive service. It's especially useful if you are summarizing your data sets and then querying the summarizations. Put your summarizations into Druid and get quick queryability out of a system that you can be confident will scale up as your data volumes increase. Deployments have scaled up to many TBs of data per hour at peak ingested and aggregated in real-time. Druid is a system that you can set up in your organization next to Hadoop. It provides the ability to access your data in an interactive slice-and-dice fashion. It trades off some query flexibility and takes over the storage format in order to provide the speed. @@ -31,11 +31,16 @@ We have more details about the general design of the system and why you might wa When Druid? ---------- -* You need to do interactive, fast, exploration on large amounts of data -* You need analytics (not a key-value store) +* You need to do interactive aggregations and fast exploration on large amounts of data +* You need ad-hoc analytic queries (not a key-value store) * You have a lot of data (10s of billions of events added per day, 10s of TB of data added per day) * You want to do your analysis on data as it’s happening (in real-time) -* You need a data store that is always available, 24x7x365, and years into the future. +* You need a data store that is always available with no single point of failure. + +Architecture Overview +--------------------- + +Druid is partially inspired by search infrastructure and creates mostly immutable views of data and stores the data in data structures highly optimized for aggregations and filters. A Druid cluster is composed of various types of nodes, each designed to do a small set of things very well. Druid vs… ---------- @@ -48,7 +53,6 @@ Druid vs… * [Druid-vs-Spark](Druid-vs-Spark.html) * [Druid-vs-Elasticsearch](Druid-vs-Elasticsearch.html) - About This Page ---------- The data infrastructure world is vast, confusing and constantly in flux. This page is meant to help potential evaluators decide whether Druid is a good fit for the problem one needs to solve. If anything about it is incorrect please provide that feedback on the mailing list or via some other means so we can fix it. diff --git a/docs/content/toc.textile b/docs/content/toc.textile index 496a4148064..eefa1aa3a16 100644 --- a/docs/content/toc.textile +++ b/docs/content/toc.textile @@ -5,6 +5,7 @@ h2. Introduction * "About Druid":./ +* "Design":./Design.html * "Concepts and Terminology":./Concepts-and-Terminology.html h2. Getting Started @@ -41,7 +42,7 @@ h2. Data Ingestion h2. Operations * "Performance FAQ":./Performance-FAQ.html * "Extending Druid":./Modules.html -* "Booting a Production Cluster":./Booting-a-production-cluster.html +* "Druid Metrics":./Metrics.html h2. Querying * "Querying":./Querying.html diff --git a/examples/config/_common/common.runtime.properties b/examples/config/_common/common.runtime.properties index 6beed3fa7d3..98d84f6656f 100644 --- a/examples/config/_common/common.runtime.properties +++ b/examples/config/_common/common.runtime.properties @@ -15,7 +15,7 @@ # limitations under the License. # -# Extensions +# Extensions (no deep storage model is listed - using local fs for deep storage - not recommended for production) druid.extensions.coordinates=["io.druid.extensions:druid-examples","io.druid.extensions:druid-kafka-eight","io.druid.extensions:mysql-metadata-storage"] # Zookeeper @@ -38,8 +38,8 @@ druid.cache.sizeInBytes=10000000 # Indexing service discovery druid.selectors.indexing.serviceName=overlord -# Monitoring (disabled for examples) +# Monitoring (disabled for examples, if you enable SysMonitor, make sure to include sigar jar in your cp) # druid.monitoring.monitors=["com.metamx.metrics.SysMonitor","com.metamx.metrics.JvmMonitor"] -# Metrics logging (disabled for examples) +# Metrics logging (disabled for examples - change this to logging or http in production) druid.emitter=noop diff --git a/examples/config/broker/runtime.properties b/examples/config/broker/runtime.properties index 3d7126c1824..068e4577759 100644 --- a/examples/config/broker/runtime.properties +++ b/examples/config/broker/runtime.properties @@ -15,6 +15,7 @@ # limitations under the License. # +# Default host: localhost. Default port: 8082. If you run each node type on its own node in production, you should override these values to be IP:8080 #druid.host=localhost #druid.port=8082 druid.service=broker @@ -23,6 +24,6 @@ druid.service=broker druid.broker.cache.useCache=true druid.broker.cache.populateCache=true -# Bump these up only for faster nested groupBy +# For prod: set numThreads = # cores - 1, and sizeBytes to 512mb druid.processing.buffer.sizeBytes=100000000 druid.processing.numThreads=1 diff --git a/examples/config/coordinator/runtime.properties b/examples/config/coordinator/runtime.properties index f5ffaea185f..6a3734c9959 100644 --- a/examples/config/coordinator/runtime.properties +++ b/examples/config/coordinator/runtime.properties @@ -15,10 +15,12 @@ # limitations under the License. # +# Default host: localhost. Default port: 8081. If you run each node type on its own node in production, you should override these values to be IP:8080 #druid.host=localhost #druid.port=8081 druid.service=coordinator # The coordinator begins assignment operations after the start delay. # We override the default here to start things up faster for examples. +# In production you should use PT5M or PT10M druid.coordinator.startDelay=PT70s diff --git a/examples/config/historical/runtime.properties b/examples/config/historical/runtime.properties index 29c50d0e88f..97625b046ec 100644 --- a/examples/config/historical/runtime.properties +++ b/examples/config/historical/runtime.properties @@ -15,14 +15,22 @@ # limitations under the License. # +# Default host: localhost. Default port: 8083. If you run each node type on its own node in production, you should override these values to be IP:8080 #druid.host=localhost #druid.port=8083 druid.service=historical -# We can only 1 scan segment in parallel with these configs. + # Our intermediate buffer is also very small so longer topNs will be slow. +# In prod: set sizeBytes = 512mb druid.processing.buffer.sizeBytes=100000000 +# We can only 1 scan segment in parallel with these configs. +# In prod: set numThreads = # cores - 1 druid.processing.numThreads=1 +# maxSize should reflect the performance you want. +# Druid memory maps segments. +# memory_for_segments = total_memory - heap_size - (processing.buffer.sizeBytes * (processing.numThreads+1)) - JVM overhead (~1G) +# The greater the memory/disk ratio, the better performance you should see druid.segmentCache.locations=[{"path": "/tmp/druid/indexCache", "maxSize"\: 10000000000}] druid.server.maxSize=10000000000 diff --git a/examples/config/overlord/runtime.properties b/examples/config/overlord/runtime.properties index 15229459b6f..38be3244cd7 100644 --- a/examples/config/overlord/runtime.properties +++ b/examples/config/overlord/runtime.properties @@ -15,12 +15,17 @@ # limitations under the License. # +# Default host: localhost. Default port: 8090. If you run each node type on its own node in production, you should override these values to be IP:8080 #druid.host=localhost #druid.port=8090 druid.service=overlord # Run the overlord in local mode with a single peon to execute tasks +# This is not recommended for production. druid.indexer.queue.startDelay=PT0M +# This setting is too small for real production workloads druid.indexer.runner.javaOpts="-server -Xmx256m" +# These settings are also too small for real production workloads +# Please see our recommended production settings in the docs (http://druid.io/docs/latest/Production-Cluster-Configuration.html) druid.indexer.fork.property.druid.processing.numThreads=1 druid.indexer.fork.property.druid.computation.buffer.size=100000000 diff --git a/examples/config/realtime/runtime.properties b/examples/config/realtime/runtime.properties index fc4365ccf1a..2e89c391478 100644 --- a/examples/config/realtime/runtime.properties +++ b/examples/config/realtime/runtime.properties @@ -15,14 +15,16 @@ # limitations under the License. # +# Default host: localhost. Default port: 8084. If you run each node type on its own node in production, you should override these values to be IP:8080 #druid.host=localhost #druid.port=8084 druid.service=realtime # We can only 1 scan segment in parallel with these configs. # Our intermediate buffer is also very small so longer topNs will be slow. +# In production sizeBytes should be 512mb, and numThreads should be # cores - 1 druid.processing.buffer.sizeBytes=100000000 -druid.processing.numThreads=2 +druid.processing.numThreads=1 # Enable Real monitoring # druid.monitoring.monitors=["com.metamx.metrics.SysMonitor","com.metamx.metrics.JvmMonitor","io.druid.segment.realtime.RealtimeMetricsMonitor"] diff --git a/server/src/main/java/io/druid/server/initialization/ZkPathsConfig.java b/server/src/main/java/io/druid/server/initialization/ZkPathsConfig.java index e3d75c788d1..4b0ae76a8d6 100644 --- a/server/src/main/java/io/druid/server/initialization/ZkPathsConfig.java +++ b/server/src/main/java/io/druid/server/initialization/ZkPathsConfig.java @@ -31,7 +31,7 @@ public class ZkPathsConfig @JsonProperty private String announcementsPath; - @JsonProperty + @JsonProperty @Deprecated private String servedSegmentsPath; @JsonProperty @@ -62,6 +62,7 @@ public class ZkPathsConfig return (null == announcementsPath) ? defaultPath("announcements") : announcementsPath; } + @Deprecated public String getServedSegmentsPath() { return (null == servedSegmentsPath) ? defaultPath("servedSegments") : servedSegmentsPath;