druid/docs/content/configuration/broker.md

---
layout: doc_page
---
Broker Node Configuration
=========================
For general Broker Node information, see [here](../design/broker.html).

Runtime Configuration
---------------------

The broker node uses several of the global configs in [Configuration](../configuration/index.html) and has the following set of configurations as well:

### Node Configs

|Property|Description|Default|
|--------|-----------|-------|
|`druid.host`|The host for the current node. This is used to advertise the current processes location as reachable from another node and should generally be specified such that `http://${druid.host}/` could actually talk to this process|InetAddress.getLocalHost().getCanonicalHostName()|
|`druid.port`|This is the port to actually listen on; unless port mapping is used, this will be the same port as is on `druid.host`|8082|
|`druid.service`|The name of the service. This is used as a dimension when emitting metrics and alerts to differentiate between the various services|druid/broker|

### Query Configs

#### Query Prioritization

|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.broker.balancer.type`|`random`, `connectionCount`|Determines how the broker balances connections to historical nodes. `random` choose randomly, `connectionCount` picks the node with the fewest number of active connections to|`random`|
|`druid.broker.select.tier`|`highestPriority`, `lowestPriority`, `custom`|If segments are cross-replicated across tiers in a cluster, you can tell the broker to prefer to select segments in a tier with a certain priority.|`highestPriority`|
|`druid.broker.select.tier.custom.priorities`|`An array of integer priorities.`|Select servers in tiers with a custom priority list.|None|

#### Concurrent Requests

Druid uses Jetty to serve HTTP requests.

|Property|Description|Default|
|--------|-----------|-------|
|`druid.server.http.numThreads`|Number of threads for HTTP requests.|10|
|`druid.server.http.maxIdleTime`|The Jetty max idle time for a connection.|PT5m|
|`druid.server.http.defaultQueryTimeout`|Query timeout in millis, beyond which unfinished queries will be cancelled|300000|
|`druid.broker.http.numConnections`|Size of connection pool for the Broker to connect to historical and real-time processes. If there are more queries than this number that all need to speak to the same node, then they will queue up.|20|
|`druid.broker.http.compressionCodec`|Compression codec the Broker uses to communicate with historical and real-time processes. May be "gzip" or "identity".|gzip|
|`druid.broker.http.readTimeout`|The timeout for data reads from historical and real-time processes.|PT15M|

#### Retry Policy

Druid broker can optionally retry queries internally for transient errors.

|Property|Description|Default|
|--------|-----------|-------|
|`druid.broker.retryPolicy.numTries`|Number of tries.|1|

#### Processing

The broker uses processing configs for nested groupBy queries. And, optionally, Long-interval queries (of any type) can be broken into shorter interval queries and processed in parallel inside this thread pool. For more details, see "chunkPeriod" in [Query Context](../querying/query-context.html) doc.

|Property|Description|Default|
|--------|-----------|-------|
|`druid.processing.buffer.sizeBytes`|This specifies a buffer size for the storage of intermediate results. The computation engine in both the Historical and Realtime nodes will use a scratch buffer of this size to do all of their intermediate computations off-heap. Larger values allow for more aggregations in a single pass over the data while smaller values can require more passes depending on the query that is being executed.|1073741824 (1GB)|
|`druid.processing.buffer.poolCacheMaxCount`|processing buffer pool caches the buffers for later use, this is the maximum count cache will grow to. note that pool can create more buffers than it can cache if necessary.|Integer.MAX_VALUE|
|`druid.processing.formatString`|Realtime and historical nodes use this format string to name their processing threads.|processing-%s|
|`druid.processing.numMergeBuffers`|The number of direct memory buffers available for merging query results. The buffers are sized by `druid.processing.buffer.sizeBytes`. This property is effectively a concurrency limit for queries that require merging buffers. If you are using any queries that require merge buffers (currently, just groupBy v2) then you should have at least two of these.|`max(2, druid.processing.numThreads / 4)`|
|`druid.processing.numThreads`|The number of processing threads to have available for parallel processing of segments. Our rule of thumb is `num_cores - 1`, which means that even under heavy load there will still be one core available to do background tasks like talking with ZooKeeper and pulling down segments. If only one core is available, this property defaults to the value `1`.|Number of cores - 1 (or 1)|
|`druid.processing.columnCache.sizeBytes`|Maximum size in bytes for the dimension value lookup cache. Any value greater than `0` enables the cache. It is currently disabled by default. Enabling the lookup cache can significantly improve the performance of aggregators operating on dimension values, such as the JavaScript aggregator, or cardinality aggregator, but can slow things down if the cache hit rate is low (i.e. dimensions with few repeating values). Enabling it may also require additional garbage collection tuning to avoid long GC pauses.|`0` (disabled)|
|`druid.processing.fifo`|If the processing queue should treat tasks of equal priority in a FIFO manner|`false`|
|`druid.processing.tmpDir`|Path where temporary files created while processing a query should be stored. If specified, this configuration takes priority over the default `java.io.tmpdir` path.|path represented by `java.io.tmpdir`|

The amount of direct memory needed by Druid is at least
`druid.processing.buffer.sizeBytes * (druid.processing.numMergeBuffers + druid.processing.numThreads + 1)`. You can
ensure at least this amount of direct memory is available by providing `-XX:MaxDirectMemorySize=<VALUE>` at the command
line.

#### General Query Configuration

##### GroupBy Query Config

See [groupBy server configuration](../querying/groupbyquery.html#server-configuration).

##### Search Query Config

|Property|Description|Default|
|--------|-----------|-------|
|`druid.query.search.maxSearchLimit`|Maximum number of search results to return.|1000|

##### Segment Metadata Query Config

|Property|Description|Default|
|--------|-----------|-------|
|`druid.query.segmentMetadata.defaultHistory`|When no interval is specified in the query, use a default interval of defaultHistory before the end time of the most recent segment, specified in ISO8601 format. This property also controls the duration of the default interval used by GET /druid/v2/datasources/{dataSourceName} interactions for retrieving datasource dimensions/metrics.|P1W|

### SQL

See [SQL server configuration](../querying/sql.html#configuration).

### Caching

You can optionally only configure caching to be enabled on the broker by setting caching configs here.

|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.broker.cache.useCache`|true, false|Enable the cache on the broker.|false|
|`druid.broker.cache.populateCache`|true, false|Populate the cache on the broker.|false|
|`druid.broker.cache.unCacheable`|All druid query types|All query types to not cache.|`["groupBy", "select"]`|
|`druid.broker.cache.cacheBulkMergeLimit`|positive integer or 0|Queries with more segments than this number will not attempt to fetch from cache at the broker level, leaving potential caching fetches (and cache result merging) to the historicals|`Integer.MAX_VALUE`|

See [cache configuration](caching.html) for how to configure cache settings.

### Others

|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.broker.segment.watchedTiers`|List of strings|Broker watches the segment announcements from nodes serving segments to build cache of which node is serving which segments, this configuration allows to only consider segments being served from a whitelist of tiers. By default, Broker would consider all tiers. This can be used to partition your dataSources in specific historical tiers and configure brokers in partitions so that they are only queryable for specific dataSources.|none|
|`druid.broker.segment.watchedDataSources`|List of strings|Broker watches the segment announcements from nodes serving segments to build cache of which node is serving which segments, this configuration allows to only consider segments being served from a whitelist of dataSources. By default, Broker would consider all datasources. This can be used to configure brokers in partitions so that they are only queryable for specific dataSources.|none|
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00			`---`
			`layout: doc_page`
			`---`
			`Broker Node Configuration`
			`=========================`
			`For general Broker Node information, see [here](../design/broker.html).`

			`Runtime Configuration`
			`---------------------`

shorten links and file names * remove redundant parts in file names * delete unsupported "Druid-Personal-Demo-Cluster" 2015-05-28 20:10:34 -04:00			`The broker node uses several of the global configs in [Configuration](../configuration/index.html) and has the following set of configurations as well:`
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00
			`### Node Configs`

			`\|Property\|Description\|Default\|`
			`\|--------\|-----------\|-------\|`
			\|`druid.host`\|The host for the current node. This is used to advertise the current processes location as reachable from another node and should generally be specified such that `http://${druid.host}/` could actually talk to this process\|InetAddress.getLocalHost().getCanonicalHostName()\|
			\|`druid.port`\|This is the port to actually listen on; unless port mapping is used, this will be the same port as is on `druid.host`\|8082\|
			\|`druid.service`\|The name of the service. This is used as a dimension when emitting metrics and alerts to differentiate between the various services\|druid/broker\|

			`### Query Configs`

			`#### Query Prioritization`

			`\|Property\|Possible Values\|Description\|Default\|`
			`\|--------\|---------------\|-----------\|-------\|`
			\|`druid.broker.balancer.type`\|`random`, `connectionCount`\|Determines how the broker balances connections to historical nodes. `random` choose randomly, `connectionCount` picks the node with the fewest number of active connections to\|`random`\|
			\|`druid.broker.select.tier`\|`highestPriority`, `lowestPriority`, `custom`\|If segments are cross-replicated across tiers in a cluster, you can tell the broker to prefer to select segments in a tier with a certain priority.\|`highestPriority`\|
			\|`druid.broker.select.tier.custom.priorities`\|`An array of integer priorities.`\|Select servers in tiers with a custom priority list.\|None\|

			`#### Concurrent Requests`

			`Druid uses Jetty to serve HTTP requests.`

			`\|Property\|Description\|Default\|`
			`\|--------\|-----------\|-------\|`
			\|`druid.server.http.numThreads`\|Number of threads for HTTP requests.\|10\|
			\|`druid.server.http.maxIdleTime`\|The Jetty max idle time for a connection.\|PT5m\|
Make timeout behavior consistent to document (#4134) * Make timeout behavior consistent to document * Refactoring BlockingPool and add more methods to QueryContexts * remove unused imports * Addressed comments * Address comments * remove unused method * Make default query timeout configurable * Fix test failure * Change timeout from period to millis 2017-04-18 20:47:53 -04:00			\|`druid.server.http.defaultQueryTimeout`\|Query timeout in millis, beyond which unfinished queries will be cancelled\|300000\|
Configurable HTTP compression. (#3759) * Configurable HTTP compression. * Call real-time nodes real-time processes in docs. 2016-12-07 20:40:39 -05:00			\|`druid.broker.http.numConnections`\|Size of connection pool for the Broker to connect to historical and real-time processes. If there are more queries than this number that all need to speak to the same node, then they will queue up.\|20\|
			\|`druid.broker.http.compressionCodec`\|Compression codec the Broker uses to communicate with historical and real-time processes. May be "gzip" or "identity".\|gzip\|
			\|`druid.broker.http.readTimeout`\|The timeout for data reads from historical and real-time processes.\|PT15M\|
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00
			`#### Retry Policy`

			`Druid broker can optionally retry queries internally for transient errors.`

			`\|Property\|Description\|Default\|`
			`\|--------\|-----------\|-------\|`
			\|`druid.broker.retryPolicy.numTries`\|Number of tries.\|1\|

			`#### Processing`

fixing the link to chunkPeriod doc 2015-10-01 14:03:46 -04:00			`The broker uses processing configs for nested groupBy queries. And, optionally, Long-interval queries (of any type) can be broken into shorter interval queries and processed in parallel inside this thread pool. For more details, see "chunkPeriod" in [Query Context](../querying/query-context.html) doc.`
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00
			`\|Property\|Description\|Default\|`
			`\|--------\|-----------\|-------\|`
			\|`druid.processing.buffer.sizeBytes`\|This specifies a buffer size for the storage of intermediate results. The computation engine in both the Historical and Realtime nodes will use a scratch buffer of this size to do all of their intermediate computations off-heap. Larger values allow for more aggregations in a single pass over the data while smaller values can require more passes depending on the query that is being executed.\|1073741824 (1GB)\|
make Global stupid pool cache size configurable 2016-01-23 01:12:22 -05:00			\|`druid.processing.buffer.poolCacheMaxCount`\|processing buffer pool caches the buffers for later use, this is the maximum count cache will grow to. note that pool can create more buffers than it can cache if necessary.\|Integer.MAX_VALUE\|
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00			\|`druid.processing.formatString`\|Realtime and historical nodes use this format string to name their processing threads.\|processing-%s\|
Use GroupBy V2 as default (#3953) * Use GroupBy V2 as default * Remove unused line * Change assert to exception propagation 2017-02-18 10:40:40 -05:00			\|`druid.processing.numMergeBuffers`\|The number of direct memory buffers available for merging query results. The buffers are sized by `druid.processing.buffer.sizeBytes`. This property is effectively a concurrency limit for queries that require merging buffers. If you are using any queries that require merge buffers (currently, just groupBy v2) then you should have at least two of these.\|`max(2, druid.processing.numThreads / 4)`\|
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00			\|`druid.processing.numThreads`\|The number of processing threads to have available for parallel processing of segments. Our rule of thumb is `num_cores - 1`, which means that even under heavy load there will still be one core available to do background tasks like talking with ZooKeeper and pulling down segments. If only one core is available, this property defaults to the value `1`.\|Number of cores - 1 (or 1)\|
			\|`druid.processing.columnCache.sizeBytes`\|Maximum size in bytes for the dimension value lookup cache. Any value greater than `0` enables the cache. It is currently disabled by default. Enabling the lookup cache can significantly improve the performance of aggregators operating on dimension values, such as the JavaScript aggregator, or cardinality aggregator, but can slow things down if the cache hit rate is low (i.e. dimensions with few repeating values). Enabling it may also require additional garbage collection tuning to avoid long GC pauses.\|`0` (disabled)\|
Make PrioritizedExecutorService optionally FIFO 2015-10-07 13:10:57 -04:00			\|`druid.processing.fifo`\|If the processing queue should treat tasks of equal priority in a FIFO manner\|`false`\|
Allow configurable temp directory for query processing (#3893) 2017-02-02 13:22:28 -05:00			\|`druid.processing.tmpDir`\|Path where temporary files created while processing a query should be stored. If specified, this configuration takes priority over the default `java.io.tmpdir` path.\|path represented by `java.io.tmpdir`\|
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00
Alternative groupBy strategy. (#2998) This patch introduces a GroupByStrategy concept and two strategies: "v1" is the current groupBy strategy and "v2" is a new one. It also introduces a merge buffers concept in DruidProcessingModule, to try to better manage memory used for merging. Both of these are described in more detail in #2987. There are two goals of this patch: 1. Make it possible for historical/realtime nodes to return larger groupBy result sets, faster, with better memory management. 2. Make it possible for brokers to merge streams when there are no order-by columns, avoiding materialization. This patch does not do anything to help with memory management on the broker when there are order-by columns or when there are nested queries. That could potentially be done in a future patch. 2016-06-24 21:06:09 -04:00			`The amount of direct memory needed by Druid is at least`
			`druid.processing.buffer.sizeBytes * (druid.processing.numMergeBuffers + druid.processing.numThreads + 1)`. You can
			ensure at least this amount of direct memory is available by providing `-XX:MaxDirectMemorySize=<VALUE>` at the command
			`line.`

renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00			`#### General Query Configuration`

			`##### GroupBy Query Config`

Alternative groupBy strategy. (#2998) This patch introduces a GroupByStrategy concept and two strategies: "v1" is the current groupBy strategy and "v2" is a new one. It also introduces a merge buffers concept in DruidProcessingModule, to try to better manage memory used for merging. Both of these are described in more detail in #2987. There are two goals of this patch: 1. Make it possible for historical/realtime nodes to return larger groupBy result sets, faster, with better memory management. 2. Make it possible for brokers to merge streams when there are no order-by columns, avoiding materialization. This patch does not do anything to help with memory management on the broker when there are order-by columns or when there are nested queries. That could potentially be done in a future patch. 2016-06-24 21:06:09 -04:00			`See [groupBy server configuration](../querying/groupbyquery.html#server-configuration).`
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00
			`##### Search Query Config`

			`\|Property\|Description\|Default\|`
			`\|--------\|-----------\|-------\|`
			\|`druid.query.search.maxSearchLimit`\|Maximum number of search results to return.\|1000\|

Add support for a configurable default segment history period for segmentMetadata queries and GET /datasources/<datasourceName> lookups 2015-09-10 22:15:12 -04:00			`##### Segment Metadata Query Config`

			`\|Property\|Description\|Default\|`
			`\|--------\|-----------\|-------\|`
			\|`druid.query.segmentMetadata.defaultHistory`\|When no interval is specified in the query, use a default interval of defaultHistory before the end time of the most recent segment, specified in ISO8601 format. This property also controls the duration of the default interval used by GET /druid/v2/datasources/{dataSourceName} interactions for retrieving datasource dimensions/metrics.\|P1W\|

Move SQL configs to sql.md. (#3959) This puts all the SQL stuff in one place. It also makes life easier by pointing out that configs be made in either common.runtime.properties or the broker runtime.properties. 2017-02-22 11:37:24 -05:00			`### SQL`
Built-in SQL. (#3682) 2016-12-16 20:15:59 -05:00
Move SQL configs to sql.md. (#3959) This puts all the SQL stuff in one place. It also makes life easier by pointing out that configs be made in either common.runtime.properties or the broker runtime.properties. 2017-02-22 11:37:24 -05:00			`See [SQL server configuration](../querying/sql.html#configuration).`
Built-in SQL. (#3682) 2016-12-16 20:15:59 -05:00
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00			`### Caching`

			`You can optionally only configure caching to be enabled on the broker by setting caching configs here.`

			`\|Property\|Possible Values\|Description\|Default\|`
			`\|--------\|---------------\|-----------\|-------\|`
			\|`druid.broker.cache.useCache`\|true, false\|Enable the cache on the broker.\|false\|
			\|`druid.broker.cache.populateCache`\|true, false\|Populate the cache on the broker.\|false\|
hybrid l1/l2 cache to combine local and remote cache 2015-10-01 11:25:03 -04:00			\|`druid.broker.cache.unCacheable`\|All druid query types\|All query types to not cache.\|`["groupBy", "select"]`\|
Allow setting upper limit on the number of cache segments a broker will try to fetch. 2015-10-23 14:34:35 -04:00			\|`druid.broker.cache.cacheBulkMergeLimit`\|positive integer or 0\|Queries with more segments than this number will not attempt to fetch from cache at the broker level, leaving potential caching fetches (and cache result merging) to the historicals\|`Integer.MAX_VALUE`\|
renaming all *.md filenames to only have lowercase and dashes so that they are editable on case-insensitive os as well 2015-05-05 17:07:32 -04:00
hybrid l1/l2 cache to combine local and remote cache 2015-10-01 11:25:03 -04:00			`See [cache configuration](caching.html) for how to configure cache settings.`
at broker - only add segments from specific tiers to the timeline 2016-02-03 00:59:03 -05:00
			`### Others`

			`\|Property\|Possible Values\|Description\|Default\|`
			`\|--------\|---------------\|-----------\|-------\|`
Add ability to filter segments for specific dataSources on broker without creating tiers (#2848) * Add back FilteredServerView removed in a32906c7fd11c9a8554df2621a172353a523a9dd to reduce memory usage using watched tiers. * Add functionality to specify "druid.broker.segment.watchedDataSources" 2016-04-19 13:10:06 -04:00			\|`druid.broker.segment.watchedTiers`\|List of strings\|Broker watches the segment announcements from nodes serving segments to build cache of which node is serving which segments, this configuration allows to only consider segments being served from a whitelist of tiers. By default, Broker would consider all tiers. This can be used to partition your dataSources in specific historical tiers and configure brokers in partitions so that they are only queryable for specific dataSources.\|none\|
			\|`druid.broker.segment.watchedDataSources`\|List of strings\|Broker watches the segment announcements from nodes serving segments to build cache of which node is serving which segments, this configuration allows to only consider segments being served from a whitelist of dataSources. By default, Broker would consider all datasources. This can be used to configure brokers in partitions so that they are only queryable for specific dataSources.\|none\|
at broker - only add segments from specific tiers to the timeline 2016-02-03 00:59:03 -05:00