diff --git a/docs/content/configuration/zookeeper.md b/docs/content/configuration/zookeeper.md new file mode 100644 index 00000000000..0a69c44900c --- /dev/null +++ b/docs/content/configuration/zookeeper.md @@ -0,0 +1,7 @@ +--- +layout: doc_page +--- +Production Zookeeper Configuration +================================== + +You can find great information about configuring and running Zookeeper [here](https://cwiki.apache.org/confluence/display/KAFKA/Operations#Operations-Zookeeper). diff --git a/docs/content/development/libraries.md b/docs/content/development/libraries.md index 7337179fb72..c3d7d188a3f 100644 --- a/docs/content/development/libraries.md +++ b/docs/content/development/libraries.md @@ -53,3 +53,4 @@ UIs --- * [mistercrunch/panoramix](https://github.com/mistercrunch/panoramix) - A web application to slice, dice and visualize data out of Druid +* [grafana](https://github.com/Quantiply/grafana-plugins/tree/master/features/druid) - A plugin for [Grafana](http://grafana.org/) diff --git a/docs/content/ingestion/schema-changes.md b/docs/content/ingestion/schema-changes.md new file mode 100644 index 00000000000..23c83f2bbf8 --- /dev/null +++ b/docs/content/ingestion/schema-changes.md @@ -0,0 +1,61 @@ +--- +layout: doc_page +--- +# Schema Changes + +Schemas for datasources can change at any time and Druid supports different schemas among segments. + +## Replacing Segments + +Druid uniquely +identifies segments using the datasource, interval, version, and partition number. The partition number is only visible in the segment id if +there are multiple segments created for some granularity of time. For example, if you have hourly segments, but you +have more data in an hour than a single segment can hold, you can create multiple segments for the same hour. These segments will share +the same datasource, interval, and version, but have linearly increasing partition numbers. + +``` +foo_2015-01-01/2015-01-02_v1_0 +foo_2015-01-01/2015-01-02_v1_1 +foo_2015-01-01/2015-01-02_v1_2 +``` + +In the example segments above, the dataSource = foo, interval = 2015-01-01/2015-01-02, version = v1, partitionNum = 0. +If at some later point in time, you reindex the data with a new schema, the newly created segments will have a higher version id. + +``` +foo_2015-01-01/2015-01-02_v2_0 +foo_2015-01-01/2015-01-02_v2_1 +foo_2015-01-01/2015-01-02_v2_2 +``` + +Druid batch indexing (either Hadoop-based or IndexTask-based) guarantees atomic updates on an interval-by-interval basis. +In our example, until all `v2` segments for `2015-01-01/2015-01-02` are loaded in a Druid cluster, queries exclusively use `v1` segments. +Once all `v2` segments are loaded and queryable, all queries ignore `v1` segments and switch to the `v2` segments. +Shortly afterwards, the `v1` segments are unloaded from the cluster. + +Note that updates that span multiple segment intervals are only atomic within each interval. They are not atomic across the entire update. +For example, you have segments such as the following: + +``` +foo_2015-01-01/2015-01-02_v1_0 +foo_2015-01-02/2015-01-03_v1_1 +foo_2015-01-03/2015-01-04_v1_2 +``` + +`v2` segments will be loaded into the cluster as soon as they are built and replace `v1` segments for the period of time the +segments overlap. Before v2 segments are completely loaded, your cluster may have a mixture of `v1` and `v2` segments. + +``` +foo_2015-01-01/2015-01-02_v1_0 +foo_2015-01-02/2015-01-03_v2_1 +foo_2015-01-03/2015-01-04_v1_2 +``` + +In this case, queries may hit a mixture of `v1` and `v2` segments. + +## Different Schemas Among Segments + +Druid segments for the same datasource may have different schemas. If a string column (dimension) exists in one segment but not +another, queries that involve both segments still work. Queries for the segment missing the dimension will behave as if the dimension has only null values. +Similarly, if one segment has a numeric column (metric) but another does not, queries on the segment missing the +metric will generally "do the right thing". Aggregations over this missing metric behave as if the metric were missing. diff --git a/docs/content/operations/multitenancy.md b/docs/content/operations/multitenancy.md new file mode 100644 index 00000000000..22185951bc0 --- /dev/null +++ b/docs/content/operations/multitenancy.md @@ -0,0 +1,38 @@ +--- +layout: doc_page +--- +# Multitenancy Considerations + +Druid is often used to power user-facing data applications and has several features built in to better support high +volumes of concurrent queries. + +## Parallelization Model + +Druid's fundamental unit of computation is a [segment](../design/segments.html). Nodes scan segments in parallel and a +given node can scan `druid.processing.numThreads` concurrently. To +process more data in parallel and increase performance, more cores can be added to a cluster. Druid segments +should be sized such that any computation over any given segment should complete in at most 500ms. + +Druid internally stores requests to scan segments in a priority queue. If a given query requires scanning +more segments than the total number of available processors in a cluster, and many similarly expensive queries are concurrently +running, we don't want any query to be starved out. Druid's internal processing logic will scan a set of segments from one query and release resources as soon as the scans complete. +This allows for a second set of segments from another query to be scanned. By keeping segment computation time very small, we ensure +that resources are constantly being yielded, and segments pertaining to different queries are all being processed. + +## Data Distribution + +Druid additionally supports multitenancy by providing configurable means of distributing data. Druid's historical nodes +can be configured into [tiers](../operations/rule-configuration.html), and [rules](../operations/rule-configuration.html) +can be set that determines which segments go into which tiers. One use case of this is that recent data tends to be accessed +more frequently than older data. Tiering enables more recent segments to be hosted on more powerful hardware for better performance. +A second copy of recent segments can be replicated on cheaper hardware (a different tier), and older segments can also be +stored on this tier. + +## Query Distribution + +Druid queries can optionally set a `priority` flag in the [query context](../querying/query-context.html). Queries known to be +slow (download or reporting style queries) can be de-prioritized and more interactive queries can have higher priority. + +Broker nodes can also be dedicated to a given tier. For example, one set of broker nodes can be dedicated to fast interactive queries, +and a second set of broker nodes can be dedicated to slower reporting queries. Druid also provides a [router](../development/router.html) +node that can route queries to different brokers based on various query parameters (datasource, interval, etc.). diff --git a/docs/content/operations/recommendations.md b/docs/content/operations/recommendations.md index 670266fea10..60f2591202e 100644 --- a/docs/content/operations/recommendations.md +++ b/docs/content/operations/recommendations.md @@ -17,11 +17,15 @@ SSDs are highly recommended for historical and real-time nodes if you are not ru Although Druid supports schema-less ingestion of dimensions, because of [https://github.com/druid-io/druid/issues/658](https://github.com/druid-io/druid/issues/658), you may sometimes get bigger segments than necessary. To ensure segments are as compact as possible, providing dimension names in lexicographic order is recommended. - # Use Timeseries and TopN Queries Instead of GroupBy Where Possible Timeseries and TopN queries are much more optimized and significantly faster than groupBy queries for their designed use cases. Issuing multiple topN or timeseries queries from your application can potentially be more efficient than a single groupBy query. +# Segment sizes matter + +Segments should generally be between 300MB-700MB in size. Too many small segments results in inefficient CPU utilizations and +too many large segments impacts query performance, most notably with TopN queries. + # Read FAQs You should read common problems people have here: diff --git a/docs/content/querying/dimensionspecs.md b/docs/content/querying/dimensionspecs.md index ec7ce09388f..94fdddba0cf 100644 --- a/docs/content/querying/dimensionspecs.md +++ b/docs/content/querying/dimensionspecs.md @@ -167,6 +167,7 @@ Example for the `__time` dimension: ### Lookup lookup extraction function Explicit lookups allow you to specify a set of keys and values to use when performing the extraction + ```json { "type":"lookup", diff --git a/docs/content/querying/joins.md b/docs/content/querying/joins.md new file mode 100644 index 00000000000..ec4ffe24ca8 --- /dev/null +++ b/docs/content/querying/joins.md @@ -0,0 +1,35 @@ +--- +layout: doc_page +--- +# Joins + +Druid has limited support for joins through [query-time lookups](../querying/lookups.html). The common use case of +query-time lookups is to replace one dimension value that (e.g. a String ID) with another value (e.g. a human-readable +String value). + +Druid does not yet have full support for joins. Although Druid’s storage format would allow for the implementation +of joins (there is no loss of fidelity for columns included as dimensions), full support for joins have not yet been implemented yet +for the following reasons: + +1. Scaling join queries has been, in our professional experience, +a constant bottleneck of working with distributed databases. +2. The incremental gains in functionality are perceived to be +of less value than the anticipated problems with managing +highly concurrent, join-heavy workloads. + +A join query is essentially the merging of two or more streams of data based on a shared set of keys. The primary +high-level strategies for join queries we are aware of are a hash-based strategy or a +sorted-merge strategy. The hash-based strategy requires that all but +one data set be available as something that looks like a hash table, +a lookup operation is then performed on this hash table for every +row in the “primary” stream. The sorted-merge strategy assumes +that each stream is sorted by the join key and thus allows for the incremental +joining of the streams. Each of these strategies, however, +requires the materialization of some number of the streams either in +sorted order or in a hash table form. + +When all sides of the join are significantly large tables (> 1 billion +records), materializing the pre-join streams requires complex +distributed memory management. The complexity of the memory +management is only amplified by the fact that we are targeting highly +concurrent, multi-tenant workloads. diff --git a/docs/content/querying/lookups.md b/docs/content/querying/lookups.md index 821faf02e19..719b66757d2 100644 --- a/docs/content/querying/lookups.md +++ b/docs/content/querying/lookups.md @@ -1,3 +1,6 @@ +--- +layout: doc_page +--- # Lookups Lookups are a concept in Druid where dimension values are (optionally) replaced with a new value. See [dimension specs](../querying/dimensionspecs.html) for more information. For the purpose of these documents, a "key" refers to a dimension value to match, and a "value" refers to its replacement. So if you wanted to rename `appid-12345` to `Super Mega Awesome App` then the key would be `appid-12345` and the value would be `Super Mega Awesome App`. @@ -36,6 +39,7 @@ The cache is populated in different ways depending on the settings below. In gen ## URI namespace update The remapping values for each namespaced lookup can be specified by json as per + ```json { "type":"uri", @@ -73,6 +77,7 @@ The `namespaceParseSpec` can be one of a number of values. Each of the examples |`valueColumn`|The name of the column containing the value|no|The second column| *example input* + ``` bar,something,foo bat,something2,baz @@ -80,6 +85,7 @@ truck,something3,buck ``` *example namespaceParseSpec* + ```json "namespaceParseSpec": { "format": "csv", @@ -100,6 +106,7 @@ truck,something3,buck *example input* + ``` bar|something,1|foo bat|something,2|baz @@ -107,6 +114,7 @@ truck|something,3|buck ``` *example namespaceParseSpec* + ```json "namespaceParseSpec": { "format": "tsv", @@ -125,6 +133,7 @@ truck|something,3|buck |`valueFieldName`|The field name of the value|yes|null| *example input* + ```json {"key": "foo", "value": "bar", "somethingElse" : "something"} {"key": "baz", "value": "bat", "somethingElse" : "something"} @@ -132,6 +141,7 @@ truck|something,3|buck ``` *example namespaceParseSpec* + ```json "namespaceParseSpec": { "format": "customJson", @@ -153,13 +163,13 @@ The `simpleJson` lookupParseSpec does not take any parameters. It is simply a li ``` *example namespaceParseSpec* + ```json "namespaceParseSpec":{ "type": "simpleJson" } ``` - ## JDBC namespaced lookup The JDBC lookups will poll a database to populate its local cache. If the `tsColumn` is set it must be able to accept comparisons in the format `'2015-01-01 00:00:00'`. For example, the following must be valid sql for the table `SELECT * FROM some_lookup_table WHERE timestamp_column > '2015-01-01 00:00:00'`. If `tsColumn` is set, the caching service will attempt to only poll values that were written *after* the last sync. If `tsColumn` is not set, the entire table is pulled every time. @@ -173,6 +183,7 @@ The JDBC lookups will poll a database to populate its local cache. If the `tsCol |`valueColumn`|The column in `table` which contains the values|Yes|| |`tsColumn`| The column in `table` which contains when the key was updated|No|Not used| |`pollPeriod`|How often to poll the DB|No|0 (only once)| + ```json { "type":"jdbc", @@ -193,6 +204,7 @@ The JDBC lookups will poll a database to populate its local cache. If the `tsCol # Kafka namespaced lookup If you need updates to populate as promptly as possible, it is possible to plug into a kafka topic whose key is the old value and message is the desired new value (both in UTF-8). This requires the following extension: "io.druid.extensions:kafka-extraction-namespace" + ```json { "type":"kafka", diff --git a/docs/content/toc.textile b/docs/content/toc.textile index 2ab4baaa1b8..8f75bd28da2 100644 --- a/docs/content/toc.textile +++ b/docs/content/toc.textile @@ -14,6 +14,7 @@ h2. Data Ingestion * "Data Formats":../ingestion/data-formats.html * "Data Schema":../ingestion/index.html * "Schema Design":../ingestion/schema-design.html +* "Schema Changes":../ingestion/schema-changes.html * "Realtime Ingestion":../ingestion/realtime-ingestion.html * "Batch Ingestion":../ingestion/batch-ingestion.html * "FAQ":../ingestion/faq.html @@ -36,6 +37,7 @@ h2. Querying ** "DimensionSpecs":../querying/dimensionspecs.html ** "Context":../querying/query-context.html * "SQL":../querying/sql.html +* "Joins":../querying/joins.html h2. Design * "Overview":../design/design.html @@ -60,6 +62,7 @@ h2. Operations * "Alerts":../operations/alerts.html * "Updating the Cluster":../operations/rolling-updates.html * "Different Hadoop Versions":../operations/other-hadoop.html +* "Multitenancy Considerations":../operations/multitenancy.html * "Performance FAQ":../operations/performance-faq.html h2. Configuration @@ -73,6 +76,7 @@ h2. Configuration * "Simple Cluster Configuration":../configuration/simple-cluster.html * "Production Cluster Configuration":../configuration/production-cluster.html * "Production Hadoop Configuration":../configuration/hadoop.html +* "Production Zookeeper Configuration":../configuration/zookeeper.html h2. Development * "Libraries":../development/libraries.html