mirror of https://github.com/apache/druid.git
added titles; fixed a few typos
This commit is contained in:
parent
6e880f8b24
commit
53b6392dce
|
@ -1,4 +1,5 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
Experimental features are features we have developed but have not fully tested in a production environment. If you choose to try them out, there will likely to edge cases that we have not covered. We would love feedback on any of these features, whether they are bug reports, suggestions for improvement, or letting us know they work as intended.
|
# About Experimental Features
|
||||||
|
Experimental features are features we have developed but have not fully tested in a production environment. If you choose to try them out, there will likely be edge cases that we have not covered. We would love feedback on any of these features, whether they are bug reports, suggestions for improvement, or letting us know they work as intended.
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Aggregations
|
||||||
Aggregations are specifications of processing over metrics available in Druid.
|
Aggregations are specifications of processing over metrics available in Druid.
|
||||||
Available aggregations are:
|
Available aggregations are:
|
||||||
|
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Deep Storage
|
||||||
Deep storage is where segments are stored. It is a storage mechanism that Druid does not provide. This deep storage infrastructure defines the level of durability of your data, as long as Druid nodes can see this storage infrastructure and get at the segments stored on it, you will not lose data no matter how many Druid nodes you lose. If segments disappear from this storage layer, then you will lose whatever data those segments represented.
|
Deep storage is where segments are stored. It is a storage mechanism that Druid does not provide. This deep storage infrastructure defines the level of durability of your data, as long as Druid nodes can see this storage infrastructure and get at the segments stored on it, you will not lose data no matter how many Druid nodes you lose. If segments disappear from this storage layer, then you will lose whatever data those segments represented.
|
||||||
|
|
||||||
The currently supported types of deep storage follow.
|
The currently supported types of deep storage follow.
|
||||||
|
|
|
@ -1,6 +1,8 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Transforming Dimension Values
|
||||||
|
The following JSON fields can be used in a query to operate on dimension values.
|
||||||
|
|
||||||
## DimensionSpec
|
## DimensionSpec
|
||||||
|
|
||||||
|
@ -8,7 +10,7 @@ layout: doc_page
|
||||||
|
|
||||||
### DefaultDimensionSpec
|
### DefaultDimensionSpec
|
||||||
|
|
||||||
Returns dimension values as is and optionally renames renames the dimension.
|
Returns dimension values as is and optionally renames the dimension.
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{ "type" : "default", "dimension" : <dimension>, "outputName": <output_name> }
|
{ "type" : "default", "dimension" : <dimension>, "outputName": <output_name> }
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
#Query Filters
|
||||||
A filter is a JSON object indicating which rows of data should be included in the computation for a query. It’s essentially the equivalent of the WHERE clause in SQL. Druid supports the following types of filters.
|
A filter is a JSON object indicating which rows of data should be included in the computation for a query. It’s essentially the equivalent of the WHERE clause in SQL. Druid supports the following types of filters.
|
||||||
|
|
||||||
### Selector filter
|
### Selector filter
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Geographic Queries
|
||||||
Druid supports filtering specially spatially indexed columns based on an origin and a bound.
|
Druid supports filtering specially spatially indexed columns based on an origin and a bound.
|
||||||
|
|
||||||
# Spatial Indexing
|
# Spatial Indexing
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Aggregation Granularity
|
||||||
The granularity field determines how data gets bucketed across the time dimension, i.e how it gets aggregated by hour, day, minute, etc.
|
The granularity field determines how data gets bucketed across the time dimension, i.e how it gets aggregated by hour, day, minute, etc.
|
||||||
|
|
||||||
It can be specified either as a string for simple granularities or as an object for arbitrary granularities.
|
It can be specified either as a string for simple granularities or as an object for arbitrary granularities.
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# groupBy Queries
|
||||||
These types of queries take a groupBy query object and return an array of JSON objects where each object represents a grouping asked for by the query. Note: If you only want to do straight aggreagates for some time range, we highly recommend using [TimeseriesQueries](TimeseriesQuery.html) instead. The performance will be substantially better.
|
These types of queries take a groupBy query object and return an array of JSON objects where each object represents a grouping asked for by the query. Note: If you only want to do straight aggreagates for some time range, we highly recommend using [TimeseriesQueries](TimeseriesQuery.html) instead. The performance will be substantially better.
|
||||||
An example groupBy query object is shown below:
|
An example groupBy query object is shown below:
|
||||||
|
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Filter groupBy Query Results
|
||||||
A having clause is a JSON object identifying which rows from a groupBy query should be returned, by specifying conditions on aggregated values.
|
A having clause is a JSON object identifying which rows from a groupBy query should be returned, by specifying conditions on aggregated values.
|
||||||
|
|
||||||
It is essentially the equivalent of the HAVING clause in SQL.
|
It is essentially the equivalent of the HAVING clause in SQL.
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Druid Indexing Service
|
||||||
The indexing service is a highly-available, distributed service that runs indexing related tasks. Indexing service [tasks](Tasks.html) create (and sometimes destroy) Druid [segments](Segments.html). The indexing service has a master/slave like architecture.
|
The indexing service is a highly-available, distributed service that runs indexing related tasks. Indexing service [tasks](Tasks.html) create (and sometimes destroy) Druid [segments](Segments.html). The indexing service has a master/slave like architecture.
|
||||||
|
|
||||||
The indexing service is composed of three main components: a peon component that can run a single task, a [Middle Manager](Middlemanager.html) component that manages peons, and an overlord component that manages task distribution to middle managers.
|
The indexing service is composed of three main components: a peon component that can run a single task, a [Middle Manager](Middlemanager.html) component that manages peons, and an overlord component that manages task distribution to middle managers.
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# MySQL Database
|
||||||
MySQL is an external dependency of Druid. We use it to store various metadata about the system, but not to store the actual data. There are a number of tables used for various purposes described below.
|
MySQL is an external dependency of Druid. We use it to store various metadata about the system, but not to store the actual data. There are a number of tables used for various purposes described below.
|
||||||
|
|
||||||
Segments Table
|
Segments Table
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Sort groupBy Query Results
|
||||||
The orderBy field provides the functionality to sort and limit the set of results from a groupBy query. If you group by a single dimension and are ordering by a single metric, we highly recommend using [TopN Queries](TopNQuery.html) instead. The performance will be substantially better. Available options are:
|
The orderBy field provides the functionality to sort and limit the set of results from a groupBy query. If you group by a single dimension and are ordering by a single metric, we highly recommend using [TopN Queries](TopNQuery.html) instead. The performance will be substantially better. Available options are:
|
||||||
|
|
||||||
### DefaultLimitSpec
|
### DefaultLimitSpec
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Post-Aggregations
|
||||||
Post-aggregations are specifications of processing that should happen on aggregated values as they come out of Druid. If you include a post aggregation as part of a query, make sure to include all aggregators the post-aggregator requires.
|
Post-aggregations are specifications of processing that should happen on aggregated values as they come out of Druid. If you include a post aggregation as part of a query, make sure to include all aggregators the post-aggregator requires.
|
||||||
|
|
||||||
There are several post-aggregators available.
|
There are several post-aggregators available.
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Configuring Rules for Coordinator Nodes
|
||||||
Note: It is recommended that the coordinator console is used to configure rules. However, the coordinator node does have HTTP endpoints to programmatically configure rules.
|
Note: It is recommended that the coordinator console is used to configure rules. However, the coordinator node does have HTTP endpoints to programmatically configure rules.
|
||||||
|
|
||||||
Load Rules
|
Load Rules
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Search Queries
|
||||||
A search query returns dimension values that match the search specification.
|
A search query returns dimension values that match the search specification.
|
||||||
|
|
||||||
```json
|
```json
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Refining Search Queries
|
||||||
Search query specs define how a "match" is defined between a search value and a dimension value. The available search query specs are:
|
Search query specs define how a "match" is defined between a search value and a dimension value. The available search query specs are:
|
||||||
|
|
||||||
InsensitiveContainsSearchQuerySpec
|
InsensitiveContainsSearchQuerySpec
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Segment Metadata Queries
|
||||||
Segment metadata queries return per segment information about:
|
Segment metadata queries return per segment information about:
|
||||||
|
|
||||||
* Cardinality of all columns in the segment
|
* Cardinality of all columns in the segment
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Tasks
|
||||||
Tasks are run on middle managers and always operate on a single data source.
|
Tasks are run on middle managers and always operate on a single data source.
|
||||||
|
|
||||||
There are several different types of tasks.
|
There are several different types of tasks.
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Time Boundary Queries
|
||||||
Time boundary queries return the earliest and latest data points of a data set. The grammar is:
|
Time boundary queries return the earliest and latest data points of a data set. The grammar is:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# Versioning Druid
|
||||||
This page discusses how we do versioning and provides information on our stable releases.
|
This page discusses how we do versioning and provides information on our stable releases.
|
||||||
|
|
||||||
Versioning Strategy
|
Versioning Strategy
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: doc_page
|
layout: doc_page
|
||||||
---
|
---
|
||||||
|
# ZooKeeper
|
||||||
Druid uses [ZooKeeper](http://zookeeper.apache.org/) (ZK) for management of current cluster state. The operations that happen over ZK are
|
Druid uses [ZooKeeper](http://zookeeper.apache.org/) (ZK) for management of current cluster state. The operations that happen over ZK are
|
||||||
|
|
||||||
1. [Coordinator](Coordinator.html) leader election
|
1. [Coordinator](Coordinator.html) leader election
|
||||||
|
|
Loading…
Reference in New Issue