重新整理文件并且将官方的英文版内容拷贝部分

This commit is contained in:
YuCheng Hu 2021-07-19 16:14:08 -04:00
parent b03e7ccc8b
commit a71a0228bf
123 changed files with 11072 additions and 147 deletions

View File

@ -132,10 +132,10 @@ TSV `inputFormat` 有以下组件:
#### ORC #### ORC
> [!WARNING] > [!WARNING]
> 使用ORC输入格式之前首先需要包含 [druid-orc-extensions](../Development/orc-extensions.md) > 使用ORC输入格式之前首先需要包含 [druid-orc-extensions](../development/orc-extensions.md)
> [!WARNING] > [!WARNING]
> 如果您正在考虑从早于0.15.0的版本升级到0.15.0或更高版本,请仔细阅读 [从contrib扩展的迁移](../Development/orc-extensions.md#从contrib扩展迁移)。 > 如果您正在考虑从早于0.15.0的版本升级到0.15.0或更高版本,请仔细阅读 [从contrib扩展的迁移](../development/orc-extensions.md#从contrib扩展迁移)。
一个加载ORC格式数据的 `inputFormat` 示例: 一个加载ORC格式数据的 `inputFormat` 示例:
```json ```json
@ -169,7 +169,7 @@ ORC `inputFormat` 有以下组件:
#### Parquet #### Parquet
> [!WARNING] > [!WARNING]
> 使用Parquet输入格式之前首先需要包含 [druid-parquet-extensions](../Development/parquet-extensions.md) > 使用Parquet输入格式之前首先需要包含 [druid-parquet-extensions](../development/parquet-extensions.md)
一个加载Parquet格式数据的 `inputFormat` 示例: 一个加载Parquet格式数据的 `inputFormat` 示例:
```json ```json
@ -277,7 +277,7 @@ Parquet `inputFormat` 有以下组件:
> [!WARNING] > [!WARNING]
> parser在 [本地批任务](native.md), [Kafka索引任务](kafka.md) 和 [Kinesis索引任务](kinesis.md) 中已经废弃,在这些类型的摄入方式中考虑使用 [inputFormat](#数据格式) > parser在 [本地批任务](native.md), [Kafka索引任务](kafka.md) 和 [Kinesis索引任务](kinesis.md) 中已经废弃,在这些类型的摄入方式中考虑使用 [inputFormat](#数据格式)
该部分列出来了所有默认的以及核心扩展中的解析器。对于社区的扩展解析器,请参见 [社区扩展列表](../Development/extensions.md#社区扩展) 该部分列出来了所有默认的以及核心扩展中的解析器。对于社区的扩展解析器,请参见 [社区扩展列表](../development/extensions.md#社区扩展)
#### String Parser #### String Parser
@ -291,7 +291,7 @@ Parquet `inputFormat` 有以下组件:
#### Avro Hadoop Parser #### Avro Hadoop Parser
> [!WARNING] > [!WARNING]
> 需要添加 [druid-avro-extensions](../Development/avro-extensions.md) 来使用 Avro Hadoop解析器 > 需要添加 [druid-avro-extensions](../development/avro-extensions.md) 来使用 Avro Hadoop解析器
该解析器用于 [Hadoop批摄取](hadoopbased.md)。在 `ioConfig` 中,`inputSpec` 中的 `inputFormat` 必须设置为 `org.apache.druid.data.input.avro.AvroValueInputFormat`。您可能想在 `tuningConfig` 中的 `jobProperties` 选项设置Avro reader的schema 例如:`"avro.schema.input.value.path": "/path/to/your/schema.avsc"` 或者 `"avro.schema.input.value": "your_schema_JSON_object"`。如果未设置Avro读取器的schema则将使用Avro对象容器文件中的schema详情可以参见 [avro规范](http://avro.apache.org/docs/1.7.7/spec.html#Schema+Resolution) 该解析器用于 [Hadoop批摄取](hadoopbased.md)。在 `ioConfig` 中,`inputSpec` 中的 `inputFormat` 必须设置为 `org.apache.druid.data.input.avro.AvroValueInputFormat`。您可能想在 `tuningConfig` 中的 `jobProperties` 选项设置Avro reader的schema 例如:`"avro.schema.input.value.path": "/path/to/your/schema.avsc"` 或者 `"avro.schema.input.value": "your_schema_JSON_object"`。如果未设置Avro读取器的schema则将使用Avro对象容器文件中的schema详情可以参见 [avro规范](http://avro.apache.org/docs/1.7.7/spec.html#Schema+Resolution)
@ -339,10 +339,10 @@ Avro parseSpec可以包含使用"root"或"path"字段类型的 [flattenSpec](#fl
#### ORC Hadoop Parser #### ORC Hadoop Parser
> [!WARNING] > [!WARNING]
> 需要添加 [druid-orc-extensions](../Development/orc-extensions.md) 来使用ORC Hadoop解析器 > 需要添加 [druid-orc-extensions](../development/orc-extensions.md) 来使用ORC Hadoop解析器
> [!WARNING] > [!WARNING]
> 如果您正在考虑从早于0.15.0的版本升级到0.15.0或更高版本,请仔细阅读 [从contrib扩展的迁移](../Development/orc-extensions.md#从contrib扩展迁移)。 > 如果您正在考虑从早于0.15.0的版本升级到0.15.0或更高版本,请仔细阅读 [从contrib扩展的迁移](../development/orc-extensions.md#从contrib扩展迁移)。
该解析器用于 [Hadoop批摄取](hadoopbased.md)。在 `ioConfig` 中,`inputSpec` 中的 `inputFormat` 必须设置为 `org.apache.orc.mapreduce.OrcInputFormat` 该解析器用于 [Hadoop批摄取](hadoopbased.md)。在 `ioConfig` 中,`inputSpec` 中的 `inputFormat` 必须设置为 `org.apache.orc.mapreduce.OrcInputFormat`
@ -564,7 +564,7 @@ Avro parseSpec可以包含使用"root"或"path"字段类型的 [flattenSpec](#fl
#### Parquet Hadoop Parser #### Parquet Hadoop Parser
> [!WARNING] > [!WARNING]
> 需要添加 [druid-parquet-extensions](../Development/parquet-extensions.md) 来使用Parquet Hadoop解析器 > 需要添加 [druid-parquet-extensions](../development/parquet-extensions.md) 来使用Parquet Hadoop解析器
该解析器用于 [Hadoop批摄取](hadoopbased.md)。在 `ioConfig` 中,`inputSpec` 中的 `inputFormat` 必须设置为 `org.apache.druid.data.input.parquet.DruidParquetInputFormat` 该解析器用于 [Hadoop批摄取](hadoopbased.md)。在 `ioConfig` 中,`inputSpec` 中的 `inputFormat` 必须设置为 `org.apache.druid.data.input.parquet.DruidParquetInputFormat`
@ -690,7 +690,7 @@ Parquet Hadoop 解析器支持自动字段发现,如果提供了一个带有 `
> 考虑在该解析器之上使用 [Parquet Hadoop Parser](#parquet-hadoop-parser) 来摄取Parquet文件。 两者之间的不同之处参见 [Parquet Hadoop解析器 vs Parquet Avro Hadoop解析器]() 部分 > 考虑在该解析器之上使用 [Parquet Hadoop Parser](#parquet-hadoop-parser) 来摄取Parquet文件。 两者之间的不同之处参见 [Parquet Hadoop解析器 vs Parquet Avro Hadoop解析器]() 部分
> [!WARNING] > [!WARNING]
> 使用Parquet Avro Hadoop Parser需要同时加入 [druid-parquet-extensions](../Development/parquet-extensions.md) 和 [druid-avro-extensions](../Development/avro-extensions.md) > 使用Parquet Avro Hadoop Parser需要同时加入 [druid-parquet-extensions](../development/parquet-extensions.md) 和 [druid-avro-extensions](../development/avro-extensions.md)
该解析器用于 [Hadoop批摄取](hadoopbased.md), 该解析器首先将Parquet数据转换为Avro记录然后再解析它们后摄入到Druid。在 `ioConfig` 中,`inputSpec` 中的 `inputFormat` 必须设置为 `org.apache.druid.data.input.parquet.DruidParquetAvroInputFormat` 该解析器用于 [Hadoop批摄取](hadoopbased.md), 该解析器首先将Parquet数据转换为Avro记录然后再解析它们后摄入到Druid。在 `ioConfig` 中,`inputSpec` 中的 `inputFormat` 必须设置为 `org.apache.druid.data.input.parquet.DruidParquetAvroInputFormat`
@ -763,7 +763,7 @@ Parquet Avro Hadoop 解析器支持自动字段发现,如果提供了一个带
#### Avro Stream Parser #### Avro Stream Parser
> [!WARNING] > [!WARNING]
> 需要添加 [druid-avro-extensions](../Development/avro-extensions.md) 来使用Avro Stream解析器 > 需要添加 [druid-avro-extensions](../development/avro-extensions.md) 来使用Avro Stream解析器
该解析器用于 [流式摄取](streamingest.md), 直接从一个流来读取数据。 该解析器用于 [流式摄取](streamingest.md), 直接从一个流来读取数据。
@ -909,7 +909,7 @@ Avro Bytes Decorder首先提取输入消息的 `subject` 和 `id` 然后使
#### Protobuf Parser #### Protobuf Parser
> [!WARNING] > [!WARNING]
> 需要添加 [druid-protobuf-extensions](../Development/protobuf-extensions.md) 来使用Protobuf解析器 > 需要添加 [druid-protobuf-extensions](../development/protobuf-extensions.md) 来使用Protobuf解析器
此解析器用于 [流接收](streamingest.md),并直接从流中读取协议缓冲区数据。 此解析器用于 [流接收](streamingest.md),并直接从流中读取协议缓冲区数据。
@ -949,7 +949,7 @@ Avro Bytes Decorder首先提取输入消息的 `subject` 和 `id` 然后使
} }
} }
``` ```
有关更多详细信息和示例,请参见 [扩展说明](../Development/protobuf-extensions.md)。 有关更多详细信息和示例,请参见 [扩展说明](../development/protobuf-extensions.md)。
### ParseSpec ### ParseSpec
@ -1117,7 +1117,7 @@ JSON数据也可以包含多值维度。维度的多个值必须在接收的数
注意: JavaScript解析器必须完全解析数据并在JS逻辑中以 `{key:value}` 格式返回。这意味着任何展平或解析多维值都必须在这里完成。 注意: JavaScript解析器必须完全解析数据并在JS逻辑中以 `{key:value}` 格式返回。这意味着任何展平或解析多维值都必须在这里完成。
> [!WARNING] > [!WARNING]
> 默认情况下禁用基于JavaScript的功能。有关使用Druid的JavaScript功能的指南包括如何启用它的说明请参阅 [Druid JavaScript编程指南](../Development/JavaScript.md)。 > 默认情况下禁用基于JavaScript的功能。有关使用Druid的JavaScript功能的指南包括如何启用它的说明请参阅 [Druid JavaScript编程指南](../development/JavaScript.md)。
#### 时间和维度解析规范 #### 时间和维度解析规范

View File

@ -163,7 +163,7 @@ Druid使用 `ioConfig` 中的 `inputSpec` 来知道要接收的数据位于何
### 删除数据 ### 删除数据
Druid支持永久的将标记为"unused"状态(详情可见架构设计中的 [段的生命周期](../Design/Design.md#段生命周期))的段删除掉 Druid支持永久的将标记为"unused"状态(详情可见架构设计中的 [段的生命周期](../design/Design.md#段生命周期))的段删除掉
杀死任务负责从元数据存储和深度存储中删除掉指定时间间隔内的不被使用的段 杀死任务负责从元数据存储和深度存储中删除掉指定时间间隔内的不被使用的段

View File

@ -34,7 +34,7 @@ Druid会拒绝时间窗口之外的事件 确认事件是否被拒绝了的
### 摄取之后段存储在哪里 ### 摄取之后段存储在哪里
段的存储位置由 `druid.storage.type` 配置决定的Druid会将段上传到 [深度存储](../Design/Deepstorage.md)。 本地磁盘是默认的深度存储位置。 段的存储位置由 `druid.storage.type` 配置决定的Druid会将段上传到 [深度存储](../design/Deepstorage.md)。 本地磁盘是默认的深度存储位置。
### 流摄取任务没有发生段切换递交 ### 流摄取任务没有发生段切换递交
@ -49,11 +49,11 @@ Druid会拒绝时间窗口之外的事件 确认事件是否被拒绝了的
### 如何让HDFS工作 ### 如何让HDFS工作
确保在类路径中包含 `druid-hdfs-storage` 和所有的hadoop配置、依赖项可以通过在安装了hadoop的计算机上运行 `hadoop classpath`命令获得。并且提供必要的HDFS设置如 [深度存储](../Design/Deepstorage.md) 中所述。 确保在类路径中包含 `druid-hdfs-storage` 和所有的hadoop配置、依赖项可以通过在安装了hadoop的计算机上运行 `hadoop classpath`命令获得。并且提供必要的HDFS设置如 [深度存储](../design/Deepstorage.md) 中所述。
### 没有在Historical进程中看到Druid段 ### 没有在Historical进程中看到Druid段
您可以查看位于 `<Coordinator_IP>:<PORT>` 的Coordinator控制台, 确保您的段实际上已加载到 [Historical进程](../Design/Historical.md)中。如果段不存在请检查Coordinator日志中有关复制错误容量的消息。不下载段的一个原因是Historical进程的 `maxSize` 太小,使它们无法下载更多数据。您可以使用(例如)更改它: 您可以查看位于 `<Coordinator_IP>:<PORT>` 的Coordinator控制台, 确保您的段实际上已加载到 [Historical进程](../design/Historical.md)中。如果段不存在请检查Coordinator日志中有关复制错误容量的消息。不下载段的一个原因是Historical进程的 `maxSize` 太小,使它们无法下载更多数据。您可以使用(例如)更改它:
```json ```json
-Ddruid.segmentCache.locations=[{"path":"/tmp/druid/storageLocation","maxSize":"500000000000"}] -Ddruid.segmentCache.locations=[{"path":"/tmp/druid/storageLocation","maxSize":"500000000000"}]

View File

@ -13,7 +13,7 @@
## 基于Hadoop的摄入 ## 基于Hadoop的摄入
Apache Druid当前支持通过一个Hadoop摄取任务来支持基于Apache Hadoop的批量索引任务 这些任务被提交到 [Druid Overlord](../Design/Overlord.md)的一个运行实例上。详情可以查看 [基于Hadoop的摄取vs基于本地批摄取的对比](ingestion.md#批量摄取) 来了解基于Hadoop的摄取、本地简单批摄取、本地并行摄取三者的比较。 Apache Druid当前支持通过一个Hadoop摄取任务来支持基于Apache Hadoop的批量索引任务 这些任务被提交到 [Druid Overlord](../design/Overlord.md)的一个运行实例上。详情可以查看 [基于Hadoop的摄取vs基于本地批摄取的对比](ingestion.md#批量摄取) 来了解基于Hadoop的摄取、本地简单批摄取、本地并行摄取三者的比较。
运行一个基于Hadoop的批量摄取任务首先需要编写一个如下的摄取规范 然后提交到Overlord的 [`druid/indexer/v1/task`](../Operations/api.md#overlord) 接口或者使用Druid软件包中自带的 `bin/post-index-task` 脚本。 运行一个基于Hadoop的批量摄取任务首先需要编写一个如下的摄取规范 然后提交到Overlord的 [`druid/indexer/v1/task`](../Operations/api.md#overlord) 接口或者使用Druid软件包中自带的 `bin/post-index-task` 脚本。
@ -388,7 +388,7 @@ Hadoop的 [MapReduce文档](https://hadoop.apache.org/docs/stable/hadoop-mapredu
```json ```json
classification=yarn-site,properties=[mapreduce.reduce.memory.mb=6144,mapreduce.reduce.java.opts=-server -Xms2g -Xmx2g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps,mapreduce.map.java.opts=758,mapreduce.map.java.opts=-server -Xms512m -Xmx512m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps,mapreduce.task.timeout=1800000] classification=yarn-site,properties=[mapreduce.reduce.memory.mb=6144,mapreduce.reduce.java.opts=-server -Xms2g -Xmx2g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps,mapreduce.map.java.opts=758,mapreduce.map.java.opts=-server -Xms512m -Xmx512m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps,mapreduce.task.timeout=1800000]
``` ```
* 按照 [Hadoop连接配置](../GettingStarted/chapter-4.md#Hadoop连接配置) 指导使用EMR master中 `/etc/hadoop/conf` 的XML文件。 * 按照 [Hadoop连接配置](../tutorials/img/chapter-4.md#Hadoop连接配置) 指导使用EMR master中 `/etc/hadoop/conf` 的XML文件。
### Kerberized Hadoop集群 ### Kerberized Hadoop集群
@ -472,7 +472,7 @@ spec文件需要包含一个JSON对象其中的内容与Hadoop索引任务中
| `password` | String | DB的密码 | 是 | | `password` | String | DB的密码 | 是 |
| `segmentTable` | String | DB中使用的表 | 是 | | `segmentTable` | String | DB中使用的表 | 是 |
这些属性应该模仿您为 [Coordinator](../Design/Coordinator.md) 配置的内容。 这些属性应该模仿您为 [Coordinator](../design/Coordinator.md) 配置的内容。
**segmentOutputPath配置** **segmentOutputPath配置**

View File

@ -16,7 +16,7 @@
Druid中的所有数据都被组织成*段*这些段是数据文件通常每个段最多有几百万行。在Druid中加载数据称为*摄取或索引*,它包括从源系统读取数据并基于该数据创建段。 Druid中的所有数据都被组织成*段*这些段是数据文件通常每个段最多有几百万行。在Druid中加载数据称为*摄取或索引*,它包括从源系统读取数据并基于该数据创建段。
在大多数摄取方法中加载数据的工作由Druid [MiddleManager](../Design/MiddleManager.md) 进程(或 [Indexer](../Design/Indexer.md) 进程完成。一个例外是基于Hadoop的摄取这项工作是使用Hadoop MapReduce作业在YARN上完成的尽管MiddleManager或Indexer进程仍然参与启动和监视Hadoop作业。一旦段被生成并存储在 [深层存储](../Design/Deepstorage.md) 中它们将被Historical进程加载。有关如何在引擎下工作的更多细节请参阅Druid设计文档的[存储设计](../Design/Design.md) 部分。 在大多数摄取方法中加载数据的工作由Druid [MiddleManager](../design/MiddleManager.md) 进程(或 [Indexer](../design/Indexer.md) 进程完成。一个例外是基于Hadoop的摄取这项工作是使用Hadoop MapReduce作业在YARN上完成的尽管MiddleManager或Indexer进程仍然参与启动和监视Hadoop作业。一旦段被生成并存储在 [深层存储](../design/Deepstorage.md) 中它们将被Historical进程加载。有关如何在引擎下工作的更多细节请参阅Druid设计文档的[存储设计](../design/Design.md) 部分。
### 如何使用本文档 ### 如何使用本文档
@ -394,7 +394,7 @@ Druid以两种可能的方式来解释 `dimensionsSpec` : *normal* 和 *schemale
##### `granularitySpec` ##### `granularitySpec`
`granularitySpec` 位于 `dataSchema` -> `granularitySpec`, 用来配置以下操作: `granularitySpec` 位于 `dataSchema` -> `granularitySpec`, 用来配置以下操作:
1. 通过 `segmentGranularity` 来将数据源分区到 [时间块](../Design/Design.md#数据源和段) 1. 通过 `segmentGranularity` 来将数据源分区到 [时间块](../design/Design.md#数据源和段)
2. 如果需要的话,通过 `queryGranularity` 来截断时间戳 2. 如果需要的话,通过 `queryGranularity` 来截断时间戳
3. 通过 `interval` 来指定批摄取中应创建段的时间块 3. 通过 `interval` 来指定批摄取中应创建段的时间块
4. 通过 `rollup` 来指定是否在摄取时进行汇总 4. 通过 `rollup` 来指定是否在摄取时进行汇总
@ -418,7 +418,7 @@ Druid以两种可能的方式来解释 `dimensionsSpec` : *normal* 和 *schemale
| 字段 | 描述 | 默认值 | | 字段 | 描述 | 默认值 |
|-|-|-| |-|-|-|
| type | `uniform` 或者 `arbitrary` ,大多数时候使用 `uniform` | `uniform` | | type | `uniform` 或者 `arbitrary` ,大多数时候使用 `uniform` | `uniform` |
| segmentGranularity | 数据源的 [时间分块](../Design/Design.md#数据源和段) 粒度。每个时间块可以创建多个段, 例如,当设置为 `day` 时,同一天的事件属于同一时间块,该时间块可以根据其他配置和输入大小进一步划分为多个段。这里可以提供任何粒度。请注意,同一时间块中的所有段应具有相同的段粒度。 <br><br> 如果 `type` 字段设置为 `arbitrary` 则忽略 | `day` | | segmentGranularity | 数据源的 [时间分块](../design/Design.md#数据源和段) 粒度。每个时间块可以创建多个段, 例如,当设置为 `day` 时,同一天的事件属于同一时间块,该时间块可以根据其他配置和输入大小进一步划分为多个段。这里可以提供任何粒度。请注意,同一时间块中的所有段应具有相同的段粒度。 <br><br> 如果 `type` 字段设置为 `arbitrary` 则忽略 | `day` |
| queryGranularity | 每个段内时间戳存储的分辨率, 必须等于或比 `segmentGranularity` 更细。这将是您可以查询的最细粒度,并且仍然可以查询到合理的结果。但是请注意,您仍然可以在比此粒度更粗的场景进行查询,例如 "`minute`"的值意味着记录将以分钟的粒度存储并且可以在分钟的任意倍数包括分钟、5分钟、小时等进行查询。<br><br> 这里可以提供任何 [粒度](../Querying/AggregationGranularity.md) 。使用 `none` 按原样存储时间戳,而不进行任何截断。请注意,即使将 `queryGranularity` 设置为 `none`,也将应用 `rollup`。 | `none` | | queryGranularity | 每个段内时间戳存储的分辨率, 必须等于或比 `segmentGranularity` 更细。这将是您可以查询的最细粒度,并且仍然可以查询到合理的结果。但是请注意,您仍然可以在比此粒度更粗的场景进行查询,例如 "`minute`"的值意味着记录将以分钟的粒度存储并且可以在分钟的任意倍数包括分钟、5分钟、小时等进行查询。<br><br> 这里可以提供任何 [粒度](../Querying/AggregationGranularity.md) 。使用 `none` 按原样存储时间戳,而不进行任何截断。请注意,即使将 `queryGranularity` 设置为 `none`,也将应用 `rollup`。 | `none` |
| rollup | 是否在摄取时使用 [rollup](#rollup)。 注意:即使 `queryGranularity` 设置为 `none`rollup也仍然是有效的当数据具有相同的时间戳时数据将被汇总 | `true` | | rollup | 是否在摄取时使用 [rollup](#rollup)。 注意:即使 `queryGranularity` 设置为 `none`rollup也仍然是有效的当数据具有相同的时间戳时数据将被汇总 | `true` |
| interval | 描述应该创建段的时间块的间隔列表。如果 `type` 设置为`uniform`,则此列表将根据 `segmentGranularity` 进行拆分和舍入。如果 `type` 设置为 `arbitrary` ,则将按原样使用此列表。<br><br> 如果该值不提供或者为空值,则批处理摄取任务通常会根据在输入数据中找到的时间戳来确定要输出的时间块。<br><br> 如果指定,批处理摄取任务可以跳过确定分区阶段,这可能会导致更快的摄取。批量摄取任务也可以预先请求它们的所有锁,而不是逐个请求。批处理摄取任务将丢弃任何时间戳超出指定间隔的记录。<br><br> 在任何形式的流摄取中忽略该配置。 | `null` | | interval | 描述应该创建段的时间块的间隔列表。如果 `type` 设置为`uniform`,则此列表将根据 `segmentGranularity` 进行拆分和舍入。如果 `type` 设置为 `arbitrary` ,则将按原样使用此列表。<br><br> 如果该值不提供或者为空值,则批处理摄取任务通常会根据在输入数据中找到的时间戳来确定要输出的时间块。<br><br> 如果指定,批处理摄取任务可以跳过确定分区阶段,这可能会导致更快的摄取。批量摄取任务也可以预先请求它们的所有锁,而不是逐个请求。批处理摄取任务将丢弃任何时间戳超出指定间隔的记录。<br><br> 在任何形式的流摄取中忽略该配置。 | `null` |

View File

@ -186,7 +186,7 @@ curl -X POST -H 'Content-Type: application/json' -d @supervisor-spec.json http:/
Kafka索引服务同时支持通过 [`inputFormat`](dataformats.md#inputformat) 和 [`parser`](dataformats.md#parser) 来指定数据格式。 `inputFormat` 是一种新的且推荐的用于Kafka索引服务中指定数据格式的方式但是很遗憾的是目前它还不支持过时的 `parser` 所有支持的所有格式(未来会支持)。 Kafka索引服务同时支持通过 [`inputFormat`](dataformats.md#inputformat) 和 [`parser`](dataformats.md#parser) 来指定数据格式。 `inputFormat` 是一种新的且推荐的用于Kafka索引服务中指定数据格式的方式但是很遗憾的是目前它还不支持过时的 `parser` 所有支持的所有格式(未来会支持)。
`inputFormat` 支持的格式包括 [`csv`](dataformats.md#csv), [`delimited`](dataformats.md#TSV(Delimited)), [`json`](dataformats.md#json)。可以使用 `parser` 来读取 [`avro_stream`](dataformats.md#AvroStreamParser), [`protobuf`](dataformats.md#ProtobufParser), [`thrift`](../Development/thrift.md) 格式的数据。 `inputFormat` 支持的格式包括 [`csv`](dataformats.md#csv), [`delimited`](dataformats.md#TSV(Delimited)), [`json`](dataformats.md#json)。可以使用 `parser` 来读取 [`avro_stream`](dataformats.md#AvroStreamParser), [`protobuf`](dataformats.md#ProtobufParser), [`thrift`](../development/overview.md) 格式的数据。
### 操作 ### 操作

View File

@ -39,7 +39,7 @@ Apache Druid当前支持两种类型的本地批量索引任务 `index_parall
传统的 [`firehose`](#firehoses%e5%b7%b2%e5%ba%9f%e5%bc%83) 支持其他一些云存储类型。下面的 `firehose` 类型也是可拆分的。请注意,`firehose` 只支持文本格式。 传统的 [`firehose`](#firehoses%e5%b7%b2%e5%ba%9f%e5%bc%83) 支持其他一些云存储类型。下面的 `firehose` 类型也是可拆分的。请注意,`firehose` 只支持文本格式。
* [`static-cloudfiles`](../Development/rackspacecloudfiles.md) * [`static-cloudfiles`](../development/rackspacecloudfiles.md)
您可能需要考虑以下事项: 您可能需要考虑以下事项:
* 您可能希望控制每个worker进程的输入数据量。这可以使用不同的配置进行控制具体取决于并行摄取的阶段有关更多详细信息请参阅 [`partitionsSpec`](#partitionsspec)。对于从 `inputSource` 读取数据的任务,可以在 `tuningConfig` 中设置 [分割提示规范](#分割提示规范)。对于合并无序段的任务,可以在 `tuningConfig` 中设置`totalNumMergeTasks`。 * 您可能希望控制每个worker进程的输入数据量。这可以使用不同的配置进行控制具体取决于并行摄取的阶段有关更多详细信息请参阅 [`partitionsSpec`](#partitionsspec)。对于从 `inputSource` 读取数据的任务,可以在 `tuningConfig` 中设置 [分割提示规范](#分割提示规范)。对于合并无序段的任务,可以在 `tuningConfig` 中设置`totalNumMergeTasks`。
@ -235,7 +235,7 @@ PartitionsSpec用于描述辅助分区方法。您应该根据需要的rollup模
基于哈希分区的并行任务类似于 [MapReduce](https://en.wikipedia.org/wiki/MapReduce)。任务分为两个阶段运行,即 `部分段生成``部分段合并` 基于哈希分区的并行任务类似于 [MapReduce](https://en.wikipedia.org/wiki/MapReduce)。任务分为两个阶段运行,即 `部分段生成``部分段合并`
* 在 `部分段生成` 阶段与MapReduce中的Map阶段一样并行任务根据分割提示规范分割输入数据并将每个分割分配给一个worker。每个worker`partial_index_generate` 类型)从 `granularitySpec` 中的`segmentGranularity主分区键` 读取分配的分割,然后按`partitionsSpec` 中 `partitionDimensions辅助分区键`的哈希值对行进行分区。分区数据存储在 [MiddleManager](../Design/MiddleManager.md) 或 [Indexer](../Design/Indexer.md) 的本地存储中。 * 在 `部分段生成` 阶段与MapReduce中的Map阶段一样并行任务根据分割提示规范分割输入数据并将每个分割分配给一个worker。每个worker`partial_index_generate` 类型)从 `granularitySpec` 中的`segmentGranularity主分区键` 读取分配的分割,然后按`partitionsSpec` 中 `partitionDimensions辅助分区键`的哈希值对行进行分区。分区数据存储在 [MiddleManager](../design/MiddleManager.md) 或 [Indexer](../design/Indexer.md) 的本地存储中。
* `部分段合并` 阶段类似于MapReduce中的Reduce阶段。并行任务生成一组新的worker`partial_index_merge` 类型来合并在前一阶段创建的分区数据。这里分区数据根据要合并的时间块和分区维度的散列值进行洗牌每个worker从多个MiddleManager/Indexer进程中读取落在同一时间块和同一散列值中的数据并将其合并以创建最终段。最后它们将最后的段一次推送到深层存储。 * `部分段合并` 阶段类似于MapReduce中的Reduce阶段。并行任务生成一组新的worker`partial_index_merge` 类型来合并在前一阶段创建的分区数据。这里分区数据根据要合并的时间块和分区维度的散列值进行洗牌每个worker从多个MiddleManager/Indexer进程中读取落在同一时间块和同一散列值中的数据并将其合并以创建最终段。最后它们将最后的段一次推送到深层存储。
**基于单一维度范围分区** **基于单一维度范围分区**
@ -254,7 +254,7 @@ PartitionsSpec用于描述辅助分区方法。您应该根据需要的rollup模
`single-dim` 分区下并行任务分为3个阶段进行`部分维分布`、`部分段生成` 和 `部分段合并`。第一个阶段是收集一些统计数据以找到最佳分区,另外两个阶段是创建部分段并分别合并它们,就像在基于哈希的分区中那样。 `single-dim` 分区下并行任务分为3个阶段进行`部分维分布`、`部分段生成` 和 `部分段合并`。第一个阶段是收集一些统计数据以找到最佳分区,另外两个阶段是创建部分段并分别合并它们,就像在基于哈希的分区中那样。
* 在 `部分维度分布` 阶段并行任务分割输入数据并根据分割提示规范将其分配给worker。每个worker任务`partial_dimension_distribution` 类型)读取分配的分割并为 `partitionDimension` 构建直方图。并行任务从worker任务收集这些直方图并根据 `partitionDimension` 找到最佳范围分区,以便在分区之间均匀分布行。请注意,`targetRowsPerSegment` 或 `maxRowsPerSegment` 将用于查找最佳分区。 * 在 `部分维度分布` 阶段并行任务分割输入数据并根据分割提示规范将其分配给worker。每个worker任务`partial_dimension_distribution` 类型)读取分配的分割并为 `partitionDimension` 构建直方图。并行任务从worker任务收集这些直方图并根据 `partitionDimension` 找到最佳范围分区,以便在分区之间均匀分布行。请注意,`targetRowsPerSegment` 或 `maxRowsPerSegment` 将用于查找最佳分区。
* 在 `部分段生成` 阶段并行任务生成新的worker任务`partial_range_index_generate` 类型以创建分区数据。每个worker任务都读取在前一阶段中创建的分割根据 `granularitySpec` 中的`segmentGranularity主分区键`的时间块对行进行分区,然后根据在前一阶段中找到的范围分区对行进行分区。分区数据存储在 [MiddleManager](../Design/MiddleManager.md) 或 [Indexer](../Design/Indexer.md)的本地存储中。 * 在 `部分段生成` 阶段并行任务生成新的worker任务`partial_range_index_generate` 类型以创建分区数据。每个worker任务都读取在前一阶段中创建的分割根据 `granularitySpec` 中的`segmentGranularity主分区键`的时间块对行进行分区,然后根据在前一阶段中找到的范围分区对行进行分区。分区数据存储在 [MiddleManager](../design/MiddleManager.md) 或 [Indexer](../design/Indexer.md)的本地存储中。
* 在 `部分段合并` 阶段并行索引任务生成一组新的worker任务`partial_index_generic_merge`类型)来合并在上一阶段创建的分区数据。这里,分区数据根据时间块和 `partitionDimension` 的值进行洗牌每个工作任务从多个MiddleManager/Indexer进程中读取属于同一范围的同一分区中的段并将它们合并以创建最后的段。最后它们将最后的段推到深层存储。 * 在 `部分段合并` 阶段并行索引任务生成一组新的worker任务`partial_index_generic_merge`类型)来合并在上一阶段创建的分区数据。这里,分区数据根据时间块和 `partitionDimension` 的值进行洗牌每个工作任务从多个MiddleManager/Indexer进程中读取属于同一范围的同一分区中的段并将它们合并以创建最后的段。最后它们将最后的段推到深层存储。
> [!WARNING] > [!WARNING]
@ -654,7 +654,7 @@ PartitionsSpec用于描述辅助分区方法。您应该根据需要的rollup模
#### S3输入源 #### S3输入源
> [!WARNING] > [!WARNING]
> 您需要添加 [`druid-s3-extensions`](../Development/S3-compatible.md) 扩展以便使用S3输入源。 > 您需要添加 [`druid-s3-extensions`](../development/S3-compatible.md) 扩展以便使用S3输入源。
S3输入源支持直接从S3读取对象。可以通过S3 URI字符串列表或S3位置前缀列表指定对象该列表将尝试列出内容并摄取位置中包含的所有对象。S3输入源是可拆分的可以由 [并行任务](#并行任务) 使用,其中 `index_parallel` 的每个worker任务将读取一个或多个对象。 S3输入源支持直接从S3读取对象。可以通过S3 URI字符串列表或S3位置前缀列表指定对象该列表将尝试列出内容并摄取位置中包含的所有对象。S3输入源是可拆分的可以由 [并行任务](#并行任务) 使用,其中 `index_parallel` 的每个worker任务将读取一个或多个对象。
@ -734,7 +734,7 @@ S3对象
| `accessKeyId` | S3输入源访问密钥的 [Password Provider](../Operations/passwordproviders.md) 或纯文本字符串 | None | 如果 `secretAccessKey` 被提供的话,则为必须 | | `accessKeyId` | S3输入源访问密钥的 [Password Provider](../Operations/passwordproviders.md) 或纯文本字符串 | None | 如果 `secretAccessKey` 被提供的话,则为必须 |
| `secretAccessKey` | S3输入源访问密钥的 [Password Provider](../Operations/passwordproviders.md) 或纯文本字符串 | None | 如果 `accessKeyId` 被提供的话,则为必须 | | `secretAccessKey` | S3输入源访问密钥的 [Password Provider](../Operations/passwordproviders.md) 或纯文本字符串 | None | 如果 `accessKeyId` 被提供的话,则为必须 |
**注意** *如果 `accessKeyId``secretAccessKey` 未被指定的话, 则将使用默认的 [S3认证](../Development/S3-compatible.md#S3认证方式)* **注意** *如果 `accessKeyId``secretAccessKey` 未被指定的话, 则将使用默认的 [S3认证](../development/S3-compatible.md#S3认证方式)*
#### 谷歌云存储输入源 #### 谷歌云存储输入源

View File

@ -21,7 +21,7 @@
任务API主要在两个地方是可用的 任务API主要在两个地方是可用的
* [Overlord](../Design/Overlord.md) 进程提供HTTP API接口来进行提交任务、取消任务、检查任务状态、查看任务日志与报告等。 查看 [任务API文档](../Operations/api.md) 可以看到完整列表 * [Overlord](../design/Overlord.md) 进程提供HTTP API接口来进行提交任务、取消任务、检查任务状态、查看任务日志与报告等。 查看 [任务API文档](../Operations/api.md) 可以看到完整列表
* Druid SQL包括了一个 [`sys.tasks`](../Querying/druidsql.md#系统Schema) ,保存了当前任务运行的信息。 此表是只读的并且可以通过Overlord API查询完整信息的有限制的子集。 * Druid SQL包括了一个 [`sys.tasks`](../Querying/druidsql.md#系统Schema) ,保存了当前任务运行的信息。 此表是只读的并且可以通过Overlord API查询完整信息的有限制的子集。
### 任务报告 ### 任务报告

View File

@ -1 +0,0 @@
## 开发指南

View File

@ -1 +0,0 @@
<!-- toc -->

View File

@ -306,7 +306,7 @@ Double/Float/Long/String的ANY聚合器不能够使用在摄入规范中
``` ```
> [!WARNING] > [!WARNING]
> 基于JavaScript的功能默认是禁用的。 如何启用它以及如何使用Druid JavaScript功能参考 [JavaScript编程指南](../Development/JavaScript.md)。 > 基于JavaScript的功能默认是禁用的。 如何启用它以及如何使用Druid JavaScript功能参考 [JavaScript编程指南](../development/JavaScript.md)。
### 近似聚合(Approximate Aggregations) ### 近似聚合(Approximate Aggregations)
#### 唯一计数(Count distinct) #### 唯一计数(Count distinct)

View File

@ -60,7 +60,7 @@
该项功能仅仅对多值维度是比较有用的。如果你在Apache Druid中有一个值为 ["v1","v2","v3"] 的行,当发送一个带有对维度值为"v1"进行[查询过滤](filters.md)的GroupBy/TopN查询, 在响应中,将会得到包含"v1","v2","v3"的三行数据。这个行为在大多数场景是不适合的。 该项功能仅仅对多值维度是比较有用的。如果你在Apache Druid中有一个值为 ["v1","v2","v3"] 的行,当发送一个带有对维度值为"v1"进行[查询过滤](filters.md)的GroupBy/TopN查询, 在响应中,将会得到包含"v1","v2","v3"的三行数据。这个行为在大多数场景是不适合的。
之所以会发生这种情况,是因为"查询过滤器"是在位图上内部使用的,并且只用于匹配要包含在查询结果处理中的行。对于多值维度,"查询过滤器"的行为类似于包含检查,它将匹配维度值为["v1"、"v2"、"v3"]的行。有关更多详细信息,请参阅[段](../Design/Segments.md)中"多值列"一节, 然后groupBy/topN处理管道"分解"所有多值维度得到3行"v1"、"v2"和"v3"。 之所以会发生这种情况,是因为"查询过滤器"是在位图上内部使用的,并且只用于匹配要包含在查询结果处理中的行。对于多值维度,"查询过滤器"的行为类似于包含检查,它将匹配维度值为["v1"、"v2"、"v3"]的行。有关更多详细信息,请参阅[段](../design/Segments.md)中"多值列"一节, 然后groupBy/topN处理管道"分解"所有多值维度得到3行"v1"、"v2"和"v3"。
除了有效地选择要处理的行的"查询过滤器"之外还可以使用带过滤的DimensionSpec来筛选多值维度值中的特定值。这些维度规范采用代理维度规范和筛选条件。从"分解"行中,查询结果中只返回与给定筛选条件匹配的行。 除了有效地选择要处理的行的"查询过滤器"之外还可以使用带过滤的DimensionSpec来筛选多值维度值中的特定值。这些维度规范采用代理维度规范和筛选条件。从"分解"行中,查询结果中只返回与给定筛选条件匹配的行。
@ -87,7 +87,7 @@
#### 带Lookup的DimensionSpec #### 带Lookup的DimensionSpec
> [!WARNING] > [!WARNING]
> Lookups是一个[实验性的特性](../Development/experimental.md) > Lookups是一个[实验性的特性](../development/experimental.md)
带Lookup的DimensionSpec可用于将lookup实现直接定义为维度规范。一般来说有两种不同类型的查找实现。第一种是在查询时像map实现一样传递的。 带Lookup的DimensionSpec可用于将lookup实现直接定义为维度规范。一般来说有两种不同类型的查找实现。第一种是在查询时像map实现一样传递的。
@ -296,7 +296,7 @@ null字符串被认定为长度为0
``` ```
> [!WARNING] > [!WARNING]
> 基于JavaScript的功能默认是禁用的。 如何启用它以及如何使用Druid JavaScript功能参考 [JavaScript编程指南](../Development/JavaScript.md)。 > 基于JavaScript的功能默认是禁用的。 如何启用它以及如何使用Druid JavaScript功能参考 [JavaScript编程指南](../development/JavaScript.md)。
#### 已注册的Lookup提取函数 #### 已注册的Lookup提取函数

View File

@ -131,7 +131,7 @@ Druid的原生类型系统允许字符串可能有多个值。这些 [多值维
在默认模式(`true`)下Druid将NULL和空字符串互换处理而不是根据SQL标准。在这种模式下Druid SQL只部分支持NULL。例如表达式 `col IS NULL``col = ''` 等效,如果 `col` 包含空字符串则两者的计算结果都为true。类似地如果`col1`是空字符串,则表达式 `COALESCE(col1col2)` 将返回 `col2`。当 `COUNT(*)` 聚合器计算所有行时,`COUNT(expr)` 聚合器将计算expr既不为空也不为空字符串的行数。此模式中的数值列不可为空任何空值或缺少的值都将被视为零。 在默认模式(`true`)下Druid将NULL和空字符串互换处理而不是根据SQL标准。在这种模式下Druid SQL只部分支持NULL。例如表达式 `col IS NULL``col = ''` 等效,如果 `col` 包含空字符串则两者的计算结果都为true。类似地如果`col1`是空字符串,则表达式 `COALESCE(col1col2)` 将返回 `col2`。当 `COUNT(*)` 聚合器计算所有行时,`COUNT(expr)` 聚合器将计算expr既不为空也不为空字符串的行数。此模式中的数值列不可为空任何空值或缺少的值都将被视为零。
在SQL兼容模式(`false`)中NULL的处理更接近SQL标准该属性同时影响存储和查询因此为了获得最佳行为应该在接收时和查询时同时设置该属性。处理空值的能力会带来一些开销有关更多详细信息请参阅 [段文档](../Design/Segments.md#SQL兼容的空值处理)。 在SQL兼容模式(`false`)中NULL的处理更接近SQL标准该属性同时影响存储和查询因此为了获得最佳行为应该在接收时和查询时同时设置该属性。处理空值的能力会带来一些开销有关更多详细信息请参阅 [段文档](../design/Segments.md#SQL兼容的空值处理)。
### 聚合函数 ### 聚合函数
@ -148,14 +148,14 @@ Druid的原生类型系统允许字符串可能有多个值。这些 [多值维
| `MAX(expr)` | 取数字的最大值 | | `MAX(expr)` | 取数字的最大值 |
| `AVG(expr)` | 取平均值 | | `AVG(expr)` | 取平均值 |
| `APPROX_COUNT_DISTINCT(expr)` | 唯一值的计数该值可以是常规列或hyperUnique。这始终是近似值而不考虑"useApproximateCountDistinct"的值。该函数使用了Druid内置的"cardinality"或"hyperUnique"聚合器。另请参见 `COUNT(DISTINCT expr)` | | `APPROX_COUNT_DISTINCT(expr)` | 唯一值的计数该值可以是常规列或hyperUnique。这始终是近似值而不考虑"useApproximateCountDistinct"的值。该函数使用了Druid内置的"cardinality"或"hyperUnique"聚合器。另请参见 `COUNT(DISTINCT expr)` |
| `APPROX_COUNT_DISTINCT_DS_HLL(expr, [lgK, tgtHllType])` | 唯一值的计数,该值可以是常规列或[HLL sketch](../Configuration/core-ext/datasketches-hll.md)。`lgk` 和 `tgtHllType` 参数在HLL Sketch文档中做了描述。 该值也始终是近似值,而不考虑"useApproximateCountDistinct"的值。另请参见 `COUNT(DISTINCT expr)`, 使用该函数需要加载 [DataSketches扩展](../Development/datasketches-extension.md) | | `APPROX_COUNT_DISTINCT_DS_HLL(expr, [lgK, tgtHllType])` | 唯一值的计数,该值可以是常规列或[HLL sketch](../Configuration/core-ext/datasketches-hll.md)。`lgk` 和 `tgtHllType` 参数在HLL Sketch文档中做了描述。 该值也始终是近似值,而不考虑"useApproximateCountDistinct"的值。另请参见 `COUNT(DISTINCT expr)`, 使用该函数需要加载 [DataSketches扩展](../development/datasketches-extension.md) |
| `APPROX_COUNT_DISTINCT_DS_THETA(expr, [size])` | 唯一值的计数,该值可以是常规列或[Theta sketch](../Configuration/core-ext/datasketches-theta.md)。`size` 参数在Theta Sketch文档中做了描述。 该值也始终是近似值,而不考虑"useApproximateCountDistinct"的值。另请参见 `COUNT(DISTINCT expr)`, 使用该函数需要加载 [DataSketches扩展](../Development/datasketches-extension.md) | | `APPROX_COUNT_DISTINCT_DS_THETA(expr, [size])` | 唯一值的计数,该值可以是常规列或[Theta sketch](../Configuration/core-ext/datasketches-theta.md)。`size` 参数在Theta Sketch文档中做了描述。 该值也始终是近似值,而不考虑"useApproximateCountDistinct"的值。另请参见 `COUNT(DISTINCT expr)`, 使用该函数需要加载 [DataSketches扩展](../development/datasketches-extension.md) |
| `DS_HLL(expr, [lgK, tgtHllType])` | 在表达式的值上创建一个 [`HLL sketch`](../Configuration/core-ext/datasketches-hll.md), 该值可以是常规列或者包括HLL Sketch的列。`lgk` 和 `tgtHllType` 参数在HLL Sketch文档中做了描述。使用该函数需要加载 [DataSketches扩展](../Development/datasketches-extension.md) | | `DS_HLL(expr, [lgK, tgtHllType])` | 在表达式的值上创建一个 [`HLL sketch`](../Configuration/core-ext/datasketches-hll.md), 该值可以是常规列或者包括HLL Sketch的列。`lgk` 和 `tgtHllType` 参数在HLL Sketch文档中做了描述。使用该函数需要加载 [DataSketches扩展](../development/datasketches-extension.md) |
| `DS_THETA(expr, [size])` | 在表达式的值上创建一个[`Theta sketch`](../Configuration/core-ext/datasketches-theta.md)该值可以是常规列或者包括Theta Sketch的列。`size` 参数在Theta Sketch文档中做了描述。使用该函数需要加载 [DataSketches扩展](../Development/datasketches-extension.md) | | `DS_THETA(expr, [size])` | 在表达式的值上创建一个[`Theta sketch`](../Configuration/core-ext/datasketches-theta.md)该值可以是常规列或者包括Theta Sketch的列。`size` 参数在Theta Sketch文档中做了描述。使用该函数需要加载 [DataSketches扩展](../development/datasketches-extension.md) |
| `APPROX_QUANTILE(expr, probability, [resolution])` | 在数值表达式或者[近似图](../Configuration/core-ext/approximate-histograms.md) 表达式上计算近似分位数,"probability"应该是位于0到1之间不包括1"resolution"是用于计算的centroids更高的resolution将会获得更精确的结果默认值为50。使用该函数需要加载 [近似直方图扩展](../Configuration/core-ext/approximate-histograms.md) | | `APPROX_QUANTILE(expr, probability, [resolution])` | 在数值表达式或者[近似图](../Configuration/core-ext/approximate-histograms.md) 表达式上计算近似分位数,"probability"应该是位于0到1之间不包括1"resolution"是用于计算的centroids更高的resolution将会获得更精确的结果默认值为50。使用该函数需要加载 [近似直方图扩展](../Configuration/core-ext/approximate-histograms.md) |
| `APPROX_QUANTILE_DS(expr, probability, [k])` | 在数值表达式或者 [Quantiles sketch](../Configuration/core-ext/datasketches-quantiles.md) 表达式上计算近似分位数,"probability"应该是位于0到1之间不包括1, `k`参数在Quantiles Sketch文档中做了描述。使用该函数需要加载 [DataSketches扩展](../Development/datasketches-extension.md) | | `APPROX_QUANTILE_DS(expr, probability, [k])` | 在数值表达式或者 [Quantiles sketch](../Configuration/core-ext/datasketches-quantiles.md) 表达式上计算近似分位数,"probability"应该是位于0到1之间不包括1, `k`参数在Quantiles Sketch文档中做了描述。使用该函数需要加载 [DataSketches扩展](../development/datasketches-extension.md) |
| `APPROX_QUANTILE_FIXED_BUCKETS(expr, probability, numBuckets, lowerLimit, upperLimit, [outlierHandlingMode])` | 在数值表达式或者[fixed buckets直方图](../Configuration/core-ext/approximate-histograms.md) 表达式上计算近似分位数,"probability"应该是位于0到1之间不包括1, `numBuckets`, `lowerLimit`, `upperLimit``outlierHandlingMode` 参数在fixed buckets直方图文档中做了描述。 使用该函数需要加载 [近似直方图扩展](../Configuration/core-ext/approximate-histograms.md) | | `APPROX_QUANTILE_FIXED_BUCKETS(expr, probability, numBuckets, lowerLimit, upperLimit, [outlierHandlingMode])` | 在数值表达式或者[fixed buckets直方图](../Configuration/core-ext/approximate-histograms.md) 表达式上计算近似分位数,"probability"应该是位于0到1之间不包括1, `numBuckets`, `lowerLimit`, `upperLimit``outlierHandlingMode` 参数在fixed buckets直方图文档中做了描述。 使用该函数需要加载 [近似直方图扩展](../Configuration/core-ext/approximate-histograms.md) |
| `DS_QUANTILES_SKETCH(expr, [k])` | 在表达式的值上创建一个[`Quantiles sketch`](../Configuration/core-ext/datasketches-quantiles.md)该值可以是常规列或者包括Quantiles Sketch的列。`k`参数在Quantiles Sketch文档中做了描述。使用该函数需要加载 [DataSketches扩展](../Development/datasketches-extension.md) | | `DS_QUANTILES_SKETCH(expr, [k])` | 在表达式的值上创建一个[`Quantiles sketch`](../Configuration/core-ext/datasketches-quantiles.md)该值可以是常规列或者包括Quantiles Sketch的列。`k`参数在Quantiles Sketch文档中做了描述。使用该函数需要加载 [DataSketches扩展](../development/datasketches-extension.md) |
| `BLOOM_FILTER(expr, numEntries)` | 根据`expr`生成的值计算bloom筛选器其中`numEntries`在假阳性率增加之前具有最大数量的不同值。详细可以参见 [Bloom过滤器扩展](../Configuration/core-ext/bloom-filter.md) | | `BLOOM_FILTER(expr, numEntries)` | 根据`expr`生成的值计算bloom筛选器其中`numEntries`在假阳性率增加之前具有最大数量的不同值。详细可以参见 [Bloom过滤器扩展](../Configuration/core-ext/bloom-filter.md) |
| `TDIGEST_QUANTILE(expr, quantileFraction, [compression])` | 根据`expr`生成的值构建一个T-Digest sketch并返回分位数的值。"compression"默认值100确定sketch的精度和大小。更高的compression意味着更高的精度但更多的空间来存储sketch。有关更多详细信息请参阅 [t-digest扩展文档](../Configuration/core-ext/tdigestsketch-quantiles.md) | | `TDIGEST_QUANTILE(expr, quantileFraction, [compression])` | 根据`expr`生成的值构建一个T-Digest sketch并返回分位数的值。"compression"默认值100确定sketch的精度和大小。更高的compression意味着更高的精度但更多的空间来存储sketch。有关更多详细信息请参阅 [t-digest扩展文档](../Configuration/core-ext/tdigestsketch-quantiles.md) |
| `TDIGEST_GENERATE_SKETCH(expr, [compression])` | 根据`expr`生成的值构建一个T-Digest sketch。"compression"默认值100确定sketch的精度和大小。更高的compression意味着更高的精度但更多的空间来存储sketch。有关更多详细信息请参阅 [t-digest扩展文档](../Configuration/core-ext/tdigestsketch-quantiles.md) | | `TDIGEST_GENERATE_SKETCH(expr, [compression])` | 根据`expr`生成的值构建一个T-Digest sketch。"compression"默认值100确定sketch的精度和大小。更高的compression意味着更高的精度但更多的空间来存储sketch。有关更多详细信息请参阅 [t-digest扩展文档](../Configuration/core-ext/tdigestsketch-quantiles.md) |
@ -326,7 +326,7 @@ Druid的原生类型系统允许字符串可能有多个值。这些 [多值维
**HLL Sketch函数** **HLL Sketch函数**
以下函数操作在 [DataSketches HLL sketches](../Configuration/core-ext/datasketches-hll.md) 之上,使用这些函数之前需要加载 [DataSketches扩展](../Development/datasketches-extension.md) 以下函数操作在 [DataSketches HLL sketches](../Configuration/core-ext/datasketches-hll.md) 之上,使用这些函数之前需要加载 [DataSketches扩展](../development/datasketches-extension.md)
| 函数 | 描述 | | 函数 | 描述 |
|-|-| |-|-|
@ -337,7 +337,7 @@ Druid的原生类型系统允许字符串可能有多个值。这些 [多值维
**Theta Sketch函数** **Theta Sketch函数**
以下函数操作在 [theta sketches](../Configuration/core-ext/datasketches-theta.md) 之上,使用这些函数之前需要加载 [DataSketches扩展](../Development/datasketches-extension.md) 以下函数操作在 [theta sketches](../Configuration/core-ext/datasketches-theta.md) 之上,使用这些函数之前需要加载 [DataSketches扩展](../development/datasketches-extension.md)
| 函数 | 描述 | | 函数 | 描述 |
|-|-| |-|-|
@ -349,7 +349,7 @@ Druid的原生类型系统允许字符串可能有多个值。这些 [多值维
**Quantiles Sketch函数** **Quantiles Sketch函数**
以下函数操作在 [quantiles sketches](../Configuration/core-ext/datasketches-quantiles.md) 之上,使用这些函数之前需要加载 [DataSketches扩展](../Development/datasketches-extension.md) 以下函数操作在 [quantiles sketches](../Configuration/core-ext/datasketches-quantiles.md) 之上,使用这些函数之前需要加载 [DataSketches扩展](../development/datasketches-extension.md)
| 函数 | 描述 | | 函数 | 描述 |
|-|-| |-|-|
@ -647,7 +647,7 @@ try (Connection connection = DriverManager.getConnection(url, connectionProperti
**连接粘性** **连接粘性**
Druid的JDBC服务不在Broker之间共享连接状态。这意味着如果您使用JDBC并且有多个Druid Broker您应该连接到一个特定的Broker或者使用启用了粘性会话的负载平衡器。Druid Router进程在平衡JDBC请求时提供连接粘性即使使用普通的非粘性负载平衡器也可以用来实现必要的粘性。请参阅 [Router文档](../Design/Router.md) 以了解更多详细信息 Druid的JDBC服务不在Broker之间共享连接状态。这意味着如果您使用JDBC并且有多个Druid Broker您应该连接到一个特定的Broker或者使用启用了粘性会话的负载平衡器。Druid Router进程在平衡JDBC请求时提供连接粘性即使使用普通的非粘性负载平衡器也可以用来实现必要的粘性。请参阅 [Router文档](../design/Router.md) 以了解更多详细信息
注意非JDBC的 [HTTP POST](#http-post) 是无状态的,不需要粘性 注意非JDBC的 [HTTP POST](#http-post) 是无状态的,不需要粘性
@ -759,10 +759,10 @@ segments表提供了所有Druid段的详细信息无论该段是否被发布
| `partition_num` | LONG | 分区号(整数,在数据源+间隔+版本中是唯一的;不一定是连续的) | | `partition_num` | LONG | 分区号(整数,在数据源+间隔+版本中是唯一的;不一定是连续的) |
| `num_replicas` | LONG | 当前正在服务的此段的副本数 | | `num_replicas` | LONG | 当前正在服务的此段的副本数 |
| `num_rows` | LONG | 当前段中的行数如果查询时Broker未知则此值可以为空 | | `num_rows` | LONG | 当前段中的行数如果查询时Broker未知则此值可以为空 |
| `is_published` | LONG | 布尔值表示为long类型其中1=true0=false。1表示此段已发布到元数据存储且 `used=1`。详情查看 [架构页面](../Design/Design.md) | | `is_published` | LONG | 布尔值表示为long类型其中1=true0=false。1表示此段已发布到元数据存储且 `used=1`。详情查看 [架构页面](../design/Design.md) |
| `is_available` | LONG | 布尔值表示为long类型其中1=true0=false。1表示此段当前由任何进程Historical或Realtime提供服务。详情查看 [架构页面](../Design/Design.md) | | `is_available` | LONG | 布尔值表示为long类型其中1=true0=false。1表示此段当前由任何进程Historical或Realtime提供服务。详情查看 [架构页面](../design/Design.md) |
| `is_realtime` | LONG | 布尔值表示为long类型其中1=true0=false。如果此段仅由实时任务提供服务则为1如果任何Historical进程正在为此段提供服务则为0。 | | `is_realtime` | LONG | 布尔值表示为long类型其中1=true0=false。如果此段仅由实时任务提供服务则为1如果任何Historical进程正在为此段提供服务则为0。 |
| `is_overshadowed` | LONG | 布尔值表示为long类型其中1=true0=false。如果此段已发布并且被其他已发布的段完全覆盖则为1。目前对于未发布的段`is_overshadowed` 总是false尽管这在未来可能会改变。可以通过过滤 `is_published=1``is_overshadowed=0` 来筛选"应该发布"的段。如果段最近被替换,它们可以短暂地被发布,也可以被掩盖,但还没有被取消发布。详情查看 [架构页面](../Design/Design.md) | | `is_overshadowed` | LONG | 布尔值表示为long类型其中1=true0=false。如果此段已发布并且被其他已发布的段完全覆盖则为1。目前对于未发布的段`is_overshadowed` 总是false尽管这在未来可能会改变。可以通过过滤 `is_published=1``is_overshadowed=0` 来筛选"应该发布"的段。如果段最近被替换,它们可以短暂地被发布,也可以被掩盖,但还没有被取消发布。详情查看 [架构页面](../design/Design.md) |
| `payload` | STRING | JSON序列化数据段负载 | | `payload` | STRING | JSON序列化数据段负载 |
例如,要检索数据源"wikipedia"的所有段,请使用查询: 例如,要检索数据源"wikipedia"的所有段,请使用查询:

View File

@ -116,7 +116,7 @@ JavaScript函数需要一个维度值的参数返回值要么是true或者fal
JavaScript过滤器支持使用提取函数详情可见 [带提取函数的过滤器](#带提取函数的过滤器) JavaScript过滤器支持使用提取函数详情可见 [带提取函数的过滤器](#带提取函数的过滤器)
> [!WARNING] > [!WARNING]
> 基于JavaScript的功能默认是禁用的。 如何启用它以及如何使用Druid JavaScript功能参考 [JavaScript编程指南](../Development/JavaScript.md)。 > 基于JavaScript的功能默认是禁用的。 如何启用它以及如何使用Druid JavaScript功能参考 [JavaScript编程指南](../development/JavaScript.md)。
### **提取过滤器(Extraction Filter)** ### **提取过滤器(Extraction Filter)**

View File

@ -2,7 +2,7 @@
## Lookups ## Lookups
> [!WARNING] > [!WARNING]
> Lookups是一个 [实验性的特性](../Development/experimental.md) > Lookups是一个 [实验性的特性](../development/experimental.md)
Lookups是Apache Druid中的一个概念在Druid中维度值(可选地)被新值替换从而允许类似join的功能。在Druid中应用Lookup类似于在数据仓库中的联接维度表。有关详细信息请参见 [维度说明](querydimensions.md)。在这些文档中,"key"是指要匹配的维度值,"value"是指其替换的目标值。所以如果你想把 `appid-12345` 映射到`Super Mega Awesome App`,那么键应该是 `appid-12345`,值就是 `Super Mega Awesome App` Lookups是Apache Druid中的一个概念在Druid中维度值(可选地)被新值替换从而允许类似join的功能。在Druid中应用Lookup类似于在数据仓库中的联接维度表。有关详细信息请参见 [维度说明](querydimensions.md)。在这些文档中,"key"是指要匹配的维度值,"value"是指其替换的目标值。所以如果你想把 `appid-12345` 映射到`Super Mega Awesome App`,那么键应该是 `appid-12345`,值就是 `Super Mega Awesome App`
@ -85,7 +85,7 @@ GROUP BY 1
### 动态配置 ### 动态配置
> [!WARNING] > [!WARNING]
> 动态Lookup配置是一个 [实验特性](../Development/experimental.md), 不再支持静态配置。下面的文档说明了集群范围的配置该配置可以通过Coordinator进行访问。配置通过服务器的"tier"概念传播。"tier"被定义为一个应该接收一组Lookup的服务集合。例如您可以让所有Historical都是 `_default`而Peon是它们所负责的数据源的各个层的一部分。Lookups的tier完全独立于Historical tiers。 > 动态Lookup配置是一个 [实验特性](../development/experimental.md), 不再支持静态配置。下面的文档说明了集群范围的配置该配置可以通过Coordinator进行访问。配置通过服务器的"tier"概念传播。"tier"被定义为一个应该接收一组Lookup的服务集合。例如您可以让所有Historical都是 `_default`而Peon是它们所负责的数据源的各个层的一部分。Lookups的tier完全独立于Historical tiers。
这些配置都可以通过以下URI模板来使用JSON获取到 这些配置都可以通过以下URI模板来使用JSON获取到

View File

@ -36,7 +36,7 @@ curl -X POST '<queryable_host>:<port>/druid/v2/?pretty' -H 'Content-Type:applica
Druid的原生查询级别相对较低与内部执行计算的方式密切相关。Druid查询被设计成轻量级的并且非常快速地完成。这意味着对于更复杂的分析或者构建更复杂的可视化可能需要多个Druid查询。 Druid的原生查询级别相对较低与内部执行计算的方式密切相关。Druid查询被设计成轻量级的并且非常快速地完成。这意味着对于更复杂的分析或者构建更复杂的可视化可能需要多个Druid查询。
即使查询通常是向Broker或Router发出的但是它们也可以被 [Historical进程](../Design/Historical.md) 和运行流摄取任务的 [peon(任务jvm)](../Design/Peons.md) 接受。如果您想查询由特定进程提供服务的特定段的结果,这可能很有价值。 即使查询通常是向Broker或Router发出的但是它们也可以被 [Historical进程](../design/Historical.md) 和运行流摄取任务的 [peon(任务jvm)](../design/Peons.md) 接受。如果您想查询由特定进程提供服务的特定段的结果,这可能很有价值。
### 可用的查询 ### 可用的查询

View File

@ -3,7 +3,7 @@
Apache Druid支持多值字符串维度。当输入字段中包括一个数组值而非单一值例如JSON数组或者包括多个 `listDelimiter` 分割的TSV字段时即可生成多值维度。 Apache Druid支持多值字符串维度。当输入字段中包括一个数组值而非单一值例如JSON数组或者包括多个 `listDelimiter` 分割的TSV字段时即可生成多值维度。
本文档描述了对一个维度进行聚合时多值维度上的GroupBy查询行为TopN很类似。对于多值维度的内部详细信息可以查看 [Segments](../Design/Segments.md) 文档的多值列部分。本文档中的示例都为 [原生Druid查询](makeNativeQueries.md)格式对于多值维度在SQL中的使用情况请查阅 [Druid SQL 文档](druidsql.md) 本文档描述了对一个维度进行聚合时多值维度上的GroupBy查询行为TopN很类似。对于多值维度的内部详细信息可以查看 [Segments](../design/Segments.md) 文档的多值列部分。本文档中的示例都为 [原生Druid查询](makeNativeQueries.md)格式对于多值维度在SQL中的使用情况请查阅 [Druid SQL 文档](druidsql.md)
<script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script> <script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<ins class="adsbygoogle" <ins class="adsbygoogle"

View File

@ -46,10 +46,10 @@ Druid还通过提供可配置的数据分发方式来支持多租户。Druid的H
### 支持高查询并发 ### 支持高查询并发
Druid的基本计算单位是[段](../Design/Segments.md)。进程并行地扫描段,给定进程可以根据`druid.processing.numThreads`的配置并发扫描。为了并行处理更多的数据并提高性能可以向集群中添加更多的核。Druid段的大小应该使任何给定段上的计算都能在最多500毫秒内完成。 Druid的基本计算单位是[段](../design/Segments.md)。进程并行地扫描段,给定进程可以根据`druid.processing.numThreads`的配置并发扫描。为了并行处理更多的数据并提高性能可以向集群中添加更多的核。Druid段的大小应该使任何给定段上的计算都能在最多500毫秒内完成。
Druid在内部将扫描段的请求存储在优先队列中。如果一个给定的查询需要扫描比集群中可用处理器总数更多的段并且许多类似昂贵的查询同时运行我们不希望任何查询都被耗尽。Druid的内部处理逻辑将扫描一个查询中的一组段扫描完成后立即释放资源允许继续扫描来自另一个查询的第二组段。通过保持段计算时间非常小我们确保不断地产生资源并且与不同查询相关的段都被处理。 Druid在内部将扫描段的请求存储在优先队列中。如果一个给定的查询需要扫描比集群中可用处理器总数更多的段并且许多类似昂贵的查询同时运行我们不希望任何查询都被耗尽。Druid的内部处理逻辑将扫描一个查询中的一组段扫描完成后立即释放资源允许继续扫描来自另一个查询的第二组段。通过保持段计算时间非常小我们确保不断地产生资源并且与不同查询相关的段都被处理。
Druid查询可以选择在[查询上下文](query-context.md)中设置`priority`标志。已知速度较慢的查询(下载或报告样式的查询)可以取消优先级,交互程度更高的查询可以具有更高的优先级。 Druid查询可以选择在[查询上下文](query-context.md)中设置`priority`标志。已知速度较慢的查询(下载或报告样式的查询)可以取消优先级,交互程度更高的查询可以具有更高的优先级。
Broker进程也可以专用于给定的层。例如一组Broker进程可以专用于快速交互查询另一组Broker进程可以专用于较慢的报告查询。Druid还提供了一个[Router](../Design/Router.md)进程可以根据各种查询参数datasource、interval等将查询路由到不同的Broker。 Broker进程也可以专用于给定的层。例如一组Broker进程可以专用于快速交互查询另一组Broker进程可以专用于较慢的报告查询。Druid还提供了一个[Router](../design/Router.md)进程可以根据各种查询参数datasource、interval等将查询路由到不同的Broker。

View File

@ -113,7 +113,7 @@ postAggregation : {
``` ```
> [!WARNING] > [!WARNING]
> 基于JavaScript的功能默认是禁用的。 如何启用它以及如何使用Druid JavaScript功能参考 [JavaScript编程指南](../Development/JavaScript.md)。 > 基于JavaScript的功能默认是禁用的。 如何启用它以及如何使用Druid JavaScript功能参考 [JavaScript编程指南](../development/JavaScript.md)。
### 超唯一基数后置聚合器(HyperUnique Cardinality post-aggregator) ### 超唯一基数后置聚合器(HyperUnique Cardinality post-aggregator)

View File

@ -23,7 +23,7 @@ Druid的查询执行方法因查询的 [数据源类型](#数据源类型) 而
直接在 [表数据源](datasource.md#table) 上操作的查询使用由Broker进程引导的**分散-聚集**方法执行。过程如下: 直接在 [表数据源](datasource.md#table) 上操作的查询使用由Broker进程引导的**分散-聚集**方法执行。过程如下:
1. Broker根据 `"interval"` 参数确定哪些 [](../Design/Segments.md) 与查询相关。段总是按时间划分的,因此任何间隔与查询间隔重叠的段都可能是相关的。 1. Broker根据 `"interval"` 参数确定哪些 [](../design/Segments.md) 与查询相关。段总是按时间划分的,因此任何间隔与查询间隔重叠的段都可能是相关的。
2. 如果输入数据使用 [`single_dim` partitionsSpec](../DataIngestion/native.md#partitionsSpec) 按范围分区并且过滤器与用于分区的维度匹配则Broker还可以根据 `"filter"` 进一步修剪段列表。 2. 如果输入数据使用 [`single_dim` partitionsSpec](../DataIngestion/native.md#partitionsSpec) 按范围分区并且过滤器与用于分区的维度匹配则Broker还可以根据 `"filter"` 进一步修剪段列表。
3. Broker在删除了查询的段列表之后将查询转发到当前为这些段提供服务的数据服务器如Historical或者运行在MiddleManagers的任务 3. Broker在删除了查询的段列表之后将查询转发到当前为这些段提供服务的数据服务器如Historical或者运行在MiddleManagers的任务
4. 对于除 [Scan](scan.md) 之外的所有查询类型,数据服务器并行处理每个段,并为每个段生成部分结果。所做的具体处理取决于查询类型。如果启用了 [查询缓存](querycached.md)则可以缓存这些部分结果。对于Scan查询段由单个线程按顺序处理结果不被缓存。 4. 对于除 [Scan](scan.md) 之外的所有查询类型,数据服务器并行处理每个段,并为每个段生成部分结果。所做的具体处理取决于查询类型。如果启用了 [查询缓存](querycached.md)则可以缓存这些部分结果。对于Scan查询段由单个线程按顺序处理结果不被缓存。

View File

@ -16,7 +16,7 @@
* [新手入门]() * [新手入门]()
* [Druid介绍](GettingStarted/chapter-1.md) * [Druid介绍](GettingStarted/chapter-1.md)
* [快速开始](GettingStarted/chapter-2.md) * [快速开始](GettingStarted/chapter-2.md)
* [Docker](GettingStarted/Docker.md) * [Docker](tutorials/docker.md)
* [单服务器部署](GettingStarted/chapter-3.md) * [单服务器部署](GettingStarted/chapter-3.md)
* [集群部署](GettingStarted/chapter-4.md) * [集群部署](GettingStarted/chapter-4.md)
@ -35,20 +35,20 @@
* [Kerberized HDFS存储](tutorials/chapter-12.md) * [Kerberized HDFS存储](tutorials/chapter-12.md)
* [架构设计]() * [架构设计]()
* [整体设计](Design/Design.md) * [整体设计](design/Design.md)
* [段设计](Design/Segments.md) * [段设计](design/Segments.md)
* [进程与服务](Design/Processes.md) * [进程与服务](design/Processes.md)
* [Coordinator](Design/Coordinator.md) * [Coordinator](design/Coordinator.md)
* [Overlord](Design/Overlord.md) * [Overlord](design/Overlord.md)
* [Historical](Design/Historical.md) * [Historical](design/Historical.md)
* [MiddleManager](Design/MiddleManager.md) * [MiddleManager](design/MiddleManager.md)
* [Broker](Design/Broker.md) * [Broker](design/Broker.md)
* [Router](Design/Router.md) * [Router](design/Router.md)
* [Indexer](Design/Indexer.md) * [Indexer](design/Indexer.md)
* [Peon](Design/Peons.md) * [Peon](design/Peons.md)
* [深度存储](Design/Deepstorage.md) * [深度存储](design/Deepstorage.md)
* [元数据存储](Design/Metadata.md) * [元数据存储](design/Metadata.md)
* [Zookeeper](Design/Zookeeper.md) * [Zookeeper](design/Zookeeper.md)
* [数据摄取]() * [数据摄取]()
* [摄取概述](DataIngestion/ingestion.md) * [摄取概述](DataIngestion/ingestion.md)
@ -107,7 +107,7 @@
* [操作指南](Operations/index.md) * [操作指南](Operations/index.md)
* [开发指南]() * [开发指南]()
* [开发指南](Development/index.md) * [开发指南](development/index.md)
* [其他相关]() * [其他相关]()
* [其他相关](misc/index.md) * [其他相关](misc/index.md)

View File

@ -3,9 +3,9 @@
- [公众平台](CONTACT.md) - [公众平台](CONTACT.md)
- 开始使用 - 开始使用
- [从文件中载入数据](yong-zhou/ling-ling/mao-ping-li-cun/index.md) - [Druid 介绍](design/index.md)
- [从 Kafka 中载入数据](yong-zhou/ling-ling/tang-fu-cun/index.md) - [快速开始](tutorials/index.md)
- [从 Hadoop 中载入数据](yong-zhou/ling-ling/zhao-jia-wan-cun/index.md) - [Docker 容器](tutorials/docker.md)
- 设计Design - 设计Design
- [JWT](jwt/README.md) - [JWT](jwt/README.md)
@ -16,7 +16,11 @@
- [面试问题和经验](interview/index.md) - [面试问题和经验](interview/index.md)
- [算法题](algorithm/index.md) - [算法题](algorithm/index.md)
- 查询Querying - 查询Querying
- 开发Development
- [在 Druid 中进行开发](development/index.md)
- [创建扩展extensions](development/modules.md)
- 其他杂项Misc - 其他杂项Misc
- [Druid 资源快速导航](misc/index.md) - [Druid 资源快速导航](misc/index.md)

View File

@ -139,7 +139,7 @@ clarity-cloud0_2018-05-21T16:00:00.000Z_2018-05-21T17:00:00.000Z_2018-05-21T15:5
### 查询处理 ### 查询处理
查询首先进入[Broker](../Design/Broker.md), Broker首先鉴别哪些段可能与本次查询有关。 段的列表总是按照时间进行筛选和修剪的当然也可能由其他属性具体取决于数据源的分区方式。然后Broker将确定哪些[Historical](../Design/Historical.md)和[MiddleManager](../Design/MiddleManager.md)为这些段提供服务、并向每个进程发送一个子查询。 Historical和MiddleManager进程接收查询、处理查询并返回结果Broker将接收到的结果合并到一起形成最后的结果集返回给调用者。 查询首先进入[Broker](/Broker.md), Broker首先鉴别哪些段可能与本次查询有关。 段的列表总是按照时间进行筛选和修剪的当然也可能由其他属性具体取决于数据源的分区方式。然后Broker将确定哪些[Historical](/Historical.md)和[MiddleManager](/MiddleManager.md)为这些段提供服务、并向每个进程发送一个子查询。 Historical和MiddleManager进程接收查询、处理查询并返回结果Broker将接收到的结果合并到一起形成最后的结果集返回给调用者。
Broker精简是Druid限制每个查询扫描数据量的一个重要方法但不是唯一的方法。对于比Broker更细粒度级别的精简筛选器每个段中的索引结构允许Druid在查看任何数据行之前找出哪些行如果有的话与筛选器集匹配。一旦Druid知道哪些行与特定查询匹配它就只访问该查询所需的特定列。在这些列中Druid可以从一行跳到另一行避免读取与查询过滤器不匹配的数据。 Broker精简是Druid限制每个查询扫描数据量的一个重要方法但不是唯一的方法。对于比Broker更细粒度级别的精简筛选器每个段中的索引结构允许Druid在查看任何数据行之前找出哪些行如果有的话与筛选器集匹配。一旦Druid知道哪些行与特定查询匹配它就只访问该查询所需的特定列。在这些列中Druid可以从一行跳到另一行避免读取与查询过滤器不匹配的数据。

View File

@ -14,7 +14,7 @@
## Indexer ## Indexer
> [!WARNING] > [!WARNING]
> 索引器是一个可选的和[实验性](../Development/experimental.md)的功能, 其内存管理系统仍在开发中,并将在以后的版本中得到显著增强。 > 索引器是一个可选的和[实验性](../development/experimental.md)的功能, 其内存管理系统仍在开发中,并将在以后的版本中得到显著增强。
Apache Druid索引器进程是MiddleManager + Peon任务执行系统的另一种可替代选择。索引器在单个JVM进程中作为单独的线程运行任务而不是为每个任务派生单独的JVM进程。 Apache Druid索引器进程是MiddleManager + Peon任务执行系统的另一种可替代选择。索引器在单个JVM进程中作为单独的线程运行任务而不是为每个任务派生单独的JVM进程。

View File

@ -86,7 +86,7 @@ Data服务执行摄取作业并存储可查询数据。
[Indexer](./Indexer.md) 进程是MiddleManager和Peon的替代方法。Indexer在单个JVM进程中作为单个线程运行任务而不是为每个任务派生单独的JVM进程。 [Indexer](./Indexer.md) 进程是MiddleManager和Peon的替代方法。Indexer在单个JVM进程中作为单个线程运行任务而不是为每个任务派生单独的JVM进程。
与MiddleManager + Peon系统相比Indexer的设计更易于配置和部署并且能够更好地实现跨任务的资源共享。Indexer是一种较新的功能由于其内存管理系统仍在开发中因此目前被指定为[实验性的特性](../Development/experimental.md)。它将在Druid的未来版本中继续成熟。 与MiddleManager + Peon系统相比Indexer的设计更易于配置和部署并且能够更好地实现跨任务的资源共享。Indexer是一种较新的功能由于其内存管理系统仍在开发中因此目前被指定为[实验性的特性](../development/experimental.md)。它将在Druid的未来版本中继续成熟。
通常您可以部署MiddleManagers或indexer但不能同时部署两者。 通常您可以部署MiddleManagers或indexer但不能同时部署两者。

View File

@ -98,7 +98,7 @@ Router有一个可配置的策略列表用于选择将查询路由到哪个Br
``` ```
> [!WARNING] > [!WARNING]
> 默认情况下禁用基于JavaScript的功能。有关使用Druid的JavaScript功能的指南包括如何启用它的说明请参阅[Druid JavaScript编程指南](../Development/JavaScript.md)。 > 默认情况下禁用基于JavaScript的功能。有关使用Druid的JavaScript功能的指南包括如何启用它的说明请参阅[Druid JavaScript编程指南](../development/JavaScript.md)。
### Avatica查询平衡 ### Avatica查询平衡

View File

Before

Width:  |  Height:  |  Size: 131 KiB

After

Width:  |  Height:  |  Size: 131 KiB

View File

Before

Width:  |  Height:  |  Size: 91 KiB

After

Width:  |  Height:  |  Size: 91 KiB

View File

Before

Width:  |  Height:  |  Size: 24 KiB

After

Width:  |  Height:  |  Size: 24 KiB

100
design/index.md Normal file
View File

@ -0,0 +1,100 @@
---
id: index
title: "Introduction to Apache Druid"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
## What is Druid?
Apache Druid is a real-time analytics database designed for fast slice-and-dice analytics
("[OLAP](http://en.wikipedia.org/wiki/Online_analytical_processing)" queries) on large data sets. Druid is most often
used as a database for powering use cases where real-time ingest, fast query performance, and high uptime are important.
As such, Druid is commonly used for powering GUIs of analytical applications, or as a backend for highly-concurrent APIs
that need fast aggregations. Druid works best with event-oriented data.
Common application areas for Druid include:
- Clickstream analytics (web and mobile analytics)
- Network telemetry analytics (network performance monitoring)
- Server metrics storage
- Supply chain analytics (manufacturing metrics)
- Application performance metrics
- Digital marketing/advertising analytics
- Business intelligence / OLAP
Druid's core architecture combines ideas from data warehouses, timeseries databases, and logsearch systems. Some of
Druid's key features are:
1. **Columnar storage format.** Druid uses column-oriented storage, meaning it only needs to load the exact columns
needed for a particular query. This gives a huge speed boost to queries that only hit a few columns. In addition, each
column is stored optimized for its particular data type, which supports fast scans and aggregations.
2. **Scalable distributed system.** Druid is typically deployed in clusters of tens to hundreds of servers, and can
offer ingest rates of millions of records/sec, retention of trillions of records, and query latencies of sub-second to a
few seconds.
3. **Massively parallel processing.** Druid can process a query in parallel across the entire cluster.
4. **Realtime or batch ingestion.** Druid can ingest data either real-time (ingested data is immediately available for
querying) or in batches.
5. **Self-healing, self-balancing, easy to operate.** As an operator, to scale the cluster out or in, simply add or
remove servers and the cluster will rebalance itself automatically, in the background, without any downtime. If any
Druid servers fail, the system will automatically route around the damage until those servers can be replaced. Druid
is designed to run 24/7 with no need for planned downtimes for any reason, including configuration changes and software
updates.
6. **Cloud-native, fault-tolerant architecture that won't lose data.** Once Druid has ingested your data, a copy is
stored safely in [deep storage](architecture.html#deep-storage) (typically cloud storage, HDFS, or a shared filesystem).
Your data can be recovered from deep storage even if every single Druid server fails. For more limited failures affecting
just a few Druid servers, replication ensures that queries are still possible while the system recovers.
7. **Indexes for quick filtering.** Druid uses [Roaring](https://roaringbitmap.org/) or
[CONCISE](https://arxiv.org/pdf/1004.0403) compressed bitmap indexes to create indexes that power fast filtering and
searching across multiple columns.
8. **Time-based partitioning.** Druid first partitions data by time, and can additionally partition based on other fields.
This means time-based queries will only access the partitions that match the time range of the query. This leads to
significant performance improvements for time-based data.
9. **Approximate algorithms.** Druid includes algorithms for approximate count-distinct, approximate ranking, and
computation of approximate histograms and quantiles. These algorithms offer bounded memory usage and are often
substantially faster than exact computations. For situations where accuracy is more important than speed, Druid also
offers exact count-distinct and exact ranking.
10. **Automatic summarization at ingest time.** Druid optionally supports data summarization at ingestion time. This
summarization partially pre-aggregates your data, and can lead to big costs savings and performance boosts.
## When should I use Druid?
Druid is used by many companies of various sizes for many different use cases. Check out the
[Powered by Apache Druid](/druid-powered) page
Druid is likely a good choice if your use case fits a few of the following descriptors:
- Insert rates are very high, but updates are less common.
- Most of your queries are aggregation and reporting queries ("group by" queries). You may also have searching and
scanning queries.
- You are targeting query latencies of 100ms to a few seconds.
- Your data has a time component (Druid includes optimizations and design choices specifically related to time).
- You may have more than one table, but each query hits just one big distributed table. Queries may potentially hit more
than one smaller "lookup" table.
- You have high cardinality data columns (e.g. URLs, user IDs) and need fast counting and ranking over them.
- You want to load data from Kafka, HDFS, flat files, or object storage like Amazon S3.
Situations where you would likely _not_ want to use Druid include:
- You need low-latency updates of _existing_ records using a primary key. Druid supports streaming inserts, but not streaming updates (updates are done using
background batch jobs).
- You are building an offline reporting system where query latency is not very important.
- You want to do "big" joins (joining one big fact table to another big fact table) and you are okay with these queries
taking a long time to complete.

View File

@ -0,0 +1,54 @@
---
id: aliyun-oss
title: "Aliyun OSS"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `aliyun-oss-extensions` extension.
## Deep Storage
[Aliyun](https://www.aliyun.com) is the 3rd largest cloud infrastructure provider in the world. It provides its own storage solution known as OSS, [Object Storage Service](https://www.aliyun.com/product/oss).
To use aliyun OSS as deep storage, first config as below
|Property|Description|Possible Values|Default|
|--------|---------------|-----------|-------|
|`druid.oss.accessKey`|the `AccessKey ID` of your account which can be used to access the bucket| |Must be set.|
|`druid.oss.secretKey`|the `AccessKey Secret` of your account which can be used to access the bucket| |Must be set. |
|`druid.oss.endpoint`|the endpoint url of your OSS storage| |Must be set.|
if you want to use OSS as deep storage, use the configurations below
|Property|Description|Possible Values|Default|
|--------|---------------|-----------|-------|
|`druid.storage.type`| Global deep storage provider. Must be set to `oss` to make use of this extension. | oss |Must be set.|
|`druid.storage.oss.bucket`|storage bucket name.| | Must be set.|
|`druid.storage.oss.prefix`|a prefix string prepended to the file names for the segments published to aliyun OSS deep storage| druid/segments | |
To save index logs to OSS, apply the configurations below:
|Property|Description|Possible Values|Default|
|--------|---------------|-----------|-------|
|`druid.indexer.logs.type`| Global deep storage provider. Must be set to `oss` to make use of this extension. | oss |Must be set.|
|`druid.indexer.logs.oss.bucket`|the bucket used to keep logs| |Must be set.|
|`druid.indexer.logs.oss.prefix`|a prefix string prepended to the log files.| | |

View File

@ -0,0 +1,99 @@
---
id: ambari-metrics-emitter
title: "Ambari Metrics Emitter"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `ambari-metrics-emitter` extension.
## Introduction
This extension emits Druid metrics to a ambari-metrics carbon server.
Events are sent after been [pickled](http://ambari-metrics.readthedocs.org/en/latest/feeding-carbon.html#the-pickle-protocol); the size of the batch is configurable.
## Configuration
All the configuration parameters for ambari-metrics emitter are under `druid.emitter.ambari-metrics`.
|property|description|required?|default|
|--------|-----------|---------|-------|
|`druid.emitter.ambari-metrics.hostname`|The hostname of the ambari-metrics server.|yes|none|
|`druid.emitter.ambari-metrics.port`|The port of the ambari-metrics server.|yes|none|
|`druid.emitter.ambari-metrics.protocol`|The protocol used to send metrics to ambari metrics collector. One of http/https|no|http|
|`druid.emitter.ambari-metrics.trustStorePath`|Path to trustStore to be used for https|no|none|
|`druid.emitter.ambari-metrics.trustStoreType`|trustStore type to be used for https|no|none|
|`druid.emitter.ambari-metrics.trustStoreType`|trustStore password to be used for https|no|none|
|`druid.emitter.ambari-metrics.batchSize`|Number of events to send as one batch.|no|100|
|`druid.emitter.ambari-metrics.eventConverter`| Filter and converter of druid events to ambari-metrics timeline event(please see next section). |yes|none|
|`druid.emitter.ambari-metrics.flushPeriod` | Queue flushing period in milliseconds. |no|1 minute|
|`druid.emitter.ambari-metrics.maxQueueSize`| Maximum size of the queue used to buffer events. |no|`MAX_INT`|
|`druid.emitter.ambari-metrics.alertEmitters`| List of emitters where alerts will be forwarded to. |no| empty list (no forwarding)|
|`druid.emitter.ambari-metrics.emitWaitTime` | wait time in milliseconds to try to send the event otherwise emitter will throwing event. |no|0|
|`druid.emitter.ambari-metrics.waitForEventTime` | waiting time in milliseconds if necessary for an event to become available. |no|1000 (1 sec)|
### Druid to Ambari Metrics Timeline Event Converter
Ambari Metrics Timeline Event Converter defines a mapping between druid metrics name plus dimensions to a timeline event metricName.
ambari-metrics metric path is organized using the following schema:
`<namespacePrefix>.[<druid service name>].[<druid hostname>].<druid metrics dimensions>.<druid metrics name>`
Properly naming the metrics is critical to avoid conflicts, confusing data and potentially wrong interpretation later on.
Example `druid.historical.hist-host1:8080.MyDataSourceName.GroupBy.query/time`:
* `druid` -> namespace prefix
* `historical` -> service name
* `hist-host1:8080` -> druid hostname
* `MyDataSourceName` -> dimension value
* `GroupBy` -> dimension value
* `query/time` -> metric name
We have two different implementation of event converter:
#### Send-All converter
The first implementation called `all`, will send all the druid service metrics events.
The path will be in the form `<namespacePrefix>.[<druid service name>].[<druid hostname>].<dimensions values ordered by dimension's name>.<metric>`
User has control of `<namespacePrefix>.[<druid service name>].[<druid hostname>].`
```json
druid.emitter.ambari-metrics.eventConverter={"type":"all", "namespacePrefix": "druid.test", "appName":"druid"}
```
#### White-list based converter
The second implementation called `whiteList`, will send only the white listed metrics and dimensions.
Same as for the `all` converter user has control of `<namespacePrefix>.[<druid service name>].[<druid hostname>].`
White-list based converter comes with the following default white list map located under resources in `./src/main/resources/defaultWhiteListMap.json`
Although user can override the default white list map by supplying a property called `mapPath`.
This property is a String containing the path for the file containing **white list map JSON object**.
For example the following converter will read the map from the file `/pathPrefix/fileName.json`.
```json
druid.emitter.ambari-metrics.eventConverter={"type":"whiteList", "namespacePrefix": "druid.test", "ignoreHostname":true, "appName":"druid", "mapPath":"/pathPrefix/fileName.json"}
```
**Druid emits a huge number of metrics we highly recommend to use the `whiteList` converter**

View File

@ -0,0 +1,30 @@
---
id: cassandra
title: "Apache Cassandra"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-cassandra-storage` extension.
[Apache Cassandra](http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra) can also
be leveraged for deep storage. This requires some additional Druid configuration as well as setting up the necessary
schema within a Cassandra keystore.

View File

@ -0,0 +1,98 @@
---
id: cloudfiles
title: "Rackspace Cloud Files"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-cloudfiles-extensions` extension.
## Deep Storage
[Rackspace Cloud Files](http://www.rackspace.com/cloud/files/) is another option for deep storage. This requires some additional Druid configuration.
|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.storage.type`|cloudfiles||Must be set.|
|`druid.storage.region`||Rackspace Cloud Files region.|Must be set.|
|`druid.storage.container`||Rackspace Cloud Files container name.|Must be set.|
|`druid.storage.basePath`||Rackspace Cloud Files base path to use in the container.|Must be set.|
|`druid.storage.operationMaxRetries`||Number of tries before cancel a Rackspace operation.|10|
|`druid.cloudfiles.userName`||Rackspace Cloud username|Must be set.|
|`druid.cloudfiles.apiKey`||Rackspace Cloud API key.|Must be set.|
|`druid.cloudfiles.provider`|rackspace-cloudfiles-us,rackspace-cloudfiles-uk|Name of the provider depending on the region.|Must be set.|
|`druid.cloudfiles.useServiceNet`|true,false|Whether to use the internal service net.|true|
## Firehose
<a name="firehose"></a>
#### StaticCloudFilesFirehose
This firehose ingests events, similar to the StaticAzureBlobStoreFirehose, but from Rackspace's Cloud Files.
Data is newline delimited, with one JSON object per line and parsed as per the `InputRowParser` configuration.
The storage account is shared with the one used for Rackspace's Cloud Files deep storage functionality, but blobs can be in a different region and container.
As with the Azure blobstore, it is assumed to be gzipped if the extension ends in .gz
This firehose is _splittable_ and can be used by [native parallel index tasks](../../ingestion/native-batch.md#parallel-task).
Since each split represents an object in this firehose, each worker task of `index_parallel` will read an object.
Sample spec:
```json
"firehose" : {
"type" : "static-cloudfiles",
"blobs": [
{
"region": "DFW"
"container": "container",
"path": "/path/to/your/file.json"
},
{
"region": "ORD"
"container": "anothercontainer",
"path": "/another/path.json"
}
]
}
```
This firehose provides caching and prefetching features. In IndexTask, a firehose can be read twice if intervals or
shardSpecs are not specified, and, in this case, caching can be useful. Prefetching is preferred when direct scan of objects is slow.
|property|description|default|required?|
|--------|-----------|-------|---------|
|type|This should be `static-cloudfiles`.|N/A|yes|
|blobs|JSON array of Cloud Files blobs.|N/A|yes|
|maxCacheCapacityBytes|Maximum size of the cache space in bytes. 0 means disabling cache.|1073741824|no|
|maxCacheCapacityBytes|Maximum size of the cache space in bytes. 0 means disabling cache. Cached files are not removed until the ingestion task completes.|1073741824|no|
|maxFetchCapacityBytes|Maximum size of the fetch space in bytes. 0 means disabling prefetch. Prefetched files are removed immediately once they are read.|1073741824|no|
|fetchTimeout|Timeout for fetching a Cloud Files object.|60000|no|
|maxFetchRetry|Maximum retry for fetching a Cloud Files object.|3|no|
Cloud Files Blobs:
|property|description|default|required?|
|--------|-----------|-------|---------|
|container|Name of the Cloud Files container|N/A|yes|
|path|The path where data is located.|N/A|yes|

View File

@ -0,0 +1,99 @@
---
id: distinctcount
title: "DistinctCount Aggregator"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) the `druid-distinctcount` extension.
Additionally, follow these steps:
1. First, use a single dimension hash-based partition spec to partition data by a single dimension. For example visitor_id. This to make sure all rows with a particular value for that dimension will go into the same segment, or this might over count.
2. Second, use distinctCount to calculate the distinct count, make sure queryGranularity is divided exactly by segmentGranularity or else the result will be wrong.
There are some limitations, when used with groupBy, the groupBy keys' numbers should not exceed maxIntermediateRows in every segment. If exceeded the result will be wrong. When used with topN, numValuesPerPass should not be too big. If too big the distinctCount will use a lot of memory and might cause the JVM to go our of memory.
Example:
## Timeseries query
```json
{
"queryType": "timeseries",
"dataSource": "sample_datasource",
"granularity": "day",
"aggregations": [
{
"type": "distinctCount",
"name": "uv",
"fieldName": "visitor_id"
}
],
"intervals": [
"2016-03-01T00:00:00.000/2013-03-20T00:00:00.000"
]
}
```
## TopN query
```json
{
"queryType": "topN",
"dataSource": "sample_datasource",
"dimension": "sample_dim",
"threshold": 5,
"metric": "uv",
"granularity": "all",
"aggregations": [
{
"type": "distinctCount",
"name": "uv",
"fieldName": "visitor_id"
}
],
"intervals": [
"2016-03-06T00:00:00/2016-03-06T23:59:59"
]
}
```
## GroupBy query
```json
{
"queryType": "groupBy",
"dataSource": "sample_datasource",
"dimensions": ["sample_dim"],
"granularity": "all",
"aggregations": [
{
"type": "distinctCount",
"name": "uv",
"fieldName": "visitor_id"
}
],
"intervals": [
"2016-03-06T00:00:00/2016-03-06T23:59:59"
]
}
```

View File

@ -0,0 +1,103 @@
---
id: gce-extensions
title: "GCE Extensions"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid (incubating) extension, make sure to [include](../../development/extensions.md#loading-extensions) `gce-extensions`.
At the moment, this extension enables only Druid to autoscale instances in GCE.
The extension manages the instances to be scaled up and down through the use of the [Managed Instance Groups](https://cloud.google.com/compute/docs/instance-groups/creating-groups-of-managed-instances#resize_managed_group)
of GCE (MIG from now on). This choice has been made to ease the configuration of the machines and simplify their
management.
For this reason, in order to use this extension, the user must have created
1. An instance template with the right machine type and image to bu used to run the MiddleManager
2. A MIG that has been configured to use the instance template created in the point above
Moreover, in order to be able to rescale the machines in the MIG, the Overlord must run with a service account
guaranteeing the following two scopes from the [Compute Engine API](https://developers.google.com/identity/protocols/googlescopes#computev1)
- `https://www.googleapis.com/auth/cloud-platform`
- `https://www.googleapis.com/auth/compute`
## Overlord Dynamic Configuration
The Overlord can dynamically change worker behavior.
The JSON object can be submitted to the Overlord via a POST request at:
```
http://<OVERLORD_IP>:<port>/druid/indexer/v1/worker
```
Optional Header Parameters for auditing the config change can also be specified.
|Header Param Name| Description | Default |
|----------|-------------|---------|
|`X-Druid-Author`| author making the config change|""|
|`X-Druid-Comment`| comment describing the change being done|""|
A sample worker config spec is shown below:
```json
{
"autoScaler": {
"envConfig" : {
"numInstances" : 1,
"projectId" : "super-project",
"zoneName" : "us-central-1",
"managedInstanceGroupName" : "druid-middlemanagers"
},
"maxNumWorkers" : 4,
"minNumWorkers" : 2,
"type" : "gce"
}
}
```
The configuration of the autoscaler is quite simple and it is made of two levels only.
The external level specifies the `type`—always `gce` in this case— and two numeric values,
the `maxNumWorkers` and `minNumWorkers` used to define the boundaries in between which the
number of instances must be at any time.
The internal level is the `envConfig` and it is used to specify
- The `numInstances` used to specify how many workers will be spawned at each
request to provision more workers. This is safe to be left to `1`
- The `projectId` used to specify the name of the project in which the MIG resides
- The `zoneName` used to identify in which zone of the worlds the MIG is
- The `managedInstanceGroupName` used to specify the MIG containing the instances created or
removed
Please refer to the Overlord Dynamic Configuration section in the main [documentation](../../configuration/index.md)
for parameters other than the ones specified here, such as `selectStrategy` etc.
## Known limitations
- The module internally uses the [ListManagedInstances](https://cloud.google.com/compute/docs/reference/rest/v1/instanceGroupManagers/listManagedInstances)
call from the API and, while the documentation of the API states that the call can be paged through using the
`pageToken` argument, the responses to such call do not provide any `nextPageToken` to set such parameter. This means
that the extension can operate safely with a maximum of 500 MiddleManagers instances at any time (the maximum number
of instances to be returned for each call).

View File

@ -0,0 +1,117 @@
---
id: graphite
title: "Graphite Emitter"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `graphite-emitter` extension.
## Introduction
This extension emits druid metrics to a graphite carbon server.
Metrics can be sent by using [plaintext](http://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol) or [pickle](http://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-pickle-protocol) protocol.
The pickle protocol is more efficient and supports sending batches of metrics (plaintext protocol send only one metric) in one request; batch size is configurable.
## Configuration
All the configuration parameters for graphite emitter are under `druid.emitter.graphite`.
|property|description|required?|default|
|--------|-----------|---------|-------|
|`druid.emitter.graphite.hostname`|The hostname of the graphite server.|yes|none|
|`druid.emitter.graphite.port`|The port of the graphite server.|yes|none|
|`druid.emitter.graphite.batchSize`|Number of events to send as one batch (only for pickle protocol)|no|100|
|`druid.emitter.graphite.protocol`|Graphite protocol; available protocols: pickle, plaintext.|no|pickle|
|`druid.emitter.graphite.eventConverter`| Filter and converter of druid events to graphite event (please see next section).|yes|none|
|`druid.emitter.graphite.flushPeriod` | Queue flushing period in milliseconds. |no|1 minute|
|`druid.emitter.graphite.maxQueueSize`| Maximum size of the queue used to buffer events. |no|`MAX_INT`|
|`druid.emitter.graphite.alertEmitters`| List of emitters where alerts will be forwarded to. This is a JSON list of emitter names, e.g. `["logging", "http"]`|no| empty list (no forwarding)|
|`druid.emitter.graphite.requestLogEmitters`| List of emitters where request logs (i.e., query logging events sent to emitters when `druid.request.logging.type` is set to `emitter`) will be forwarded to. This is a JSON list of emitter names, e.g. `["logging", "http"]`|no| empty list (no forwarding)|
|`druid.emitter.graphite.emitWaitTime` | wait time in milliseconds to try to send the event otherwise emitter will throwing event. |no|0|
|`druid.emitter.graphite.waitForEventTime` | waiting time in milliseconds if necessary for an event to become available. |no|1000 (1 sec)|
### Supported event types
The graphite emitter only emits service metric events to graphite (See [Druid Metrics](../../operations/metrics.md) for a list of metrics).
Alerts and request logs are not sent to graphite. These event types are not well represented in Graphite, which is more suited for timeseries views on numeric metrics, vs. storing non-numeric log events.
Instead, alerts and request logs are optionally forwarded to other emitter implementations, specified by `druid.emitter.graphite.alertEmitters` and `druid.emitter.graphite.requestLogEmitters` respectively.
### Druid to Graphite Event Converter
Graphite Event Converter defines a mapping between druid metrics name plus dimensions to a Graphite metric path.
Graphite metric path is organized using the following schema:
`<namespacePrefix>.[<druid service name>].[<druid hostname>].<druid metrics dimensions>.<druid metrics name>`
Properly naming the metrics is critical to avoid conflicts, confusing data and potentially wrong interpretation later on.
Example `druid.historical.hist-host1_yahoo_com:8080.MyDataSourceName.GroupBy.query/time`:
* `druid` -> namespace prefix
* `historical` -> service name
* `hist-host1.yahoo.com:8080` -> druid hostname
* `MyDataSourceName` -> dimension value
* `GroupBy` -> dimension value
* `query/time` -> metric name
We have two different implementation of event converter:
#### Send-All converter
The first implementation called `all`, will send all the druid service metrics events.
The path will be in the form `<namespacePrefix>.[<druid service name>].[<druid hostname>].<dimensions values ordered by dimension's name>.<metric>`
User has control of `<namespacePrefix>.[<druid service name>].[<druid hostname>].`
You can omit the hostname by setting `ignoreHostname=true`
`druid.SERVICE_NAME.dataSourceName.queryType.query/time`
You can omit the service name by setting `ignoreServiceName=true`
`druid.HOSTNAME.dataSourceName.queryType.query/time`
Elements in metric name by default are separated by "/", so graphite will create all metrics on one level. If you want to have metrics in the tree structure, you have to set `replaceSlashWithDot=true`
Original: `druid.HOSTNAME.dataSourceName.queryType.query/time`
Changed: `druid.HOSTNAME.dataSourceName.queryType.query.time`
```json
druid.emitter.graphite.eventConverter={"type":"all", "namespacePrefix": "druid.test", "ignoreHostname":true, "ignoreServiceName":true}
```
#### White-list based converter
The second implementation called `whiteList`, will send only the white listed metrics and dimensions.
Same as for the `all` converter user has control of `<namespacePrefix>.[<druid service name>].[<druid hostname>].`
White-list based converter comes with the following default white list map located under resources in `./src/main/resources/defaultWhiteListMap.json`
Although user can override the default white list map by supplying a property called `mapPath`.
This property is a String containing the path for the file containing **white list map JSON object**.
For example the following converter will read the map from the file `/pathPrefix/fileName.json`.
```json
druid.emitter.graphite.eventConverter={"type":"whiteList", "namespacePrefix": "druid.test", "ignoreHostname":true, "ignoreServiceName":true, "mapPath":"/pathPrefix/fileName.json"}
```
**Druid emits a huge number of metrics we highly recommend to use the `whiteList` converter**

View File

@ -0,0 +1,67 @@
---
id: influx
title: "InfluxDB Line Protocol Parser"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-influx-extensions`.
This extension enables Druid to parse the [InfluxDB Line Protocol](https://docs.influxdata.com/influxdb/v1.5/write_protocols/line_protocol_tutorial/), a popular text-based timeseries metric serialization format.
## Line Protocol
A typical line looks like this:
```cpu,application=dbhost=prdb123,region=us-east-1 usage_idle=99.24,usage_user=0.55 1520722030000000000```
which contains four parts:
- measurement: A string indicating the name of the measurement represented (e.g. cpu, network, web_requests)
- tags: zero or more key-value pairs (i.e. dimensions)
- measurements: one or more key-value pairs; values can be numeric, boolean, or string
- timestamp: nanoseconds since Unix epoch (the parser truncates it to milliseconds)
The parser extracts these fields into a map, giving the measurement the key `measurement` and the timestamp the key `_ts`. The tag and measurement keys are copied verbatim, so users should take care to avoid name collisions. It is up to the ingestion spec to decide which fields should be treated as dimensions and which should be treated as metrics (typically tags correspond to dimensions and measurements correspond to metrics).
The parser is configured like so:
```json
"parser": {
"type": "string",
"parseSpec": {
"format": "influx",
"timestampSpec": {
"column": "__ts",
"format": "millis"
},
"dimensionsSpec": {
"dimensionExclusions": [
"__ts"
]
},
"whitelistMeasurements": [
"cpu"
]
}
```
The `whitelistMeasurements` field is an optional list of strings. If present, measurements that do not match one of the strings in the list will be ignored.

View File

@ -0,0 +1,74 @@
---
id: influxdb-emitter
title: "InfluxDB Emitter"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-influxdb-emitter` extension.
## Introduction
This extension emits druid metrics to [InfluxDB](https://www.influxdata.com/time-series-platform/influxdb/) over HTTP. Currently this emitter only emits service metric events to InfluxDB (See [Druid metrics](../../operations/metrics.md) for a list of metrics).
When a metric event is fired it is added to a queue of events. After a configurable amount of time, the events on the queue are transformed to InfluxDB's line protocol
and POSTed to the InfluxDB HTTP API. The entire queue is flushed at this point. The queue is also flushed as the emitter is shutdown.
Note that authentication and authorization must be [enabled](https://docs.influxdata.com/influxdb/v1.7/administration/authentication_and_authorization/) on the InfluxDB server.
## Configuration
All the configuration parameters for the influxdb emitter are under `druid.emitter.influxdb`.
|Property|Description|Required?|Default|
|--------|-----------|---------|-------|
|`druid.emitter.influxdb.hostname`|The hostname of the InfluxDB server.|Yes|N/A|
|`druid.emitter.influxdb.port`|The port of the InfluxDB server.|No|8086|
|`druid.emitter.influxdb.databaseName`|The name of the database in InfluxDB.|Yes|N/A|
|`druid.emitter.influxdb.maxQueueSize`|The size of the queue that holds events.|No|Integer.MAX_VALUE(=2^31-1)|
|`druid.emitter.influxdb.flushPeriod`|How often (in milliseconds) the events queue is parsed into Line Protocol and POSTed to InfluxDB.|No|60000|
|`druid.emitter.influxdb.flushDelay`|How long (in milliseconds) the scheduled method will wait until it first runs.|No|60000|
|`druid.emitter.influxdb.influxdbUserName`|The username for authenticating with the InfluxDB database.|Yes|N/A|
|`druid.emitter.influxdb.influxdbPassword`|The password of the database authorized user|Yes|N/A|
|`druid.emitter.influxdb.dimensionWhitelist`|A whitelist of metric dimensions to include as tags|No|`["dataSource","type","numMetrics","numDimensions","threshold","dimension","taskType","taskStatus","tier"]`|
## InfluxDB Line Protocol
An example of how this emitter parses a Druid metric event into InfluxDB's [line protocol](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_reference/) is given here:
The syntax of the line protocol is :
`<measurement>[,<tag_key>=<tag_value>[,<tag_key>=<tag_value>]] <field_key>=<field_value>[,<field_key>=<field_value>] [<timestamp>]`
where timestamp is in nanoseconds since epoch.
A typical service metric event as recorded by Druid's logging emitter is: `Event [{"feed":"metrics","timestamp":"2017-10-31T09:09:06.857Z","service":"druid/historical","host":"historical001:8083","version":"0.11.0-SNAPSHOT","metric":"query/cache/total/hits","value":34787256}]`.
This event is parsed into line protocol according to these rules:
* The measurement becomes druid_query since query is the first part of the metric.
* The tags are service=druid/historical, hostname=historical001, metric=druid_cache_total. (The metric tag is the middle part of the druid metric separated with _ and preceded by druid_. Another example would be if an event has metric=query/time then there is no middle part and hence no metric tag)
* The field is druid_hits since this is the last part of the metric.
This gives the following String which can be POSTed to InfluxDB: `"druid_query,service=druid/historical,hostname=historical001,metric=druid_cache_total druid_hits=34787256 1509440946857000000"`
The InfluxDB emitter has a white list of dimensions
which will be added as a tag to the line protocol string if the metric has a dimension from the white list.
The value of the dimension is sanitized such that every occurrence of a dot or whitespace is replaced with a `_` .

View File

@ -0,0 +1,54 @@
---
id: kafka-emitter
title: "Kafka Emitter"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `kafka-emitter` extension.
## Introduction
This extension emits Druid metrics to [Apache Kafka](https://kafka.apache.org) directly with JSON format.<br>
Currently, Kafka has not only their nice ecosystem but also consumer API readily available.
So, If you currently use Kafka, It's easy to integrate various tool or UI
to monitor the status of your Druid cluster with this extension.
## Configuration
All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
|property|description|required?|default|
|--------|-----------|---------|-------|
|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
### Example
```
druid.emitter.kafka.bootstrap.servers=hostname1:9092,hostname2:9092
druid.emitter.kafka.metric.topic=druid-metric
druid.emitter.kafka.alert.topic=druid-alert
druid.emitter.kafka.producer.config={"max.block.ms":10000}
```

View File

@ -0,0 +1,136 @@
---
id: materialized-view
title: "Materialized View"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid feature, make sure to only load `materialized-view-selection` on Broker and load `materialized-view-maintenance` on Overlord. In addition, this feature currently requires a Hadoop cluster.
This feature enables Druid to greatly improve the query performance, especially when the query dataSource has a very large number of dimensions but the query only required several dimensions. This feature includes two parts. One is `materialized-view-maintenance`, and the other is `materialized-view-selection`.
## Materialized-view-maintenance
In materialized-view-maintenance, dataSources user ingested are called "base-dataSource". For each base-dataSource, we can submit `derivativeDataSource` supervisors to create and maintain other dataSources which we called "derived-dataSource". The dimensions and metrics of derived-dataSources are the subset of base-dataSource's.
The `derivativeDataSource` supervisor is used to keep the timeline of derived-dataSource consistent with base-dataSource. Each `derivativeDataSource` supervisor is responsible for one derived-dataSource.
A sample derivativeDataSource supervisor spec is shown below:
```json
{
"type": "derivativeDataSource",
"baseDataSource": "wikiticker",
"dimensionsSpec": {
"dimensions": [
"isUnpatrolled",
"metroCode",
"namespace",
"page",
"regionIsoCode",
"regionName",
"user"
]
},
"metricsSpec": [
{
"name": "count",
"type": "count"
},
{
"name": "added",
"type": "longSum",
"fieldName": "added"
}
],
"tuningConfig": {
"type": "hadoop"
}
}
```
**Supervisor Configuration**
|Field|Description|Required|
|--------|-----------|---------|
|Type |The supervisor type. This should always be `derivativeDataSource`.|yes|
|baseDataSource |The name of base dataSource. This dataSource data should be already stored inside Druid, and the dataSource will be used as input data.|yes|
|dimensionsSpec |Specifies the dimensions of the data. These dimensions must be the subset of baseDataSource's dimensions.|yes|
|metricsSpec |A list of aggregators. These metrics must be the subset of baseDataSource's metrics. See [aggregations](../../querying/aggregations.md).|yes|
|tuningConfig |TuningConfig must be HadoopTuningConfig. See [Hadoop tuning config](../../ingestion/hadoop.html#tuningconfig).|yes|
|dataSource |The name of this derived dataSource. |no(default=baseDataSource-hashCode of supervisor)|
|hadoopDependencyCoordinates |A JSON array of Hadoop dependency coordinates that Druid will use, this property will override the default Hadoop coordinates. Once specified, Druid will look for those Hadoop dependencies from the location specified by druid.extensions.hadoopDependenciesDir |no|
|classpathPrefix |Classpath that will be prepended for the Peon process. |no|
|context |See below. |no|
**Context**
|Field|Description|Required|
|--------|-----------|---------|
|maxTaskCount |The max number of tasks the supervisor can submit simultaneously. |no(default=1)|
## Materialized-view-selection
In materialized-view-selection, we implement a new query type `view`. When we request a view query, Druid will try its best to optimize the query based on query dataSource and intervals.
A sample view query spec is shown below:
```json
{
"queryType": "view",
"query": {
"queryType": "groupBy",
"dataSource": "wikiticker",
"granularity": "all",
"dimensions": [
"user"
],
"limitSpec": {
"type": "default",
"limit": 1,
"columns": [
{
"dimension": "added",
"direction": "descending",
"dimensionOrder": "numeric"
}
]
},
"aggregations": [
{
"type": "longSum",
"name": "added",
"fieldName": "added"
}
],
"intervals": [
"2015-09-12/2015-09-13"
]
}
}
```
There are 2 parts in a view query:
|Field|Description|Required|
|--------|-----------|---------|
|queryType |The query type. This should always be view |yes|
|query |The real query of this `view` query. The real query must be [groupBy](../../querying/groupbyquery.md), [topN](../../querying/topnquery.md), or [timeseries](../../querying/timeseriesquery.md) type.|yes|
**Note that Materialized View is currently designated as experimental. Please make sure the time of all processes are the same and increase monotonically. Otherwise, some unexpected errors may happen on query results.**

View File

@ -0,0 +1,125 @@
---
id: momentsketch-quantiles
title: "Moment Sketches for Approximate Quantiles module"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This module provides aggregators for approximate quantile queries using the [momentsketch](https://github.com/stanford-futuredata/momentsketch) library.
The momentsketch provides coarse quantile estimates with less space and aggregation time overheads than traditional sketches, approaching the performance of counts and sums by reconstructing distributions from computed statistics.
To use this Apache Druid extension, make sure you [include](../../development/extensions.md#loading-extensions) the extension in your config file:
```
druid.extensions.loadList=["druid-momentsketch"]
```
### Aggregator
The result of the aggregation is a momentsketch that is the union of all sketches either built from raw data or read from the segments.
The `momentSketch` aggregator operates over raw data while the `momentSketchMerge` aggregator should be used when aggregating precomputed sketches.
```json
{
"type" : <aggregator_type>,
"name" : <output_name>,
"fieldName" : <input_name>,
"k" : <int>,
"compress" : <boolean>
}
```
|property|description|required?|
|--------|-----------|---------|
|type|Type of aggregator desired. Either "momentSketch" or "momentSketchMerge" |yes|
|name|A String for the output (result) name of the calculation.|yes|
|fieldName|A String for the name of the input field (can contain sketches or raw numeric values).|yes|
|k|Parameter that determines the accuracy and size of the sketch. Higher k means higher accuracy but more space to store sketches. Usable range is generally [3,15] |no, defaults to 13.|
|compress|Flag for whether the aggregator compresses numeric values using arcsinh. Can improve robustness to skewed and long-tailed distributions, but reduces accuracy slightly on more uniform distributions.| no, defaults to true
### Post Aggregators
Users can query for a set of quantiles using the `momentSketchSolveQuantiles` post-aggregator on the sketches created by the `momentSketch` or `momentSketchMerge` aggregators.
```json
{
"type" : "momentSketchSolveQuantiles",
"name" : <output_name>,
"field" : <reference to moment sketch>,
"fractions" : <array of doubles in [0,1]>
}
```
Users can also query for the min/max of a distribution:
```json
{
"type" : "momentSketchMin" | "momentSketchMax",
"name" : <output_name>,
"field" : <reference to moment sketch>,
}
```
### Example
As an example of a query with sketches pre-aggregated at ingestion time, one could set up the following aggregator at ingest:
```json
{
"type": "momentSketch",
"name": "sketch",
"fieldName": "value",
"k": 10,
"compress": true,
}
```
and make queries using the following aggregator + post-aggregator:
```json
{
"aggregations": [{
"type": "momentSketchMerge",
"name": "sketch",
"fieldName": "sketch",
"k": 10,
"compress": true
}],
"postAggregations": [
{
"type": "momentSketchSolveQuantiles",
"name": "quantiles",
"fractions": [0.1, 0.5, 0.9],
"field": {
"type": "fieldAccess",
"fieldName": "sketch"
}
},
{
"type": "momentSketchMin",
"name": "min",
"field": {
"type": "fieldAccess",
"fieldName": "sketch"
}
}]
}
```

View File

@ -0,0 +1,349 @@
---
id: moving-average-query
title: "Moving Average Query"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
## Overview
**Moving Average Query** is an extension which provides support for [Moving Average](https://en.wikipedia.org/wiki/Moving_average) and other Aggregate [Window Functions](https://en.wikibooks.org/wiki/Structured_Query_Language/Window_functions) in Druid queries.
These Aggregate Window Functions consume standard Druid Aggregators and outputs additional windowed aggregates called [Averagers](#averagers).
#### High level algorithm
Moving Average encapsulates the [groupBy query](../../querying/groupbyquery.md) (Or [timeseries](../../querying/timeseriesquery.md) in case of no dimensions) in order to rely on the maturity of these query types.
It runs the query in two main phases:
1. Runs an inner [groupBy](../../querying/groupbyquery.html) or [timeseries](../../querying/timeseriesquery.html) query to compute Aggregators (i.e. daily count of events).
2. Passes over aggregated results in Broker, in order to compute Averagers (i.e. moving 7 day average of the daily count).
#### Main enhancements provided by this extension:
1. Functionality: Extending druid query functionality (i.e. initial introduction of Window Functions).
2. Performance: Improving performance of such moving aggregations by eliminating multiple segment scans.
#### Further reading
[Moving Average](https://en.wikipedia.org/wiki/Moving_average)
[Window Functions](https://en.wikibooks.org/wiki/Structured_Query_Language/Window_functions)
[Analytic Functions](https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts)
## Operations
To use this extension, make sure to [load](../../development/extensions.md#loading-extensions) `druid-moving-average-query` only to the Broker.
## Configuration
There are currently no configuration properties specific to Moving Average.
## Limitations
* movingAverage is missing support for the following groupBy properties: `subtotalsSpec`, `virtualColumns`.
* movingAverage is missing support for the following timeseries properties: `descending`.
* movingAverage is missing support for [SQL-compatible null handling](https://github.com/apache/druid/issues/4349) (So setting druid.generic.useDefaultValueForNull in configuration will give an error).
##Query spec:
* Most properties in the query spec derived from [groupBy query](../../querying/groupbyquery.md) / [timeseries](../../querying/timeseriesquery.md), see documentation for these query types.
|property|description|required?|
|--------|-----------|---------|
|queryType|This String should always be "movingAverage"; this is the first thing Druid looks at to figure out how to interpret the query.|yes|
|dataSource|A String or Object defining the data source to query, very similar to a table in a relational database. See [DataSource](../../querying/datasource.md) for more information.|yes|
|dimensions|A JSON list of [DimensionSpec](../../querying/dimensionspecs.md) (Notice that property is optional)|no|
|limitSpec|See [LimitSpec](../../querying/limitspec.md)|no|
|having|See [Having](../../querying/having.md)|no|
|granularity|A period granularity; See [Period Granularities](../../querying/granularities.html#period-granularities)|yes|
|filter|See [Filters](../../querying/filters.md)|no|
|aggregations|Aggregations forms the input to Averagers; See [Aggregations](../../querying/aggregations.md)|yes|
|postAggregations|Supports only aggregations as input; See [Post Aggregations](../../querying/post-aggregations.md)|no|
|intervals|A JSON Object representing ISO-8601 Intervals. This defines the time ranges to run the query over.|yes|
|context|An additional JSON Object which can be used to specify certain flags.|no|
|averagers|Defines the moving average function; See [Averagers](#averagers)|yes|
|postAveragers|Support input of both averagers and aggregations; Syntax is identical to postAggregations (See [Post Aggregations](../../querying/post-aggregations.md))|no|
## Averagers
Averagers are used to define the Moving-Average function. Averagers are not limited to an average - they can also provide other types of window functions such as MAX()/MIN().
### Properties
These are properties which are common to all Averagers:
|property|description|required?|
|--------|-----------|---------|
|type|Averager type; See [Averager types](#averager-types)|yes|
|name|Averager name|yes|
|fieldName|Input name (An aggregation name)|yes|
|buckets|Number of lookback buckets (time periods), including current one. Must be >0|yes|
|cycleSize|Cycle size; Used to calculate day-of-week option; See [Cycle size (Day of Week)](#cycle-size-day-of-week)|no, defaults to 1|
### Averager types:
* [Standard averagers](#standard-averagers):
* doubleMean
* doubleMeanNoNulls
* doubleSum
* doubleMax
* doubleMin
* longMean
* longMeanNoNulls
* longSum
* longMax
* longMin
#### Standard averagers
These averagers offer four functions:
* Mean (Average)
* MeanNoNulls (Ignores empty buckets).
* Sum
* Max
* Min
**Ignoring nulls**:
Using a MeanNoNulls averager is useful when the interval starts at the dataset beginning time.
In that case, the first records will ignore missing buckets and average won't be artificially low.
However, this also means that empty days in a sparse dataset will also be ignored.
Example of usage:
```json
{ "type" : "doubleMean", "name" : <output_name>, "fieldName": <input_name> }
```
### Cycle size (Day of Week)
This optional parameter is used to calculate over a single bucket within each cycle instead of all buckets.
A prime example would be weekly buckets, resulting in a Day of Week calculation. (Other examples: Month of year, Hour of day).
I.e. when using these parameters:
* *granularity*: period=P1D (daily)
* *buckets*: 28
* *cycleSize*: 7
Within each output record, the averager will compute the result over the following buckets: current (#0), #7, #14, #21.
Whereas without specifying cycleSize it would have computed over all 28 buckets.
## Examples
All examples are based on the Wikipedia dataset provided in the Druid [tutorials](../../tutorials/index.md).
### Basic example
Calculating a 7-buckets moving average for Wikipedia edit deltas.
Query syntax:
```json
{
"queryType": "movingAverage",
"dataSource": "wikipedia",
"granularity": {
"type": "period",
"period": "PT30M"
},
"intervals": [
"2015-09-12T00:00:00Z/2015-09-13T00:00:00Z"
],
"aggregations": [
{
"name": "delta30Min",
"fieldName": "delta",
"type": "longSum"
}
],
"averagers": [
{
"name": "trailing30MinChanges",
"fieldName": "delta30Min",
"type": "longMean",
"buckets": 7
}
]
}
```
Result:
```json
[ {
"version" : "v1",
"timestamp" : "2015-09-12T00:30:00.000Z",
"event" : {
"delta30Min" : 30490,
"trailing30MinChanges" : 4355.714285714285
}
}, {
"version" : "v1",
"timestamp" : "2015-09-12T01:00:00.000Z",
"event" : {
"delta30Min" : 96526,
"trailing30MinChanges" : 18145.14285714286
}
}, {
...
...
...
}, {
"version" : "v1",
"timestamp" : "2015-09-12T23:00:00.000Z",
"event" : {
"delta30Min" : 119100,
"trailing30MinChanges" : 198697.2857142857
}
}, {
"version" : "v1",
"timestamp" : "2015-09-12T23:30:00.000Z",
"event" : {
"delta30Min" : 177882,
"trailing30MinChanges" : 193890.0
}
}
```
### Post averager example
Calculating a 7-buckets moving average for Wikipedia edit deltas, plus a ratio between the current period and the moving average.
Query syntax:
```json
{
"queryType": "movingAverage",
"dataSource": "wikipedia",
"granularity": {
"type": "period",
"period": "PT30M"
},
"intervals": [
"2015-09-12T22:00:00Z/2015-09-13T00:00:00Z"
],
"aggregations": [
{
"name": "delta30Min",
"fieldName": "delta",
"type": "longSum"
}
],
"averagers": [
{
"name": "trailing30MinChanges",
"fieldName": "delta30Min",
"type": "longMean",
"buckets": 7
}
],
"postAveragers" : [
{
"name": "ratioTrailing30MinChanges",
"type": "arithmetic",
"fn": "/",
"fields": [
{
"type": "fieldAccess",
"fieldName": "delta30Min"
},
{
"type": "fieldAccess",
"fieldName": "trailing30MinChanges"
}
]
}
]
}
```
Result:
```json
[ {
"version" : "v1",
"timestamp" : "2015-09-12T22:00:00.000Z",
"event" : {
"delta30Min" : 144269,
"trailing30MinChanges" : 204088.14285714287,
"ratioTrailing30MinChanges" : 0.7068955500319539
}
}, {
"version" : "v1",
"timestamp" : "2015-09-12T22:30:00.000Z",
"event" : {
"delta30Min" : 242860,
"trailing30MinChanges" : 214031.57142857142,
"ratioTrailing30MinChanges" : 1.134692411867141
}
}, {
"version" : "v1",
"timestamp" : "2015-09-12T23:00:00.000Z",
"event" : {
"delta30Min" : 119100,
"trailing30MinChanges" : 198697.2857142857,
"ratioTrailing30MinChanges" : 0.5994042624782422
}
}, {
"version" : "v1",
"timestamp" : "2015-09-12T23:30:00.000Z",
"event" : {
"delta30Min" : 177882,
"trailing30MinChanges" : 193890.0,
"ratioTrailing30MinChanges" : 0.9174377224199288
}
} ]
```
### Cycle size example
Calculating an average of every first 10-minutes of the last 3 hours:
Query syntax:
```json
{
"queryType": "movingAverage",
"dataSource": "wikipedia",
"granularity": {
"type": "period",
"period": "PT10M"
},
"intervals": [
"2015-09-12T00:00:00Z/2015-09-13T00:00:00Z"
],
"aggregations": [
{
"name": "delta10Min",
"fieldName": "delta",
"type": "doubleSum"
}
],
"averagers": [
{
"name": "trailing10MinPerHourChanges",
"fieldName": "delta10Min",
"type": "doubleMeanNoNulls",
"buckets": 18,
"cycleSize": 6
}
]
}
```

View File

@ -0,0 +1,62 @@
---
id: opentsdb-emitter
title: "OpenTSDB Emitter"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `opentsdb-emitter` extension.
## Introduction
This extension emits druid metrics to [OpenTSDB](https://github.com/OpenTSDB/opentsdb) over HTTP (Using `Jersey client`). And this emitter only emits service metric events to OpenTSDB (See [Druid metrics](../../operations/metrics.md) for a list of metrics).
## Configuration
All the configuration parameters for the OpenTSDB emitter are under `druid.emitter.opentsdb`.
|property|description|required?|default|
|--------|-----------|---------|-------|
|`druid.emitter.opentsdb.host`|The host of the OpenTSDB server.|yes|none|
|`druid.emitter.opentsdb.port`|The port of the OpenTSDB server.|yes|none|
|`druid.emitter.opentsdb.connectionTimeout`|`Jersey client` connection timeout(in milliseconds).|no|2000|
|`druid.emitter.opentsdb.readTimeout`|`Jersey client` read timeout(in milliseconds).|no|2000|
|`druid.emitter.opentsdb.flushThreshold`|Queue flushing threshold.(Events will be sent as one batch)|no|100|
|`druid.emitter.opentsdb.maxQueueSize`|Maximum size of the queue used to buffer events.|no|1000|
|`druid.emitter.opentsdb.consumeDelay`|Queue consuming delay(in milliseconds). Actually, we use `ScheduledExecutorService` to schedule consuming events, so this `consumeDelay` means the delay between the termination of one execution and the commencement of the next. If your druid processes produce metric events fast, then you should decrease this `consumeDelay` or increase the `maxQueueSize`.|no|10000|
|`druid.emitter.opentsdb.metricMapPath`|JSON file defining the desired metrics and dimensions for every Druid metric|no|./src/main/resources/defaultMetrics.json|
|`druid.emitter.opentsdb.namespacePrefix`|Optional (string) prefix for metric names, for example the default metric name `query.count` with a namespacePrefix set to `druid` would be emitted as `druid.query.count` |no|null|
### Druid to OpenTSDB Event Converter
The OpenTSDB emitter will send only the desired metrics and dimensions which is defined in a JSON file.
If the user does not specify their own JSON file, a default file is used. All metrics are expected to be configured in the JSON file. Metrics which are not configured will be logged.
Desired metrics and dimensions is organized using the following schema:`<druid metric name> : [ <dimension list> ]`<br />
e.g.
```json
"query/time": [
"dataSource",
"type"
]
```
For most use-cases, the default configuration is sufficient.

View File

@ -0,0 +1,58 @@
---
id: redis-cache
title: "Druid Redis Cache"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-redis-cache` extension.
A cache implementation for Druid based on [Redis](https://github.com/antirez/redis).
Below are the configuration options known to this module.
Note that just adding these properties does not enable the cache. You still need to add the `druid.<process-type>.cache.useCache` and `druid.<process-type>.cache.populateCache` properties for the processes you want to enable the cache on as described in the [cache configuration docs](../../configuration/index.html#cache-configuration).
A possible configuration would be to keep the properties below in your `common.runtime.properties` file (present on all processes) and then add `druid.<nodetype>.cache.useCache` and `druid.<nodetype>.cache.populateCache` in the `runtime.properties` file of the process types you want to enable caching on.
## Configuration
|`common.runtime.properties`|Description|Default|Required|
|--------------------|-----------|-------|--------|
|`druid.cache.host`|Redis server host|None|yes|
|`druid.cache.port`|Redis server port|None|yes|
|`druid.cache.expiration`|Expiration(in milliseconds) for cache entries|24 * 3600 * 1000|no|
|`druid.cache.timeout`|Timeout(in milliseconds) for get cache entries from Redis|2000|no|
|`druid.cache.maxTotalConnections`|Max total connections to Redis|8|no|
|`druid.cache.maxIdleConnections`|Max idle connections to Redis|8|no|
|`druid.cache.minIdleConnections`|Min idle connections to Redis|0|no|
## Enabling
To enable the redis cache, include this module on the loadList and set `druid.cache.type` to `redis` in your properties.
## Metrics
In addition to the normal cache metrics, the redis cache implementation also reports the following in both `total` and `delta`
|Metric|Description|Normal value|
|------|-----------|------------|
|`query/cache/redis/*/requests`|Count of requests to redis cache|whatever request to redis will increase request count by 1|

View File

@ -0,0 +1,56 @@
---
id: sqlserver
title: "Microsoft SQLServer"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `sqlserver-metadata-storage` as an extension.
## Setting up SQLServer
1. Install Microsoft SQLServer
2. Create a druid database and user
Create the druid user
- Microsoft SQL Server Management Studio - Security - Logins - New Login...
- Create a druid user, enter `diurd` when prompted for the password.
Create a druid database owned by the user we just created
- Databases - New Database
- Database Name: druid, Owner: druid
3. Add the Microsoft JDBC library to the Druid classpath
- To ensure the com.microsoft.sqlserver.jdbc.SQLServerDriver class is loaded you will have to add the appropriate Microsoft JDBC library (sqljdbc*.jar) to the Druid classpath.
- For instance, if all jar files in your "druid/lib" directory are automatically added to your Druid classpath, then manually download the Microsoft JDBC drivers from ( https://www.microsoft.com/en-ca/download/details.aspx?id=11774) and drop it into my druid/lib directory.
4. Configure your Druid metadata storage extension:
Add the following parameters to your Druid configuration, replacing `<host>`
with the location (host name and port) of the database.
```properties
druid.metadata.storage.type=sqlserver
druid.metadata.storage.connector.connectURI=jdbc:sqlserver://<host>;databaseName=druid
druid.metadata.storage.connector.user=druid
druid.metadata.storage.connector.password=diurd
```

View File

@ -0,0 +1,71 @@
---
id: statsd
title: "StatsD Emitter"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `statsd-emitter` extension.
## Introduction
This extension emits druid metrics to a StatsD server.
(https://github.com/etsy/statsd)
(https://github.com/armon/statsite)
## Configuration
All the configuration parameters for the StatsD emitter are under `druid.emitter.statsd`.
|property|description|required?|default|
|--------|-----------|---------|-------|
|`druid.emitter.statsd.hostname`|The hostname of the StatsD server.|yes|none|
|`druid.emitter.statsd.port`|The port of the StatsD server.|yes|none|
|`druid.emitter.statsd.prefix`|Optional metric name prefix.|no|""|
|`druid.emitter.statsd.separator`|Metric name separator|no|.|
|`druid.emitter.statsd.includeHost`|Flag to include the hostname as part of the metric name.|no|false|
|`druid.emitter.statsd.dimensionMapPath`|JSON file defining the StatsD type, and desired dimensions for every Druid metric|no|Default mapping provided. See below.|
|`druid.emitter.statsd.blankHolder`|The blank character replacement as StatsD does not support path with blank character|no|"-"|
|`druid.emitter.statsd.dogstatsd`|Flag to enable [DogStatsD](https://docs.datadoghq.com/developers/dogstatsd/) support. Causes dimensions to be included as tags, not as a part of the metric name. `convertRange` fields will be ignored.|no|false|
|`druid.emitter.statsd.dogstatsdConstantTags`|If `druid.emitter.statsd.dogstatsd` is true, the tags in the JSON list of strings will be sent with every event.|no|[]|
|`druid.emitter.statsd.dogstatsdServiceAsTag`|If `druid.emitter.statsd.dogstatsd` and `druid.emitter.statsd.dogstatsdServiceAsTag` are true, druid service (e.g. `druid/broker`, `druid/coordinator`, etc) is reported as a tag (e.g. `druid_service:druid/broker`) instead of being included in metric name (e.g. `druid.broker.query.time`) and `druid` is used as metric prefix (e.g. `druid.query.time`).|no|false|
|`druid.emitter.statsd.dogstatsdEvents`|If `druid.emitter.statsd.dogstatsd` and `druid.emitter.statsd.dogstatsdEvents` are true, [Alert events](../../operations/alerts.html) are reported to DogStatsD.|no|false|
### Druid to StatsD Event Converter
Each metric sent to StatsD must specify a type, one of `[timer, counter, guage]`. StatsD Emitter expects this mapping to
be provided as a JSON file. Additionally, this mapping specifies which dimensions should be included for each metric.
StatsD expects that metric values be integers. Druid emits some metrics with values between the range 0 and 1. To accommodate these metrics they are converted
into the range 0 to 100. This conversion can be enabled by setting the optional "convertRange" field true in the JSON mapping file.
If the user does not specify their own JSON file, a default mapping is used. All
metrics are expected to be mapped. Metrics which are not mapped will log an error.
StatsD metric path is organized using the following schema:
`<druid metric name> : { "dimensions" : <dimension list>, "type" : <StatsD type>, "convertRange" : true/false}`
e.g.
`query/time" : { "dimensions" : ["dataSource", "type"], "type" : "timer"}`
For metrics which are emitted from multiple services with different dimensions, the metric name is prefixed with
the service name.
e.g.
`"coordinator-segment/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
"historical-segment/count" : { "dimensions" : ["dataSource", "tier", "priority"], "type" : "gauge" }`
For most use-cases, the default mapping is sufficient.

View File

@ -0,0 +1,151 @@
---
id: tdigestsketch-quantiles
title: "T-Digest Quantiles Sketch module"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This module provides Apache Druid approximate sketch aggregators based on T-Digest.
T-Digest (https://github.com/tdunning/t-digest) is a popular data structure for accurate on-line accumulation of
rank-based statistics such as quantiles and trimmed means.
The data structure is also designed for parallel programming use cases like distributed aggregations or map reduce jobs by making combining two intermediate t-digests easy and efficient.
The tDigestSketch aggregator is capable of generating sketches from raw numeric values as well as
aggregating/combining pre-generated T-Digest sketches generated using the tDigestSketch aggregator itself.
While one can generate sketches on the fly during the query time itself, it generally is more performant
to generate sketches during ingestion time itself and then combining them during query time.
The module also provides a postAggregator, quantilesFromTDigestSketch, that can be used to compute approximate
quantiles from T-Digest sketches generated by the tDigestSketch aggregator.
To use this aggregator, make sure you [include](../../development/extensions.md#loading-extensions) the extension in your config file:
```
druid.extensions.loadList=["druid-tdigestsketch"]
```
### Aggregator
The result of the aggregation is a T-Digest sketch that is built ingesting numeric values from the raw data or from
combining pre-generated T-Digest sketches.
```json
{
"type" : "tDigestSketch",
"name" : <output_name>,
"fieldName" : <metric_name>,
"compression": <parameter that controls size and accuracy>
}
```
Example:
```json
{
"type": "tDigestSketch",
"name": "sketch",
"fieldName": "session_duration",
"compression": 200
}
```
```json
{
"type": "tDigestSketch",
"name": "combined_sketch",
"fieldName": <input-column>,
"compression": 200
}
```
|property|description|required?|
|--------|-----------|---------|
|type|This String should always be "tDigestSketch"|yes|
|name|A String for the output (result) name of the calculation.|yes|
|fieldName|A String for the name of the input field containing raw numeric values or pre-generated T-Digest sketches.|yes|
|compression|Parameter that determines the accuracy and size of the sketch. Higher compression means higher accuracy but more space to store sketches.|no, defaults to 100|
### Post Aggregators
#### Quantiles
This returns an array of quantiles corresponding to a given array of fractions.
```json
{
"type" : "quantilesFromTDigestSketch",
"name": <output name>,
"field" : <post aggregator that refers to a TDigestSketch (fieldAccess or another post aggregator)>,
"fractions" : <array of fractions>
}
```
|property|description|required?|
|--------|-----------|---------|
|type|This String should always be "quantilesFromTDigestSketch"|yes|
|name|A String for the output (result) name of the calculation.|yes|
|field|A field reference pointing to the field aggregated/combined T-Digest sketch.|yes|
|fractions|Non-empty array of fractions between 0 and 1|yes|
Example:
```json
{
"queryType": "groupBy",
"dataSource": "test_datasource",
"granularity": "ALL",
"dimensions": [],
"aggregations": [{
"type": "tDigestSketch",
"name": "merged_sketch",
"fieldName": "ingested_sketch",
"compression": 200
}],
"postAggregations": [{
"type": "quantilesFromTDigestSketch",
"name": "quantiles",
"fractions": [0, 0.5, 1],
"field": {
"type": "fieldAccess",
"fieldName": "merged_sketch"
}
}],
"intervals": ["2016-01-01T00:00:00.000Z/2016-01-31T00:00:00.000Z"]
}
```
Similar to quantilesFromTDigestSketch except it takes in a single fraction for computing quantile.
```json
{
"type" : "quantileFromTDigestSketch",
"name": <output name>,
"field" : <post aggregator that refers to a TDigestSketch (fieldAccess or another post aggregator)>,
"fraction" : <value>
}
```
|property|description|required?|
|--------|-----------|---------|
|type|This String should always be "quantileFromTDigestSketch"|yes|
|name|A String for the output (result) name of the calculation.|yes|
|field|A field reference pointing to the field aggregated/combined T-Digest sketch.|yes|
|fraction|Decimal value between 0 and 1|yes|

View File

@ -0,0 +1,87 @@
---
id: thrift
title: "Thrift"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-thrift-extensions`.
This extension enables Druid to ingest thrift compact data online (`ByteBuffer`) and offline (SequenceFile of type `<Writable, BytesWritable>` or LzoThriftBlock File).
You may want to use another version of thrift, change the dependency in pom and compile yourself.
## LZO Support
If you plan to read LZO-compressed Thrift files, you will need to download version 0.4.19 of the [hadoop-lzo JAR](https://mvnrepository.com/artifact/com.hadoop.gplcompression/hadoop-lzo/0.4.19) and place it in your `extensions/druid-thrift-extensions` directory.
## Thrift Parser
| Field | Type | Description | Required |
| ----------- | ----------- | ---------------------------------------- | -------- |
| type | String | This should say `thrift` | yes |
| parseSpec | JSON Object | Specifies the timestamp and dimensions of the data. Should be a JSON parseSpec. | yes |
| thriftJar | String | path of thrift jar, if not provided, it will try to find the thrift class in classpath. Thrift jar in batch ingestion should be uploaded to HDFS first and configure `jobProperties` with `"tmpjars":"/path/to/your/thrift.jar"` | no |
| thriftClass | String | classname of thrift | yes |
- Batch Ingestion example - `inputFormat` and `tmpjars` should be set.
This is for batch ingestion using the HadoopDruidIndexer. The inputFormat of inputSpec in ioConfig could be one of `"org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat"` and `com.twitter.elephantbird.mapreduce.input.LzoThriftBlockInputFormat`. Be careful, when `LzoThriftBlockInputFormat` is used, thrift class must be provided twice.
```json
{
"type": "index_hadoop",
"spec": {
"dataSchema": {
"dataSource": "book",
"parser": {
"type": "thrift",
"jarPath": "book.jar",
"thriftClass": "org.apache.druid.data.input.thrift.Book",
"protocol": "compact",
"parseSpec": {
"format": "json",
...
}
},
"metricsSpec": [],
"granularitySpec": {}
},
"ioConfig": {
"type": "hadoop",
"inputSpec": {
"type": "static",
"inputFormat": "org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat",
// "inputFormat": "com.twitter.elephantbird.mapreduce.input.LzoThriftBlockInputFormat",
"paths": "/user/to/some/book.seq"
}
},
"tuningConfig": {
"type": "hadoop",
"jobProperties": {
"tmpjars":"/user/h_user_profile/du00/druid/test/book.jar",
// "elephantbird.class.for.MultiInputFormat" : "${YOUR_THRIFT_CLASS_NAME}"
}
}
}
}
```

View File

@ -0,0 +1,104 @@
---
id: time-min-max
title: "Timestamp Min/Max aggregators"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-time-min-max`.
These aggregators enable more precise calculation of min and max time of given events than `__time` column whose granularity is sparse, the same as query granularity.
To use this feature, a "timeMin" or "timeMax" aggregator must be included at indexing time.
They can apply to any columns that can be converted to timestamp, which include Long, DateTime, Timestamp, and String types.
For example, when a data set consists of timestamp, dimension, and metric value like followings.
```
2015-07-28T01:00:00.000Z A 1
2015-07-28T02:00:00.000Z A 1
2015-07-28T03:00:00.000Z A 1
2015-07-28T04:00:00.000Z B 1
2015-07-28T05:00:00.000Z A 1
2015-07-28T06:00:00.000Z B 1
2015-07-29T01:00:00.000Z C 1
2015-07-29T02:00:00.000Z C 1
2015-07-29T03:00:00.000Z A 1
2015-07-29T04:00:00.000Z A 1
```
At ingestion time, timeMin and timeMax aggregator can be included as other aggregators.
```json
{
"type": "timeMin",
"name": "tmin",
"fieldName": "<field_name, typically column specified in timestamp spec>"
}
```
```json
{
"type": "timeMax",
"name": "tmax",
"fieldName": "<field_name, typically column specified in timestamp spec>"
}
```
`name` is output name of aggregator and can be any string. `fieldName` is typically column specified in timestamp spec but can be any column that can be converted to timestamp.
To query for results, the same aggregators "timeMin" and "timeMax" is used.
```json
{
"queryType": "groupBy",
"dataSource": "timeMinMax",
"granularity": "DAY",
"dimensions": ["product"],
"aggregations": [
{
"type": "count",
"name": "count"
},
{
"type": "timeMin",
"name": "<output_name of timeMin>",
"fieldName": "tmin"
},
{
"type": "timeMax",
"name": "<output_name of timeMax>",
"fieldName": "tmax"
}
],
"intervals": [
"2010-01-01T00:00:00.000Z/2020-01-01T00:00:00.000Z"
]
}
```
Then, result has min and max of timestamp, which is finer than query granularity.
```
2015-07-28T00:00:00.000Z A 4 2015-07-28T01:00:00.000Z 2015-07-28T05:00:00.000Z
2015-07-28T00:00:00.000Z B 2 2015-07-28T04:00:00.000Z 2015-07-28T06:00:00.000Z
2015-07-29T00:00:00.000Z A 2 2015-07-29T03:00:00.000Z 2015-07-29T04:00:00.000Z
2015-07-29T00:00:00.000Z C 2 2015-07-29T01:00:00.000Z 2015-07-29T02:00:00.000Z
```

View File

@ -0,0 +1,320 @@
---
id: approximate-histograms
title: "Approximate Histogram aggregators"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-histogram` as an extension.
The `druid-histogram` extension provides an approximate histogram aggregator and a fixed buckets histogram aggregator.
<a name="approximate-histogram-aggregator"></a>
## Approximate Histogram aggregator (Deprecated)
> The Approximate Histogram aggregator is deprecated. Please use [DataSketches Quantiles](../extensions-core/datasketches-quantiles.md) instead which provides a superior distribution-independent algorithm with formal error guarantees.
This aggregator is based on
[http://jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf](http://jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf)
to compute approximate histograms, with the following modifications:
- some tradeoffs in accuracy were made in the interest of speed (see below)
- the sketch maintains the exact original data as long as the number of
distinct data points is fewer than the resolutions (number of centroids),
increasing accuracy when there are few data points, or when dealing with
discrete data points. You can find some of the details in [this post](https://metamarkets.com/2013/histograms/).
Approximate histogram sketches are still experimental for a reason, and you
should understand the limitations of the current implementation before using
them. The approximation is heavily data-dependent, which makes it difficult to
give good general guidelines, so you should experiment and see what parameters
work well for your data.
Here are a few things to note before using them:
- As indicated in the original paper, there are no formal error bounds on the
approximation. In practice, the approximation gets worse if the distribution
is skewed.
- The algorithm is order-dependent, so results can vary for the same query, due
to variations in the order in which results are merged.
- In general, the algorithm only works well if the data that comes is randomly
distributed (i.e. if data points end up sorted in a column, approximation
will be horrible)
- We traded accuracy for aggregation speed, taking some shortcuts when adding
histograms together, which can lead to pathological cases if your data is
ordered in some way, or if your distribution has long tails. It should be
cheaper to increase the resolution of the sketch to get the accuracy you need.
That being said, those sketches can be useful to get a first order approximation
when averages are not good enough. Assuming most rows in your segment store
fewer data points than the resolution of histogram, you should be able to use
them for monitoring purposes and detect meaningful variations with a few
hundred centroids. To get good accuracy readings on 95th percentiles with
millions of rows of data, you may want to use several thousand centroids,
especially with long tails, since that's where the approximation will be worse.
### Creating approximate histogram sketches at ingestion time
To use this feature, an "approxHistogram" or "approxHistogramFold" aggregator must be included at
indexing time. The ingestion aggregator can only apply to numeric values. If you use "approxHistogram"
then any input rows missing the value will be considered to have a value of 0, while with "approxHistogramFold"
such rows will be ignored.
To query for results, an "approxHistogramFold" aggregator must be included in the
query.
```json
{
"type" : "approxHistogram or approxHistogramFold (at ingestion time), approxHistogramFold (at query time)",
"name" : <output_name>,
"fieldName" : <metric_name>,
"resolution" : <integer>,
"numBuckets" : <integer>,
"lowerLimit" : <float>,
"upperLimit" : <float>
}
```
|Property |Description |Default |
|-------------------------|------------------------------|----------------------------------|
|`resolution` |Number of centroids (data points) to store. The higher the resolution, the more accurate results are, but the slower the computation will be.|50|
|`numBuckets` |Number of output buckets for the resulting histogram. Bucket intervals are dynamic, based on the range of the underlying data. Use a post-aggregator to have finer control over the bucketing scheme|7|
|`lowerLimit`/`upperLimit`|Restrict the approximation to the given range. The values outside this range will be aggregated into two centroids. Counts of values outside this range are still maintained. |-INF/+INF|
|`finalizeAsBase64Binary` |If true, the finalized aggregator value will be a Base64-encoded byte array containing the serialized form of the histogram. If false, the finalized aggregator value will be a JSON representation of the histogram.|false|
## Fixed Buckets Histogram
The fixed buckets histogram aggregator builds a histogram on a numeric column, with evenly-sized buckets across a specified value range. Values outside of the range are handled based on a user-specified outlier handling mode.
This histogram supports the min/max/quantiles post-aggregators but does not support the bucketing post-aggregators.
### When to use
The accuracy/usefulness of the fixed buckets histogram is extremely data-dependent; it is provided to support special use cases where the user has a great deal of prior information about the data being aggregated and knows that a fixed buckets implementation is suitable.
For general histogram and quantile use cases, the [DataSketches Quantiles Sketch](../extensions-core/datasketches-quantiles.md) extension is recommended.
### Properties
|Property |Description |Default |
|-------------------------|------------------------------|----------------------------------|
|`type`|Type of the aggregator. Must `fixedBucketsHistogram`.|No default, must be specified|
|`name`|Column name for the aggregator.|No default, must be specified|
|`fieldName`|Column name of the input to the aggregator.|No default, must be specified|
|`lowerLimit`|Lower limit of the histogram. |No default, must be specified|
|`upperLimit`|Upper limit of the histogram. |No default, must be specified|
|`numBuckets`|Number of buckets for the histogram. The range [lowerLimit, upperLimit] will be divided into `numBuckets` intervals of equal size.|10|
|`outlierHandlingMode`|Specifies how values outside of [lowerLimit, upperLimit] will be handled. Supported modes are "ignore", "overflow", and "clip". See [outlier handling modes](#outlier-handling-modes) for more details.|No default, must be specified|
|`finalizeAsBase64Binary`|If true, the finalized aggregator value will be a Base64-encoded byte array containing the [serialized form](#serialization-formats) of the histogram. If false, the finalized aggregator value will be a JSON representation of the histogram.|false|
An example aggregator spec is shown below:
```json
{
"type" : "fixedBucketsHistogram",
"name" : <output_name>,
"fieldName" : <metric_name>,
"numBuckets" : <integer>,
"lowerLimit" : <double>,
"upperLimit" : <double>,
"outlierHandlingMode": <mode>
}
```
### Outlier handling modes
The outlier handling mode specifies what should be done with values outside of the histogram's range. There are three supported modes:
- `ignore`: Throw away outlier values.
- `overflow`: A count of outlier values will be tracked by the histogram, available in the `lowerOutlierCount` and `upperOutlierCount` fields.
- `clip`: Outlier values will be clipped to the `lowerLimit` or the `upperLimit` and included in the histogram.
If you don't care about outliers, `ignore` is the cheapest option performance-wise. There is currently no difference in storage size among the modes.
### Output fields
The histogram aggregator's output object has the following fields:
- `lowerLimit`: Lower limit of the histogram
- `upperLimit`: Upper limit of the histogram
- `numBuckets`: Number of histogram buckets
- `outlierHandlingMode`: Outlier handling mode
- `count`: Total number of values contained in the histogram, excluding outliers
- `lowerOutlierCount`: Count of outlier values below `lowerLimit`. Only used if the outlier mode is `overflow`.
- `upperOutlierCount`: Count of outlier values above `upperLimit`. Only used if the outlier mode is `overflow`.
- `missingValueCount`: Count of null values seen by the histogram.
- `max`: Max value seen by the histogram. This does not include outlier values.
- `min`: Min value seen by the histogram. This does not include outlier values.
- `histogram`: An array of longs with size `numBuckets`, containing the bucket counts
### Ingesting existing histograms
It is also possible to ingest existing fixed buckets histograms. The input must be a Base64 string encoding a byte array that contains a serialized histogram object. Both "full" and "sparse" formats can be used. Please see [Serialization formats](#serialization-formats) below for details.
### Serialization formats
#### Full serialization format
This format includes the full histogram bucket count array in the serialization format.
```
byte: serialization version, must be 0x01
byte: encoding mode, 0x01 for full
double: lowerLimit
double: upperLimit
int: numBuckets
byte: outlier handling mode (0x00 for `ignore`, 0x01 for `overflow`, and 0x02 for `clip`)
long: count, total number of values contained in the histogram, excluding outliers
long: lowerOutlierCount
long: upperOutlierCount
long: missingValueCount
double: max
double: min
array of longs: bucket counts for the histogram
```
#### Sparse serialization format
This format represents the histogram bucket counts as (bucketNum, count) pairs. This serialization format is used when less than half of the histogram's buckets have values.
```
byte: serialization version, must be 0x01
byte: encoding mode, 0x02 for sparse
double: lowerLimit
double: upperLimit
int: numBuckets
byte: outlier handling mode (0x00 for `ignore`, 0x01 for `overflow`, and 0x02 for `clip`)
long: count, total number of values contained in the histogram, excluding outliers
long: lowerOutlierCount
long: upperOutlierCount
long: missingValueCount
double: max
double: min
int: number of following (bucketNum, count) pairs
sequence of (int, long) pairs:
int: bucket number
count: bucket count
```
### Combining histograms with different bucketing schemes
It is possible to combine two histograms with different bucketing schemes (lowerLimit, upperLimit, numBuckets) together.
The bucketing scheme of the "left hand" histogram will be preserved (i.e., when running a query, the bucketing schemes specified in the query's histogram aggregators will be preserved).
When merging, we assume that values are evenly distributed within the buckets of the "right hand" histogram.
When the right-hand histogram contains outliers (when using `overflow` mode), we assume that all of the outliers counted in the right-hand histogram will be outliers in the left-hand histogram as well.
For performance and accuracy reasons, we recommend avoiding aggregation of histograms with different bucketing schemes if possible.
### Null handling
If `druid.generic.useDefaultValueForNull` is false, null values will be tracked in the `missingValueCount` field of the histogram.
If `druid.generic.useDefaultValueForNull` is true, null values will be added to the histogram as the default 0.0 value.
## Histogram post-aggregators
Post-aggregators are used to transform opaque approximate histogram sketches
into bucketed histogram representations, as well as to compute various
distribution metrics such as quantiles, min, and max.
### Equal buckets post-aggregator
Computes a visual representation of the approximate histogram with a given number of equal-sized bins.
Bucket intervals are based on the range of the underlying data. This aggregator is not supported for the fixed buckets histogram.
```json
{
"type": "equalBuckets",
"name": "<output_name>",
"fieldName": "<aggregator_name>",
"numBuckets": <count>
}
```
### Buckets post-aggregator
Computes a visual representation given an initial breakpoint, offset, and a bucket size.
Bucket size determines the width of the binning interval.
Offset determines the value on which those interval bins align.
This aggregator is not supported for the fixed buckets histogram.
```json
{
"type": "buckets",
"name": "<output_name>",
"fieldName": "<aggregator_name>",
"bucketSize": <bucket_size>,
"offset": <offset>
}
```
### Custom buckets post-aggregator
Computes a visual representation of the approximate histogram with bins laid out according to the given breaks.
This aggregator is not supported for the fixed buckets histogram.
```json
{ "type" : "customBuckets", "name" : <output_name>, "fieldName" : <aggregator_name>,
"breaks" : [ <value>, <value>, ... ] }
```
### min post-aggregator
Returns the minimum value of the underlying approximate or fixed buckets histogram aggregator
```json
{ "type" : "min", "name" : <output_name>, "fieldName" : <aggregator_name> }
```
### max post-aggregator
Returns the maximum value of the underlying approximate or fixed buckets histogram aggregator
```json
{ "type" : "max", "name" : <output_name>, "fieldName" : <aggregator_name> }
```
#### quantile post-aggregator
Computes a single quantile based on the underlying approximate or fixed buckets histogram aggregator
```json
{ "type" : "quantile", "name" : <output_name>, "fieldName" : <aggregator_name>,
"probability" : <quantile> }
```
#### quantiles post-aggregator
Computes an array of quantiles based on the underlying approximate or fixed buckets histogram aggregator
```json
{ "type" : "quantiles", "name" : <output_name>, "fieldName" : <aggregator_name>,
"probabilities" : [ <quantile>, <quantile>, ... ] }
```

View File

@ -0,0 +1,32 @@
---
id: avro
title: "Apache Avro"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
## Avro extension
This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides
two Avro Parsers for stream ingestion and Hadoop batch ingestion.
See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
for more details about how to use these in an ingestion spec.
Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` as an extension.

View File

@ -0,0 +1,43 @@
---
id: azure
title: "Microsoft Azure"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-azure-extensions` extension.
## Deep Storage
[Microsoft Azure Storage](http://azure.microsoft.com/en-us/services/storage/) is another option for deep storage. This requires some additional Druid configuration.
|Property|Description|Possible Values|Default|
|--------|---------------|-----------|-------|
|`druid.storage.type`|azure||Must be set.|
|`druid.azure.account`||Azure Storage account name.|Must be set.|
|`druid.azure.key`||Azure Storage account key.|Must be set.|
|`druid.azure.container`||Azure Storage container name.|Must be set.|
|`druid.azure.prefix`|A prefix string that will be prepended to the blob names for the segments published to Azure deep storage| |""|
|`druid.azure.protocol`|the protocol to use|http or https|https|
|`druid.azure.maxTries`|Number of tries before canceling an Azure operation.| |3|
|`druid.azure.maxListingLength`|maximum number of input files matching a given prefix to retrieve at a time| |1024|
See [Azure Services](http://azure.microsoft.com/en-us/pricing/free-trial/) for more information.

View File

@ -0,0 +1,179 @@
---
id: bloom-filter
title: "Bloom Filter"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This Apache Druid extension adds the ability to both construct bloom filters from query results, and filter query results by testing
against a bloom filter. Make sure to [include](../../development/extensions.md#loading-extensions) `druid-bloom-filter` as an
extension.
A Bloom filter is a probabilistic data structure for performing a set membership check. A bloom filter is a good candidate
to use with Druid for cases where an explicit filter is impossible, e.g. filtering a query against a set of millions of
values.
Following are some characteristics of Bloom filters:
- Bloom filters are highly space efficient when compared to using a HashSet.
- Because of the probabilistic nature of bloom filters, false positive results are possible (element was not actually
inserted into a bloom filter during construction, but `test()` says true)
- False negatives are not possible (if element is present then `test()` will never say false).
- The false positive probability of this implementation is currently fixed at 5%, but increasing the number of entries
that the filter can hold can decrease this false positive rate in exchange for overall size.
- Bloom filters are sensitive to number of elements that will be inserted in the bloom filter. During the creation of bloom filter expected number of entries must be specified. If the number of insertions exceed
the specified initial number of entries then false positive probability will increase accordingly.
This extension is currently based on `org.apache.hive.common.util.BloomKFilter` from `hive-storage-api`. Internally,
this implementation uses Murmur3 as the hash algorithm.
To construct a BloomKFilter externally with Java to use as a filter in a Druid query:
```java
BloomKFilter bloomFilter = new BloomKFilter(1500);
bloomFilter.addString("value 1");
bloomFilter.addString("value 2");
bloomFilter.addString("value 3");
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
BloomKFilter.serialize(byteArrayOutputStream, bloomFilter);
String base64Serialized = Base64.encodeBase64String(byteArrayOutputStream.toByteArray());
```
This string can then be used in the native or SQL Druid query.
## Filtering queries with a Bloom Filter
### JSON Specification of Bloom Filter
```json
{
"type" : "bloom",
"dimension" : <dimension_name>,
"bloomKFilter" : <serialized_bytes_for_BloomKFilter>,
"extractionFn" : <extraction_fn>
}
```
|Property |Description |required? |
|-------------------------|------------------------------|----------------------------------|
|`type` |Filter Type. Should always be `bloom`|yes|
|`dimension` |The dimension to filter over. | yes |
|`bloomKFilter` |Base64 encoded Binary representation of `org.apache.hive.common.util.BloomKFilter`| yes |
|`extractionFn`|[Extraction function](../../querying/dimensionspecs.html#extraction-functions) to apply to the dimension values |no|
### Serialized Format for BloomKFilter
Serialized BloomKFilter format:
- 1 byte for the number of hash functions.
- 1 big endian int(That is how OutputStream works) for the number of longs in the bitset
- big endian longs in the BloomKFilter bitset
Note: `org.apache.hive.common.util.BloomKFilter` provides a serialize method which can be used to serialize bloom filters to outputStream.
### Filtering SQL Queries
Bloom filters can be used in SQL `WHERE` clauses via the `bloom_filter_test` operator:
```sql
SELECT COUNT(*) FROM druid.foo WHERE bloom_filter_test(<expr>, '<serialized_bytes_for_BloomKFilter>')
```
### Expression and Virtual Column Support
The bloom filter extension also adds a bloom filter [Druid expression](../../misc/math-expr.md) which shares syntax
with the SQL operator.
```sql
bloom_filter_test(<expr>, '<serialized_bytes_for_BloomKFilter>')
```
## Bloom Filter Query Aggregator
Input for a `bloomKFilter` can also be created from a druid query with the `bloom` aggregator. Note that it is very
important to set a reasonable value for the `maxNumEntries` parameter, which is the maximum number of distinct entries
that the bloom filter can represent without increasing the false positive rate. It may be worth performing a query using
one of the unique count sketches to calculate the value for this parameter in order to build a bloom filter appropriate
for the query.
### JSON Specification of Bloom Filter Aggregator
```json
{
"type": "bloom",
"name": <output_field_name>,
"maxNumEntries": <maximum_number_of_elements_for_BloomKFilter>
"field": <dimension_spec>
}
```
|Property |Description |required? |
|-------------------------|------------------------------|----------------------------------|
|`type` |Aggregator Type. Should always be `bloom`|yes|
|`name` |Output field name |yes|
|`field` |[DimensionSpec](../../querying/dimensionspecs.md) to add to `org.apache.hive.common.util.BloomKFilter` | yes |
|`maxNumEntries` |Maximum number of distinct values supported by `org.apache.hive.common.util.BloomKFilter`, default `1500`| no |
### Example
```json
{
"queryType": "timeseries",
"dataSource": "wikiticker",
"intervals": [ "2015-09-12T00:00:00.000/2015-09-13T00:00:00.000" ],
"granularity": "day",
"aggregations": [
{
"type": "bloom",
"name": "userBloom",
"maxNumEntries": 100000,
"field": {
"type":"default",
"dimension":"user",
"outputType": "STRING"
}
}
]
}
```
response
```json
[{"timestamp":"2015-09-12T00:00:00.000Z","result":{"userBloom":"BAAAJhAAAA..."}}]
```
These values can then be set in the filter specification described above.
Ordering results by a bloom filter aggregator, for example in a TopN query, will perform a comparatively expensive
linear scan _of the filter itself_ to count the number of set bits as a means of approximating how many items have been
added to the set. As such, ordering by an alternate aggregation is recommended if possible.
### SQL Bloom Filter Aggregator
Bloom filters can be computed in SQL expressions with the `bloom_filter` aggregator:
```sql
SELECT BLOOM_FILTER(<expression>, <max number of entries>) FROM druid.foo WHERE dim2 = 'abc'
```
but requires the setting `druid.sql.planner.serializeComplexValues` to be set to `true`. Bloom filter results in a SQL
response are serialized into a base64 string, which can then be used in subsequent queries as a filter.

View File

@ -0,0 +1,39 @@
---
id: datasketches-extension
title: "DataSketches extension"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
Apache Druid aggregators based on [Apache DataSketches](https://datasketches.apache.org/) library. Sketches are data structures implementing approximate streaming mergeable algorithms. Sketches can be ingested from the outside of Druid or built from raw data at ingestion time. Sketches can be stored in Druid segments as additive metrics.
To use the datasketches aggregators, make sure you [include](../../development/extensions.md#loading-extensions) the extension in your config file:
```
druid.extensions.loadList=["druid-datasketches"]
```
The following modules are available:
* [Theta sketch](datasketches-theta.html) - approximate distinct counting with set operations (union, intersection and set difference).
* [Tuple sketch](datasketches-tuple.html) - extension of Theta sketch to support values associated with distinct keys (arrays of numeric values in this specialized implementation).
* [Quantiles sketch](datasketches-quantiles.html) - approximate distribution of comparable values to obtain ranks, quantiles and histograms. This is a specialized implementation for numeric values.
* [HLL sketch](datasketches-hll.html) - approximate distinct counting using very compact HLL sketch.

View File

@ -0,0 +1,121 @@
---
id: datasketches-hll
title: "DataSketches HLL Sketch module"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This module provides Apache Druid aggregators for distinct counting based on HLL sketch from [Apache DataSketches](https://datasketches.apache.org/) library. At ingestion time, this aggregator creates the HLL sketch objects to be stored in Druid segments. At query time, sketches are read and merged together. In the end, by default, you receive the estimate of the number of distinct values presented to the sketch. Also, you can use post aggregator to produce a union of sketch columns in the same row.
You can use the HLL sketch aggregator on columns of any identifiers. It will return estimated cardinality of the column.
To use this aggregator, make sure you [include](../../development/extensions.md#loading-extensions) the extension in your config file:
```
druid.extensions.loadList=["druid-datasketches"]
```
### Aggregators
```
{
"type" : "HLLSketchBuild",
"name" : <output name>,
"fieldName" : <metric name>,
"lgK" : <size and accuracy parameter>,
"tgtHllType" : <target HLL type>,
"round": <false | true>
}
```
```
{
"type" : "HLLSketchMerge",
"name" : <output name>,
"fieldName" : <metric name>,
"lgK" : <size and accuracy parameter>,
"tgtHllType" : <target HLL type>,
"round": <false | true>
}
```
|property|description|required?|
|--------|-----------|---------|
|type|This String should be "HLLSketchBuild" or "HLLSketchMerge"|yes|
|name|A String for the output (result) name of the calculation.|yes|
|fieldName|A String for the name of the input field.|yes|
|lgK|log2 of K that is the number of buckets in the sketch, parameter that controls the size and the accuracy. Must be a power of 2 from 4 to 21 inclusively.|no, defaults to 12|
|tgtHllType|The type of the target HLL sketch. Must be "HLL&lowbar;4", "HLL&lowbar;6" or "HLL&lowbar;8" |no, defaults to "HLL&lowbar;4"|
|round|Round off values to whole numbers. Only affects query-time behavior and is ignored at ingestion-time.|no, defaults to false|
### Post Aggregators
#### Estimate
Returns the distinct count estimate as a double.
```
{
"type" : "HLLSketchEstimate",
"name": <output name>,
"field" : <post aggregator that returns an HLL Sketch>,
"round" : <if true, round the estimate. Default is false>
}
```
#### Estimate with bounds
Returns a distinct count estimate and error bounds from an HLL sketch.
The result will be an array containing three double values: estimate, lower bound and upper bound.
The bounds are provided at a given number of standard deviations (optional, defaults to 1).
This must be an integer value of 1, 2 or 3 corresponding to approximately 68.3%, 95.4% and 99.7% confidence intervals.
```
{
"type" : "HLLSketchEstimateWithBounds",
"name": <output name>,
"field" : <post aggregator that returns an HLL Sketch>,
"numStdDev" : <number of standard deviations: 1 (default), 2 or 3>
}
```
#### Union
```
{
"type" : "HLLSketchUnion",
"name": <output name>,
"fields" : <array of post aggregators that return HLL sketches>,
"lgK": <log2 of K for the target sketch>,
"tgtHllType" : <target HLL type>
}
```
#### Sketch to string
Human-readable sketch summary for debugging.
```
{
"type" : "HLLSketchToString",
"name": <output name>,
"field" : <post aggregator that returns an HLL Sketch>
}
```

View File

@ -0,0 +1,137 @@
---
id: datasketches-quantiles
title: "DataSketches Quantiles Sketch module"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This module provides Apache Druid aggregators based on numeric quantiles DoublesSketch from [Apache DataSketches](https://datasketches.apache.org/) library. Quantiles sketch is a mergeable streaming algorithm to estimate the distribution of values, and approximately answer queries about the rank of a value, probability mass function of the distribution (PMF) or histogram, cumulative distribution function (CDF), and quantiles (median, min, max, 95th percentile and such). See [Quantiles Sketch Overview](https://datasketches.apache.org/docs/Quantiles/QuantilesOverview.html).
There are three major modes of operation:
1. Ingesting sketches built outside of Druid (say, with Pig or Hive)
2. Building sketches from raw data during ingestion
3. Building sketches from raw data at query time
To use this aggregator, make sure you [include](../../development/extensions.md#loading-extensions) the extension in your config file:
```
druid.extensions.loadList=["druid-datasketches"]
```
### Aggregator
The result of the aggregation is a DoublesSketch that is the union of all sketches either built from raw data or read from the segments.
```json
{
"type" : "quantilesDoublesSketch",
"name" : <output_name>,
"fieldName" : <metric_name>,
"k": <parameter that controls size and accuracy>
}
```
|property|description|required?|
|--------|-----------|---------|
|type|This String should always be "quantilesDoublesSketch"|yes|
|name|A String for the output (result) name of the calculation.|yes|
|fieldName|A String for the name of the input field (can contain sketches or raw numeric values).|yes|
|k|Parameter that determines the accuracy and size of the sketch. Higher k means higher accuracy but more space to store sketches. Must be a power of 2 from 2 to 32768. See the [Quantiles Accuracy](https://datasketches.apache.org/docs/Quantiles/QuantilesAccuracy.html) for details. |no, defaults to 128|
### Post Aggregators
#### Quantile
This returns an approximation to the value that would be preceded by a given fraction of a hypothetical sorted version of the input stream.
```json
{
"type" : "quantilesDoublesSketchToQuantile",
"name": <output name>,
"field" : <post aggregator that refers to a DoublesSketch (fieldAccess or another post aggregator)>,
"fraction" : <fractional position in the hypothetical sorted stream, number from 0 to 1 inclusive>
}
```
#### Quantiles
This returns an array of quantiles corresponding to a given array of fractions
```json
{
"type" : "quantilesDoublesSketchToQuantiles",
"name": <output name>,
"field" : <post aggregator that refers to a DoublesSketch (fieldAccess or another post aggregator)>,
"fractions" : <array of fractional positions in the hypothetical sorted stream, number from 0 to 1 inclusive>
}
```
#### Histogram
This returns an approximation to the histogram given an array of split points that define the histogram bins or a number of bins (not both). An array of <i>m</i> unique, monotonically increasing split points divide the real number line into <i>m+1</i> consecutive disjoint intervals. The definition of an interval is inclusive of the left split point and exclusive of the right split point. If the number of bins is specified instead of split points, the interval between the minimum and maximum values is divided into the given number of equally-spaced bins.
```json
{
"type" : "quantilesDoublesSketchToHistogram",
"name": <output name>,
"field" : <post aggregator that refers to a DoublesSketch (fieldAccess or another post aggregator)>,
"splitPoints" : <array of split points (optional)>,
"numBins" : <number of bins (optional, defaults to 10)>
}
```
#### Rank
This returns an approximation to the rank of a given value that is the fraction of the distribution less than that value.
```json
{
"type" : "quantilesDoublesSketchToRank",
"name": <output name>,
"field" : <post aggregator that refers to a DoublesSketch (fieldAccess or another post aggregator)>,
"value" : <value>
}
```
#### CDF
This returns an approximation to the Cumulative Distribution Function given an array of split points that define the edges of the bins. An array of <i>m</i> unique, monotonically increasing split points divide the real number line into <i>m+1</i> consecutive disjoint intervals. The definition of an interval is inclusive of the left split point and exclusive of the right split point. The resulting array of fractions can be viewed as ranks of each split point with one additional rank that is always 1.
```json
{
"type" : "quantilesDoublesSketchToCDF",
"name": <output name>,
"field" : <post aggregator that refers to a DoublesSketch (fieldAccess or another post aggregator)>,
"splitPoints" : <array of split points>
}
```
#### Sketch Summary
This returns a summary of the sketch that can be used for debugging. This is the result of calling toString() method.
```json
{
"type" : "quantilesDoublesSketchToString",
"name": <output name>,
"field" : <post aggregator that refers to a DoublesSketch (fieldAccess or another post aggregator)>
}
```

View File

@ -0,0 +1,284 @@
---
id: datasketches-theta
title: "DataSketches Theta Sketch module"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This module provides Apache Druid aggregators based on Theta sketch from [Apache DataSketches](https://datasketches.apache.org/) library. Note that sketch algorithms are approximate; see details in the "Accuracy" section of the datasketches doc.
At ingestion time, this aggregator creates the Theta sketch objects which get stored in Druid segments. Logically speaking, a Theta sketch object can be thought of as a Set data structure. At query time, sketches are read and aggregated (set unioned) together. In the end, by default, you receive the estimate of the number of unique entries in the sketch object. Also, you can use post aggregators to do union, intersection or difference on sketch columns in the same row.
Note that you can use `thetaSketch` aggregator on columns which were not ingested using the same. It will return estimated cardinality of the column. It is recommended to use it at ingestion time as well to make querying faster.
To use this aggregator, make sure you [include](../../development/extensions.md#loading-extensions) the extension in your config file:
```
druid.extensions.loadList=["druid-datasketches"]
```
### Aggregators
```json
{
"type" : "thetaSketch",
"name" : <output_name>,
"fieldName" : <metric_name>,
"isInputThetaSketch": false,
"size": 16384
}
```
|property|description|required?|
|--------|-----------|---------|
|type|This String should always be "thetaSketch"|yes|
|name|A String for the output (result) name of the calculation.|yes|
|fieldName|A String for the name of the aggregator used at ingestion time.|yes|
|isInputThetaSketch|This should only be used at indexing time if your input data contains theta sketch objects. This would be the case if you use datasketches library outside of Druid, say with Pig/Hive, to produce the data that you are ingesting into Druid |no, defaults to false|
|size|Must be a power of 2. Internally, size refers to the maximum number of entries sketch object will retain. Higher size means higher accuracy but more space to store sketches. Note that after you index with a particular size, druid will persist sketch in segments and you will use size greater or equal to that at query time. See the [DataSketches site](https://datasketches.apache.org/docs/Theta/ThetaSize.html) for details. In general, We recommend just sticking to default size. |no, defaults to 16384|
### Post Aggregators
#### Sketch Estimator
```json
{
"type" : "thetaSketchEstimate",
"name": <output name>,
"field" : <post aggregator of type fieldAccess that refers to a thetaSketch aggregator or that of type thetaSketchSetOp>
}
```
#### Sketch Operations
```json
{
"type" : "thetaSketchSetOp",
"name": <output name>,
"func": <UNION|INTERSECT|NOT>,
"fields" : <array of fieldAccess type post aggregators to access the thetaSketch aggregators or thetaSketchSetOp type post aggregators to allow arbitrary combination of set operations>,
"size": <16384 by default, must be max of size from sketches in fields input>
}
```
#### Sketch Summary
This returns a summary of the sketch that can be used for debugging. This is the result of calling toString() method.
```json
{
"type" : "thetaSketchToString",
"name": <output name>,
"field" : <post aggregator that refers to a Theta sketch (fieldAccess or another post aggregator)>
}
```
### Examples
Assuming, you have a dataset containing (timestamp, product, user_id). You want to answer questions like
How many unique users visited product A?
How many unique users visited both product A and product B?
to answer above questions, you would index your data using following aggregator.
```json
{ "type": "thetaSketch", "name": "user_id_sketch", "fieldName": "user_id" }
```
then, sample query for, How many unique users visited product A?
```json
{
"queryType": "groupBy",
"dataSource": "test_datasource",
"granularity": "ALL",
"dimensions": [],
"aggregations": [
{ "type": "thetaSketch", "name": "unique_users", "fieldName": "user_id_sketch" }
],
"filter": { "type": "selector", "dimension": "product", "value": "A" },
"intervals": [ "2014-10-19T00:00:00.000Z/2014-10-22T00:00:00.000Z" ]
}
```
sample query for, How many unique users visited both product A and B?
```json
{
"queryType": "groupBy",
"dataSource": "test_datasource",
"granularity": "ALL",
"dimensions": [],
"filter": {
"type": "or",
"fields": [
{"type": "selector", "dimension": "product", "value": "A"},
{"type": "selector", "dimension": "product", "value": "B"}
]
},
"aggregations": [
{
"type" : "filtered",
"filter" : {
"type" : "selector",
"dimension" : "product",
"value" : "A"
},
"aggregator" : {
"type": "thetaSketch", "name": "A_unique_users", "fieldName": "user_id_sketch"
}
},
{
"type" : "filtered",
"filter" : {
"type" : "selector",
"dimension" : "product",
"value" : "B"
},
"aggregator" : {
"type": "thetaSketch", "name": "B_unique_users", "fieldName": "user_id_sketch"
}
}
],
"postAggregations": [
{
"type": "thetaSketchEstimate",
"name": "final_unique_users",
"field":
{
"type": "thetaSketchSetOp",
"name": "final_unique_users_sketch",
"func": "INTERSECT",
"fields": [
{
"type": "fieldAccess",
"fieldName": "A_unique_users"
},
{
"type": "fieldAccess",
"fieldName": "B_unique_users"
}
]
}
}
],
"intervals": [
"2014-10-19T00:00:00.000Z/2014-10-22T00:00:00.000Z"
]
}
```
#### Retention Analysis Example
Suppose you want to answer a question like, "How many unique users performed a specific action in a particular time period and also performed another specific action in a different time period?"
e.g., "How many unique users signed up in week 1, and purchased something in week 2?"
Using the `(timestamp, product, user_id)` example dataset, data would be indexed with the following aggregator, like in the example above:
```json
{ "type": "thetaSketch", "name": "user_id_sketch", "fieldName": "user_id" }
```
The following query expresses:
"Out of the unique users who visited Product A between 10/01/2014 and 10/07/2014, how many visited Product A again in the week of 10/08/2014 to 10/14/2014?"
```json
{
"queryType": "groupBy",
"dataSource": "test_datasource",
"granularity": "ALL",
"dimensions": [],
"filter": {
"type": "or",
"fields": [
{"type": "selector", "dimension": "product", "value": "A"}
]
},
"aggregations": [
{
"type" : "filtered",
"filter" : {
"type" : "and",
"fields" : [
{
"type" : "selector",
"dimension" : "product",
"value" : "A"
},
{
"type" : "interval",
"dimension" : "__time",
"intervals" : ["2014-10-01T00:00:00.000Z/2014-10-07T00:00:00.000Z"]
}
]
},
"aggregator" : {
"type": "thetaSketch", "name": "A_unique_users_week_1", "fieldName": "user_id_sketch"
}
},
{
"type" : "filtered",
"filter" : {
"type" : "and",
"fields" : [
{
"type" : "selector",
"dimension" : "product",
"value" : "A"
},
{
"type" : "interval",
"dimension" : "__time",
"intervals" : ["2014-10-08T00:00:00.000Z/2014-10-14T00:00:00.000Z"]
}
]
},
"aggregator" : {
"type": "thetaSketch", "name": "A_unique_users_week_2", "fieldName": "user_id_sketch"
}
},
],
"postAggregations": [
{
"type": "thetaSketchEstimate",
"name": "final_unique_users",
"field":
{
"type": "thetaSketchSetOp",
"name": "final_unique_users_sketch",
"func": "INTERSECT",
"fields": [
{
"type": "fieldAccess",
"fieldName": "A_unique_users_week_1"
},
{
"type": "fieldAccess",
"fieldName": "A_unique_users_week_2"
}
]
}
}
],
"intervals": ["2014-10-01T00:00:00.000Z/2014-10-14T00:00:00.000Z"]
}
```

View File

@ -0,0 +1,174 @@
---
id: datasketches-tuple
title: "DataSketches Tuple Sketch module"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This module provides Apache Druid aggregators based on Tuple sketch from [Apache DataSketches](https://datasketches.apache.org/) library. ArrayOfDoublesSketch sketches extend the functionality of the count-distinct Theta sketches by adding arrays of double values associated with unique keys.
To use this aggregator, make sure you [include](../../development/extensions.md#loading-extensions) the extension in your config file:
```
druid.extensions.loadList=["druid-datasketches"]
```
### Aggregators
```json
{
"type" : "arrayOfDoublesSketch",
"name" : <output_name>,
"fieldName" : <metric_name>,
"nominalEntries": <number>,
"numberOfValues" : <number>,
"metricColumns" : <array of strings>
}
```
|property|description|required?|
|--------|-----------|---------|
|type|This String should always be "arrayOfDoublesSketch"|yes|
|name|A String for the output (result) name of the calculation.|yes|
|fieldName|A String for the name of the input field.|yes|
|nominalEntries|Parameter that determines the accuracy and size of the sketch. Higher k means higher accuracy but more space to store sketches. Must be a power of 2. See the [Theta sketch accuracy](https://datasketches.apache.org/docs/Theta/ThetaErrorTable.html) for details. |no, defaults to 16384|
|numberOfValues|Number of values associated with each distinct key. |no, defaults to 1|
|metricColumns|If building sketches from raw data, an array of names of the input columns containing numeric values to be associated with each distinct key.|no, defaults to empty array|
### Post Aggregators
#### Estimate of the number of distinct keys
Returns a distinct count estimate from a given ArrayOfDoublesSketch.
```json
{
"type" : "arrayOfDoublesSketchToEstimate",
"name": <output name>,
"field" : <post aggregator that refers to an ArrayOfDoublesSketch (fieldAccess or another post aggregator)>
}
```
#### Estimate of the number of distinct keys with error bounds
Returns a distinct count estimate and error bounds from a given ArrayOfDoublesSketch. The result will be three double values: estimate of the number of distinct keys, lower bound and upper bound. The bounds are provided at the given number of standard deviations (optional, defaults to 1). This must be an integer value of 1, 2 or 3 corresponding to approximately 68.3%, 95.4% and 99.7% confidence intervals.
```json
{
"type" : "arrayOfDoublesSketchToEstimateAndBounds",
"name": <output name>,
"field" : <post aggregator that refers to an ArrayOfDoublesSketch (fieldAccess or another post aggregator)>,
"numStdDevs", <number from 1 to 3>
}
```
#### Number of retained entries
Returns the number of retained entries from a given ArrayOfDoublesSketch.
```json
{
"type" : "arrayOfDoublesSketchToNumEntries",
"name": <output name>,
"field" : <post aggregator that refers to an ArrayOfDoublesSketch (fieldAccess or another post aggregator)>
}
```
#### Mean values for each column
Returns a list of mean values from a given ArrayOfDoublesSketch. The result will be N double values, where N is the number of double values kept in the sketch per key.
```json
{
"type" : "arrayOfDoublesSketchToMeans",
"name": <output name>,
"field" : <post aggregator that refers to a DoublesSketch (fieldAccess or another post aggregator)>
}
```
#### Variance values for each column
Returns a list of variance values from a given ArrayOfDoublesSketch. The result will be N double values, where N is the number of double values kept in the sketch per key.
```json
{
"type" : "arrayOfDoublesSketchToVariances",
"name": <output name>,
"field" : <post aggregator that refers to a DoublesSketch (fieldAccess or another post aggregator)>
}
```
#### Quantiles sketch from a column
Returns a quantiles DoublesSketch constructed from a given column of values from a given ArrayOfDoublesSketch using optional parameter k that determines the accuracy and size of the quantiles sketch. See [Quantiles Sketch Module](datasketches-quantiles.html)
* The column number is 1-based and is optional (the default is 1).
* The parameter k is optional (the default is defined in the sketch library).
* The result is a quantiles sketch.
```json
{
"type" : "arrayOfDoublesSketchToQuantilesSketch",
"name": <output name>,
"field" : <post aggregator that refers to a DoublesSketch (fieldAccess or another post aggregator)>,
"column" : <number>,
"k" : <parameter that determines the accuracy and size of the quantiles sketch>
}
```
#### Set Operations
Returns a result of a specified set operation on the given array of sketches. Supported operations are: union, intersection and set difference (UNION, INTERSECT, NOT).
```json
{
"type" : "arrayOfDoublesSketchSetOp",
"name": <output name>,
"operation": <"UNION"|"INTERSECT"|"NOT">,
"fields" : <array of post aggregators to access sketch aggregators or post aggregators to allow arbitrary combination of set operations>,
"nominalEntries" : <parameter that determines the accuracy and size of the sketch>,
"numberOfValues" : <number of values associated with each distinct key>
}
```
#### Student's t-test
Performs Student's t-test and returns a list of p-values given two instances of ArrayOfDoublesSketch. The result will be N double values, where N is the number of double values kept in the sketch per key. See [t-test documentation](http://commons.apache.org/proper/commons-math/javadocs/api-3.4/org/apache/commons/math3/stat/inference/TTest.html).
```json
{
"type" : "arrayOfDoublesSketchTTest",
"name": <output name>,
"fields" : <array with two post aggregators to access sketch aggregators or post aggregators referring to an ArrayOfDoublesSketch>,
}
```
#### Sketch summary
Returns a human-readable summary of a given ArrayOfDoublesSketch. This is a string returned by toString() method of the sketch. This can be useful for debugging.
```json
{
"type" : "arrayOfDoublesSketchToString",
"name": <output name>,
"field" : <post aggregator that refers to an ArrayOfDoublesSketch (fieldAccess or another post aggregator)>
}
```

View File

@ -0,0 +1,544 @@
---
id: druid-basic-security
title: "Basic Security"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This Apache Druid extension adds:
- an Authenticator which supports [HTTP Basic authentication](https://en.wikipedia.org/wiki/Basic_access_authentication) using the Druid metadata store or LDAP as its credentials store
- an Authorizer which implements basic role-based access control for Druid metadata store or LDAP users and groups
Make sure to [include](../../development/extensions.md#loading-extensions) `druid-basic-security` as an extension.
Please see [Authentication and Authorization](../../design/auth.md) for more information on the extension interfaces being implemented.
## Configuration
The examples in the section will use "MyBasicMetadataAuthenticator", "MyBasicLDAPAuthenticator", "MyBasicMetadataAuthorizer", and "MyBasicLDAPAuthorizer" as names for the Authenticators and Authorizer.
These properties are not tied to specific Authenticator or Authorizer instances.
These configuration properties should be added to the common runtime properties file.
### Properties
|Property|Description|Default|required|
|--------|-----------|-------|--------|
|`druid.auth.basic.common.pollingPeriod`|Defines in milliseconds how often processes should poll the Coordinator for the current Druid metadata store authenticator/authorizer state.|60000|No|
|`druid.auth.basic.common.maxRandomDelay`|Defines in milliseconds the amount of random delay to add to the pollingPeriod, to spread polling requests across time.|6000|No|
|`druid.auth.basic.common.maxSyncRetries`|Determines how many times a service will retry if the authentication/authorization Druid metadata store state sync with the Coordinator fails.|10|No|
|`druid.auth.basic.common.cacheDirectory`|If defined, snapshots of the basic Authenticator and Authorizer Druid metadata store caches will be stored on disk in this directory. If this property is defined, when a service is starting, it will attempt to initialize its caches from these on-disk snapshots, if the service is unable to initialize its state by communicating with the Coordinator.|null|No|
### Creating an Authenticator that uses the Druid metadata store to lookup and validate credentials
```
druid.auth.authenticatorChain=["MyBasicMetadataAuthenticator"]
druid.auth.authenticator.MyBasicMetadataAuthenticator.type=basic
druid.auth.authenticator.MyBasicMetadataAuthenticator.initialAdminPassword=password1
druid.auth.authenticator.MyBasicMetadataAuthenticator.initialInternalClientPassword=password2
druid.auth.authenticator.MyBasicMetadataAuthenticator.credentialsValidator.type=metadata
druid.auth.authenticator.MyBasicMetadataAuthenticator.skipOnFailure=false
druid.auth.authenticator.MyBasicMetadataAuthenticator.authorizerName=MyBasicMetadataAuthorizer
```
To use the Basic authenticator, add an authenticator with type `basic` to the authenticatorChain.
The authenticator needs to also define a credentialsValidator with type 'metadata' or 'ldap'.
If credentialsValidator is not specified, type 'metadata' will be used as default.
Configuration of the named authenticator is assigned through properties with the form:
```
druid.auth.authenticator.<authenticatorName>.<authenticatorProperty>
```
The authenticator configuration examples in the rest of this document will use "MyBasicMetadataAuthenticator" or "MyBasicLDAPAuthenticator" as the name of the authenticators being configured.
#### Properties for Druid metadata store user authentication
|Property|Description|Default|required|
|--------|-----------|-------|--------|
|`druid.auth.authenticator.MyBasicMetadataAuthenticator.initialAdminPassword`|Initial [Password Provider](../../operations/password-provider.md) for the automatically created default admin user. If no password is specified, the default admin user will not be created. If the default admin user already exists, setting this property will not affect its password.|null|No|
|`druid.auth.authenticator.MyBasicMetadataAuthenticator.initialInternalClientPassword`|Initial [Password Provider](../../operations/password-provider.md) for the default internal system user, used for internal process communication. If no password is specified, the default internal system user will not be created. If the default internal system user already exists, setting this property will not affect its password.|null|No|
|`druid.auth.authenticator.MyBasicMetadataAuthenticator.enableCacheNotifications`|If true, the Coordinator will notify Druid processes whenever a configuration change to this Authenticator occurs, allowing them to immediately update their state without waiting for polling.|true|No|
|`druid.auth.authenticator.MyBasicMetadataAuthenticator.cacheNotificationTimeout`|The timeout in milliseconds for the cache notifications.|5000|No|
|`druid.auth.authenticator.MyBasicMetadataAuthenticator.credentialIterations`|Number of iterations to use for password hashing.|10000|No|
|`druid.auth.authenticator.MyBasicMetadataAuthenticator.credentialsValidator.type`|The type of credentials store (metadata) to validate requests credentials.|metadata|No|
|`druid.auth.authenticator.MyBasicMetadataAuthenticator.skipOnFailure`|If true and the request credential doesn't exists or isn't fully configured in the credentials store, the request will proceed to next Authenticator in the chain.|false|No|
|`druid.auth.authenticator.MyBasicMetadataAuthenticator.authorizerName`|Authorizer that requests should be directed to|N/A|Yes|
#### Properties for LDAP user authentication
|Property|Description|Default|required|
|--------|-----------|-------|--------|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.initialAdminPassword`|Initial [Password Provider](../../operations/password-provider.md) for the automatically created default admin user. If no password is specified, the default admin user will not be created. If the default admin user already exists, setting this property will not affect its password.|null|No|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.initialInternalClientPassword`|Initial [Password Provider](../../operations/password-provider.md) for the default internal system user, used for internal process communication. If no password is specified, the default internal system user will not be created. If the default internal system user already exists, setting this property will not affect its password.|null|No|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.enableCacheNotifications`|If true, the Coordinator will notify Druid processes whenever a configuration change to this Authenticator occurs, allowing them to immediately update their state without waiting for polling.|true|No|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.cacheNotificationTimeout`|The timeout in milliseconds for the cache notifications.|5000|No|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.credentialIterations`|Number of iterations to use for password hashing.|10000|No|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.credentialsValidator.type`|The type of credentials store (ldap) to validate requests credentials.|metadata|No|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.credentialsValidator.url`|URL of the LDAP server.|null|Yes|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.credentialsValidator.bindUser`|LDAP bind user username.|null|Yes|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.credentialsValidator.bindPassword`|[Password Provider](../../operations/password-provider.md) LDAP bind user password.|null|Yes|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.credentialsValidator.baseDn`|The point from where the LDAP server will search for users.|null|Yes|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.credentialsValidator.userSearch`|The filter/expression to use for the search. For example, (&(sAMAccountName=%s)(objectClass=user))|null|Yes|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.credentialsValidator.userAttribute`|The attribute id identifying the attribute that will be returned as part of the search. For example, sAMAccountName. |null|Yes|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.credentialsValidator.credentialVerifyDuration`|The duration in seconds for how long valid credentials are verifiable within the cache when not requested.|600|No|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.credentialsValidator.credentialMaxDuration`|The max duration in seconds for valid credentials that can reside in cache regardless of how often they are requested.|3600|No|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.credentialsValidator.credentialCacheSize`|The valid credentials cache size. The cache uses a LRU policy.|100|No|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.skipOnFailure`|If true and the request credential doesn't exists or isn't fully configured in the credentials store, the request will proceed to next Authenticator in the chain.|false|No|
|`druid.auth.authenticator.MyBasicLDAPAuthenticator.authorizerName`|Authorizer that requests should be directed to.|N/A|Yes|
### Creating an Escalator
```
# Escalator
druid.escalator.type=basic
druid.escalator.internalClientUsername=druid_system
druid.escalator.internalClientPassword=password2
druid.escalator.authorizerName=MyBasicMetadataAuthorizer
```
#### Properties
|Property|Description|Default|required|
|--------|-----------|-------|--------|
|`druid.escalator.internalClientUsername`|The escalator will use this username for requests made as the internal system user.|n/a|Yes|
|`druid.escalator.internalClientPassword`|The escalator will use this [Password Provider](../../operations/password-provider.md) for requests made as the internal system user.|n/a|Yes|
|`druid.escalator.authorizerName`|Authorizer that requests should be directed to.|n/a|Yes|
### Creating an Authorizer
```
druid.auth.authorizers=["MyBasicMetadataAuthorizer"]
druid.auth.authorizer.MyBasicMetadataAuthorizer.type=basic
```
To use the Basic authorizer, add an authorizer with type `basic` to the authorizers list.
Configuration of the named authorizer is assigned through properties with the form:
```
druid.auth.authorizer.<authorizerName>.<authorizerProperty>
```
The authorizer configuration examples in the rest of this document will use "MyBasicMetadataAuthorizer" or "MyBasicLDAPAuthorizer" as the name of the authenticators being configured.
#### Properties for Druid metadata store user authorization
|Property|Description|Default|required|
|--------|-----------|-------|--------|
|`druid.auth.authorizer.MyBasicMetadataAuthorizer.enableCacheNotifications`|If true, the Coordinator will notify Druid processes whenever a configuration change to this Authorizer occurs, allowing them to immediately update their state without waiting for polling.|true|No|
|`druid.auth.authorizer.MyBasicMetadataAuthorizer.cacheNotificationTimeout`|The timeout in milliseconds for the cache notifications.|5000|No|
|`druid.auth.authorizer.MyBasicMetadataAuthorizer.initialAdminUser`|The initial admin user with role defined in initialAdminRole property if specified, otherwise the default admin role will be assigned.|admin|No|
|`druid.auth.authorizer.MyBasicMetadataAuthorizer.initialAdminRole`|The initial admin role to create if it doesn't already exists.|admin|No|
|`druid.auth.authorizer.MyBasicMetadataAuthorizer.roleProvider.type`|The type of role provider to authorize requests credentials.|metadata|No
#### Properties for LDAP user authorization
|Property|Description|Default|required|
|--------|-----------|-------|--------|
|`druid.auth.authorizer.MyBasicLDAPAuthorizer.enableCacheNotifications`|If true, the Coordinator will notify Druid processes whenever a configuration change to this Authorizer occurs, allowing them to immediately update their state without waiting for polling.|true|No|
|`druid.auth.authorizer.MyBasicLDAPAuthorizer.cacheNotificationTimeout`|The timeout in milliseconds for the cache notifications.|5000|No|
|`druid.auth.authorizer.MyBasicLDAPAuthorizer.initialAdminUser`|The initial admin user with role defined in initialAdminRole property if specified, otherwise the default admin role will be assigned.|admin|No|
|`druid.auth.authorizer.MyBasicLDAPAuthorizer.initialAdminRole`|The initial admin role to create if it doesn't already exists.|admin|No|
|`druid.auth.authorizer.MyBasicLDAPAuthorizer.initialAdminGroupMapping`|The initial admin group mapping with role defined in initialAdminRole property if specified, otherwise the default admin role will be assigned. The name of this initial admin group mapping will be set to adminGroupMapping|null|No|
|`druid.auth.authorizer.MyBasicLDAPAuthorizer.roleProvider.type`|The type of role provider (ldap) to authorize requests credentials.|metadata|No
|`druid.auth.authorizer.MyBasicLDAPAuthorizer.roleProvider.groupFilters`|Array of LDAP group filters used to filter out the allowed set of groups returned from LDAP search. Filters can be begin with *, or end with ,* to provide configurational flexibility to limit or filter allowed set of groups available to LDAP Authorizer.|null|No|
## Usage
### Coordinator Security API
To use these APIs, a user needs read/write permissions for the CONFIG resource type with name "security".
#### Authentication API
Root path: `/druid-ext/basic-security/authentication`
Each API endpoint includes {authenticatorName}, specifying which Authenticator instance is being configured.
##### User/Credential Management
`GET(/druid-ext/basic-security/authentication/db/{authenticatorName}/users)`
Return a list of all user names.
`GET(/druid-ext/basic-security/authentication/db/{authenticatorName}/users/{userName})`
Return the name and credentials information of the user with name {userName}
`POST(/druid-ext/basic-security/authentication/db/{authenticatorName}/users/{userName})`
Create a new user with name {userName}
`DELETE(/druid-ext/basic-security/authentication/db/{authenticatorName}/users/{userName})`
Delete the user with name {userName}
`POST(/druid-ext/basic-security/authentication/db/{authenticatorName}/users/{userName}/credentials)`
Assign a password used for HTTP basic authentication for {userName}
Content: JSON password request object
Example request body:
```
{
"password": "helloworld"
}
```
##### Cache Load Status
`GET(/druid-ext/basic-security/authentication/loadStatus)`
Return the current load status of the local caches of the authentication Druid metadata store.
#### Authorization API
Root path: `/druid-ext/basic-security/authorization`
Each API endpoint includes {authorizerName}, specifying which Authorizer instance is being configured.
##### User Creation/Deletion
`GET(/druid-ext/basic-security/authorization/db/{authorizerName}/users)`
Return a list of all user names.
`GET(/druid-ext/basic-security/authorization/db/{authorizerName}/users/{userName})`
Return the name and role information of the user with name {userName}
Example output:
```json
{
"name": "druid2",
"roles": [
"druidRole"
]
}
```
This API supports the following flags:
- `?full`: The response will also include the full information for each role currently assigned to the user.
Example output:
```json
{
"name": "druid2",
"roles": [
{
"name": "druidRole",
"permissions": [
{
"resourceAction": {
"resource": {
"name": "A",
"type": "DATASOURCE"
},
"action": "READ"
},
"resourceNamePattern": "A"
},
{
"resourceAction": {
"resource": {
"name": "C",
"type": "CONFIG"
},
"action": "WRITE"
},
"resourceNamePattern": "C"
}
]
}
]
}
```
The output format of this API when `?full` is specified is deprecated and in later versions will be switched to the output format used when both `?full` and `?simplifyPermissions` flag is set.
The `resourceNamePattern` is a compiled version of the resource name regex. It is redundant and complicates the use of this API for clients such as frontends that edit the authorization configuration, as the permission format in this output does not match the format used for adding permissions to a role.
- `?full?simplifyPermissions`: When both `?full` and `?simplifyPermissions` are set, the permissions in the output will contain only a list of `resourceAction` objects, without the extraneous `resourceNamePattern` field.
```json
{
"name": "druid2",
"roles": [
{
"name": "druidRole",
"users": null,
"permissions": [
{
"resource": {
"name": "A",
"type": "DATASOURCE"
},
"action": "READ"
},
{
"resource": {
"name": "C",
"type": "CONFIG"
},
"action": "WRITE"
}
]
}
]
}
```
`POST(/druid-ext/basic-security/authorization/db/{authorizerName}/users/{userName})`
Create a new user with name {userName}
`DELETE(/druid-ext/basic-security/authorization/db/{authorizerName}/users/{userName})`
Delete the user with name {userName}
##### Group mapping Creation/Deletion
`GET(/druid-ext/basic-security/authorization/db/{authorizerName}/groupMappings)`
Return a list of all group mappings.
`GET(/druid-ext/basic-security/authorization/db/{authorizerName}/groupMappings/{groupMappingName})`
Return the group mapping and role information of the group mapping with name {groupMappingName}
`POST(/druid-ext/basic-security/authorization/db/{authorizerName}/groupMappings/{groupMappingName})`
Create a new group mapping with name {groupMappingName}
Content: JSON group mapping object
Example request body:
```
{
"name": "user",
"groupPattern": "CN=aaa,OU=aaa,OU=Groupings,DC=corp,DC=company,DC=com",
"roles": [
"user"
]
}
```
`DELETE(/druid-ext/basic-security/authorization/db/{authorizerName}/groupMappings/{groupMappingName})`
Delete the group mapping with name {groupMappingName}
#### Role Creation/Deletion
`GET(/druid-ext/basic-security/authorization/db/{authorizerName}/roles)`
Return a list of all role names.
`GET(/druid-ext/basic-security/authorization/db/{authorizerName}/roles/{roleName})`
Return name and permissions for the role named {roleName}.
Example output:
```json
{
"name": "druidRole2",
"permissions": [
{
"resourceAction": {
"resource": {
"name": "E",
"type": "DATASOURCE"
},
"action": "WRITE"
},
"resourceNamePattern": "E"
}
]
}
```
The default output format of this API is deprecated and in later versions will be switched to the output format used when the `?simplifyPermissions` flag is set. The `resourceNamePattern` is a compiled version of the resource name regex. It is redundant and complicates the use of this API for clients such as frontends that edit the authorization configuration, as the permission format in this output does not match the format used for adding permissions to a role.
This API supports the following flags:
- `?full`: The output will contain an extra `users` list, containing the users that currently have this role.
```json
{"users":["druid"]}
```
- `?simplifyPermissions`: The permissions in the output will contain only a list of `resourceAction` objects, without the extraneous `resourceNamePattern` field. The `users` field will be null when `?full` is not specified.
Example output:
```json
{
"name": "druidRole2",
"users": null,
"permissions": [
{
"resource": {
"name": "E",
"type": "DATASOURCE"
},
"action": "WRITE"
}
]
}
```
`POST(/druid-ext/basic-security/authorization/db/{authorizerName}/roles/{roleName})`
Create a new role with name {roleName}.
Content: username string
`DELETE(/druid-ext/basic-security/authorization/db/{authorizerName}/roles/{roleName})`
Delete the role with name {roleName}.
#### Role Assignment
`POST(/druid-ext/basic-security/authorization/db/{authorizerName}/users/{userName}/roles/{roleName})`
Assign role {roleName} to user {userName}.
`DELETE(/druid-ext/basic-security/authorization/db/{authorizerName}/users/{userName}/roles/{roleName})`
Unassign role {roleName} from user {userName}
`POST(/druid-ext/basic-security/authorization/db/{authorizerName}/groupMappings/{groupMappingName}/roles/{roleName})`
Assign role {roleName} to group mapping {groupMappingName}.
`DELETE(/druid-ext/basic-security/authorization/db/{authorizerName}/groupMappings/{groupMappingName}/roles/{roleName})`
Unassign role {roleName} from group mapping {groupMappingName}
#### Permissions
`POST(/druid-ext/basic-security/authorization/db/{authorizerName}/roles/{roleName}/permissions)`
Set the permissions of {roleName}. This replaces the previous set of permissions on the role.
Content: List of JSON Resource-Action objects, e.g.:
```
[
{
"resource": {
"name": "wiki.*",
"type": "DATASOURCE"
},
"action": "READ"
},
{
"resource": {
"name": "wikiticker",
"type": "DATASOURCE"
},
"action": "WRITE"
}
]
```
The "name" field for resources in the permission definitions are regexes used to match resource names during authorization checks.
Please see [Defining permissions](#defining-permissions) for more details.
##### Cache Load Status
`GET(/druid-ext/basic-security/authorization/loadStatus)`
Return the current load status of the local caches of the authorization Druid metadata store.
## Default user accounts
### Authenticator
If `druid.auth.authenticator.<authenticator-name>.initialAdminPassword` is set, a default admin user named "admin" will be created, with the specified initial password. If this configuration is omitted, the "admin" user will not be created.
If `druid.auth.authenticator.<authenticator-name>.initialInternalClientPassword` is set, a default internal system user named "druid_system" will be created, with the specified initial password. If this configuration is omitted, the "druid_system" user will not be created.
### Authorizer
Each Authorizer will always have a default "admin" and "druid_system" user with full privileges.
## Defining permissions
There are two action types in Druid: READ and WRITE
There are three resource types in Druid: DATASOURCE, CONFIG, and STATE.
### DATASOURCE
Resource names for this type are datasource names. Specifying a datasource permission allows the administrator to grant users access to specific datasources.
### CONFIG
There are two possible resource names for the "CONFIG" resource type, "CONFIG" and "security". Granting a user access to CONFIG resources allows them to access the following endpoints.
"CONFIG" resource name covers the following endpoints:
|Endpoint|Process Type|
|--------|---------|
|`/druid/coordinator/v1/config`|coordinator|
|`/druid/indexer/v1/worker`|overlord|
|`/druid/indexer/v1/worker/history`|overlord|
|`/druid/worker/v1/disable`|middleManager|
|`/druid/worker/v1/enable`|middleManager|
"security" resource name covers the following endpoint:
|Endpoint|Process Type|
|--------|---------|
|`/druid-ext/basic-security/authentication`|coordinator|
|`/druid-ext/basic-security/authorization`|coordinator|
### STATE
There is only one possible resource name for the "STATE" config resource type, "STATE". Granting a user access to STATE resources allows them to access the following endpoints.
"STATE" resource name covers the following endpoints:
|Endpoint|Process Type|
|--------|---------|
|`/druid/coordinator/v1`|coordinator|
|`/druid/coordinator/v1/rules`|coordinator|
|`/druid/coordinator/v1/rules/history`|coordinator|
|`/druid/coordinator/v1/servers`|coordinator|
|`/druid/coordinator/v1/tiers`|coordinator|
|`/druid/broker/v1`|broker|
|`/druid/v2/candidates`|broker|
|`/druid/indexer/v1/leader`|overlord|
|`/druid/indexer/v1/isLeader`|overlord|
|`/druid/indexer/v1/action`|overlord|
|`/druid/indexer/v1/workers`|overlord|
|`/druid/indexer/v1/scaling`|overlord|
|`/druid/worker/v1/enabled`|middleManager|
|`/druid/worker/v1/tasks`|middleManager|
|`/druid/worker/v1/task/{taskid}/shutdown`|middleManager|
|`/druid/worker/v1/task/{taskid}/log`|middleManager|
|`/druid/historical/v1`|historical|
|`/druid-internal/v1/segments/`|historical|
|`/druid-internal/v1/segments/`|peon|
|`/druid-internal/v1/segments/`|realtime|
|`/status`|all process types|
### HTTP methods
For information on what HTTP methods are supported on a particular request endpoint, please refer to the [API documentation](../../operations/api-reference.md).
GET requires READ permission, while POST and DELETE require WRITE permission.
### SQL Permissions
Queries on Druid datasources require DATASOURCE READ permissions for the specified datasource.
Queries on the [INFORMATION_SCHEMA tables](../../querying/sql.html#information-schema) will
return information about datasources that the caller has DATASOURCE READ access to. Other
datasources will be omitted.
Queries on the [system schema tables](../../querying/sql.html#system-schema) require the following permissions:
- `segments`: Segments will be filtered based on DATASOURCE READ permissions.
- `servers`: The user requires STATE READ permissions.
- `server_segments`: The user requires STATE READ permissions and segments will be filtered based on DATASOURCE READ permissions.
- `tasks`: Tasks will be filtered based on DATASOURCE READ permissions.
## Configuration Propagation
To prevent excessive load on the Coordinator, the Authenticator and Authorizer user/role Druid metadata store state is cached on each Druid process.
Each process will periodically poll the Coordinator for the latest Druid metadata store state, controlled by the `druid.auth.basic.common.pollingPeriod` and `druid.auth.basic.common.maxRandomDelay` properties.
When a configuration update occurs, the Coordinator can optionally notify each process with the updated Druid metadata store state. This behavior is controlled by the `enableCacheNotifications` and `cacheNotificationTimeout` properties on Authenticators and Authorizers.
Note that because of the caching, changes made to the user/role Druid metadata store may not be immediately reflected at each Druid process.

View File

@ -0,0 +1,125 @@
---
id: druid-kerberos
title: "Kerberos"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
Apache Druid Extension to enable Authentication for Druid Processes using Kerberos.
This extension adds an Authenticator which is used to protect HTTP Endpoints using the simple and protected GSSAPI negotiation mechanism [SPNEGO](https://en.wikipedia.org/wiki/SPNEGO).
Make sure to [include](../../development/extensions.md#loading-extensions) `druid-kerberos` as an extension.
## Configuration
### Creating an Authenticator
```
druid.auth.authenticatorChain=["MyKerberosAuthenticator"]
druid.auth.authenticator.MyKerberosAuthenticator.type=kerberos
```
To use the Kerberos authenticator, add an authenticator with type `kerberos` to the authenticatorChain. The example above uses the name "MyKerberosAuthenticator" for the Authenticator.
Configuration of the named authenticator is assigned through properties with the form:
```
druid.auth.authenticator.<authenticatorName>.<authenticatorProperty>
```
The configuration examples in the rest of this document will use "kerberos" as the name of the authenticator being configured.
### Properties
|Property|Possible Values|Description|Default|required|
|--------|---------------|-----------|-------|--------|
|`druid.auth.authenticator.kerberos.serverPrincipal`|`HTTP/_HOST@EXAMPLE.COM`| SPNEGO service principal used by druid processes|empty|Yes|
|`druid.auth.authenticator.kerberos.serverKeytab`|`/etc/security/keytabs/spnego.service.keytab`|SPNego service keytab used by druid processes|empty|Yes|
|`druid.auth.authenticator.kerberos.authToLocal`|`RULE:[1:$1@$0](druid@EXAMPLE.COM)s/.*/druid DEFAULT`|It allows you to set a general rule for mapping principal names to local user names. It will be used if there is not an explicit mapping for the principal name that is being translated.|DEFAULT|No|
|`druid.auth.authenticator.kerberos.cookieSignatureSecret`|`secretString`| Secret used to sign authentication cookies. It is advisable to explicitly set it, if you have multiple druid nodes running on same machine with different ports as the Cookie Specification does not guarantee isolation by port.|<Random value>|No|
|`druid.auth.authenticator.kerberos.authorizerName`|Depends on available authorizers|Authorizer that requests should be directed to|Empty|Yes|
As a note, it is required that the SPNego principal in use by the druid processes must start with HTTP (This specified by [RFC-4559](https://tools.ietf.org/html/rfc4559)) and must be of the form "HTTP/_HOST@REALM".
The special string _HOST will be replaced automatically with the value of config `druid.host`
### `druid.auth.authenticator.kerberos.excludedPaths`
In older releases, the Kerberos authenticator had an `excludedPaths` property that allowed the user to specify a list of paths where authentication checks should be skipped. This property has been removed from the Kerberos authenticator because the path exclusion functionality is now handled across all authenticators/authorizers by setting `druid.auth.unsecuredPaths`, as described in the [main auth documentation](../../design/auth.md).
### Auth to Local Syntax
`druid.auth.authenticator.kerberos.authToLocal` allows you to set a general rules for mapping principal names to local user names.
The syntax for mapping rules is `RULE:\[n:string](regexp)s/pattern/replacement/g`. The integer n indicates how many components the target principal should have. If this matches, then a string will be formed from string, substituting the realm of the principal for $0 and the nth component of the principal for $n. e.g. if the principal was druid/admin then `\[2:$2$1suffix]` would result in the string `admindruidsuffix`.
If this string matches regexp, then the s//\[g] substitution command will be run over the string. The optional g will cause the substitution to be global over the string, instead of replacing only the first match in the string.
If required, multiple rules can be be joined by newline character and specified as a String.
### Increasing HTTP Header size for large SPNEGO negotiate header
In Active Directory environment, SPNEGO token in the Authorization header includes PAC (Privilege Access Certificate) information,
which includes all security groups for the user. In some cases when the user belongs to many security groups the header to grow beyond what druid can handle by default.
In such cases, max request header size that druid can handle can be increased by setting `druid.server.http.maxRequestHeaderSize` (default 8Kb) and `druid.router.http.maxRequestBufferSize` (default 8Kb).
## Configuring Kerberos Escalated Client
Druid internal processes communicate with each other using an escalated http Client. A Kerberos enabled escalated HTTP Client can be configured by following properties -
|Property|Example Values|Description|Default|required|
|--------|---------------|-----------|-------|--------|
|`druid.escalator.type`|`kerberos`| Type of Escalator client used for internal process communication.|n/a|Yes|
|`druid.escalator.internalClientPrincipal`|`druid@EXAMPLE.COM`| Principal user name, used for internal process communication|n/a|Yes|
|`druid.escalator.internalClientKeytab`|`/etc/security/keytabs/druid.keytab`|Path to keytab file used for internal process communication|n/a|Yes|
|`druid.escalator.authorizerName`|`MyBasicAuthorizer`|Authorizer that requests should be directed to.|n/a|Yes|
## Accessing Druid HTTP end points when kerberos security is enabled
1. To access druid HTTP endpoints via curl user will need to first login using `kinit` command as follows -
```
kinit -k -t <path_to_keytab_file> user@REALM.COM
```
2. Once the login is successful verify that login is successful using `klist` command
3. Now you can access druid HTTP endpoints using curl command as follows -
```
curl --negotiate -u:anyUser -b ~/cookies.txt -c ~/cookies.txt -X POST -H'Content-Type: application/json' <HTTP_END_POINT>
```
e.g to send a query from file `query.json` to the Druid Broker use this command -
```
curl --negotiate -u:anyUser -b ~/cookies.txt -c ~/cookies.txt -X POST -H'Content-Type: application/json' http://broker-host:port/druid/v2/?pretty -d @query.json
```
Note: Above command will authenticate the user first time using SPNego negotiate mechanism and store the authentication cookie in file. For subsequent requests the cookie will be used for authentication.
## Accessing Coordinator or Overlord console from web browser
To access Coordinator/Overlord console from browser you will need to configure your browser for SPNego authentication as follows -
1. Safari - No configurations required.
2. Firefox - Open firefox and follow these steps -
1. Go to `about:config` and search for `network.negotiate-auth.trusted-uris`.
2. Double-click and add the following values: `"http://druid-coordinator-hostname:ui-port"` and `"http://druid-overlord-hostname:port"`
3. Google Chrome - From the command line run following commands -
1. `google-chrome --auth-server-whitelist="druid-coordinator-hostname" --auth-negotiate-delegate-whitelist="druid-coordinator-hostname"`
2. `google-chrome --auth-server-whitelist="druid-overlord-hostname" --auth-negotiate-delegate-whitelist="druid-overlord-hostname"`
4. Internet Explorer -
1. Configure trusted websites to include `"druid-coordinator-hostname"` and `"druid-overlord-hostname"`
2. Allow negotiation for the UI website.
## Sending Queries programmatically
Many HTTP client libraries, such as Apache Commons [HttpComponents](https://hc.apache.org/), already have support for performing SPNEGO authentication. You can use any of the available HTTP client library to communicate with druid cluster.

View File

@ -0,0 +1,155 @@
---
id: druid-lookups
title: "Cached Lookup Module"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
> Please note that this is an experimental module and the development/testing still at early stage. Feel free to try it and give us your feedback.
## Description
This Apache Druid module provides a per-lookup caching mechanism for JDBC data sources.
The main goal of this cache is to speed up the access to a high latency lookup sources and to provide a caching isolation for every lookup source.
Thus user can define various caching strategies or and implementation per lookup, even if the source is the same.
This module can be used side to side with other lookup module like the global cached lookup module.
To use this extension please make sure to [include](../../development/extensions.md#loading-extensions) `druid-lookups-cached-single` as an extension.
> If using JDBC, you will need to add your database's client JAR files to the extension's directory.
> For Postgres, the connector JAR is already included.
> For MySQL, you can get it from https://dev.mysql.com/downloads/connector/j/.
> Copy or symlink the downloaded file to `extensions/druid-lookups-cached-single` under the distribution root directory.
## Architecture
Generally speaking this module can be divided into two main component, namely, the data fetcher layer and caching layer.
### Data Fetcher layer
First part is the data fetcher layer API `DataFetcher`, that exposes a set of fetch methods to fetch data from the actual Lookup dimension source.
For instance `JdbcDataFetcher` provides an implementation of `DataFetcher` that can be used to fetch key/value from a RDBMS via JDBC driver.
If you need new type of data fetcher, all you need to do, is to implement the interface `DataFetcher` and load it via another druid module.
### Caching layer
This extension comes with two different caching strategies. First strategy is a poll based and the second is a load based.
#### Poll lookup cache
The poll strategy cache strategy will fetch and swap all the pair of key/values periodically from the lookup source.
Hence, user should make sure that the cache can fit all the data.
The current implementation provides 2 type of poll cache, the first is on-heap (uses immutable map), while the second uses MapDB based off-heap map.
User can also implement a different lookup polling cache by implementing `PollingCacheFactory` and `PollingCache` interfaces.
#### Loading lookup
Loading cache strategy will load the key/value pair upon request on the key it self, the general algorithm is load key if absent.
Once the key/value pair is loaded eviction will occur according to the cache eviction policy.
This module comes with two loading lookup implementation, the first is on-heap backed by a Guava cache implementation, the second is MapDB off-heap implementation.
Both implementations offer various eviction strategies.
Same for Loading cache, developer can implement a new type of loading cache by implementing `LookupLoadingCache` interface.
## Configuration and Operation:
### Polling Lookup
**Note that the current implementation of `offHeapPolling` and `onHeapPolling` will create two caches one to lookup value based on key and the other to reverse lookup the key from value**
|Field|Type|Description|Required|default|
|-----|----|-----------|--------|-------|
|dataFetcher|JSON object|Specifies the lookup data fetcher type to use in order to fetch data|yes|null|
|cacheFactory|JSON Object|Cache factory implementation|no |onHeapPolling|
|pollPeriod|Period|polling period |no |null (poll once)|
##### Example of Polling On-heap Lookup
This example demonstrates a polling cache that will update its on-heap cache every 10 minutes
```json
{
"type":"pollingLookup",
"pollPeriod":"PT10M",
"dataFetcher":{ "type":"jdbcDataFetcher", "connectorConfig":"jdbc://mysql://localhost:3306/my_data_base", "table":"lookup_table_name", "keyColumn":"key_column_name", "valueColumn": "value_column_name"},
"cacheFactory":{"type":"onHeapPolling"}
}
```
##### Example Polling Off-heap Lookup
This example demonstrates an off-heap lookup that will be cached once and never swapped `(pollPeriod == null)`
```json
{
"type":"pollingLookup",
"dataFetcher":{ "type":"jdbcDataFetcher", "connectorConfig":"jdbc://mysql://localhost:3306/my_data_base", "table":"lookup_table_name", "keyColumn":"key_column_name", "valueColumn": "value_column_name"},
"cacheFactory":{"type":"offHeapPolling"}
}
```
### Loading lookup
|Field|Type|Description|Required|default|
|-----|----|-----------|--------|-------|
|dataFetcher|JSON object|Specifies the lookup data fetcher type to use in order to fetch data|yes|null|
|loadingCacheSpec|JSON Object|Lookup cache spec implementation|yes |null|
|reverseLoadingCacheSpec|JSON Object| Reverse lookup cache implementation|yes |null|
##### Example Loading On-heap Guava
Guava cache configuration spec.
|Field|Type|Description|Required|default|
|-----|----|-----------|--------|-------|
|concurrencyLevel|int|Allowed concurrency among update operations|no|4|
|initialCapacity|int|Initial capacity size|no |null|
|maximumSize|long| Specifies the maximum number of entries the cache may contain.|no |null (infinite capacity)|
|expireAfterAccess|long| Specifies the eviction time after last read in milliseconds.|no |null (No read-time-based eviction when set to null)|
|expireAfterWrite|long| Specifies the eviction time after last write in milliseconds.|no |null (No write-time-based eviction when set to null)|
```json
{
"type":"loadingLookup",
"dataFetcher":{ "type":"jdbcDataFetcher", "connectorConfig":"jdbc://mysql://localhost:3306/my_data_base", "table":"lookup_table_name", "keyColumn":"key_column_name", "valueColumn": "value_column_name"},
"loadingCacheSpec":{"type":"guava"},
"reverseLoadingCacheSpec":{"type":"guava", "maximumSize":500000, "expireAfterAccess":100000, "expireAfterAccess":10000}
}
```
##### Example Loading Off-heap MapDB
Off heap cache is backed by [MapDB](http://www.mapdb.org/) implementation. MapDB is using direct memory as memory pool, please take that into account when limiting the JVM direct memory setup.
|Field|Type|Description|Required|default|
|-----|----|-----------|--------|-------|
|maxStoreSize|double|maximal size of store in GB, if store is larger entries will start expiring|no |0|
|maxEntriesSize|long| Specifies the maximum number of entries the cache may contain.|no |0 (infinite capacity)|
|expireAfterAccess|long| Specifies the eviction time after last read in milliseconds.|no |0 (No read-time-based eviction when set to null)|
|expireAfterWrite|long| Specifies the eviction time after last write in milliseconds.|no |0 (No write-time-based eviction when set to null)|
```json
{
"type":"loadingLookup",
"dataFetcher":{ "type":"jdbcDataFetcher", "connectorConfig":"jdbc://mysql://localhost:3306/my_data_base", "table":"lookup_table_name", "keyColumn":"key_column_name", "valueColumn": "value_column_name"},
"loadingCacheSpec":{"type":"mapDb", "maxEntriesSize":100000},
"reverseLoadingCacheSpec":{"type":"mapDb", "maxStoreSize":5, "expireAfterAccess":100000, "expireAfterAccess":10000}
}
```

View File

@ -0,0 +1,46 @@
---
id: druid-pac4j
title: "Druid pac4j based Security extension"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
Apache Druid Extension to enable [OpenID Connect](https://openid.net/connect/) based Authentication for Druid Processes using [pac4j](https://github.com/pac4j/pac4j) as the underlying client library.
This can be used with any authentication server that supports same e.g. [Okta](https://developer.okta.com/).
This extension should only be used at the router node to enable a group of users in existing authentication server to interact with Druid cluster, using the [Web Console](../../operations/druid-console.html). This extension does not support JDBC client authentication.
## Configuration
### Creating an Authenticator
```
druid.auth.authenticatorChain=["pac4j"]
druid.auth.authenticator.pac4j.type=pac4j
```
### Properties
|Property|Description|Default|required|
|--------|---------------|-----------|-------|
|`druid.auth.pac4j.cookiePassphrase`|passphrase for encrypting the cookies used to manage authentication session with browser. It can be provided as plaintext string or The [Password Provider](../../operations/password-provider.md).|none|Yes|
|`druid.auth.pac4j.readTimeout`|Socket connect and read timeout duration used when communicating with authentication server|PT5S|No|
|`druid.auth.pac4j.enableCustomSslContext`|Whether to use custom SSLContext setup via [simple-client-sslcontext](simple-client-sslcontext.md) extension which must be added to extensions list when this property is set to true.|false|No|
|`druid.auth.pac4j.oidc.clientID`|OAuth Client Application id.|none|Yes|
|`druid.auth.pac4j.oidc.clientSecret`|OAuth Client Application secret. It can be provided as plaintext string or The [Password Provider](../../operations/password-provider.md).|none|Yes|
|`druid.auth.pac4j.oidc.discoveryURI`|discovery URI for fetching OP metadata [see this](http://openid.net/specs/openid-connect-discovery-1_0.html).|none|Yes|

View File

@ -0,0 +1,127 @@
---
id: druid-ranger-security
title: "Apache Ranger Security"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This Apache Druid extension adds an Authorizer which implements access control for Druid, backed by [Apache Ranger](https://ranger.apache.org/). Please see [Authentication and Authorization](../../design/auth.md) for more information on the basic facilities this extension provides.
Make sure to [include](../../development/extensions.md#loading-extensions) `druid-ranger-security` as an extension.
> The latest release of Apache Ranger is at the time of writing version 2.0. This version has a dependency on `log4j 1.2.17` which has a vulnerability if you configure it to use a `SocketServer` (CVE-2019-17571). Next to that, it also includes Kafka 2.0.0 which has 2 known vulnerabilities (CVE-2019-12399, CVE-2018-17196). Kafka can be used by the audit component in Ranger, but is not required.
## Configuration
Support for Apache Ranger authorization consists of three elements:
* configuring the extension in Apache Druid
* configuring the connection to Apache Ranger
* providing the service definition for Druid to Apache Ranger
### Enabling the extension
Ensure that you have a valid authenticator chain and escalator set in your `common.runtime.properties`. For every authenticator your wish to use the authorizer for, set `druid.auth.authenticator.<authenticatorName>.authorizerName` to the name you will give the authorizer, e.g. `ranger`.
Then add the following and amend to your needs (in case you need to use multiple authorizers):
```
druid.auth.authorizers=["ranger"]
druid.auth.authorizer.ranger.type=ranger
```
The following is an example that showcases using `druid-basic-security` for authentication and `druid-ranger-security` for authorization.
```
druid.auth.authenticatorChain=["basic"]
druid.auth.authenticator.basic.type=basic
druid.auth.authenticator.basic.initialAdminPassword=password1
druid.auth.authenticator.basic.initialInternalClientPassword=password2
druid.auth.authenticator.basic.credentialsValidator.type=metadata
druid.auth.authenticator.basic.skipOnFailure=false
druid.auth.authenticator.basic.enableCacheNotifications=true
druid.auth.authenticator.basic.authorizerName=ranger
druid.auth.authorizers=["ranger"]
druid.auth.authorizer.ranger.type=ranger
# Escalator
druid.escalator.type=basic
druid.escalator.internalClientUsername=druid_system
druid.escalator.internalClientPassword=password2
druid.escalator.authorizerName=ranger
```
> Contrary to the documentation of `druid-basic-auth` Ranger does not automatically provision a highly privileged system user, you will need to do this yourself. This system user in the case of `druid-basic-auth` is named `druid_system` and for the escalator it is configurable, as shown above. Make sure to take note of these user names and configure `READ` access to `state:STATE` and to `config:security` in your ranger policies, otherwise system services will not work properly.
#### Properties to configure the extension in Apache Druid
|Property|Description|Default|required|
|--------|-----------|-------|--------|
|`druid.auth.ranger.keytab`|Defines the keytab to be used while authenticating against Apache Ranger to obtain policies and provide auditing|null|No|
|`druid.auth.ranger.principal`|Defines the principal to be used while authenticating against Apache Ranger to obtain policies and provide auditing|null|No|
|`druid.auth.ranger.use_ugi`|Determines if groups that the authenticated user belongs to should be obtained from Hadoop's `UserGroupInformation`|null|No|
### Configuring the connection to Apache Ranger
The Apache Ranger authorization extension will read several configuration files. Discussing the contents of those files is beyond the scope of this document. Depending on your needs you will need to create them. The minimum you will need to have is a `ranger-druid-security.xml` file that you will need to put in the classpath (e.g. `_common`). For auditing, the configuration is in `ranger-druid-audit.xml`.
### Adding the service definition for Apache Druid to Apache Ranger
At the time of writing of this document Apache Ranger (2.0) does not include an out of the box service and service definition for Druid. You can add the service definition to Apache Ranger by entering the following command:
`curl -u <user>:<password> -d "@ranger-servicedef-druid.json" -X POST -H "Accept: application/json" -H "Content-Type: application/json" http://localhost:6080/service/public/v2/api/servicedef/`
You should get back `json` describing the service definition you just added. You can now go to the web interface of Apache Ranger which should now include a widget for "Druid". Click the plus sign and create the new service. Ensure your service name is equal to what you configured in `ranger-druid-security.xml`.
#### Configuring Apache Ranger policies
When installing a new Druid service in Apache Ranger for the first time, Ranger will provision the policies to allow the administrative user `read/write` access to all properties and data sources. You might want to limit this. Do not forget to add the correct policies for the `druid_system` user and the `internalClientUserName` of the escalator.
> Loading new data sources requires `write` access to the `datasource` prior to the loading itself. So if you want to create a datasource `wikipedia` you are required to have an `allow` policy inside Apache Ranger before trying to load the spec.
## Usage
### HTTP methods
For information on what HTTP methods are supported for a particular request endpoint, please refer to the [API documentation](../../operations/api-reference.md).
GET requires READ permission, while POST and DELETE require WRITE permission.
### SQL Permissions
Queries on Druid datasources require DATASOURCE READ permissions for the specified datasource.
Queries on the [INFORMATION_SCHEMA tables](../../querying/sql.html#information-schema) will return information about datasources that the caller has DATASOURCE READ access to. Other datasources will be omitted.
Queries on the [system schema tables](../../querying/sql.html#system-schema) require the following permissions:
- `segments`: Segments will be filtered based on DATASOURCE READ permissions.
- `servers`: The user requires STATE READ permissions.
- `server_segments`: The user requires STATE READ permissions and segments will be filtered based on DATASOURCE READ permissions.
- `tasks`: Tasks will be filtered based on DATASOURCE READ permissions.
### Debugging
If you face difficulty grasping why access is denied to certain elements, and the `audit` section in Apache Ranger does not give you any detail, you can enable debug logging for `org.apache.druid.security.ranger`. To do so add the following in your `log4j2.xml`:
```xml
<!-- Set level="debug" to see access requests to Apache Ranger -->
<Logger name="org.apache.druid.security" level="debug" additivity="false">
<Appender-ref ref="Console"/>
</Logger>
```

View File

@ -0,0 +1,26 @@
---
id: examples
title: "Extension Examples"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This extension was removed in Apache Druid 0.16.0. In prior versions, the extension provided obsolete facilities to ingest data from the Twitter 'Spritzer' data stream as well as the Wikipedia changes IRC channel.

View File

@ -0,0 +1,58 @@
---
id: google
title: "Google Cloud Storage"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
## Google Cloud Storage Extension
This extension allows you to do 2 things:
* [Ingest data](#reading-data-from-google-cloud-storage) from files stored in Google Cloud Storage.
* Write segments to [deep storage](#deep-storage) in GCS.
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-google-extensions` extension.
### Required Configuration
To configure connectivity to google cloud, run druid processes with `GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account_keyfile` in the environment.
### Reading data from Google Cloud Storage
The [Google Cloud Storage input source](../../ingestion/native-batch.md#google-cloud-storage-input-source) is supported by the [Parallel task](../../ingestion/native-batch.md#parallel-task)
to read objects directly from Google Cloud Storage. If you use the [Hadoop task](../../ingestion/hadoop.md),
you can read data from Google Cloud Storage by specifying the paths in your [`inputSpec`](../../ingestion/hadoop.md#inputspec).
Objects can also be read directly from Google Cloud Storage via the [StaticGoogleBlobStoreFirehose](../../ingestion/native-batch.md#staticgoogleblobstorefirehose)
### Deep Storage
Deep storage can be written to Google Cloud Storage either via this extension or the [druid-hdfs-storage extension](../extensions-core/hdfs.md).
#### Configuration
To configure connectivity to google cloud, run druid processes with `GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account_keyfile` in the environment.
|Property|Description|Possible Values|Default|
|--------|---------------|-----------|-------|
|`druid.storage.type`|google||Must be set.|
|`druid.google.bucket`||Google Storage bucket name.|Must be set.|
|`druid.google.prefix`|A prefix string that will be prepended to the blob names for the segments published to Google deep storage| |""|
|`druid.google.maxListingLength`|maximum number of input files matching a given prefix to retrieve at a time| |1024|

View File

@ -0,0 +1,169 @@
---
id: hdfs
title: "HDFS"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-hdfs-storage` as an extension and run druid processes with `GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account_keyfile` in the environment.
## Deep Storage
### Configuration for HDFS
|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.storage.type`|hdfs||Must be set.|
|`druid.storage.storageDirectory`||Directory for storing segments.|Must be set.|
|`druid.hadoop.security.kerberos.principal`|`druid@EXAMPLE.COM`| Principal user name |empty|
|`druid.hadoop.security.kerberos.keytab`|`/etc/security/keytabs/druid.headlessUser.keytab`|Path to keytab file|empty|
Besides the above settings, you also need to include all Hadoop configuration files (such as `core-site.xml`, `hdfs-site.xml`)
in the Druid classpath. One way to do this is copying all those files under `${DRUID_HOME}/conf/_common`.
If you are using the Hadoop ingestion, set your output directory to be a location on Hadoop and it will work.
If you want to eagerly authenticate against a secured hadoop/hdfs cluster you must set `druid.hadoop.security.kerberos.principal` and `druid.hadoop.security.kerberos.keytab`, this is an alternative to the cron job method that runs `kinit` command periodically.
### Configuration for Cloud Storage
You can also use the AWS S3 or the Google Cloud Storage as the deep storage via HDFS.
#### Configuration for AWS S3
To use the AWS S3 as the deep storage, you need to configure `druid.storage.storageDirectory` properly.
|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.storage.type`|hdfs| |Must be set.|
|`druid.storage.storageDirectory`|s3a://bucket/example/directory or s3n://bucket/example/directory|Path to the deep storage|Must be set.|
You also need to include the [Hadoop AWS module](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html), especially the `hadoop-aws.jar` in the Druid classpath.
Run the below command to install the `hadoop-aws.jar` file under `${DRUID_HOME}/extensions/druid-hdfs-storage` in all nodes.
```bash
java -classpath "${DRUID_HOME}lib/*" org.apache.druid.cli.Main tools pull-deps -h "org.apache.hadoop:hadoop-aws:${HADOOP_VERSION}";
cp ${DRUID_HOME}/hadoop-dependencies/hadoop-aws/${HADOOP_VERSION}/hadoop-aws-${HADOOP_VERSION}.jar ${DRUID_HOME}/extensions/druid-hdfs-storage/
```
Finally, you need to add the below properties in the `core-site.xml`.
For more configurations, see the [Hadoop AWS module](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html).
```xml
<property>
<name>fs.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
<description>The implementation class of the S3A Filesystem</description>
</property>
<property>
<name>fs.AbstractFileSystem.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3A</value>
<description>The implementation class of the S3A AbstractFileSystem.</description>
</property>
<property>
<name>fs.s3a.access.key</name>
<description>AWS access key ID. Omit for IAM role-based or provider-based authentication.</description>
<value>your access key</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<description>AWS secret key. Omit for IAM role-based or provider-based authentication.</description>
<value>your secret key</value>
</property>
```
#### Configuration for Google Cloud Storage
To use the Google Cloud Storage as the deep storage, you need to configure `druid.storage.storageDirectory` properly.
|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.storage.type`|hdfs||Must be set.|
|`druid.storage.storageDirectory`|gs://bucket/example/directory|Path to the deep storage|Must be set.|
All services that need to access GCS need to have the [GCS connector jar](https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage#other_sparkhadoop_clusters) in their class path.
Please read the [install instructions](https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcs/INSTALL.md)
to properly set up the necessary libraries and configurations.
One option is to place this jar in `${DRUID_HOME}/lib/` and `${DRUID_HOME}/extensions/druid-hdfs-storage/`.
Finally, you need to configure the `core-site.xml` file with the filesystem
and authentication properties needed for GCS. You may want to copy the below
example properties. Please follow the instructions at
[https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcs/INSTALL.md](https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcs/INSTALL.md)
for more details.
For more configurations, [GCS core default](https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcs/conf/gcs-core-default.xml)
and [GCS core template](https://github.com/GoogleCloudPlatform/bdutil/blob/master/conf/hadoop2/gcs-core-template.xml).
```xml
<property>
<name>fs.gs.impl</name>
<value>com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem</value>
<description>The FileSystem for gs: (GCS) uris.</description>
</property>
<property>
<name>fs.AbstractFileSystem.gs.impl</name>
<value>com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS</value>
<description>The AbstractFileSystem for gs: uris.</description>
</property>
<property>
<name>google.cloud.auth.service.account.enable</name>
<value>true</value>
<description>
Whether to use a service account for GCS authorization.
Setting this property to `false` will disable use of service accounts for
authentication.
</description>
</property>
<property>
<name>google.cloud.auth.service.account.json.keyfile</name>
<value>/path/to/keyfile</value>
<description>
The JSON key file of the service account used for GCS
access when google.cloud.auth.service.account.enable is true.
</description>
</property>
```
Tested with Druid 0.17.0, Hadoop 2.8.5 and gcs-connector jar 2.0.0-hadoop2.
## Reading data from HDFS or Cloud Storage
### Native batch ingestion
The [HDFS input source](../../ingestion/native-batch.md#hdfs-input-source) is supported by the [Parallel task](../../ingestion/native-batch.md#parallel-task)
to read files directly from the HDFS Storage. You may be able to read objects from cloud storage
with the HDFS input source, but we highly recommend to use a proper
[Input Source](../../ingestion/native-batch.md#input-sources) instead if possible because
it is simple to set up. For now, only the [S3 input source](../../ingestion/native-batch.md#s3-input-source)
and the [Google Cloud Storage input source](../../ingestion/native-batch.md#google-cloud-storage-input-source)
are supported for cloud storage types, and so you may still want to use the HDFS input source
to read from cloud storage other than those two.
### Hadoop-based ingestion
If you use the [Hadoop ingestion](../../ingestion/hadoop.md), you can read data from HDFS
by specifying the paths in your [`inputSpec`](../../ingestion/hadoop.md#inputspec).
See the [Static](../../ingestion/hadoop.md#static) inputSpec for details.

View File

@ -0,0 +1,66 @@
---
id: kafka-extraction-namespace
title: "Apache Kafka Lookups"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
> Lookups are an [experimental](../experimental.md) feature.
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-lookups-cached-global` and `druid-kafka-extraction-namespace` as an extension.
If you need updates to populate as promptly as possible, it is possible to plug into a Kafka topic whose key is the old value and message is the desired new value (both in UTF-8) as a LookupExtractorFactory.
```json
{
"type":"kafka",
"kafkaTopic":"testTopic",
"kafkaProperties":{"zookeeper.connect":"somehost:2181/kafka"}
}
```
|Parameter|Description|Required|Default|
|---------|-----------|--------|-------|
|`kafkaTopic`|The Kafka topic to read the data from|Yes||
|`kafkaProperties`|Kafka consumer properties. At least"zookeeper.connect" must be specified. Only the zookeeper connector is supported|Yes||
|`connectTimeout`|How long to wait for an initial connection|No|`0` (do not wait)|
|`isOneToOne`|The map is a one-to-one (see [Lookup DimensionSpecs](../../querying/dimensionspecs.md))|No|`false`|
The extension `kafka-extraction-namespace` enables reading from a Kafka feed which has name/key pairs to allow renaming of dimension values. An example use case would be to rename an ID to a human readable format.
The consumer properties `group.id` and `auto.offset.reset` CANNOT be set in `kafkaProperties` as they are set by the extension as `UUID.randomUUID().toString()` and `smallest` respectively.
See [lookups](../../querying/lookups.md) for how to configure and use lookups.
## Limitations
Currently the Kafka lookup extractor feeds the entire Kafka stream into a local cache. If you are using on-heap caching, this can easily clobber your java heap if the Kafka stream spews a lot of unique keys.
off-heap caching should alleviate these concerns, but there is still a limit to the quantity of data that can be stored.
There is currently no eviction policy.
## Testing the Kafka rename functionality
To test this setup, you can send key/value pairs to a Kafka stream via the following producer console:
```
./bin/kafka-console-producer.sh --property parse.key=true --property key.separator="->" --broker-list localhost:9092 --topic testTopic
```
Renames can then be published as `OLD_VAL->NEW_VAL` followed by newline (enter or return)

View File

@ -0,0 +1,417 @@
---
id: kafka-ingestion
title: "Apache Kafka ingestion"
sidebar_label: "Apache Kafka"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
The Kafka indexing service enables the configuration of *supervisors* on the Overlord, which facilitate ingestion from
Kafka by managing the creation and lifetime of Kafka indexing tasks. These indexing tasks read events using Kafka's own
partition and offset mechanism and are therefore able to provide guarantees of exactly-once ingestion.
The supervisor oversees the state of the indexing tasks to coordinate handoffs,
manage failures, and ensure that the scalability and replication requirements are maintained.
This service is provided in the `druid-kafka-indexing-service` core Apache Druid extension (see
[Including Extensions](../../development/extensions.md#loading-extensions)).
> The Kafka indexing service supports transactional topics which were introduced in Kafka 0.11.x. These changes make the
> Kafka consumer that Druid uses incompatible with older brokers. Ensure that your Kafka brokers are version 0.11.x or
> better before using this functionality. Refer [Kafka upgrade guide](https://kafka.apache.org/documentation/#upgrade)
> if you are using older version of Kafka brokers.
## Tutorial
This page contains reference documentation for Apache Kafka-based ingestion.
For a walk-through instead, check out the [Loading from Apache Kafka](../../tutorials/tutorial-kafka.md) tutorial.
## Submitting a Supervisor Spec
The Kafka indexing service requires that the `druid-kafka-indexing-service` extension be loaded on both the Overlord and the
MiddleManagers. A supervisor for a dataSource is started by submitting a supervisor spec via HTTP POST to
`http://<OVERLORD_IP>:<OVERLORD_PORT>/druid/indexer/v1/supervisor`, for example:
```
curl -X POST -H 'Content-Type: application/json' -d @supervisor-spec.json http://localhost:8090/druid/indexer/v1/supervisor
```
A sample supervisor spec is shown below:
```json
{
"type": "kafka",
"dataSchema": {
"dataSource": "metrics-kafka",
"timestampSpec": {
"column": "timestamp",
"format": "auto"
},
"dimensionsSpec": {
"dimensions": [],
"dimensionExclusions": [
"timestamp",
"value"
]
},
"metricsSpec": [
{
"name": "count",
"type": "count"
},
{
"name": "value_sum",
"fieldName": "value",
"type": "doubleSum"
},
{
"name": "value_min",
"fieldName": "value",
"type": "doubleMin"
},
{
"name": "value_max",
"fieldName": "value",
"type": "doubleMax"
}
],
"granularitySpec": {
"type": "uniform",
"segmentGranularity": "HOUR",
"queryGranularity": "NONE"
}
},
"ioConfig": {
"topic": "metrics",
"inputFormat": {
"type": "json"
},
"consumerProperties": {
"bootstrap.servers": "localhost:9092"
},
"taskCount": 1,
"replicas": 1,
"taskDuration": "PT1H"
},
"tuningConfig": {
"type": "kafka",
"maxRowsPerSegment": 5000000
}
}
```
## Supervisor Configuration
|Field|Description|Required|
|--------|-----------|---------|
|`type`|The supervisor type, this should always be `kafka`.|yes|
|`dataSchema`|The schema that will be used by the Kafka indexing task during ingestion. See [`dataSchema`](../../ingestion/index.md#dataschema) for details.|yes|
|`ioConfig`|A KafkaSupervisorIOConfig object for configuring Kafka connection and I/O-related settings for the supervisor and indexing task. See [KafkaSupervisorIOConfig](#kafkasupervisorioconfig) below.|yes|
|`tuningConfig`|A KafkaSupervisorTuningConfig object for configuring performance-related settings for the supervisor and indexing tasks. See [KafkaSupervisorTuningConfig](#kafkasupervisortuningconfig) below.|no|
### KafkaSupervisorIOConfig
|Field|Type|Description|Required|
|-----|----|-----------|--------|
|`topic`|String|The Kafka topic to read from. This must be a specific topic as topic patterns are not supported.|yes|
|`inputFormat`|Object|[`inputFormat`](../../ingestion/data-formats.md#input-format) to specify how to parse input data. See [the below section](#specifying-data-format) for details about specifying the input format.|yes|
|`consumerProperties`|Map<String, Object>|A map of properties to be passed to the Kafka consumer. This must contain a property `bootstrap.servers` with a list of Kafka brokers in the form: `<BROKER_1>:<PORT_1>,<BROKER_2>:<PORT_2>,...`. For SSL connections, the `keystore`, `truststore` and `key` passwords can be provided as a [Password Provider](../../operations/password-provider.md) or String password.|yes|
|`pollTimeout`|Long|The length of time to wait for the Kafka consumer to poll records, in milliseconds|no (default == 100)|
|`replicas`|Integer|The number of replica sets, where 1 means a single set of tasks (no replication). Replica tasks will always be assigned to different workers to provide resiliency against process failure.|no (default == 1)|
|`taskCount`|Integer|The maximum number of *reading* tasks in a *replica set*. This means that the maximum number of reading tasks will be `taskCount * replicas` and the total number of tasks (*reading* + *publishing*) will be higher than this. See [Capacity Planning](#capacity-planning) below for more details. The number of reading tasks will be less than `taskCount` if `taskCount > {numKafkaPartitions}`.|no (default == 1)|
|`taskDuration`|ISO8601 Period|The length of time before tasks stop reading and begin publishing their segment.|no (default == PT1H)|
|`startDelay`|ISO8601 Period|The period to wait before the supervisor starts managing tasks.|no (default == PT5S)|
|`period`|ISO8601 Period|How often the supervisor will execute its management logic. Note that the supervisor will also run in response to certain events (such as tasks succeeding, failing, and reaching their taskDuration) so this value specifies the maximum time between iterations.|no (default == PT30S)|
|`useEarliestOffset`|Boolean|If a supervisor is managing a dataSource for the first time, it will obtain a set of starting offsets from Kafka. This flag determines whether it retrieves the earliest or latest offsets in Kafka. Under normal circumstances, subsequent tasks will start from where the previous segments ended so this flag will only be used on first run.|no (default == false)|
|`completionTimeout`|ISO8601 Period|The length of time to wait before declaring a publishing task as failed and terminating it. If this is set too low, your tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|no (default == PT30M)|
|`lateMessageRejectionStartDateTime`|ISO8601 DateTime|Configure tasks to reject messages with timestamps earlier than this date time; for example if this is set to `2016-01-01T11:00Z` and the supervisor creates a task at *2016-01-01T12:00Z*, messages with timestamps earlier than *2016-01-01T11:00Z* will be dropped. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments (e.g. a realtime and a nightly batch ingestion pipeline).|no (default == none)|
|`lateMessageRejectionPeriod`|ISO8601 Period|Configure tasks to reject messages with timestamps earlier than this period before the task was created; for example if this is set to `PT1H` and the supervisor creates a task at *2016-01-01T12:00Z*, messages with timestamps earlier than *2016-01-01T11:00Z* will be dropped. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments (e.g. a realtime and a nightly batch ingestion pipeline). Please note that only one of `lateMessageRejectionPeriod` or `lateMessageRejectionStartDateTime` can be specified.|no (default == none)|
|`earlyMessageRejectionPeriod`|ISO8601 Period|Configure tasks to reject messages with timestamps later than this period after the task reached its taskDuration; for example if this is set to `PT1H`, the taskDuration is set to `PT1H` and the supervisor creates a task at *2016-01-01T12:00Z*, messages with timestamps later than *2016-01-01T14:00Z* will be dropped. **Note:** Tasks sometimes run past their task duration, for example, in cases of supervisor failover. Setting earlyMessageRejectionPeriod too low may cause messages to be dropped unexpectedly whenever a task runs past its originally configured task duration.|no (default == none)|
#### Specifying data format
Kafka indexing service supports both [`inputFormat`](../../ingestion/data-formats.md#input-format) and [`parser`](../../ingestion/data-formats.md#parser) to specify the data format.
The `inputFormat` is a new and recommended way to specify the data format for Kafka indexing service,
but unfortunately, it doesn't support all data formats supported by the legacy `parser`.
(They will be supported in the future.)
The supported `inputFormat`s include [`csv`](../../ingestion/data-formats.md#csv),
[`delimited`](../../ingestion/data-formats.md#tsv-delimited), and [`json`](../../ingestion/data-formats.md#json).
You can also read [`avro_stream`](../../ingestion/data-formats.md#avro-stream-parser),
[`protobuf`](../../ingestion/data-formats.md#protobuf-parser),
and [`thrift`](../extensions-contrib/thrift.md) formats using `parser`.
<a name="tuningconfig"></a>
### KafkaSupervisorTuningConfig
The tuningConfig is optional and default parameters will be used if no tuningConfig is specified.
| Field | Type | Description | Required |
|-----------------------------------|----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------|
| `type` | String | The indexing task type, this should always be `kafka`. | yes |
| `maxRowsInMemory` | Integer | The number of rows to aggregate before persisting. This number is the post-aggregation rows, so it is not equivalent to the number of input events, but the number of aggregated rows that those events result in. This is used to manage the required JVM heap size. Maximum heap memory usage for indexing scales with maxRowsInMemory * (2 + maxPendingPersists). Normally user does not need to set this, but depending on the nature of data, if rows are short in terms of bytes, user may not want to store a million rows in memory and this value should be set. | no (default == 1000000) |
| `maxBytesInMemory` | Long | The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally this is computed internally and user does not need to set it. The maximum heap memory usage for indexing is maxBytesInMemory * (2 + maxPendingPersists). | no (default == One-sixth of max JVM memory) |
| `maxRowsPerSegment` | Integer | The number of rows to aggregate into a segment; this number is post-aggregation rows. Handoff will happen either if `maxRowsPerSegment` or `maxTotalRows` is hit or every `intermediateHandoffPeriod`, whichever happens earlier. | no (default == 5000000) |
| `maxTotalRows` | Long | The number of rows to aggregate across all segments; this number is post-aggregation rows. Handoff will happen either if `maxRowsPerSegment` or `maxTotalRows` is hit or every `intermediateHandoffPeriod`, whichever happens earlier. | no (default == unlimited) |
| `intermediatePersistPeriod` | ISO8601 Period | The period that determines the rate at which intermediate persists occur. | no (default == PT10M) |
| `maxPendingPersists` | Integer | Maximum number of persists that can be pending but not started. If this limit would be exceeded by a new intermediate persist, ingestion will block until the currently-running persist finishes. Maximum heap memory usage for indexing scales with maxRowsInMemory * (2 + maxPendingPersists). | no (default == 0, meaning one persist can be running concurrently with ingestion, and none can be queued up) |
| `indexSpec` | Object | Tune how data is indexed. See [IndexSpec](#indexspec) for more information. | no |
| `indexSpecForIntermediatePersists`| | Defines segment storage format options to be used at indexing time for intermediate persisted temporary segments. This can be used to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published, see [IndexSpec](#indexspec) for possible values. | no (default = same as indexSpec) |
| `reportParseExceptions` | Boolean | *DEPRECATED*. If true, exceptions encountered during parsing will be thrown and will halt ingestion; if false, unparseable rows and fields will be skipped. Setting `reportParseExceptions` to true will override existing configurations for `maxParseExceptions` and `maxSavedParseExceptions`, setting `maxParseExceptions` to 0 and limiting `maxSavedParseExceptions` to no more than 1. | no (default == false) |
| `handoffConditionTimeout` | Long | Milliseconds to wait for segment handoff. It must be >= 0, where 0 means to wait forever. | no (default == 0) |
| `resetOffsetAutomatically` | Boolean | Controls behavior when Druid needs to read Kafka messages that are no longer available (i.e. when OffsetOutOfRangeException is encountered).<br/><br/>If false, the exception will bubble up, which will cause your tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation; potentially using the [Reset Supervisor API](../../operations/api-reference.html#supervisors). This mode is useful for production, since it will make you aware of issues with ingestion.<br/><br/>If true, Druid will automatically reset to the earlier or latest offset available in Kafka, based on the value of the `useEarliestOffset` property (earliest if true, latest if false). Please note that this can lead to data being _DROPPED_ (if `useEarliestOffset` is false) or _DUPLICATED_ (if `useEarliestOffset` is true) without your knowledge. Messages will be logged indicating that a reset has occurred, but ingestion will continue. This mode is useful for non-production situations, since it will make Druid attempt to recover from problems automatically, even if they lead to quiet dropping or duplicating of data.<br/><br/>This feature behaves similarly to the Kafka `auto.offset.reset` consumer property. | no (default == false) |
| `workerThreads` | Integer | The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation. | no (default == min(10, taskCount)) |
| `chatThreads` | Integer | The number of threads that will be used for communicating with indexing tasks. | no (default == min(10, taskCount * replicas)) |
| `chatRetries` | Integer | The number of times HTTP requests to indexing tasks will be retried before considering tasks unresponsive. | no (default == 8) |
| `httpTimeout` | ISO8601 Period | How long to wait for a HTTP response from an indexing task. | no (default == PT10S) |
| `shutdownTimeout` | ISO8601 Period | How long to wait for the supervisor to attempt a graceful shutdown of tasks before exiting. | no (default == PT80S) |
| `offsetFetchPeriod` | ISO8601 Period | How often the supervisor queries Kafka and the indexing tasks to fetch current offsets and calculate lag. | no (default == PT30S, min == PT5S) |
| `segmentWriteOutMediumFactory` | Object | Segment write-out medium to use when creating segments. See below for more information. | no (not specified by default, the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type` is used) |
| `intermediateHandoffPeriod` | ISO8601 Period | How often the tasks should hand off segments. Handoff will happen either if `maxRowsPerSegment` or `maxTotalRows` is hit or every `intermediateHandoffPeriod`, whichever happens earlier. | no (default == P2147483647D) |
| `logParseExceptions` | Boolean | If true, log an error message when a parsing exception occurs, containing information about the row where the error occurred. | no, default == false |
| `maxParseExceptions` | Integer | The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set. | no, unlimited default |
| `maxSavedParseExceptions` | Integer | When a parse exception occurs, Druid can keep track of the most recent parse exceptions. "maxSavedParseExceptions" limits how many exception instances will be saved. These saved exceptions will be made available after the task finishes in the [task completion report](../../ingestion/tasks.md#reports). Overridden if `reportParseExceptions` is set. | no, default == 0 |
#### IndexSpec
|Field|Type|Description|Required|
|-----|----|-----------|--------|
|bitmap|Object|Compression format for bitmap indexes. Should be a JSON object. See [Bitmap types](#bitmap-types) below for options.|no (defaults to Roaring)|
|dimensionCompression|String|Compression format for dimension columns. Choose from `LZ4`, `LZF`, or `uncompressed`.|no (default == `LZ4`)|
|metricCompression|String|Compression format for primitive type metric columns. Choose from `LZ4`, `LZF`, `uncompressed`, or `none`.|no (default == `LZ4`)|
|longEncoding|String|Encoding format for metric and dimension columns with type long. Choose from `auto` or `longs`. `auto` encodes the values using offset or lookup table depending on column cardinality, and store them with variable size. `longs` stores the value as is with 8 bytes each.|no (default == `longs`)|
##### Bitmap types
For Roaring bitmaps:
|Field|Type|Description|Required|
|-----|----|-----------|--------|
|`type`|String|Must be `roaring`.|yes|
|`compressRunOnSerialization`|Boolean|Use a run-length encoding where it is estimated as more space efficient.|no (default == `true`)|
For Concise bitmaps:
|Field|Type|Description|Required|
|-----|----|-----------|--------|
|`type`|String|Must be `concise`.|yes|
#### SegmentWriteOutMediumFactory
|Field|Type|Description|Required|
|-----|----|-----------|--------|
|`type`|String|See [Additional Peon Configuration: SegmentWriteOutMediumFactory](../../configuration/index.html#segmentwriteoutmediumfactory) for explanation and available options.|yes|
## Operations
This section gives descriptions of how some supervisor APIs work specifically in Kafka Indexing Service.
For all supervisor APIs, please check [Supervisor APIs](../../operations/api-reference.html#supervisors).
### Getting Supervisor Status Report
`GET /druid/indexer/v1/supervisor/<supervisorId>/status` returns a snapshot report of the current state of the tasks managed by the given supervisor. This includes the latest
offsets as reported by Kafka, the consumer lag per partition, as well as the aggregate lag of all partitions. The
consumer lag per partition may be reported as negative values if the supervisor has not received a recent latest offset
response from Kafka. The aggregate lag value will always be >= 0.
The status report also contains the supervisor's state and a list of recently thrown exceptions (reported as
`recentErrors`, whose max size can be controlled using the `druid.supervisor.maxStoredExceptionEvents` configuration).
There are two fields related to the supervisor's state - `state` and `detailedState`. The `state` field will always be
one of a small number of generic states that are applicable to any type of supervisor, while the `detailedState` field
will contain a more descriptive, implementation-specific state that may provide more insight into the supervisor's
activities than the generic `state` field.
The list of possible `state` values are: [`PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`, `UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`]
The list of `detailedState` values and their corresponding `state` mapping is as follows:
|Detailed State|Corresponding State|Description|
|--------------|-------------------|-----------|
|UNHEALTHY_SUPERVISOR|UNHEALTHY_SUPERVISOR|The supervisor has encountered errors on the past `druid.supervisor.unhealthinessThreshold` iterations|
|UNHEALTHY_TASKS|UNHEALTHY_TASKS|The last `druid.supervisor.taskUnhealthinessThreshold` tasks have all failed|
|UNABLE_TO_CONNECT_TO_STREAM|UNHEALTHY_SUPERVISOR|The supervisor is encountering connectivity issues with Kafka and has not successfully connected in the past|
|LOST_CONTACT_WITH_STREAM|UNHEALTHY_SUPERVISOR|The supervisor is encountering connectivity issues with Kafka but has successfully connected in the past|
|PENDING (first iteration only)|PENDING|The supervisor has been initialized and hasn't started connecting to the stream|
|CONNECTING_TO_STREAM (first iteration only)|RUNNING|The supervisor is trying to connect to the stream and update partition data|
|DISCOVERING_INITIAL_TASKS (first iteration only)|RUNNING|The supervisor is discovering already-running tasks|
|CREATING_TASKS (first iteration only)|RUNNING|The supervisor is creating tasks and discovering state|
|RUNNING|RUNNING|The supervisor has started tasks and is waiting for taskDuration to elapse|
|SUSPENDED|SUSPENDED|The supervisor has been suspended|
|STOPPING|STOPPING|The supervisor is stopping|
On each iteration of the supervisor's run loop, the supervisor completes the following tasks in sequence:
1) Fetch the list of partitions from Kafka and determine the starting offset for each partition (either based on the
last processed offset if continuing, or starting from the beginning or ending of the stream if this is a new topic).
2) Discover any running indexing tasks that are writing to the supervisor's datasource and adopt them if they match
the supervisor's configuration, else signal them to stop.
3) Send a status request to each supervised task to update our view of the state of the tasks under our supervision.
4) Handle tasks that have exceeded `taskDuration` and should transition from the reading to publishing state.
5) Handle tasks that have finished publishing and signal redundant replica tasks to stop.
6) Handle tasks that have failed and clean up the supervisor's internal state.
7) Compare the list of healthy tasks to the requested `taskCount` and `replicas` configurations and create additional tasks if required.
The `detailedState` field will show additional values (those marked with "first iteration only") the first time the
supervisor executes this run loop after startup or after resuming from a suspension. This is intended to surface
initialization-type issues, where the supervisor is unable to reach a stable state (perhaps because it can't connect to
Kafka, it can't read from the Kafka topic, or it can't communicate with existing tasks). Once the supervisor is stable -
that is, once it has completed a full execution without encountering any issues - `detailedState` will show a `RUNNING`
state until it is stopped, suspended, or hits a failure threshold and transitions to an unhealthy state.
### Getting Supervisor Ingestion Stats Report
`GET /druid/indexer/v1/supervisor/<supervisorId>/stats` returns a snapshot of the current ingestion row counters for each task being managed by the supervisor, along with moving averages for the row counters.
See [Task Reports: Row Stats](../../ingestion/tasks.md#row-stats) for more information.
### Supervisor Health Check
`GET /druid/indexer/v1/supervisor/<supervisorId>/health` returns `200 OK` if the supervisor is healthy and
`503 Service Unavailable` if it is unhealthy. Healthiness is determined by the supervisor's `state` (as returned by the
`/status` endpoint) and the `druid.supervisor.*` Overlord configuration thresholds.
### Updating Existing Supervisors
`POST /druid/indexer/v1/supervisor` can be used to update existing supervisor spec.
Calling this endpoint when there is already an existing supervisor for the same dataSource will cause:
- The running supervisor to signal its managed tasks to stop reading and begin publishing.
- The running supervisor to exit.
- A new supervisor to be created using the configuration provided in the request body. This supervisor will retain the
existing publishing tasks and will create new tasks starting at the offsets the publishing tasks ended on.
Seamless schema migrations can thus be achieved by simply submitting the new schema using this endpoint.
### Suspending and Resuming Supervisors
You can suspend and resume a supervisor using `POST /druid/indexer/v1/supervisor/<supervisorId>/suspend` and `POST /druid/indexer/v1/supervisor/<supervisorId>/resume`, respectively.
Note that the supervisor itself will still be operating and emitting logs and metrics,
it will just ensure that no indexing tasks are running until the supervisor is resumed.
### Resetting Supervisors
The `POST /druid/indexer/v1/supervisor/<supervisorId>/reset` operation clears stored
offsets, causing the supervisor to start reading offsets from either the earliest or latest
offsets in Kafka (depending on the value of `useEarliestOffset`). After clearing stored
offsets, the supervisor kills and recreates any active tasks, so that tasks begin reading
from valid offsets.
Use care when using this operation! Resetting the supervisor may cause Kafka messages
to be skipped or read twice, resulting in missing or duplicate data.
The reason for using this operation is to recover from a state in which the supervisor
ceases operating due to missing offsets. The indexing service keeps track of the latest
persisted Kafka offsets in order to provide exactly-once ingestion guarantees across
tasks. Subsequent tasks must start reading from where the previous task completed in
order for the generated segments to be accepted. If the messages at the expected
starting offsets are no longer available in Kafka (typically because the message retention
period has elapsed or the topic was removed and re-created) the supervisor will refuse
to start and in flight tasks will fail. This operation enables you to recover from this condition.
Note that the supervisor must be running for this endpoint to be available.
### Terminating Supervisors
The `POST /druid/indexer/v1/supervisor/<supervisorId>/terminate` operation terminates a supervisor and causes all
associated indexing tasks managed by this supervisor to immediately stop and begin
publishing their segments. This supervisor will still exist in the metadata store and it's history may be retrieved
with the supervisor history API, but will not be listed in the 'get supervisors' API response nor can it's configuration
or status report be retrieved. The only way this supervisor can start again is by submitting a functioning supervisor
spec to the create API.
### Capacity Planning
Kafka indexing tasks run on MiddleManagers and are thus limited by the resources available in the MiddleManager
cluster. In particular, you should make sure that you have sufficient worker capacity (configured using the
`druid.worker.capacity` property) to handle the configuration in the supervisor spec. Note that worker capacity is
shared across all types of indexing tasks, so you should plan your worker capacity to handle your total indexing load
(e.g. batch processing, realtime tasks, merging tasks, etc.). If your workers run out of capacity, Kafka indexing tasks
will queue and wait for the next available worker. This may cause queries to return partial results but will not result
in data loss (assuming the tasks run before Kafka purges those offsets).
A running task will normally be in one of two states: *reading* or *publishing*. A task will remain in reading state for
`taskDuration`, at which point it will transition to publishing state. A task will remain in publishing state for as long
as it takes to generate segments, push segments to deep storage, and have them be loaded and served by a Historical process
(or until `completionTimeout` elapses).
The number of reading tasks is controlled by `replicas` and `taskCount`. In general, there will be `replicas * taskCount`
reading tasks, the exception being if taskCount > {numKafkaPartitions} in which case {numKafkaPartitions} tasks will
be used instead. When `taskDuration` elapses, these tasks will transition to publishing state and `replicas * taskCount`
new reading tasks will be created. Therefore to allow for reading tasks and publishing tasks to run concurrently, there
should be a minimum capacity of:
```
workerCapacity = 2 * replicas * taskCount
```
This value is for the ideal situation in which there is at most one set of tasks publishing while another set is reading.
In some circumstances, it is possible to have multiple sets of tasks publishing simultaneously. This would happen if the
time-to-publish (generate segment, push to deep storage, loaded on Historical) > `taskDuration`. This is a valid
scenario (correctness-wise) but requires additional worker capacity to support. In general, it is a good idea to have
`taskDuration` be large enough that the previous set of tasks finishes publishing before the current set begins.
### Supervisor Persistence
When a supervisor spec is submitted via the `POST /druid/indexer/v1/supervisor` endpoint, it is persisted in the
configured metadata database. There can only be a single supervisor per dataSource, and submitting a second spec for
the same dataSource will overwrite the previous one.
When an Overlord gains leadership, either by being started or as a result of another Overlord failing, it will spawn
a supervisor for each supervisor spec in the metadata database. The supervisor will then discover running Kafka indexing
tasks and will attempt to adopt them if they are compatible with the supervisor's configuration. If they are not
compatible because they have a different ingestion spec or partition allocation, the tasks will be killed and the
supervisor will create a new set of tasks. In this way, the supervisors are persistent across Overlord restarts and
fail-overs.
A supervisor is stopped via the `POST /druid/indexer/v1/supervisor/<supervisorId>/terminate` endpoint. This places a
tombstone marker in the database (to prevent the supervisor from being reloaded on a restart) and then gracefully
shuts down the currently running supervisor. When a supervisor is shut down in this way, it will instruct its
managed tasks to stop reading and begin publishing their segments immediately. The call to the shutdown endpoint will
return after all tasks have been signaled to stop but before the tasks finish publishing their segments.
### Schema/Configuration Changes
Schema and configuration changes are handled by submitting the new supervisor spec via the same
`POST /druid/indexer/v1/supervisor` endpoint used to initially create the supervisor. The Overlord will initiate a
graceful shutdown of the existing supervisor which will cause the tasks being managed by that supervisor to stop reading
and begin publishing their segments. A new supervisor will then be started which will create a new set of tasks that
will start reading from the offsets where the previous now-publishing tasks left off, but using the updated schema.
In this way, configuration changes can be applied without requiring any pause in ingestion.
### Deployment Notes
#### On the Subject of Segments
Each Kafka Indexing Task puts events consumed from Kafka partitions assigned to it in a single segment for each segment
granular interval until maxRowsPerSegment, maxTotalRows or intermediateHandoffPeriod limit is reached, at this point a new partition
for this segment granularity is created for further events. Kafka Indexing Task also does incremental hand-offs which
means that all the segments created by a task will not be held up till the task duration is over. As soon as maxRowsPerSegment,
maxTotalRows or intermediateHandoffPeriod limit is hit, all the segments held by the task at that point in time will be handed-off
and new set of segments will be created for further events. This means that the task can run for longer durations of time
without accumulating old segments locally on Middle Manager processes and it is encouraged to do so.
Kafka Indexing Service may still produce some small segments. Lets say the task duration is 4 hours, segment granularity
is set to an HOUR and Supervisor was started at 9:10 then after 4 hours at 13:10, new set of tasks will be started and
events for the interval 13:00 - 14:00 may be split across previous and new set of tasks. If you see it becoming a problem then
one can schedule re-indexing tasks be run to merge segments together into new segments of an ideal size (in the range of ~500-700 MB per segment).
Details on how to optimize the segment size can be found on [Segment size optimization](../../operations/segment-optimization.md).
There is also ongoing work to support automatic segment compaction of sharded segments as well as compaction not requiring
Hadoop (see [here](https://github.com/apache/druid/pull/5102)).

View File

@ -0,0 +1,473 @@
---
id: kinesis-ingestion
title: "Amazon Kinesis ingestion"
sidebar_label: "Amazon Kinesis"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
Similar to the [Kafka indexing service](./kafka-ingestion.md), the Kinesis indexing service enables the configuration of *supervisors* on the Overlord, which facilitate ingestion from
Kinesis by managing the creation and lifetime of Kinesis indexing tasks. These indexing tasks read events using Kinesis's own
Shards and Sequence Number mechanism and are therefore able to provide guarantees of exactly-once ingestion.
The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures,
and ensure that the scalability and replication requirements are maintained.
The Kinesis indexing service is provided as the `druid-kinesis-indexing-service` core Apache Druid extension (see
[Including Extensions](../../development/extensions.md#loading-extensions)). Please note that this is
currently designated as an *experimental feature* and is subject to the usual
[experimental caveats](../experimental.md).
## Submitting a Supervisor Spec
The Kinesis indexing service requires that the `druid-kinesis-indexing-service` extension be loaded on both the Overlord
and the MiddleManagers. A supervisor for a dataSource is started by submitting a supervisor spec via HTTP POST to
`http://<OVERLORD_IP>:<OVERLORD_PORT>/druid/indexer/v1/supervisor`, for example:
```
curl -X POST -H 'Content-Type: application/json' -d @supervisor-spec.json http://localhost:8090/druid/indexer/v1/supervisor
```
A sample supervisor spec is shown below:
```json
{
"type": "kinesis",
"dataSchema": {
"dataSource": "metrics-kinesis",
"timestampSpec": {
"column": "timestamp",
"format": "auto"
},
"dimensionsSpec": {
"dimensions": [],
"dimensionExclusions": [
"timestamp",
"value"
]
},
"metricsSpec": [
{
"name": "count",
"type": "count"
},
{
"name": "value_sum",
"fieldName": "value",
"type": "doubleSum"
},
{
"name": "value_min",
"fieldName": "value",
"type": "doubleMin"
},
{
"name": "value_max",
"fieldName": "value",
"type": "doubleMax"
}
],
"granularitySpec": {
"type": "uniform",
"segmentGranularity": "HOUR",
"queryGranularity": "NONE"
}
},
"ioConfig": {
"stream": "metrics",
"inputFormat": {
"type": "json"
},
"endpoint": "kinesis.us-east-1.amazonaws.com",
"taskCount": 1,
"replicas": 1,
"taskDuration": "PT1H",
"recordsPerFetch": 2000,
"fetchDelayMillis": 1000
},
"tuningConfig": {
"type": "kinesis",
"maxRowsPerSegment": 5000000
}
}
```
## Supervisor Spec
|Field|Description|Required|
|--------|-----------|---------|
|`type`|The supervisor type, this should always be `kinesis`.|yes|
|`dataSchema`|The schema that will be used by the Kinesis indexing task during ingestion. See [`dataSchema`](../../ingestion/index.md#dataschema).|yes|
|`ioConfig`|A KinesisSupervisorIOConfig object for configuring Kafka connection and I/O-related settings for the supervisor and indexing task. See [KinesisSupervisorIOConfig](#kinesissupervisorioconfig) below.|yes|
|`tuningConfig`|A KinesisSupervisorTuningConfig object for configuring performance-related settings for the supervisor and indexing tasks. See [KinesisSupervisorTuningConfig](#kinesissupervisortuningconfig) below.|no|
### KinesisSupervisorIOConfig
|Field|Type|Description|Required|
|-----|----|-----------|--------|
|`stream`|String|The Kinesis stream to read.|yes|
|`inputFormat`|Object|[`inputFormat`](../../ingestion/data-formats.md#input-format) to specify how to parse input data. See [the below section](#specifying-data-format) for details about specifying the input format.|yes|
|`endpoint`|String|The AWS Kinesis stream endpoint for a region. You can find a list of endpoints [here](http://docs.aws.amazon.com/general/latest/gr/rande.html#ak_region).|no (default == kinesis.us-east-1.amazonaws.com)|
|`replicas`|Integer|The number of replica sets, where 1 means a single set of tasks (no replication). Replica tasks will always be assigned to different workers to provide resiliency against process failure.|no (default == 1)|
|`taskCount`|Integer|The maximum number of *reading* tasks in a *replica set*. This means that the maximum number of reading tasks will be `taskCount * replicas` and the total number of tasks (*reading* + *publishing*) will be higher than this. See [Capacity Planning](#capacity-planning) below for more details. The number of reading tasks will be less than `taskCount` if `taskCount > {numKinesisShards}`.|no (default == 1)|
|`taskDuration`|ISO8601 Period|The length of time before tasks stop reading and begin publishing their segment.|no (default == PT1H)|
|`startDelay`|ISO8601 Period|The period to wait before the supervisor starts managing tasks.|no (default == PT5S)|
|`period`|ISO8601 Period|How often the supervisor will execute its management logic. Note that the supervisor will also run in response to certain events (such as tasks succeeding, failing, and reaching their taskDuration) so this value specifies the maximum time between iterations.|no (default == PT30S)|
|`useEarliestSequenceNumber`|Boolean|If a supervisor is managing a dataSource for the first time, it will obtain a set of starting sequence numbers from Kinesis. This flag determines whether it retrieves the earliest or latest sequence numbers in Kinesis. Under normal circumstances, subsequent tasks will start from where the previous segments ended so this flag will only be used on first run.|no (default == false)|
|`completionTimeout`|ISO8601 Period|The length of time to wait before declaring a publishing task as failed and terminating it. If this is set too low, your tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|no (default == PT6H)|
|`lateMessageRejectionPeriod`|ISO8601 Period|Configure tasks to reject messages with timestamps earlier than this period before the task was created; for example if this is set to `PT1H` and the supervisor creates a task at *2016-01-01T12:00Z*, messages with timestamps earlier than *2016-01-01T11:00Z* will be dropped. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments (e.g. a realtime and a nightly batch ingestion pipeline).|no (default == none)|
|`earlyMessageRejectionPeriod`|ISO8601 Period|Configure tasks to reject messages with timestamps later than this period after the task reached its taskDuration; for example if this is set to `PT1H`, the taskDuration is set to `PT1H` and the supervisor creates a task at *2016-01-01T12:00Z*, messages with timestamps later than *2016-01-01T14:00Z* will be dropped. **Note:** Tasks sometimes run past their task duration, for example, in cases of supervisor failover. Setting earlyMessageRejectionPeriod too low may cause messages to be dropped unexpectedly whenever a task runs past its originally configured task duration.|no (default == none)|
|`recordsPerFetch`|Integer|The number of records to request per GetRecords call to Kinesis. See 'Determining Fetch Settings' below.|no (default == 2000)|
|`fetchDelayMillis`|Integer|Time in milliseconds to wait between subsequent GetRecords calls to Kinesis. See 'Determining Fetch Settings' below.|no (default == 1000)|
|`awsAssumedRoleArn`|String|The AWS assumed role to use for additional permissions.|no|
|`awsExternalId`|String|The AWS external id to use for additional permissions.|no|
|`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. See below for details.|no|
#### Specifying data format
Kinesis indexing service supports both [`inputFormat`](../../ingestion/data-formats.md#input-format) and [`parser`](../../ingestion/data-formats.md#parser) to specify the data format.
The `inputFormat` is a new and recommended way to specify the data format for Kinesis indexing service,
but unfortunately, it doesn't support all data formats supported by the legacy `parser`.
(They will be supported in the future.)
The supported `inputFormat`s include [`csv`](../../ingestion/data-formats.md#csv),
[`delimited`](../../ingestion/data-formats.md#tsv-delimited), and [`json`](../../ingestion/data-formats.md#json).
You can also read [`avro_stream`](../../ingestion/data-formats.md#avro-stream-parser),
[`protobuf`](../../ingestion/data-formats.md#protobuf-parser),
and [`thrift`](../extensions-contrib/thrift.md) formats using `parser`.
<a name="tuningconfig"></a>
### KinesisSupervisorTuningConfig
The tuningConfig is optional and default parameters will be used if no tuningConfig is specified.
| Field | Type | Description | Required |
|---------------------------------------|----------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------|
| `type` | String | The indexing task type, this should always be `kinesis`. | yes |
| `maxRowsInMemory` | Integer | The number of rows to aggregate before persisting. This number is the post-aggregation rows, so it is not equivalent to the number of input events, but the number of aggregated rows that those events result in. This is used to manage the required JVM heap size. Maximum heap memory usage for indexing scales with maxRowsInMemory * (2 + maxPendingPersists). | no (default == 100000) |
| `maxBytesInMemory` | Long | The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally this is computed internally and user does not need to set it. The maximum heap memory usage for indexing is maxBytesInMemory * (2 + maxPendingPersists). | no (default == One-sixth of max JVM memory) |
| `maxRowsPerSegment` | Integer | The number of rows to aggregate into a segment; this number is post-aggregation rows. Handoff will happen either if `maxRowsPerSegment` or `maxTotalRows` is hit or every `intermediateHandoffPeriod`, whichever happens earlier. | no (default == 5000000) |
| `maxTotalRows` | Long | The number of rows to aggregate across all segments; this number is post-aggregation rows. Handoff will happen either if `maxRowsPerSegment` or `maxTotalRows` is hit or every `intermediateHandoffPeriod`, whichever happens earlier. | no (default == unlimited) |
| `intermediatePersistPeriod` | ISO8601 Period | The period that determines the rate at which intermediate persists occur. | no (default == PT10M) |
| `maxPendingPersists` | Integer | Maximum number of persists that can be pending but not started. If this limit would be exceeded by a new intermediate persist, ingestion will block until the currently-running persist finishes. Maximum heap memory usage for indexing scales with maxRowsInMemory * (2 + maxPendingPersists). | no (default == 0, meaning one persist can be running concurrently with ingestion, and none can be queued up) |
| `indexSpec` | Object | Tune how data is indexed. See [IndexSpec](#indexspec) for more information. | no |
| `indexSpecForIntermediatePersists` | | Defines segment storage format options to be used at indexing time for intermediate persisted temporary segments. This can be used to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published, see [IndexSpec](#indexspec) for possible values. | no (default = same as indexSpec) |
| `reportParseExceptions` | Boolean | If true, exceptions encountered during parsing will be thrown and will halt ingestion; if false, unparseable rows and fields will be skipped. | no (default == false) |
| `handoffConditionTimeout` | Long | Milliseconds to wait for segment handoff. It must be >= 0, where 0 means to wait forever. | no (default == 0) |
| `resetOffsetAutomatically` | Boolean | Controls behavior when Druid needs to read Kinesis messages that are no longer available.<br/><br/>If false, the exception will bubble up, which will cause your tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation; potentially using the [Reset Supervisor API](../../operations/api-reference.html#supervisors). This mode is useful for production, since it will make you aware of issues with ingestion.<br/><br/>If true, Druid will automatically reset to the earlier or latest sequence number available in Kinesis, based on the value of the `useEarliestSequenceNumber` property (earliest if true, latest if false). Please note that this can lead to data being _DROPPED_ (if `useEarliestSequenceNumber` is false) or _DUPLICATED_ (if `useEarliestSequenceNumber` is true) without your knowledge. Messages will be logged indicating that a reset has occurred, but ingestion will continue. This mode is useful for non-production situations, since it will make Druid attempt to recover from problems automatically, even if they lead to quiet dropping or duplicating of data. | no (default == false) |
| `skipSequenceNumberAvailabilityCheck` | Boolean | Whether to enable checking if the current sequence number is still available in a particular Kinesis shard. If set to false, the indexing task will attempt to reset the current sequence number (or not), depending on the value of `resetOffsetAutomatically`. | no (default == false) |
| `workerThreads` | Integer | The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation. | no (default == min(10, taskCount)) |
| `chatThreads` | Integer | The number of threads that will be used for communicating with indexing tasks. | no (default == min(10, taskCount * replicas)) |
| `chatRetries` | Integer | The number of times HTTP requests to indexing tasks will be retried before considering tasks unresponsive. | no (default == 8) |
| `httpTimeout` | ISO8601 Period | How long to wait for a HTTP response from an indexing task. | no (default == PT10S) |
| `shutdownTimeout` | ISO8601 Period | How long to wait for the supervisor to attempt a graceful shutdown of tasks before exiting. | no (default == PT80S) |
| `recordBufferSize` | Integer | Size of the buffer (number of events) used between the Kinesis fetch threads and the main ingestion thread. | no (default == 10000) |
| `recordBufferOfferTimeout` | Integer | Length of time in milliseconds to wait for space to become available in the buffer before timing out. | no (default == 5000) |
| `recordBufferFullWait` | Integer | Length of time in milliseconds to wait for the buffer to drain before attempting to fetch records from Kinesis again. | no (default == 5000) |
| `fetchSequenceNumberTimeout` | Integer | Length of time in milliseconds to wait for Kinesis to return the earliest or latest sequence number for a shard. Kinesis will not return the latest sequence number if no data is actively being written to that shard. In this case, this fetch call will repeatedly timeout and retry until fresh data is written to the stream. | no (default == 60000) |
| `fetchThreads` | Integer | Size of the pool of threads fetching data from Kinesis. There is no benefit in having more threads than Kinesis shards. | no (default == procs * 2, where "procs" is the number of processors on the server that the task is running on) |
| `segmentWriteOutMediumFactory` | Object | Segment write-out medium to use when creating segments. See below for more information. | no (not specified by default, the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type` is used) |
| `intermediateHandoffPeriod` | ISO8601 Period | How often the tasks should hand off segments. Handoff will happen either if `maxRowsPerSegment` or `maxTotalRows` is hit or every `intermediateHandoffPeriod`, whichever happens earlier. | no (default == P2147483647D) |
| `logParseExceptions` | Boolean | If true, log an error message when a parsing exception occurs, containing information about the row where the error occurred. | no, default == false |
| `maxParseExceptions` | Integer | The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set. | no, unlimited default |
| `maxSavedParseExceptions` | Integer | When a parse exception occurs, Druid can keep track of the most recent parse exceptions. "maxSavedParseExceptions" limits how many exception instances will be saved. These saved exceptions will be made available after the task finishes in the [task completion report](../../ingestion/tasks.md#reports). Overridden if `reportParseExceptions` is set. | no, default == 0 |
| `maxRecordsPerPoll` | Integer | The maximum number of records/events to be fetched from buffer per poll. The actual maximum will be `Max(maxRecordsPerPoll, Max(bufferSize, 1))` | no, default == 100 |
| `repartitionTransitionDuration` | ISO8601 Period | When shards are split or merged, the supervisor will recompute shard -> task group mappings, and signal any running tasks created under the old mappings to stop early at (current time + `repartitionTransitionDuration`). Stopping the tasks early allows Druid to begin reading from the new shards more quickly. The repartition transition wait time controlled by this property gives the stream additional time to write records to the new shards after the split/merge, which helps avoid the issues with empty shard handling described at https://github.com/apache/druid/issues/7600. | no, (default == PT2M) |
#### IndexSpec
|Field|Type|Description|Required|
|-----|----|-----------|--------|
|bitmap|Object|Compression format for bitmap indexes. Should be a JSON object. See [Bitmap types](#bitmap-types) below for options.|no (defaults to Roaring)|
|dimensionCompression|String|Compression format for dimension columns. Choose from `LZ4`, `LZF`, or `uncompressed`.|no (default == `LZ4`)|
|metricCompression|String|Compression format for primitive type metric columns. Choose from `LZ4`, `LZF`, `uncompressed`, or `none`.|no (default == `LZ4`)|
|longEncoding|String|Encoding format for metric and dimension columns with type long. Choose from `auto` or `longs`. `auto` encodes the values using sequence number or lookup table depending on column cardinality, and store them with variable size. `longs` stores the value as is with 8 bytes each.|no (default == `longs`)|
##### Bitmap types
For Roaring bitmaps:
|Field|Type|Description|Required|
|-----|----|-----------|--------|
|`type`|String|Must be `roaring`.|yes|
|`compressRunOnSerialization`|Boolean|Use a run-length encoding where it is estimated as more space efficient.|no (default == `true`)|
For Concise bitmaps:
|Field|Type|Description|Required|
|-----|----|-----------|--------|
|`type`|String|Must be `concise`.|yes|
#### SegmentWriteOutMediumFactory
|Field|Type|Description|Required|
|-----|----|-----------|--------|
|`type`|String|See [Additional Peon Configuration: SegmentWriteOutMediumFactory](../../configuration/index.html#segmentwriteoutmediumfactory) for explanation and available options.|yes|
## Operations
This section gives descriptions of how some supervisor APIs work specifically in Kinesis Indexing Service.
For all supervisor APIs, please check [Supervisor APIs](../../operations/api-reference.html#supervisors).
### AWS Authentication
To authenticate with AWS, you must provide your AWS access key and AWS secret key via runtime.properties, for example:
```
-Ddruid.kinesis.accessKey=123 -Ddruid.kinesis.secretKey=456
```
The AWS access key ID and secret access key are used for Kinesis API requests. If this is not provided, the service will
look for credentials set in environment variables, in the default profile configuration file, and from the EC2 instance
profile provider (in this order).
### Getting Supervisor Status Report
`GET /druid/indexer/v1/supervisor/<supervisorId>/status` returns a snapshot report of the current state of the tasks
managed by the given supervisor. This includes the latest sequence numbers as reported by Kinesis. Unlike the Kafka
Indexing Service, stats about lag are not yet supported.
The status report also contains the supervisor's state and a list of recently thrown exceptions (reported as
`recentErrors`, whose max size can be controlled using the `druid.supervisor.maxStoredExceptionEvents` configuration).
There are two fields related to the supervisor's state - `state` and `detailedState`. The `state` field will always be
one of a small number of generic states that are applicable to any type of supervisor, while the `detailedState` field
will contain a more descriptive, implementation-specific state that may provide more insight into the supervisor's
activities than the generic `state` field.
The list of possible `state` values are: [`PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`, `UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`]
The list of `detailedState` values and their corresponding `state` mapping is as follows:
|Detailed State|Corresponding State|Description|
|--------------|-------------------|-----------|
|UNHEALTHY_SUPERVISOR|UNHEALTHY_SUPERVISOR|The supervisor has encountered errors on the past `druid.supervisor.unhealthinessThreshold` iterations|
|UNHEALTHY_TASKS|UNHEALTHY_TASKS|The last `druid.supervisor.taskUnhealthinessThreshold` tasks have all failed|
|UNABLE_TO_CONNECT_TO_STREAM|UNHEALTHY_SUPERVISOR|The supervisor is encountering connectivity issues with Kinesis and has not successfully connected in the past|
|LOST_CONTACT_WITH_STREAM|UNHEALTHY_SUPERVISOR|The supervisor is encountering connectivity issues with Kinesis but has successfully connected in the past|
|PENDING (first iteration only)|PENDING|The supervisor has been initialized and hasn't started connecting to the stream|
|CONNECTING_TO_STREAM (first iteration only)|RUNNING|The supervisor is trying to connect to the stream and update partition data|
|DISCOVERING_INITIAL_TASKS (first iteration only)|RUNNING|The supervisor is discovering already-running tasks|
|CREATING_TASKS (first iteration only)|RUNNING|The supervisor is creating tasks and discovering state|
|RUNNING|RUNNING|The supervisor has started tasks and is waiting for taskDuration to elapse|
|SUSPENDED|SUSPENDED|The supervisor has been suspended|
|STOPPING|STOPPING|The supervisor is stopping|
On each iteration of the supervisor's run loop, the supervisor completes the following tasks in sequence:
1) Fetch the list of shards from Kinesis and determine the starting sequence number for each shard (either based on the
last processed sequence number if continuing, or starting from the beginning or ending of the stream if this is a new stream).
2) Discover any running indexing tasks that are writing to the supervisor's datasource and adopt them if they match
the supervisor's configuration, else signal them to stop.
3) Send a status request to each supervised task to update our view of the state of the tasks under our supervision.
4) Handle tasks that have exceeded `taskDuration` and should transition from the reading to publishing state.
5) Handle tasks that have finished publishing and signal redundant replica tasks to stop.
6) Handle tasks that have failed and clean up the supervisor's internal state.
7) Compare the list of healthy tasks to the requested `taskCount` and `replicas` configurations and create additional tasks if required.
The `detailedState` field will show additional values (those marked with "first iteration only") the first time the
supervisor executes this run loop after startup or after resuming from a suspension. This is intended to surface
initialization-type issues, where the supervisor is unable to reach a stable state (perhaps because it can't connect to
Kinesis, it can't read from the stream, or it can't communicate with existing tasks). Once the supervisor is stable -
that is, once it has completed a full execution without encountering any issues - `detailedState` will show a `RUNNING`
state until it is stopped, suspended, or hits a failure threshold and transitions to an unhealthy state.
### Updating Existing Supervisors
`POST /druid/indexer/v1/supervisor` can be used to update existing supervisor spec.
Calling this endpoint when there is already an existing supervisor for the same dataSource will cause:
- The running supervisor to signal its managed tasks to stop reading and begin publishing.
- The running supervisor to exit.
- A new supervisor to be created using the configuration provided in the request body. This supervisor will retain the
existing publishing tasks and will create new tasks starting at the sequence numbers the publishing tasks ended on.
Seamless schema migrations can thus be achieved by simply submitting the new schema using this endpoint.
### Suspending and Resuming Supervisors
You can suspend and resume a supervisor using `POST /druid/indexer/v1/supervisor/<supervisorId>/suspend` and `POST /druid/indexer/v1/supervisor/<supervisorId>/resume`, respectively.
Note that the supervisor itself will still be operating and emitting logs and metrics,
it will just ensure that no indexing tasks are running until the supervisor is resumed.
### Resetting Supervisors
The `POST /druid/indexer/v1/supervisor/<supervisorId>/reset` operation clears stored
sequence numbers, causing the supervisor to start reading from either the earliest or
latest sequence numbers in Kinesis (depending on the value of `useEarliestSequenceNumber`).
After clearing stored sequence numbers, the supervisor kills and recreates active tasks,
so that tasks begin reading from valid sequence numbers.
Use care when using this operation! Resetting the supervisor may cause Kinesis messages
to be skipped or read twice, resulting in missing or duplicate data.
The reason for using this operation is to recover from a state in which the supervisor
ceases operating due to missing sequence numbers. The indexing service keeps track of the latest
persisted sequence number in order to provide exactly-once ingestion guarantees across
tasks.
Subsequent tasks must start reading from where the previous task completed in
order for the generated segments to be accepted. If the messages at the expected starting sequence numbers are
no longer available in Kinesis (typically because the message retention period has elapsed or the topic was
removed and re-created) the supervisor will refuse to start and in-flight tasks will fail. This operation
enables you to recover from this condition.
Note that the supervisor must be running for this endpoint to be available.
### Terminating Supervisors
The `POST /druid/indexer/v1/supervisor/<supervisorId>/terminate` operation terminates a supervisor and causes
all associated indexing tasks managed by this supervisor to immediately stop and begin
publishing their segments. This supervisor will still exist in the metadata store and its history may be retrieved
with the supervisor history API, but will not be listed in the 'get supervisors' API response nor can its configuration
or status report be retrieved. The only way this supervisor can start again is by submitting a functioning supervisor
spec to the create API.
### Capacity Planning
Kinesis indexing tasks run on MiddleManagers and are thus limited by the resources available in the MiddleManager
cluster. In particular, you should make sure that you have sufficient worker capacity (configured using the
`druid.worker.capacity` property) to handle the configuration in the supervisor spec. Note that worker capacity is
shared across all types of indexing tasks, so you should plan your worker capacity to handle your total indexing load
(e.g. batch processing, realtime tasks, merging tasks, etc.). If your workers run out of capacity, Kinesis indexing tasks
will queue and wait for the next available worker. This may cause queries to return partial results but will not result
in data loss (assuming the tasks run before Kinesis purges those sequence numbers).
A running task will normally be in one of two states: *reading* or *publishing*. A task will remain in reading state for
`taskDuration`, at which point it will transition to publishing state. A task will remain in publishing state for as long
as it takes to generate segments, push segments to deep storage, and have them be loaded and served by a Historical process
(or until `completionTimeout` elapses).
The number of reading tasks is controlled by `replicas` and `taskCount`. In general, there will be `replicas * taskCount`
reading tasks, the exception being if taskCount > {numKinesisShards} in which case {numKinesisShards} tasks will
be used instead. When `taskDuration` elapses, these tasks will transition to publishing state and `replicas * taskCount`
new reading tasks will be created. Therefore to allow for reading tasks and publishing tasks to run concurrently, there
should be a minimum capacity of:
```
workerCapacity = 2 * replicas * taskCount
```
This value is for the ideal situation in which there is at most one set of tasks publishing while another set is reading.
In some circumstances, it is possible to have multiple sets of tasks publishing simultaneously. This would happen if the
time-to-publish (generate segment, push to deep storage, loaded on Historical) > `taskDuration`. This is a valid
scenario (correctness-wise) but requires additional worker capacity to support. In general, it is a good idea to have
`taskDuration` be large enough that the previous set of tasks finishes publishing before the current set begins.
### Supervisor Persistence
When a supervisor spec is submitted via the `POST /druid/indexer/v1/supervisor` endpoint, it is persisted in the
configured metadata database. There can only be a single supervisor per dataSource, and submitting a second spec for
the same dataSource will overwrite the previous one.
When an Overlord gains leadership, either by being started or as a result of another Overlord failing, it will spawn
a supervisor for each supervisor spec in the metadata database. The supervisor will then discover running Kinesis indexing
tasks and will attempt to adopt them if they are compatible with the supervisor's configuration. If they are not
compatible because they have a different ingestion spec or shard allocation, the tasks will be killed and the
supervisor will create a new set of tasks. In this way, the supervisors are persistent across Overlord restarts and
fail-overs.
A supervisor is stopped via the `POST /druid/indexer/v1/supervisor/<supervisorId>/terminate` endpoint. This places a
tombstone marker in the database (to prevent the supervisor from being reloaded on a restart) and then gracefully
shuts down the currently running supervisor. When a supervisor is shut down in this way, it will instruct its
managed tasks to stop reading and begin publishing their segments immediately. The call to the shutdown endpoint will
return after all tasks have been signalled to stop but before the tasks finish publishing their segments.
### Schema/Configuration Changes
Schema and configuration changes are handled by submitting the new supervisor spec via the same
`POST /druid/indexer/v1/supervisor` endpoint used to initially create the supervisor. The Overlord will initiate a
graceful shutdown of the existing supervisor which will cause the tasks being managed by that supervisor to stop reading
and begin publishing their segments. A new supervisor will then be started which will create a new set of tasks that
will start reading from the sequence numbers where the previous now-publishing tasks left off, but using the updated schema.
In this way, configuration changes can be applied without requiring any pause in ingestion.
### Deployment Notes
#### On the Subject of Segments
Each Kinesis Indexing Task puts events consumed from Kinesis Shards assigned to it in a single segment for each segment
granular interval until maxRowsPerSegment, maxTotalRows or intermediateHandoffPeriod limit is reached, at this point a new shard
for this segment granularity is created for further events. Kinesis Indexing Task also does incremental hand-offs which
means that all the segments created by a task will not be held up till the task duration is over. As soon as maxRowsPerSegment,
maxTotalRows or intermediateHandoffPeriod limit is hit, all the segments held by the task at that point in time will be handed-off
and new set of segments will be created for further events. This means that the task can run for longer durations of time
without accumulating old segments locally on Middle Manager processes and it is encouraged to do so.
Kinesis Indexing Service may still produce some small segments. Lets say the task duration is 4 hours, segment granularity
is set to an HOUR and Supervisor was started at 9:10 then after 4 hours at 13:10, new set of tasks will be started and
events for the interval 13:00 - 14:00 may be split across previous and new set of tasks. If you see it becoming a problem then
one can schedule re-indexing tasks be run to merge segments together into new segments of an ideal size (in the range of ~500-700 MB per segment).
Details on how to optimize the segment size can be found on [Segment size optimization](../../operations/segment-optimization.md).
There is also ongoing work to support automatic segment compaction of sharded segments as well as compaction not requiring
Hadoop (see [here](https://github.com/apache/druid/pull/5102)).
### Determining Fetch Settings
Internally, the Kinesis Indexing Service uses the Kinesis Record Supplier abstraction for fetching Kinesis data records and storing the records
locally. The way the Kinesis Record Supplier fetches records is to have a separate thread run the fetching operation per each Kinesis Shard, the
max number of threads is determined by `fetchThreads`. For example, a Kinesis stream with 3 shards will have 3 threads, each fetching from a shard separately.
There is a delay between each fetching operation, which is controlled by `fetchDelayMillis`. The maximum number of records to be fetched per thread per
operation is controlled by `recordsPerFetch`. Note that this is not the same as `maxRecordsPerPoll`.
The records fetched by each thread will be pushed to a queue in the order that they are fetched. The records are stored in this queue until `poll()` is called
by either the supervisor or the indexing task. `poll()` will attempt to drain the internal buffer queue up to a limit of `max(maxRecordsPerPoll, q.size())`.
Here `maxRecordsPerPoll` controls the theoretical maximum records to drain out of the buffer queue, so setting this parameter to a reasonable value is essential
in preventing the queue from overflowing or memory exceeding heap size.
Kinesis places the following restrictions on calls to fetch records:
- Each data record can be up to 1 MB in size.
- Each shard can support up to 5 transactions per second for reads.
- Each shard can read up to 2 MB per second.
- The maximum size of data that GetRecords can return is 10 MB.
Values for `recordsPerFetch` and `fetchDelayMillis` should be chosen to maximize throughput under the above constraints.
The values that you choose will depend on the average size of a record and the number of consumers you have reading from
a given shard (which will be `replicas` unless you have other consumers also reading from this Kinesis stream).
If the above limits are violated, AWS will throw ProvisionedThroughputExceededException errors on subsequent calls to
read data. When this happens, the Kinesis indexing service will pause by `fetchDelayMillis` and then attempt the call
again.
Internally, each indexing task maintains a buffer that stores the fetched but not yet processed record. `recordsPerFetch` and `fetchDelayMillis`
control this behavior. The number of records that the indexing task fetch from the buffer is controlled by `maxRecordsPerPoll`, which
determines the number of records to be processed per each ingestion loop in the task.
## Deaggregation
See [issue](https://github.com/apache/druid/issues/6714)
The Kinesis indexing service supports de-aggregation of multiple rows packed into a single record by the Kinesis
Producer Library's aggregate method for more efficient data transfer. Currently, enabling the de-aggregate functionality
requires the user to manually provide the Kinesis Client Library on the classpath, since this library has a license not
compatible with Apache projects.
To enable this feature, add the `amazon-kinesis-client` (tested on version `1.9.2`) jar file ([link](https://mvnrepository.com/artifact/com.amazonaws/amazon-kinesis-client/1.9.2)) under `dist/druid/extensions/druid-kinesis-indexing-service/`.
Then when submitting a supervisor-spec, set `deaggregate` to true.
## Resharding
When changing the shard count for a Kinesis stream, there will be a window of time around the resharding operation with early shutdown of Kinesis ingestion tasks and possible task failures.
The early shutdowns and task failures are expected, and they occur because the supervisor will update the shard -> task group mappings as shards are closed and fully read, to ensure that tasks are not running
with an assignment of closed shards that have been fully read and to ensure a balanced distribution of active shards across tasks.
This window with early task shutdowns and possible task failures will conclude when:
- All closed shards have been fully read and the Kinesis ingestion tasks have published the data from those shards, committing the "closed" state to metadata storage
- Any remaining tasks that had inactive shards in the assignment have been shutdown (these tasks would have been created before the closed shards were completely drained)

View File

@ -0,0 +1,378 @@
---
id: lookups-cached-global
title: "Globally Cached Lookups"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
> Lookups are an [experimental](../experimental.md) feature.
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-lookups-cached-global` as an extension.
## Configuration
> Static configuration is no longer supported. Lookups can be configured through
> [dynamic configuration](../../querying/lookups.md#configuration).
Globally cached lookups are appropriate for lookups which are not possible to pass at query time due to their size,
or are not desired to be passed at query time because the data is to reside in and be handled by the Druid servers,
and are small enough to reasonably populate in-memory. This usually means tens to tens of thousands of entries per lookup.
Globally cached lookups all draw from the same cache pool, allowing each process to have a fixed cache pool that can be used by cached lookups.
Globally cached lookups can be specified as part of the [cluster wide config for lookups](../../querying/lookups.md) as a type of `cachedNamespace`
```json
{
"type": "cachedNamespace",
"extractionNamespace": {
"type": "uri",
"uri": "file:/tmp/prefix/",
"namespaceParseSpec": {
"format": "csv",
"columns": [
"key",
"value"
]
},
"pollPeriod": "PT5M"
},
"firstCacheTimeout": 0
}
```
```json
{
"type": "cachedNamespace",
"extractionNamespace": {
"type": "jdbc",
"connectorConfig": {
"createTables": true,
"connectURI": "jdbc:mysql:\/\/localhost:3306\/druid",
"user": "druid",
"password": "diurd"
},
"table": "lookupTable",
"keyColumn": "mykeyColumn",
"valueColumn": "myValueColumn",
"filter" : "myFilterSQL (Where clause statement e.g LOOKUPTYPE=1)",
"tsColumn": "timeColumn"
},
"firstCacheTimeout": 120000,
"injective":true
}
```
The parameters are as follows
|Property|Description|Required|Default|
|--------|-----------|--------|-------|
|`extractionNamespace`|Specifies how to populate the local cache. See below|Yes|-|
|`firstCacheTimeout`|How long to wait (in ms) for the first run of the cache to populate. 0 indicates to not wait|No|`0` (do not wait)|
|`injective`|If the underlying map is [injective](../../querying/lookups.html#query-execution) (keys and values are unique) then optimizations can occur internally by setting this to `true`|No|`false`|
If `firstCacheTimeout` is set to a non-zero value, it should be less than `druid.manager.lookups.hostUpdateTimeout`. If `firstCacheTimeout` is NOT set, then management is essentially asynchronous and does not know if a lookup succeeded or failed in starting. In such a case logs from the processes using lookups should be monitored for repeated failures.
Proper functionality of globally cached lookups requires the following extension to be loaded on the Broker, Peon, and Historical processes:
`druid-lookups-cached-global`
## Example configuration
In a simple case where only one [tier](../../querying/lookups.html#dynamic-configuration) exists (`realtime_customer2`) with one `cachedNamespace` lookup called `country_code`, the resulting configuration JSON looks similar to the following:
```json
{
"realtime_customer2": {
"country_code": {
"version": "v0",
"lookupExtractorFactory": {
"type": "cachedNamespace",
"extractionNamespace": {
"type": "jdbc",
"connectorConfig": {
"createTables": true,
"connectURI": "jdbc:mysql:\/\/localhost:3306\/druid",
"user": "druid",
"password": "diurd"
},
"table": "lookupValues",
"keyColumn": "value_id",
"valueColumn": "value_text",
"filter": "value_type='country'",
"tsColumn": "timeColumn"
},
"firstCacheTimeout": 120000,
"injective": true
}
}
}
}
```
Where the Coordinator endpoint `/druid/coordinator/v1/lookups/realtime_customer2/country_code` should return
```json
{
"version": "v0",
"lookupExtractorFactory": {
"type": "cachedNamespace",
"extractionNamespace": {
"type": "jdbc",
"connectorConfig": {
"createTables": true,
"connectURI": "jdbc:mysql://localhost:3306/druid",
"user": "druid",
"password": "diurd"
},
"table": "lookupValues",
"keyColumn": "value_id",
"valueColumn": "value_text",
"filter": "value_type='country'",
"tsColumn": "timeColumn"
},
"firstCacheTimeout": 120000,
"injective": true
}
}
```
## Cache Settings
Lookups are cached locally on Historical processes. The following are settings used by the processes which service queries when
setting namespaces (Broker, Peon, Historical)
|Property|Description|Default|
|--------|-----------|-------|
|`druid.lookup.namespace.cache.type`|Specifies the type of caching to be used by the namespaces. May be one of [`offHeap`, `onHeap`]. `offHeap` uses a temporary file for off-heap storage of the namespace (memory mapped files). `onHeap` stores all cache on the heap in standard java map types.|`onHeap`|
|`druid.lookup.namespace.numExtractionThreads`|The number of threads in the thread pool dedicated for lookup extraction and updates. This number may need to be scaled up, if you have a lot of lookups and they take long time to extract, to avoid timeouts.|2|
|`druid.lookup.namespace.numBufferedEntries`|If using off-heap caching, the number of records to be stored on an on-heap buffer.|100,000|
The cache is populated in different ways depending on the settings below. In general, most namespaces employ
a `pollPeriod` at the end of which time they poll the remote resource of interest for updates.
`onHeap` uses `ConcurrentMap`s in the java heap, and thus affects garbage collection and heap sizing.
`offHeap` uses an on-heap buffer and MapDB using memory-mapped files in the java temporary directory.
So if total number of entries in the `cachedNamespace` is in excess of the buffer's configured capacity, the extra will be kept in memory as page cache, and paged in and out by general OS tunings.
It's highly recommended that `druid.lookup.namespace.numBufferedEntries` is set when using `offHeap`, the value should be chosen from the range between 10% and 50% of the number of entries in the lookup.
## Supported lookups
For additional lookups, please see our [extensions list](../extensions.md).
### URI lookup
The remapping values for each globally cached lookup can be specified by a JSON object as per the following examples:
```json
{
"type":"uri",
"uri": "s3://bucket/some/key/prefix/renames-0003.gz",
"namespaceParseSpec":{
"format":"csv",
"columns":["key","value"]
},
"pollPeriod":"PT5M"
}
```
```json
{
"type":"uri",
"uriPrefix": "s3://bucket/some/key/prefix/",
"fileRegex":"renames-[0-9]*\\.gz",
"namespaceParseSpec":{
"format":"csv",
"columns":["key","value"]
},
"pollPeriod":"PT5M"
}
```
|Property|Description|Required|Default|
|--------|-----------|--------|-------|
|`pollPeriod`|Period between polling for updates|No|0 (only once)|
|`uri`|URI for the file of interest, specified as a file, hdfs, or s3 path|No|Use `uriPrefix`|
|`uriPrefix`|A URI that specifies a directory (or other searchable resource) in which to search for files|No|Use `uri`|
|`fileRegex`|Optional regex for matching the file name under `uriPrefix`. Only used if `uriPrefix` is used|No|`".*"`|
|`namespaceParseSpec`|How to interpret the data at the URI|Yes||
One of either `uri` or `uriPrefix` must be specified, as either a local file system (file://), HDFS (hdfs://), or S3 (s3://) location. HTTP location is not currently supported.
The `pollPeriod` value specifies the period in ISO 8601 format between checks for replacement data for the lookup. If the source of the lookup is capable of providing a timestamp, the lookup will only be updated if it has changed since the prior tick of `pollPeriod`. A value of 0, an absent parameter, or `null` all mean populate once and do not attempt to look for new data later. Whenever an poll occurs, the updating system will look for a file with the most recent timestamp and assume that one with the most recent data set, replacing the local cache of the lookup data.
The `namespaceParseSpec` can be one of a number of values. Each of the examples below would rename foo to bar, baz to bat, and buck to truck. All parseSpec types assumes each input is delimited by a new line. See below for the types of parseSpec supported.
Only ONE file which matches the search will be used. For most implementations, the discriminator for choosing the URIs is by whichever one reports the most recent timestamp for its modification time.
#### csv lookupParseSpec
|Parameter|Description|Required|Default|
|---------|-----------|--------|-------|
|`columns`|The list of columns in the csv file|no if `hasHeaderRow` is set|`null`|
|`keyColumn`|The name of the column containing the key|no|The first column|
|`valueColumn`|The name of the column containing the value|no|The second column|
|`hasHeaderRow`|A flag to indicate that column information can be extracted from the input files' header row|no|false|
|`skipHeaderRows`|Number of header rows to be skipped|no|0|
If both `skipHeaderRows` and `hasHeaderRow` options are set, `skipHeaderRows` is first applied. For example, if you set
`skipHeaderRows` to 2 and `hasHeaderRow` to true, Druid will skip the first two lines and then extract column information
from the third line.
*example input*
```
bar,something,foo
bat,something2,baz
truck,something3,buck
```
*example namespaceParseSpec*
```json
"namespaceParseSpec": {
"format": "csv",
"columns": ["value","somethingElse","key"],
"keyColumn": "key",
"valueColumn": "value"
}
```
#### tsv lookupParseSpec
|Parameter|Description|Required|Default|
|---------|-----------|--------|-------|
|`columns`|The list of columns in the tsv file|yes|`null`|
|`keyColumn`|The name of the column containing the key|no|The first column|
|`valueColumn`|The name of the column containing the value|no|The second column|
|`delimiter`|The delimiter in the file|no|tab (`\t`)|
|`listDelimiter`|The list delimiter in the file|no| (`\u0001`)|
|`hasHeaderRow`|A flag to indicate that column information can be extracted from the input files' header row|no|false|
|`skipHeaderRows`|Number of header rows to be skipped|no|0|
If both `skipHeaderRows` and `hasHeaderRow` options are set, `skipHeaderRows` is first applied. For example, if you set
`skipHeaderRows` to 2 and `hasHeaderRow` to true, Druid will skip the first two lines and then extract column information
from the third line.
*example input*
```
bar|something,1|foo
bat|something,2|baz
truck|something,3|buck
```
*example namespaceParseSpec*
```json
"namespaceParseSpec": {
"format": "tsv",
"columns": ["value","somethingElse","key"],
"keyColumn": "key",
"valueColumn": "value",
"delimiter": "|"
}
```
#### customJson lookupParseSpec
|Parameter|Description|Required|Default|
|---------|-----------|--------|-------|
|`keyFieldName`|The field name of the key|yes|null|
|`valueFieldName`|The field name of the value|yes|null|
*example input*
```json
{"key": "foo", "value": "bar", "somethingElse" : "something"}
{"key": "baz", "value": "bat", "somethingElse" : "something"}
{"key": "buck", "somethingElse": "something", "value": "truck"}
```
*example namespaceParseSpec*
```json
"namespaceParseSpec": {
"format": "customJson",
"keyFieldName": "key",
"valueFieldName": "value"
}
```
With customJson parsing, if the value field for a particular row is missing or null then that line will be skipped, and
will not be included in the lookup.
#### simpleJson lookupParseSpec
The `simpleJson` lookupParseSpec does not take any parameters. It is simply a line delimited JSON file where the field is the key, and the field's value is the value.
*example input*
```json
{"foo": "bar"}
{"baz": "bat"}
{"buck": "truck"}
```
*example namespaceParseSpec*
```json
"namespaceParseSpec":{
"format": "simpleJson"
}
```
### JDBC lookup
The JDBC lookups will poll a database to populate its local cache. If the `tsColumn` is set it must be able to accept comparisons in the format `'2015-01-01 00:00:00'`. For example, the following must be valid SQL for the table `SELECT * FROM some_lookup_table WHERE timestamp_column > '2015-01-01 00:00:00'`. If `tsColumn` is set, the caching service will attempt to only poll values that were written *after* the last sync. If `tsColumn` is not set, the entire table is pulled every time.
|Parameter|Description|Required|Default|
|---------|-----------|--------|-------|
|`namespace`|The namespace to define|Yes||
|`connectorConfig`|The connector config to use|Yes||
|`table`|The table which contains the key value pairs|Yes||
|`keyColumn`|The column in `table` which contains the keys|Yes||
|`valueColumn`|The column in `table` which contains the values|Yes||
|`filter`|The filter to use when selecting lookups, this is used to create a where clause on lookup population|No|No Filter|
|`tsColumn`| The column in `table` which contains when the key was updated|No|Not used|
|`pollPeriod`|How often to poll the DB|No|0 (only once)|
```json
{
"type":"jdbc",
"namespace":"some_lookup",
"connectorConfig":{
"createTables":true,
"connectURI":"jdbc:mysql://localhost:3306/druid",
"user":"druid",
"password":"diurd"
},
"table":"some_lookup_table",
"keyColumn":"the_old_dim_value",
"valueColumn":"the_new_dim_value",
"tsColumn":"timestamp_column",
"pollPeriod":600000
}
```
> If using JDBC, you will need to add your database's client JAR files to the extension's directory.
> For Postgres, the connector JAR is already included.
> For MySQL, you can get it from https://dev.mysql.com/downloads/connector/j/.
> Copy or symlink the downloaded file to `extensions/druid-lookups-cached-global` under the distribution root directory.
## Introspection
Globally cached lookups have introspection points at `/keys` and `/values` which return a complete set of the keys and values (respectively) in the lookup. Introspection to `/` returns the entire map. Introspection to `/version` returns the version indicator for the lookup.

View File

@ -0,0 +1,173 @@
---
id: mysql
title: "MySQL Metadata Store"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `mysql-metadata-storage` as an extension.
> The MySQL extension requires the MySQL Connector/J library which is not included in the Druid distribution.
> Refer to the following section for instructions on how to install this library.
## Installing the MySQL connector library
This extension uses Oracle's MySQL JDBC driver which is not included in the Druid distribution and must be
installed separately. There are a few ways to obtain this library:
- It can be downloaded from the MySQL site at: https://dev.mysql.com/downloads/connector/j/
- It can be fetched from Maven Central at: https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.48/mysql-connector-java-5.1.48.jar
- It may be available through your package manager, e.g. as `libmysql-java` on APT for a Debian-based OS
This should fetch a JAR file named similar to 'mysql-connector-java-x.x.xx.jar'.
Copy or symlink this file to `extensions/mysql-metadata-storage` under the distribution root directory.
## Setting up MySQL
1. Install MySQL
Use your favorite package manager to install mysql, e.g.:
- on Ubuntu/Debian using apt `apt-get install mysql-server`
- on OS X, using [Homebrew](http://brew.sh/) `brew install mysql`
Alternatively, download and follow installation instructions for MySQL
Community Server here:
[http://dev.mysql.com/downloads/mysql/](http://dev.mysql.com/downloads/mysql/)
2. Create a druid database and user
Connect to MySQL from the machine where it is installed.
```bash
> mysql -u root
```
Paste the following snippet into the mysql prompt:
```sql
-- create a druid database, make sure to use utf8mb4 as encoding
CREATE DATABASE druid DEFAULT CHARACTER SET utf8mb4;
-- create a druid user
CREATE USER 'druid'@'localhost' IDENTIFIED BY 'diurd';
-- grant the user all the permissions on the database we just created
GRANT ALL PRIVILEGES ON druid.* TO 'druid'@'localhost';
```
3. Configure your Druid metadata storage extension:
Add the following parameters to your Druid configuration, replacing `<host>`
with the location (host name and port) of the database.
```properties
druid.extensions.loadList=["mysql-metadata-storage"]
druid.metadata.storage.type=mysql
druid.metadata.storage.connector.connectURI=jdbc:mysql://<host>/druid
druid.metadata.storage.connector.user=druid
druid.metadata.storage.connector.password=diurd
```
## Encrypting MySQL connections
This extension provides support for encrypting MySQL connections. To get more information about encrypting MySQL connections using TLS/SSL in general, please refer to this [guide](https://dev.mysql.com/doc/refman/5.7/en/using-encrypted-connections.html).
## Configuration
|Property|Description|Default|Required|
|--------|-----------|-------|--------|
|`druid.metadata.mysql.ssl.useSSL`|Enable SSL|`false`|no|
|`druid.metadata.mysql.ssl.clientCertificateKeyStoreUrl`|The file path URL to the client certificate key store.|none|no|
|`druid.metadata.mysql.ssl.clientCertificateKeyStoreType`|The type of the key store where the client certificate is stored.|none|no|
|`druid.metadata.mysql.ssl.clientCertificateKeyStorePassword`|The [Password Provider](../../operations/password-provider.md) or String password for the client key store.|none|no|
|`druid.metadata.mysql.ssl.verifyServerCertificate`|Enables server certificate verification.|false|no|
|`druid.metadata.mysql.ssl.trustCertificateKeyStoreUrl`|The file path to the trusted root certificate key store.|Default trust store provided by MySQL|yes if `verifyServerCertificate` is set to true and a custom trust store is used|
|`druid.metadata.mysql.ssl.trustCertificateKeyStoreType`|The type of the key store where trusted root certificates are stored.|JKS|yes if `verifyServerCertificate` is set to true and keystore type is not JKS|
|`druid.metadata.mysql.ssl.trustCertificateKeyStorePassword`|The [Password Provider](../../operations/password-provider.md) or String password for the trust store.|none|yes if `verifyServerCertificate` is set to true and password is not null|
|`druid.metadata.mysql.ssl.enabledSSLCipherSuites`|Overrides the existing cipher suites with these cipher suites.|none|no|
|`druid.metadata.mysql.ssl.enabledTLSProtocols`|Overrides the TLS protocols with these protocols.|none|no|
### MySQL Firehose
The MySQL extension provides an implementation of an [SqlFirehose](../../ingestion/native-batch.md#firehoses-deprecated) which can be used to ingest data into Druid from a MySQL database.
```json
{
"type": "index_parallel",
"spec": {
"dataSchema": {
"dataSource": "some_datasource",
"parser": {
"parseSpec": {
"format": "timeAndDims",
"dimensionsSpec": {
"dimensionExclusions": [],
"dimensions": [
"dim1",
"dim2",
"dim3"
]
},
"timestampSpec": {
"format": "auto",
"column": "ts"
}
}
},
"metricsSpec": [],
"granularitySpec": {
"type": "uniform",
"segmentGranularity": "DAY",
"queryGranularity": {
"type": "none"
},
"rollup": false,
"intervals": null
},
"transformSpec": {
"filter": null,
"transforms": []
}
},
"ioConfig": {
"type": "index_parallel",
"firehose": {
"type": "sql",
"database": {
"type": "mysql",
"connectorConfig": {
"connectURI": "jdbc:mysql://some-rds-host.us-west-1.rds.amazonaws.com:3306/druid",
"user": "admin",
"password": "secret"
}
},
"sqls": [
"SELECT * FROM some_table"
]
}
},
"tuningconfig": {
"type": "index_parallel"
}
}
}
```

View File

@ -0,0 +1,84 @@
---
id: orc
title: "ORC Extension"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
## ORC extension
This Apache Druid extension enables Druid to ingest and understand the Apache ORC data format.
The extension provides the [ORC input format](../../ingestion/data-formats.md#orc) and the [ORC Hadoop parser](../../ingestion/data-formats.md#orc-hadoop-parser)
for [native batch ingestion](../../ingestion/native-batch.md) and [Hadoop batch ingestion](../../ingestion/hadoop.md), respectively.
Please see corresponding docs for details.
To use this extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-orc-extensions`.
### Migration from 'contrib' extension
This extension, first available in version 0.15.0, replaces the previous 'contrib' extension which was available until
0.14.0-incubating. While this extension can index any data the 'contrib' extension could, the JSON spec for the
ingestion task is *incompatible*, and will need modified to work with the newer 'core' extension.
To migrate to 0.15.0+:
* In `inputSpec` of `ioConfig`, `inputFormat` must be changed from `"org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat"` to
`"org.apache.orc.mapreduce.OrcInputFormat"`
* The 'contrib' extension supported a `typeString` property, which provided the schema of the
ORC file, of which was essentially required to have the types correct, but notably _not_ the column names, which
facilitated column renaming. In the 'core' extension, column renaming can be achieved with
[`flattenSpec`](../../ingestion/index.md#flattenspec). For example, `"typeString":"struct<time:string,name:string>"`
with the actual schema `struct<_col0:string,_col1:string>`, to preserve Druid schema would need replaced with:
```json
"flattenSpec": {
"fields": [
{
"type": "path",
"name": "time",
"expr": "$._col0"
},
{
"type": "path",
"name": "name",
"expr": "$._col1"
}
]
...
}
```
* The 'contrib' extension supported a `mapFieldNameFormat` property, which provided a way to specify a dimension to
flatten `OrcMap` columns with primitive types. This functionality has also been replaced with
[`flattenSpec`](../../ingestion/index.md#flattenspec). For example: `"mapFieldNameFormat": "<PARENT>_<CHILD>"`
for a dimension `nestedData_dim1`, to preserve Druid schema could be replaced with
```json
"flattenSpec": {
"fields": [
{
"type": "path",
"name": "nestedData_dim1",
"expr": "$.nestedData.dim1"
}
]
...
}
```

View File

@ -0,0 +1,36 @@
---
id: parquet
title: "Apache Parquet Extension"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This Apache Druid module extends [Druid Hadoop based indexing](../../ingestion/hadoop.md) to ingest data directly from offline
Apache Parquet files.
Note: If using the `parquet-avro` parser for Apache Hadoop based indexing, `druid-parquet-extensions` depends on the `druid-avro-extensions` module, so be sure to
[include both](../../development/extensions.md#loading-extensions).
The `druid-parquet-extensions` provides the [Parquet input format](../../ingestion/data-formats.md#parquet), the [Parquet Hadoop parser](../../ingestion/data-formats.md#parquet-hadoop-parser),
and the [Parquet Avro Hadoop Parser](../../ingestion/data-formats.md#parquet-avro-hadoop-parser) with `druid-avro-extensions`.
The Parquet input format is available for [native batch ingestion](../../ingestion/native-batch.md)
and the other 2 parsers are for [Hadoop batch ingestion](../../ingestion/hadoop.md).
Please see corresponding docs for details.

View File

@ -0,0 +1,152 @@
---
id: postgresql
title: "PostgreSQL Metadata Store"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `postgresql-metadata-storage` as an extension.
## Setting up PostgreSQL
1. Install PostgreSQL
Use your favorite package manager to install PostgreSQL, e.g.:
- on Ubuntu/Debian using apt `apt-get install postgresql`
- on OS X, using [Homebrew](http://brew.sh/) `brew install postgresql`
2. Create a druid database and user
On the machine where PostgreSQL is installed, using an account with proper
postgresql permissions:
Create a druid user, enter `diurd` when prompted for the password.
```bash
createuser druid -P
```
Create a druid database owned by the user we just created
```bash
createdb druid -O druid
```
*Note:* On Ubuntu / Debian you may have to prefix the `createuser` and
`createdb` commands with `sudo -u postgres` in order to gain proper
permissions.
3. Configure your Druid metadata storage extension:
Add the following parameters to your Druid configuration, replacing `<host>`
with the location (host name and port) of the database.
```properties
druid.extensions.loadList=["postgresql-metadata-storage"]
druid.metadata.storage.type=postgresql
druid.metadata.storage.connector.connectURI=jdbc:postgresql://<host>/druid
druid.metadata.storage.connector.user=druid
druid.metadata.storage.connector.password=diurd
```
## Configuration
In most cases, the configuration options map directly to the [postgres JDBC connection options](https://jdbc.postgresql.org/documentation/head/connect.html).
|Property|Description|Default|Required|
|--------|-----------|-------|--------|
| `druid.metadata.postgres.ssl.useSSL` | Enables SSL | `false` | no |
| `druid.metadata.postgres.ssl.sslPassword` | The [Password Provider](../../operations/password-provider.md) or String password for the client's key. | none | no |
| `druid.metadata.postgres.ssl.sslFactory` | The class name to use as the `SSLSocketFactory` | none | no |
| `druid.metadata.postgres.ssl.sslFactoryArg` | An optional argument passed to the sslFactory's constructor | none | no |
| `druid.metadata.postgres.ssl.sslMode` | The sslMode. Possible values are "disable", "require", "verify-ca", "verify-full", "allow" and "prefer"| none | no |
| `druid.metadata.postgres.ssl.sslCert` | The full path to the certificate file. | none | no |
| `druid.metadata.postgres.ssl.sslKey` | The full path to the key file. | none | no |
| `druid.metadata.postgres.ssl.sslRootCert` | The full path to the root certificate. | none | no |
| `druid.metadata.postgres.ssl.sslHostNameVerifier` | The classname of the hostname verifier. | none | no |
| `druid.metadata.postgres.ssl.sslPasswordCallback` | The classname of the SSL password provider. | none | no |
| `druid.metadata.postgres.dbTableSchema` | druid meta table schema | `public` | no |
### PostgreSQL Firehose
The PostgreSQL extension provides an implementation of an [SqlFirehose](../../ingestion/native-batch.md#firehoses-deprecated) which can be used to ingest data into Druid from a PostgreSQL database.
```json
{
"type": "index_parallel",
"spec": {
"dataSchema": {
"dataSource": "some_datasource",
"parser": {
"parseSpec": {
"format": "timeAndDims",
"dimensionsSpec": {
"dimensionExclusions": [],
"dimensions": [
"dim1",
"dim2",
"dim3"
]
},
"timestampSpec": {
"format": "auto",
"column": "ts"
}
}
},
"metricsSpec": [],
"granularitySpec": {
"type": "uniform",
"segmentGranularity": "DAY",
"queryGranularity": {
"type": "none"
},
"rollup": false,
"intervals": null
},
"transformSpec": {
"filter": null,
"transforms": []
}
},
"ioConfig": {
"type": "index_parallel",
"firehose": {
"type": "sql",
"database": {
"type": "postgresql",
"connectorConfig": {
"connectURI": "jdbc:postgresql://some-rds-host.us-west-1.rds.amazonaws.com:5432/druid",
"user": "admin",
"password": "secret"
}
},
"sqls": [
"SELECT * FROM some_table"
]
}
},
"tuningconfig": {
"type": "index_parallel"
}
}
}
```

View File

@ -0,0 +1,239 @@
---
id: protobuf
title: "Protobuf"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This Apache Druid extension enables Druid to ingest and understand the Protobuf data format. Make sure to [include](../../development/extensions.md#loading-extensions) `druid-protobuf-extensions` as an extension.
The `druid-protobuf-extensions` provides the [Protobuf Parser](../../ingestion/data-formats.md#protobuf-parser)
for [stream ingestion](../../ingestion/index.md#streaming). See corresponding docs for details.
## Example: Load Protobuf messages from Kafka
This example demonstrates how to load Protobuf messages from Kafka. Please read the [Load from Kafka tutorial](../../tutorials/tutorial-kafka.md) first, and see [Kafka Indexing Service](./kafka-ingestion.md) documentation for more details.
The files used in this example are found at [`./examples/quickstart/protobuf` in your Druid directory](https://github.com/apache/druid/tree/master/examples/quickstart/protobuf).
For this example:
- Kafka broker host is `localhost:9092`
- Kafka topic is `metrics_pb`
- Datasource name is `metrics-protobuf`
Here is a JSON example of the 'metrics' data schema used in the example.
```json
{
"unit": "milliseconds",
"http_method": "GET",
"value": 44,
"timestamp": "2017-04-06T02:36:22Z",
"http_code": "200",
"page": "/",
"metricType": "request/latency",
"server": "www1.example.com"
}
```
### Proto file
The corresponding proto file for our 'metrics' dataset looks like this.
```
syntax = "proto3";
message Metrics {
string unit = 1;
string http_method = 2;
int32 value = 3;
string timestamp = 4;
string http_code = 5;
string page = 6;
string metricType = 7;
string server = 8;
}
```
### Descriptor file
Next, we use the `protoc` Protobuf compiler to generate the descriptor file and save it as `metrics.desc`. The descriptor file must be either in the classpath or reachable by URL. In this example the descriptor file was saved at `/tmp/metrics.desc`, however this file is also available in the example files. From your Druid install directory:
```
protoc -o /tmp/metrics.desc ./quickstart/protobuf/metrics.proto
```
## Create Kafka Supervisor
Below is the complete Supervisor spec JSON to be submitted to the Overlord.
Make sure these keys are properly configured for successful ingestion.
Important supervisor properties
- `descriptor` for the descriptor file URL
- `protoMessageType` from the proto definition
- `parser` should have `type` set to `protobuf`, but note that the `format` of the `parseSpec` must be `json`
```json
{
"type": "kafka",
"dataSchema": {
"dataSource": "metrics-protobuf",
"parser": {
"type": "protobuf",
"descriptor": "file:///tmp/metrics.desc",
"protoMessageType": "Metrics",
"parseSpec": {
"format": "json",
"timestampSpec": {
"column": "timestamp",
"format": "auto"
},
"dimensionsSpec": {
"dimensions": [
"unit",
"http_method",
"http_code",
"page",
"metricType",
"server"
],
"dimensionExclusions": [
"timestamp",
"value"
]
}
}
},
"metricsSpec": [
{
"name": "count",
"type": "count"
},
{
"name": "value_sum",
"fieldName": "value",
"type": "doubleSum"
},
{
"name": "value_min",
"fieldName": "value",
"type": "doubleMin"
},
{
"name": "value_max",
"fieldName": "value",
"type": "doubleMax"
}
],
"granularitySpec": {
"type": "uniform",
"segmentGranularity": "HOUR",
"queryGranularity": "NONE"
}
},
"tuningConfig": {
"type": "kafka",
"maxRowsPerSegment": 5000000
},
"ioConfig": {
"topic": "metrics_pb",
"consumerProperties": {
"bootstrap.servers": "localhost:9092"
},
"taskCount": 1,
"replicas": 1,
"taskDuration": "PT1H"
}
}
```
## Adding Protobuf messages to Kafka
If necessary, from your Kafka installation directory run the following command to create the Kafka topic
```
./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic metrics_pb
```
This example script requires `protobuf` and `kafka-python` modules. With the topic in place, messages can be inserted running the following command from your Druid installation directory
```
./bin/generate-example-metrics | ./quickstart/protobuf/pb_publisher.py
```
You can confirm that data has been inserted to your Kafka topic using the following command from your Kafka installation directory
```
./bin/kafka-console-consumer --zookeeper localhost --topic metrics_pb
```
which should print messages like this
```
millisecondsGETR"2017-04-06T03:23:56Z*2002/list:request/latencyBwww1.example.com
```
If your supervisor created in the previous step is running, the indexing tasks should begin producing the messages and the data will soon be available for querying in Druid.
## Generating the example files
The files provided in the example quickstart can be generated in the following manner starting with only `metrics.proto`.
### `metrics.desc`
The descriptor file is generated using `protoc` Protobuf compiler. Given a `.proto` file, a `.desc` file can be generated like so.
```
protoc -o metrics.desc metrics.proto
```
### `metrics_pb2.py`
`metrics_pb2.py` is also generated with `protoc`
```
protoc -o metrics.desc metrics.proto --python_out=.
```
### `pb_publisher.py`
After `metrics_pb2.py` is generated, another script can be constructed to parse JSON data, convert it to Protobuf, and produce to a Kafka topic
```python
#!/usr/bin/env python
import sys
import json
from kafka import KafkaProducer
from metrics_pb2 import Metrics
producer = KafkaProducer(bootstrap_servers='localhost:9092')
topic = 'metrics_pb'
for row in iter(sys.stdin):
d = json.loads(row)
metrics = Metrics()
for k, v in d.items():
setattr(metrics, k, v)
pb = metrics.SerializeToString()
producer.send(topic, pb)
producer.flush()
```

View File

@ -0,0 +1,126 @@
---
id: s3
title: "S3-compatible"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
## S3 extension
This extension allows you to do 2 things:
* [Ingest data](#reading-data-from-s3) from files stored in S3.
* Write segments to [deep storage](#deep-storage) in S3.
To use this Apache Druid extension, make sure to [include](../../development/extensions.md#loading-extensions) `druid-s3-extensions` as an extension.
### Reading data from S3
The [S3 input source](../../ingestion/native-batch.md#s3-input-source) is supported by the [Parallel task](../../ingestion/native-batch.md#parallel-task)
to read objects directly from S3. If you use the [Hadoop task](../../ingestion/hadoop.md),
you can read data from S3 by specifying the S3 paths in your [`inputSpec`](../../ingestion/hadoop.md#inputspec).
To configure the extension to read objects from S3 you need to configure how to [connect to S3](#configuration).
### Deep Storage
S3-compatible deep storage means either AWS S3 or a compatible service like Google Storage which exposes the same API as S3.
S3 deep storage needs to be explicitly enabled by setting `druid.storage.type=s3`. **Only after setting the storage type to S3 will any of the settings below take effect.**
To correctly configure this extension for deep storage in S3, first configure how to [connect to S3](#configuration).
In addition to this you need to set additional configuration, specific for [deep storage](#deep-storage-specific-configuration)
#### Deep storage specific configuration
|Property|Description|Default|
|--------|-----------|-------|
|`druid.storage.bucket`|Bucket to store in.|Must be set.|
|`druid.storage.baseKey`|A prefix string that will be prepended to the object names for the segments published to S3 deep storage|Must be set.|
|`druid.storage.type`|Global deep storage provider. Must be set to `s3` to make use of this extension.|Must be set (likely `s3`).|
|`druid.storage.archiveBucket`|S3 bucket name for archiving when running the *archive task*.|none|
|`druid.storage.archiveBaseKey`|S3 object key prefix for archiving.|none|
|`druid.storage.disableAcl`|Boolean flag to disable ACL. If this is set to `false`, the full control would be granted to the bucket owner. This may require to set additional permissions. See [S3 permissions settings](#s3-permissions-settings).|false|
|`druid.storage.useS3aSchema`|If true, use the "s3a" filesystem when using Hadoop-based ingestion. If false, the "s3n" filesystem will be used. Only affects Hadoop-based ingestion.|false|
## Configuration
### S3 authentication methods
Druid uses the following credentials provider chain to connect to your S3 bucket (whether a deep storage bucket or source bucket).
**Note :** *You can override the default credentials provider chain for connecting to source bucket by specifying an access key and secret key using [Properties Object](../../ingestion/native-batch.md#s3-input-source) parameters in the ingestionSpec.*
|order|type|details|
|--------|-----------|-------|
|1|Druid config file|Based on your runtime.properties if it contains values `druid.s3.accessKey` and `druid.s3.secretKey` |
|2|Custom properties file| Based on custom properties file where you can supply `sessionToken`, `accessKey` and `secretKey` values. This file is provided to Druid through `druid.s3.fileSessionCredentials` properties|
|3|Environment variables|Based on environment variables `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`|
|4|Java system properties|Based on JVM properties `aws.accessKeyId` and `aws.secretKey` |
|5|Profile information|Based on credentials you may have on your druid instance (generally in `~/.aws/credentials`)|
|6|ECS container credentials|Based on environment variables available on AWS ECS (AWS_CONTAINER_CREDENTIALS_RELATIVE_URI or AWS_CONTAINER_CREDENTIALS_FULL_URI) as described in the [EC2ContainerCredentialsProviderWrapper documentation](https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/EC2ContainerCredentialsProviderWrapper.html)|
|7|Instance profile information|Based on the instance profile you may have attached to your druid instance|
You can find more information about authentication method [here](https://docs.aws.amazon.com/fr_fr/sdk-for-java/v1/developer-guide/credentials.html)<br/>
**Note :** *Order is important here as it indicates the precedence of authentication methods.<br/>
So if you are trying to use Instance profile information, you **must not** set `druid.s3.accessKey` and `druid.s3.secretKey` in your Druid runtime.properties*
### S3 permissions settings
`s3:GetObject` and `s3:PutObject` are basically required for pushing/loading segments to/from S3.
If `druid.storage.disableAcl` is set to `false`, then `s3:GetBucketAcl` and `s3:PutObjectAcl` are additionally required to set ACL for objects.
### AWS region
The AWS SDK requires that the target region be specified. Two ways of doing this are by using the JVM system property `aws.region` or the environment variable `AWS_REGION`.
As an example, to set the region to 'us-east-1' through system properties:
- Add `-Daws.region=us-east-1` to the jvm.config file for all Druid services.
- Add `-Daws.region=us-east-1` to `druid.indexer.runner.javaOpts` in [Middle Manager configuration](../../configuration/index.md#middlemanager-configuration) so that the property will be passed to Peon (worker) processes.
### Connecting to S3 configuration
|Property|Description|Default|
|--------|-----------|-------|
|`druid.s3.accessKey`|S3 access key. See [S3 authentication methods](#s3-authentication-methods) for more details|Can be omitted according to authentication methods chosen.|
|`druid.s3.secretKey`|S3 secret key. See [S3 authentication methods](#s3-authentication-methods) for more details|Can be omitted according to authentication methods chosen.|
|`druid.s3.fileSessionCredentials`|Path to properties file containing `sessionToken`, `accessKey` and `secretKey` value. One key/value pair per line (format `key=value`). See [S3 authentication methods](#s3-authentication-methods) for more details |Can be omitted according to authentication methods chosen.|
|`druid.s3.protocol`|Communication protocol type to use when sending requests to AWS. `http` or `https` can be used. This configuration would be ignored if `druid.s3.endpoint.url` is filled with a URL with a different protocol.|`https`|
|`druid.s3.disableChunkedEncoding`|Disables chunked encoding. See [AWS document](https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Builder.html#disableChunkedEncoding--) for details.|false|
|`druid.s3.enablePathStyleAccess`|Enables path style access. See [AWS document](https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Builder.html#enablePathStyleAccess--) for details.|false|
|`druid.s3.forceGlobalBucketAccessEnabled`|Enables global bucket access. See [AWS document](https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Builder.html#setForceGlobalBucketAccessEnabled-java.lang.Boolean-) for details.|false|
|`druid.s3.endpoint.url`|Service endpoint either with or without the protocol.|None|
|`druid.s3.endpoint.signingRegion`|Region to use for SigV4 signing of requests (e.g. us-west-1).|None|
|`druid.s3.proxy.host`|Proxy host to connect through.|None|
|`druid.s3.proxy.port`|Port on the proxy host to connect through.|None|
|`druid.s3.proxy.username`|User name to use when connecting through a proxy.|None|
|`druid.s3.proxy.password`|Password to use when connecting through a proxy.|None|
|`druid.storage.sse.type`|Server-side encryption type. Should be one of `s3`, `kms`, and `custom`. See the below [Server-side encryption section](#server-side-encryption) for more details.|None|
|`druid.storage.sse.kms.keyId`|AWS KMS key ID. This is used only when `druid.storage.sse.type` is `kms` and can be empty to use the default key ID.|None|
|`druid.storage.sse.custom.base64EncodedKey`|Base64-encoded key. Should be specified if `druid.storage.sse.type` is `custom`.|None|
## Server-side encryption
You can enable [server-side encryption](https://docs.aws.amazon.com/AmazonS3/latest/dev/serv-side-encryption.html) by setting
`druid.storage.sse.type` to a supported type of server-side encryption. The current supported types are:
- s3: [Server-side encryption with S3-managed encryption keys](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingServerSideEncryption.html)
- kms: [Server-side encryption with AWS KMSManaged Keys](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingKMSEncryption.html)
- custom: [Server-side encryption with Customer-Provided Encryption Keys](https://docs.aws.amazon.com/AmazonS3/latest/dev/ServerSideEncryptionCustomerKeys.html)

View File

@ -0,0 +1,52 @@
---
id: simple-client-sslcontext
title: "Simple SSLContext Provider Module"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This Apache Druid module contains a simple implementation of [SSLContext](http://docs.oracle.com/javase/8/docs/api/javax/net/ssl/SSLContext.html)
that will be injected to be used with HttpClient that Druid processes use internally to communicate with each other. To learn more about
Java's SSL support, please refer to [this](http://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html) guide.
|Property|Description|Default|Required|
|--------|-----------|-------|--------|
|`druid.client.https.protocol`|SSL protocol to use.|`TLSv1.2`|no|
|`druid.client.https.trustStoreType`|The type of the key store where trusted root certificates are stored.|`java.security.KeyStore.getDefaultType()`|no|
|`druid.client.https.trustStorePath`|The file path or URL of the TLS/SSL Key store where trusted root certificates are stored.|none|yes|
|`druid.client.https.trustStoreAlgorithm`|Algorithm to be used by TrustManager to validate certificate chains|`javax.net.ssl.TrustManagerFactory.getDefaultAlgorithm()`|no|
|`druid.client.https.trustStorePassword`|The [Password Provider](../../operations/password-provider.md) or String password for the Trust Store.|none|yes|
The following table contains optional parameters for supporting client certificate authentication:
|Property|Description|Default|Required|
|--------|-----------|-------|--------|
|`druid.client.https.keyStorePath`|The file path or URL of the TLS/SSL Key store containing the client certificate that Druid will use when communicating with other Druid services. If this is null, the other properties in this table are ignored.|none|yes|
|`druid.client.https.keyStoreType`|The type of the key store.|none|yes|
|`druid.client.https.certAlias`|Alias of TLS client certificate in the keystore.|none|yes|
|`druid.client.https.keyStorePassword`|The [Password Provider](../../operations/password-provider.md) or String password for the Key Store.|none|no|
|`druid.client.https.keyManagerFactoryAlgorithm`|Algorithm to use for creating KeyManager, more details [here](https://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/JSSERefGuide.html#KeyManager).|`javax.net.ssl.KeyManagerFactory.getDefaultAlgorithm()`|no|
|`druid.client.https.keyManagerPassword`|The [Password Provider](../../operations/password-provider.md) or String password for the Key Manager.|none|no|
|`druid.client.https.validateHostnames`|Validate the hostname of the server. This should not be disabled unless you are using [custom TLS certificate checks](../../operations/tls-support.md) and know that standard hostname validation is not needed.|true|no|
This [document](http://docs.oracle.com/javase/8/docs/technotes/guides/security/StandardNames.html) lists all the possible
values for the above mentioned configs among others provided by Java implementation.

View File

@ -0,0 +1,171 @@
---
id: stats
title: "Stats aggregator"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This Apache Druid extension includes stat-related aggregators, including variance and standard deviations, etc. Make sure to [include](../../development/extensions.md#loading-extensions) `druid-stats` as an extension.
## Variance aggregator
Algorithm of the aggregator is the same with that of apache hive. This is the description in GenericUDAFVariance in hive.
Evaluate the variance using the algorithm described by Chan, Golub, and LeVeque in
"Algorithms for computing the sample variance: analysis and recommendations"
The American Statistician, 37 (1983) pp. 242--247.
variance = variance1 + variance2 + n/(m*(m+n)) * pow(((m/n)*t1 - t2),2)
where: - variance is sum(x-avg^2) (this is actually n times the variance)
and is updated at every step. - n is the count of elements in chunk1 - m is
the count of elements in chunk2 - t1 = sum of elements in chunk1, t2 =
sum of elements in chunk2.
This algorithm was proven to be numerically stable by J.L. Barlow in
"Error analysis of a pairwise summation algorithm to compute sample variance"
Numer. Math, 58 (1991) pp. 583--590
### Pre-aggregating variance at ingestion time
To use this feature, an "variance" aggregator must be included at indexing time.
The ingestion aggregator can only apply to numeric values. If you use "variance"
then any input rows missing the value will be considered to have a value of 0.
User can specify expected input type as one of "float", "double", "long", "variance" for ingestion, which is by default "float".
```json
{
"type" : "variance",
"name" : <output_name>,
"fieldName" : <metric_name>,
"inputType" : <input_type>,
"estimator" : <string>
}
```
To query for results, "variance" aggregator with "variance" input type or simply a "varianceFold" aggregator must be included in the query.
```json
{
"type" : "varianceFold",
"name" : <output_name>,
"fieldName" : <metric_name>,
"estimator" : <string>
}
```
|Property |Description |Default |
|-------------------------|------------------------------|----------------------------------|
|`estimator`|Set "population" to get variance_pop rather than variance_sample, which is default.|null|
### Standard deviation post-aggregator
To acquire standard deviation from variance, user can use "stddev" post aggregator.
```json
{
"type": "stddev",
"name": "<output_name>",
"fieldName": "<aggregator_name>",
"estimator": <string>
}
```
## Query examples:
### Timeseries query
```json
{
"queryType": "timeseries",
"dataSource": "testing",
"granularity": "day",
"aggregations": [
{
"type": "variance",
"name": "index_var",
"fieldName": "index_var"
}
],
"intervals": [
"2016-03-01T00:00:00.000/2013-03-20T00:00:00.000"
]
}
```
### TopN query
```json
{
"queryType": "topN",
"dataSource": "testing",
"dimensions": ["alias"],
"threshold": 5,
"granularity": "all",
"aggregations": [
{
"type": "variance",
"name": "index_var",
"fieldName": "index"
}
],
"postAggregations": [
{
"type": "stddev",
"name": "index_stddev",
"fieldName": "index_var"
}
],
"intervals": [
"2016-03-06T00:00:00/2016-03-06T23:59:59"
]
}
```
### GroupBy query
```json
{
"queryType": "groupBy",
"dataSource": "testing",
"dimensions": ["alias"],
"granularity": "all",
"aggregations": [
{
"type": "variance",
"name": "index_var",
"fieldName": "index"
}
],
"postAggregations": [
{
"type": "stddev",
"name": "index_stddev",
"fieldName": "index_var"
}
],
"intervals": [
"2016-03-06T00:00:00/2016-03-06T23:59:59"
]
}
```

View File

@ -0,0 +1,117 @@
---
id: test-stats
title: "Test Stats Aggregators"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This Apache Druid extension incorporates test statistics related aggregators, including z-score and p-value. Please refer to [https://www.paypal-engineering.com/2017/06/29/democratizing-experimentation-data-for-product-innovations/](https://www.paypal-engineering.com/2017/06/29/democratizing-experimentation-data-for-product-innovations/) for math background and details.
Make sure to include `druid-stats` extension in order to use these aggregators.
## Z-Score for two sample ztests post aggregator
Please refer to [https://www.isixsigma.com/tools-templates/hypothesis-testing/making-sense-two-proportions-test/](https://www.isixsigma.com/tools-templates/hypothesis-testing/making-sense-two-proportions-test/) and [http://www.ucs.louisiana.edu/~jcb0773/Berry_statbook/Berry_statbook_chpt6.pdf](http://www.ucs.louisiana.edu/~jcb0773/Berry_statbook/Berry_statbook_chpt6.pdf) for more details.
z = (p1 - p2) / S.E. (assuming null hypothesis is true)
Please see below for p1 and p2.
Please note S.E. stands for standard error where
S.E. = sqrt{ p1 * ( 1 - p1 )/n1 + p2 * (1 - p2)/n2) }
(p1 p2) is the observed difference between two sample proportions.
### zscore2sample post aggregator
* **`zscore2sample`**: calculate the z-score using two-sample z-test while converting binary variables (***e.g.*** success or not) to continuous variables (***e.g.*** conversion rate).
```json
{
"type": "zscore2sample",
"name": "<output_name>",
"successCount1": <post_aggregator> success count of sample 1,
"sample1Size": <post_aggregaror> sample 1 size,
"successCount2": <post_aggregator> success count of sample 2,
"sample2Size" : <post_aggregator> sample 2 size
}
```
Please note the post aggregator will be converting binary variables to continuous variables for two population proportions. Specifically
p1 = (successCount1) / (sample size 1)
p2 = (successCount2) / (sample size 2)
### pvalue2tailedZtest post aggregator
* **`pvalue2tailedZtest`**: calculate p-value of two-sided z-test from zscore
- ***pvalue2tailedZtest(zscore)*** - the input is a z-score which can be calculated using the zscore2sample post aggregator
```json
{
"type": "pvalue2tailedZtest",
"name": "<output_name>",
"zScore": <zscore post_aggregator>
}
```
## Example Usage
In this example, we use zscore2sample post aggregator to calculate z-score, and then feed the z-score to pvalue2tailedZtest post aggregator to calculate p-value.
A JSON query example can be as follows:
```json
{
...
"postAggregations" : {
"type" : "pvalue2tailedZtest",
"name" : "pvalue",
"zScore" :
{
"type" : "zscore2sample",
"name" : "zscore",
"successCount1" :
{ "type" : "constant",
"name" : "successCountFromPopulation1Sample",
"value" : 300
},
"sample1Size" :
{ "type" : "constant",
"name" : "sampleSizeOfPopulation1",
"value" : 500
},
"successCount2":
{ "type" : "constant",
"name" : "successCountFromPopulation2Sample",
"value" : 450
},
"sample2Size" :
{ "type" : "constant",
"name" : "sampleSizeOfPopulation2",
"value" : 600
}
}
}
}
```

3
development/index.md Normal file
View File

@ -0,0 +1,3 @@
# Druid 开发指南
[开发概述](overview.md ':include')

339
development/modules.md Normal file
View File

@ -0,0 +1,339 @@
# 创建 Druid 扩展
Druid 使用一个定义的 模块module系统来允许在运行时runtime使用扩展来extensions扩充功能。
## Writing your own extensions
Druid's extensions leverage Guice in order to add things at runtime.
Basically, Guice is a framework for Dependency Injection, but we use it to hold the expected object graph of the Druid process.
Extensions can make any changes they want/need to the object graph via adding Guice bindings.
While the extensions actually give you the capability to change almost anything however you want, in general, we expect people to want to extend one of the things listed below.
This means that we honor our [versioning strategy](./versioning.md) for changes that affect the interfaces called out on this page, but other interfaces are deemed "internal" and can be changed in an incompatible manner even between patch releases.
1. Add a new deep storage implementation by extending the `org.apache.druid.segment.loading.DataSegment*` and
`org.apache.druid.tasklogs.TaskLog*` classes.
1. Add a new input source by extending `org.apache.druid.data.input.InputSource`.
1. Add a new input entity by extending `org.apache.druid.data.input.InputEntity`.
1. Add a new input source reader if necessary by extending `org.apache.druid.data.input.InputSourceReader`. You can use `org.apache.druid.data.input.impl.InputEntityIteratingReader` in most cases.
1. Add a new input format by extending `org.apache.druid.data.input.InputFormat`.
1. Add a new input entity reader by extending `org.apache.druid.data.input.TextReader` for text formats or `org.apache.druid.data.input.IntermediateRowParsingReader` for binary formats.
1. Add Aggregators by extending `org.apache.druid.query.aggregation.AggregatorFactory`, `org.apache.druid.query.aggregation.Aggregator`,
and `org.apache.druid.query.aggregation.BufferAggregator`.
1. Add PostAggregators by extending `org.apache.druid.query.aggregation.PostAggregator`.
1. Add ExtractionFns by extending `org.apache.druid.query.extraction.ExtractionFn`.
1. Add Complex metrics by extending `org.apache.druid.segment.serde.ComplexMetricSerde`.
1. Add new Query types by extending `org.apache.druid.query.QueryRunnerFactory`, `org.apache.druid.query.QueryToolChest`, and
`org.apache.druid.query.Query`.
1. Add new Jersey resources by calling `Jerseys.addResource(binder, clazz)`.
1. Add new Jetty filters by extending `org.apache.druid.server.initialization.jetty.ServletFilterHolder`.
1. Add new secret providers by extending `org.apache.druid.metadata.PasswordProvider`.
1. Add new ingest transform by implementing the `org.apache.druid.segment.transform.Transform` interface from the `druid-processing` package.
1. Bundle your extension with all the other Druid extensions
Extensions are added to the system via an implementation of `org.apache.druid.initialization.DruidModule`.
### Creating a Druid Module
The DruidModule class is has two methods
1. A `configure(Binder)` method
2. A `getJacksonModules()` method
The `configure(Binder)` method is the same method that a normal Guice module would have.
The `getJacksonModules()` method provides a list of Jackson modules that are used to help initialize the Jackson ObjectMapper instances used by Druid. This is how you add extensions that are instantiated via Jackson (like AggregatorFactory and InputSource objects) to Druid.
### Registering your Druid Module
Once you have your DruidModule created, you will need to package an extra file in the `META-INF/services` directory of your jar. This is easiest to accomplish with a maven project by creating files in the `src/main/resources` directory. There are examples of this in the Druid code under the `cassandra-storage`, `hdfs-storage` and `s3-extensions` modules, for examples.
The file that should exist in your jar is
`META-INF/services/org.apache.druid.initialization.DruidModule`
It should be a text file with a new-line delimited list of package-qualified classes that implement DruidModule like
```
org.apache.druid.storage.cassandra.CassandraDruidModule
```
If your jar has this file, then when it is added to the classpath or as an extension, Druid will notice the file and will instantiate instances of the Module. Your Module should have a default constructor, but if you need access to runtime configuration properties, it can have a method with @Inject on it to get a Properties object injected into it from Guice.
### Adding a new deep storage implementation
Check the `azure-storage`, `google-storage`, `cassandra-storage`, `hdfs-storage` and `s3-extensions` modules for examples of how to do this.
The basic idea behind the extension is that you need to add bindings for your DataSegmentPusher and DataSegmentPuller objects. The way to add them is something like (taken from HdfsStorageDruidModule)
``` java
Binders.dataSegmentPullerBinder(binder)
.addBinding("hdfs")
.to(HdfsDataSegmentPuller.class).in(LazySingleton.class);
Binders.dataSegmentPusherBinder(binder)
.addBinding("hdfs")
.to(HdfsDataSegmentPusher.class).in(LazySingleton.class);
```
`Binders.dataSegment*Binder()` is a call provided by the druid-core jar which sets up a Guice multibind "MapBinder". If that doesn't make sense, don't worry about it, just think of it as a magical incantation.
`addBinding("hdfs")` for the Puller binder creates a new handler for loadSpec objects of type "hdfs". For the Pusher binder it creates a new type value that you can specify for the `druid.storage.type` parameter.
`to(...).in(...);` is normal Guice stuff.
In addition to DataSegmentPusher and DataSegmentPuller, you can also bind:
* DataSegmentKiller: Removes segments, used as part of the Kill Task to delete unused segments, i.e. perform garbage collection of segments that are either superseded by newer versions or that have been dropped from the cluster.
* DataSegmentMover: Allow migrating segments from one place to another, currently this is only used as part of the MoveTask to move unused segments to a different S3 bucket or prefix, typically to reduce storage costs of unused data (e.g. move to glacier or cheaper storage)
* DataSegmentArchiver: Just a wrapper around Mover, but comes with a pre-configured target bucket/path, so it doesn't have to be specified at runtime as part of the ArchiveTask.
### Validating your deep storage implementation
**WARNING!** This is not a formal procedure, but a collection of hints to validate if your new deep storage implementation is able do push, pull and kill segments.
It's recommended to use batch ingestion tasks to validate your implementation.
The segment will be automatically rolled up to Historical note after ~20 seconds.
In this way, you can validate both push (at realtime process) and pull (at Historical process) segments.
* DataSegmentPusher
Wherever your data storage (cloud storage service, distributed file system, etc.) is, you should be able to see one new file: `index.zip` (`partitionNum_index.zip` for HDFS data storage) after your ingestion task ends.
* DataSegmentPuller
After ~20 secs your ingestion task ends, you should be able to see your Historical process trying to load the new segment.
The following example was retrieved from a Historical process configured to use Azure for deep storage:
```
2015-04-14T02:42:33,450 INFO [ZkCoordinator-0] org.apache.druid.server.coordination.ZkCoordinator - New request[LOAD: dde_2015-01-02T00:00:00.000Z_2015-01-03T00:00:00
.000Z_2015-04-14T02:41:09.484Z] with zNode[/druid/dev/loadQueue/192.168.33.104:8081/dde_2015-01-02T00:00:00.000Z_2015-01-03T00:00:00.000Z_2015-04-14T02:41:09.
484Z].
2015-04-14T02:42:33,451 INFO [ZkCoordinator-0] org.apache.druid.server.coordination.ZkCoordinator - Loading segment dde_2015-01-02T00:00:00.000Z_2015-01-03T00:00:00.0
00Z_2015-04-14T02:41:09.484Z
2015-04-14T02:42:33,463 INFO [ZkCoordinator-0] org.apache.druid.guice.JsonConfigurator - Loaded class[class org.apache.druid.storage.azure.AzureAccountConfig] from props[drui
d.azure.] as [org.apache.druid.storage.azure.AzureAccountConfig@759c9ad9]
2015-04-14T02:49:08,275 INFO [ZkCoordinator-0] org.apache.druid.utils.CompressionUtils - Unzipping file[/opt/druid/tmp/compressionUtilZipCache1263964429587449785.z
ip] to [/opt/druid/zk_druid/dde/2015-01-02T00:00:00.000Z_2015-01-03T00:00:00.000Z/2015-04-14T02:41:09.484Z/0]
2015-04-14T02:49:08,276 INFO [ZkCoordinator-0] org.apache.druid.storage.azure.AzureDataSegmentPuller - Loaded 1196 bytes from [dde/2015-01-02T00:00:00.000Z_2015-01-03
T00:00:00.000Z/2015-04-14T02:41:09.484Z/0/index.zip] to [/opt/druid/zk_druid/dde/2015-01-02T00:00:00.000Z_2015-01-03T00:00:00.000Z/2015-04-14T02:41:09.484Z/0]
2015-04-14T02:49:08,277 WARN [ZkCoordinator-0] org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager - Segment [dde_2015-01-02T00:00:00.000Z_2015-01-03T00:00:00.000Z_2015-04-14T02:41:09.484Z] is different than expected size. Expected [0] found [1196]
2015-04-14T02:49:08,282 INFO [ZkCoordinator-0] org.apache.druid.server.coordination.BatchDataSegmentAnnouncer - Announcing segment[dde_2015-01-02T00:00:00.000Z_2015-01-03T00:00:00.000Z_2015-04-14T02:41:09.484Z] at path[/druid/dev/segments/192.168.33.104:8081/192.168.33.104:8081_historical__default_tier_2015-04-14T02:49:08.282Z_7bb87230ebf940188511dd4a53ffd7351]
2015-04-14T02:49:08,292 INFO [ZkCoordinator-0] org.apache.druid.server.coordination.ZkCoordinator - Completed request [LOAD: dde_2015-01-02T00:00:00.000Z_2015-01-03T00:00:00.000Z_2015-04-14T02:41:09.484Z]
```
* DataSegmentKiller
The easiest way of testing the segment killing is marking a segment as not used and then starting a killing task through the old Coordinator console.
To mark a segment as not used, you need to connect to your metadata storage and update the `used` column to `false` on the segment table rows.
To start a segment killing task, you need to access the old Coordinator console `http://<COODRINATOR_IP>:<COORDINATOR_PORT/old-console/kill.html` then select the appropriate datasource and then input a time range (e.g. `2000/3000`).
After the killing task ends, `index.zip` (`partitionNum_index.zip` for HDFS data storage) file should be deleted from the data storage.
### Adding support for a new input source
Adding support for a new input source requires to implement three interfaces, i.e., `InputSource`, `InputEntity`, and `InputSourceReader`.
`InputSource` is to define where the input data is stored. `InputEntity` is to define how data can be read in parallel
in [native parallel indexing](../ingestion/native-batch.md).
`InputSourceReader` defines how to read your new input source and you can simply use the provided `InputEntityIteratingReader` in most cases.
There is an example of this in the `druid-s3-extensions` module with the `S3InputSource` and `S3Entity`.
Adding an InputSource is done almost entirely through the Jackson Modules instead of Guice. Specifically, note the implementation
``` java
@Override
public List<? extends Module> getJacksonModules()
{
return ImmutableList.of(
new SimpleModule().registerSubtypes(new NamedType(S3InputSource.class, "s3"))
);
}
```
This is registering the InputSource with Jackson's polymorphic serialization/deserialization layer. More concretely, having this will mean that if you specify a `"inputSource": { "type": "s3", ... }` in your IO config, then the system will load this InputSource for your `InputSource` implementation.
Note that inside of Druid, we have made the `@JacksonInject` annotation for Jackson deserialized objects actually use the base Guice injector to resolve the object to be injected. So, if your InputSource needs access to some object, you can add a `@JacksonInject` annotation on a setter and it will get set on instantiation.
### Adding support for a new data format
Adding support for a new data format requires implementing two interfaces, i.e., `InputFormat` and `InputEntityReader`.
`InputFormat` is to define how your data is formatted. `InputEntityReader` is to define how to parse your data and convert into Druid `InputRow`.
There is an example in the `druid-orc-extensions` module with the `OrcInputFormat` and `OrcReader`.
Adding an InputFormat is very similar to adding an InputSource. They operate purely through Jackson and thus should just be additions to the Jackson modules returned by your DruidModule.
### Adding Aggregators
Adding AggregatorFactory objects is very similar to InputSource objects. They operate purely through Jackson and thus should just be additions to the Jackson modules returned by your DruidModule.
### Adding Complex Metrics
Adding ComplexMetrics is a little ugly in the current version. The method of getting at complex metrics is through registration with the `ComplexMetrics.registerSerde()` method. There is no special Guice stuff to get this working, just in your `configure(Binder)` method register the serialization/deserialization.
### Adding new Query types
Adding a new Query type requires the implementation of three interfaces.
1. `org.apache.druid.query.Query`
1. `org.apache.druid.query.QueryToolChest`
1. `org.apache.druid.query.QueryRunnerFactory`
Registering these uses the same general strategy as a deep storage mechanism does. You do something like
``` java
DruidBinders.queryToolChestBinder(binder)
.addBinding(SegmentMetadataQuery.class)
.to(SegmentMetadataQueryQueryToolChest.class);
DruidBinders.queryRunnerFactoryBinder(binder)
.addBinding(SegmentMetadataQuery.class)
.to(SegmentMetadataQueryRunnerFactory.class);
```
The first one binds the SegmentMetadataQueryQueryToolChest for usage when a SegmentMetadataQuery is used. The second one does the same thing but for the QueryRunnerFactory instead.
### Adding new Jersey resources
Adding new Jersey resources to a module requires calling the following code to bind the resource in the module:
```java
Jerseys.addResource(binder, NewResource.class);
```
### Adding a new Password Provider implementation
You will need to implement `org.apache.druid.metadata.PasswordProvider` interface. For every place where Druid uses PasswordProvider, a new instance of the implementation will be created,
thus make sure all the necessary information required for fetching each password is supplied during object instantiation.
In your implementation of `org.apache.druid.initialization.DruidModule`, `getJacksonModules` should look something like this -
``` java
return ImmutableList.of(
new SimpleModule("SomePasswordProviderModule")
.registerSubtypes(
new NamedType(SomePasswordProvider.class, "some")
)
);
```
where `SomePasswordProvider` is the implementation of `PasswordProvider` interface, you can have a look at `org.apache.druid.metadata.EnvironmentVariablePasswordProvider` for example.
### Adding a Transform Extension
To create a transform extension implement the `org.apache.druid.segment.transform.Transform` interface. You'll need to install the `druid-processing` package to import `org.apache.druid.segment.transform`.
```java
import com.fasterxml.jackson.annotation.JsonCreator;
import com.fasterxml.jackson.annotation.JsonProperty;
import org.apache.druid.segment.transform.RowFunction;
import org.apache.druid.segment.transform.Transform;
public class MyTransform implements Transform {
private final String name;
@JsonCreator
public MyTransform(
@JsonProperty("name") final String name
) {
this.name = name;
}
@JsonProperty
@Override
public String getName() {
return name;
}
@Override
public RowFunction getRowFunction() {
return new MyRowFunction();
}
static class MyRowFunction implements RowFunction {
@Override
public Object eval(Row row) {
return "transformed-value";
}
}
}
```
Then register your transform as a Jackson module.
```java
import com.fasterxml.jackson.databind.Module;
import com.fasterxml.jackson.databind.jsontype.NamedModule;
import com.fasterxml.jackson.databind.module.SimpleModule;
import com.google.inject.Binder;
import com.google.common.collect.ImmutableList;
import org.apache.druid.initialization.DruidModule;
public class MyTransformModule implements DruidModule {
@Override
public List<? extends Module> getJacksonModules() {
return return ImmutableList.of(
new SimpleModule("MyTransformModule").registerSubtypes(
new NamedType(MyTransform.class, "my-transform")
)
):
}
@Override
public void configure(Binder binder) {
}
}
```
### Bundle your extension with all the other Druid extensions
When you do `mvn install`, Druid extensions will be packaged within the Druid tarball and `extensions` directory, which are both underneath `distribution/target/`.
If you want your extension to be included, you can add your extension's maven coordinate as an argument at
[distribution/pom.xml](https://github.com/apache/druid/blob/master/distribution/pom.xml#L95)
During `mvn install`, maven will install your extension to the local maven repository, and then call [pull-deps](../operations/pull-deps.md) to pull your extension from
there. In the end, you should see your extension underneath `distribution/target/extensions` and within Druid tarball.
### 管理依赖
针对扩展的依赖和常用依赖冲突的管理可能让人非常头疼。Managing library collisions can be daunting for extensions which draw in commonly used libraries.
针对下面的库的 group IDs 我们建议在 Maven 的 scope 使用 `provided` 来避免与 Druid 中使用的相同包的依赖产生冲突:
```
"org.apache.druid",
"com.metamx.druid",
"asm",
"org.ow2.asm",
"org.jboss.netty",
"com.google.guava",
"com.google.code.findbugs",
"com.google.protobuf",
"com.esotericsoftware.minlog",
"log4j",
"org.slf4j",
"commons-logging",
"org.eclipse.jetty",
"org.mortbay.jetty",
"com.sun.jersey",
"com.sun.jersey.contribs",
"common-beanutils",
"commons-codec",
"commons-lang",
"commons-cli",
"commons-io",
"javax.activation",
"org.apache.httpcomponents",
"org.apache.zookeeper",
"org.codehaus.jackson",
"com.fasterxml.jackson",
"com.fasterxml.jackson.core",
"com.fasterxml.jackson.dataformat",
"com.fasterxml.jackson.datatype",
"org.roaringbitmap",
"net.java.dev.jets3t"
```
请查看源代码 `org.apache.druid.cli.PullDependencies` 中的注释文档来获得更多的信息。

157
development/overview.md Normal file
View File

@ -0,0 +1,157 @@
---
id: extensions
title: "Extensions"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
Druid implements an extension system that allows for adding functionality at runtime. Extensions
are commonly used to add support for deep storages (like HDFS and S3), metadata stores (like MySQL
and PostgreSQL), new aggregators, new input formats, and so on.
Production clusters will generally use at least two extensions; one for deep storage and one for a
metadata store. Many clusters will also use additional extensions.
## Core extensions
Core extensions are maintained by Druid committers.
|Name|Description|Docs|
|----|-----------|----|
|druid-avro-extensions|Support for data in Apache Avro data format.|[link](../development/extensions-core/avro.md)|
|druid-azure-extensions|Microsoft Azure deep storage.|[link](../development/extensions-core/azure.md)|
|druid-basic-security|Support for Basic HTTP authentication and role-based access control.|[link](../development/extensions-core/druid-basic-security.md)|
|druid-bloom-filter|Support for providing Bloom filters in druid queries.|[link](../development/extensions-core/bloom-filter.md)|
|druid-datasketches|Support for approximate counts and set operations with [Apache DataSketches](https://datasketches.apache.org/).|[link](../development/extensions-core/datasketches-extension.md)|
|druid-google-extensions|Google Cloud Storage deep storage.|[link](../development/extensions-core/google.md)|
|druid-hdfs-storage|HDFS deep storage.|[link](../development/extensions-core/hdfs.md)|
|druid-histogram|Approximate histograms and quantiles aggregator. Deprecated, please use the [DataSketches quantiles aggregator](../development/extensions-core/datasketches-quantiles.md) from the `druid-datasketches` extension instead.|[link](../development/extensions-core/approximate-histograms.md)|
|druid-kafka-extraction-namespace|Apache Kafka-based namespaced lookup. Requires namespace lookup extension.|[link](../development/extensions-core/kafka-extraction-namespace.md)|
|druid-kafka-indexing-service|Supervised exactly-once Apache Kafka ingestion for the indexing service.|[link](../development/extensions-core/kafka-ingestion.md)|
|druid-kinesis-indexing-service|Supervised exactly-once Kinesis ingestion for the indexing service.|[link](../development/extensions-core/kinesis-ingestion.md)|
|druid-kerberos|Kerberos authentication for druid processes.|[link](../development/extensions-core/druid-kerberos.md)|
|druid-lookups-cached-global|A module for [lookups](../querying/lookups.md) providing a jvm-global eager caching for lookups. It provides JDBC and URI implementations for fetching lookup data.|[link](../development/extensions-core/lookups-cached-global.md)|
|druid-lookups-cached-single| Per lookup caching module to support the use cases where a lookup need to be isolated from the global pool of lookups |[link](../development/extensions-core/druid-lookups.md)|
|druid-orc-extensions|Support for data in Apache ORC data format.|[link](../development/extensions-core/orc.md)|
|druid-parquet-extensions|Support for data in Apache Parquet data format. Requires druid-avro-extensions to be loaded.|[link](../development/extensions-core/parquet.md)|
|druid-protobuf-extensions| Support for data in Protobuf data format.|[link](../development/extensions-core/protobuf.md)|
|druid-ranger-security|Support for access control through Apache Ranger.|[link](../development/extensions-core/druid-ranger-security.md)|
|druid-s3-extensions|Interfacing with data in AWS S3, and using S3 as deep storage.|[link](../development/extensions-core/s3.md)|
|druid-ec2-extensions|Interfacing with AWS EC2 for autoscaling middle managers|UNDOCUMENTED|
|druid-stats|Statistics related module including variance and standard deviation.|[link](../development/extensions-core/stats.md)|
|mysql-metadata-storage|MySQL metadata store.|[link](../development/extensions-core/mysql.md)|
|postgresql-metadata-storage|PostgreSQL metadata store.|[link](../development/extensions-core/postgresql.md)|
|simple-client-sslcontext|Simple SSLContext provider module to be used by Druid's internal HttpClient when talking to other Druid processes over HTTPS.|[link](../development/extensions-core/simple-client-sslcontext.md)|
|druid-pac4j|OpenID Connect authentication for druid processes.|[link](../development/extensions-core/druid-pac4j.md)|
## Community extensions
> Community extensions are not maintained by Druid committers, although we accept patches from community members using these extensions. They may not have been as extensively tested as the core extensions.
A number of community members have contributed their own extensions to Druid that are not packaged with the default Druid tarball.
If you'd like to take on maintenance for a community extension, please post on [dev@druid.apache.org](https://lists.apache.org/list.html?dev@druid.apache.org) to let us know!
All of these community extensions can be downloaded using [pull-deps](../operations/pull-deps.md) while specifying a `-c` coordinate option to pull `org.apache.druid.extensions.contrib:{EXTENSION_NAME}:{DRUID_VERSION}`.
|Name|Description|Docs|
|----|-----------|----|
|aliyun-oss-extensions|Aliyun OSS deep storage |[link](../development/extensions-contrib/aliyun-oss-extensions.md)|
|ambari-metrics-emitter|Ambari Metrics Emitter |[link](../development/extensions-contrib/ambari-metrics-emitter.md)|
|druid-cassandra-storage|Apache Cassandra deep storage.|[link](../development/extensions-contrib/cassandra.md)|
|druid-cloudfiles-extensions|Rackspace Cloudfiles deep storage and firehose.|[link](../development/extensions-contrib/cloudfiles.md)|
|druid-distinctcount|DistinctCount aggregator|[link](../development/extensions-contrib/distinctcount.md)|
|druid-redis-cache|A cache implementation for Druid based on Redis.|[link](../development/extensions-contrib/redis-cache.md)|
|druid-time-min-max|Min/Max aggregator for timestamp.|[link](../development/extensions-contrib/time-min-max.md)|
|sqlserver-metadata-storage|Microsoft SQLServer deep storage.|[link](../development/extensions-contrib/sqlserver.md)|
|graphite-emitter|Graphite metrics emitter|[link](../development/extensions-contrib/graphite.md)|
|statsd-emitter|StatsD metrics emitter|[link](../development/extensions-contrib/statsd.md)|
|kafka-emitter|Kafka metrics emitter|[link](../development/extensions-contrib/kafka-emitter.md)|
|druid-thrift-extensions|Support thrift ingestion |[link](../development/extensions-contrib/thrift.md)|
|druid-opentsdb-emitter|OpenTSDB metrics emitter |[link](../development/extensions-contrib/opentsdb-emitter.md)|
|materialized-view-selection, materialized-view-maintenance|Materialized View|[link](../development/extensions-contrib/materialized-view.md)|
|druid-moving-average-query|Support for [Moving Average](https://en.wikipedia.org/wiki/Moving_average) and other Aggregate [Window Functions](https://en.wikibooks.org/wiki/Structured_Query_Language/Window_functions) in Druid queries.|[link](../development/extensions-contrib/moving-average-query.md)|
|druid-influxdb-emitter|InfluxDB metrics emitter|[link](../development/extensions-contrib/influxdb-emitter.md)|
|druid-momentsketch|Support for approximate quantile queries using the [momentsketch](https://github.com/stanford-futuredata/momentsketch) library|[link](../development/extensions-contrib/momentsketch-quantiles.md)|
|druid-tdigestsketch|Support for approximate sketch aggregators based on [T-Digest](https://github.com/tdunning/t-digest)|[link](../development/extensions-contrib/tdigestsketch-quantiles.md)|
|gce-extensions|GCE Extensions|[link](../development/extensions-contrib/gce-extensions.md)|
## Promoting community extensions to core extensions
Please post on [dev@druid.apache.org](https://lists.apache.org/list.html?dev@druid.apache.org) if you'd like an extension to be promoted to core.
If we see a community extension actively supported by the community, we can promote it to core based on community feedback.
For information how to create your own extension, please see [here](../development/modules.md).
## Loading extensions
### Loading core extensions
Apache Druid bundles all [core extensions](../development/extensions.md#core-extensions) out of the box.
See the [list of extensions](../development/extensions.md#core-extensions) for your options. You
can load bundled extensions by adding their names to your common.runtime.properties
`druid.extensions.loadList` property. For example, to load the *postgresql-metadata-storage* and
*druid-hdfs-storage* extensions, use the configuration:
```
druid.extensions.loadList=["postgresql-metadata-storage", "druid-hdfs-storage"]
```
These extensions are located in the `extensions` directory of the distribution.
> Druid bundles two sets of configurations: one for the [quickstart](../tutorials/index.md) and
> one for a [clustered configuration](../tutorials/cluster.md). Make sure you are updating the correct
> common.runtime.properties for your setup.
> Because of licensing, the mysql-metadata-storage extension does not include the required MySQL JDBC driver. For instructions
> on how to install this library, see the [MySQL extension page](../development/extensions-core/mysql.md).
### Loading community extensions
You can also load community and third-party extensions not already bundled with Druid. To do this, first download the extension and
then install it into your `extensions` directory. You can download extensions from their distributors directly, or
if they are available from Maven, the included [pull-deps](../operations/pull-deps.md) can download them for you. To use *pull-deps*,
specify the full Maven coordinate of the extension in the form `groupId:artifactId:version`. For example,
for the (hypothetical) extension *com.example:druid-example-extension:1.0.0*, run:
```
java \
-cp "lib/*" \
-Ddruid.extensions.directory="extensions" \
-Ddruid.extensions.hadoopDependenciesDir="hadoop-dependencies" \
org.apache.druid.cli.Main tools pull-deps \
--no-default-hadoop \
-c "com.example:druid-example-extension:1.0.0"
```
You only have to install the extension once. Then, add `"druid-example-extension"` to
`druid.extensions.loadList` in common.runtime.properties to instruct Druid to load the extension.
> Please make sure all the Extensions related configuration properties listed [here](../configuration/index.md#extensions) are set correctly.
> The Maven groupId for almost every [community extension](../development/extensions.md#community-extensions) is org.apache.druid.extensions.contrib. The artifactId is the name
> of the extension, and the version is the latest Druid stable version.
### Loading extensions from the classpath
If you add your extension jar to the classpath at runtime, Druid will also load it into the system. This mechanism is relatively easy to reason about,
but it also means that you have to ensure that all dependency jars on the classpath are compatible. That is, Druid makes no provisions while using
this method to maintain class loader isolation so you must make sure that the jars on your classpath are mutually compatible.

Some files were not shown because too many files have changed in this diff Show More