mirror of https://github.com/apache/druid.git
Add more Apache branding to docs (#7515)
This commit is contained in:
parent
9929f8b022
commit
74960e82bf
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Druid vs Elasticsearch"
|
||||
title: "Apache Druid (incubating) vs Elasticsearch"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Druid vs. Key/Value Stores (HBase/Cassandra/OpenTSDB)"
|
||||
title: "Apache Druid (incubating) vs. Key/Value Stores (HBase/Cassandra/OpenTSDB)"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Druid vs Kudu"
|
||||
title: "Apache Druid (incubating) vs Kudu"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Druid vs Redshift"
|
||||
title: "Apache Druid (incubating) vs Redshift"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Druid vs Spark"
|
||||
title: "Apache Druid (incubating) vs Spark"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Druid vs SQL-on-Hadoop"
|
||||
title: "Apache Druid (incubating) vs SQL-on-Hadoop"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Configuration Reference"
|
|||
|
||||
# Configuration Reference
|
||||
|
||||
This page documents all of the configuration properties for each Druid service type.
|
||||
This page documents all of the configuration properties for each Apache Druid (incubating) service type.
|
||||
|
||||
## Table of Contents
|
||||
* [Recommended Configuration File Organization](#recommended-configuration-file-organization)
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Logging"
|
|||
|
||||
# Logging
|
||||
|
||||
Druid processes will emit logs that are useful for debugging to the console. Druid processes also emit periodic metrics about their state. For more about metrics, see [Configuration](../configuration/index.html#enabling-metrics). Metric logs are printed to the console by default, and can be disabled with `-Ddruid.emitter.logging.logLevel=debug`.
|
||||
Apache Druid (incubating) processes will emit logs that are useful for debugging to the console. Druid processes also emit periodic metrics about their state. For more about metrics, see [Configuration](../configuration/index.html#enabling-metrics). Metric logs are printed to the console by default, and can be disabled with `-Ddruid.emitter.logging.logLevel=debug`.
|
||||
|
||||
Druid uses [log4j2](http://logging.apache.org/log4j/2.x/) for logging. Logging can be configured with a log4j2.xml file. Add the path to the directory containing the log4j2.xml file (e.g. the _common/ dir) to your classpath if you want to override default Druid log configuration. Note that this directory should be earlier in the classpath than the druid jars. The easiest way to do this is to prefix the classpath with the config dir.
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Realtime Process Configuration"
|
|||
|
||||
# Realtime Process Configuration
|
||||
|
||||
For general Realtime Process information, see [here](../design/realtime.html).
|
||||
For general Apache Druid (incubating) Realtime Process information, see [here](../design/realtime.html).
|
||||
|
||||
Runtime Configuration
|
||||
---------------------
|
||||
|
|
|
@ -26,7 +26,7 @@ title: "Cassandra Deep Storage"
|
|||
|
||||
## Introduction
|
||||
|
||||
Druid can use Cassandra as a deep storage mechanism. Segments and their metadata are stored in Cassandra in two tables:
|
||||
Apache Druid (incubating) can use Apache Cassandra as a deep storage mechanism. Segments and their metadata are stored in Cassandra in two tables:
|
||||
`index_storage` and `descriptor_storage`. Underneath the hood, the Cassandra integration leverages Astyanax. The
|
||||
index storage table is a [Chunked Object](https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store) repository. It contains
|
||||
compressed segments for distribution to Historical processes. Since segments can be large, the Chunked Object storage allows the integration to multi-thread
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Deep Storage"
|
|||
|
||||
# Deep Storage
|
||||
|
||||
Deep storage is where segments are stored. It is a storage mechanism that Druid does not provide. This deep storage infrastructure defines the level of durability of your data, as long as Druid processes can see this storage infrastructure and get at the segments stored on it, you will not lose data no matter how many Druid nodes you lose. If segments disappear from this storage layer, then you will lose whatever data those segments represented.
|
||||
Deep storage is where segments are stored. It is a storage mechanism that Apache Druid (incubating) does not provide. This deep storage infrastructure defines the level of durability of your data, as long as Druid processes can see this storage infrastructure and get at the segments stored on it, you will not lose data no matter how many Druid nodes you lose. If segments disappear from this storage layer, then you will lose whatever data those segments represented.
|
||||
|
||||
## Local Mount
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Metadata Storage"
|
|||
|
||||
# Metadata Storage
|
||||
|
||||
The Metadata Storage is an external dependency of Druid. Druid uses it to store
|
||||
The Metadata Storage is an external dependency of Apache Druid (incubating). Druid uses it to store
|
||||
various metadata about the system, but not to store the actual data. There are
|
||||
a number of tables used for various purposes described below.
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "ZooKeeper"
|
|||
|
||||
# ZooKeeper
|
||||
|
||||
Druid uses [ZooKeeper](http://zookeeper.apache.org/) (ZK) for management of current cluster state. The operations that happen over ZK are
|
||||
Apache Druid (incubating) uses [Apache ZooKeeper](http://zookeeper.apache.org/) (ZK) for management of current cluster state. The operations that happen over ZK are
|
||||
|
||||
1. [Coordinator](../design/coordinator.html) leader election
|
||||
2. Segment "publishing" protocol from [Historical](../design/historical.html) and [Realtime](../design/realtime.html)
|
||||
|
|
|
@ -24,6 +24,8 @@ title: "Authentication and Authorization"
|
|||
|
||||
# Authentication and Authorization
|
||||
|
||||
This document describes non-extension specific Apache Druid (incubating) authentication and authorization configurations.
|
||||
|
||||
|Property|Type|Description|Default|Required|
|
||||
|--------|-----------|--------|--------|--------|
|
||||
|`druid.auth.authenticatorChain`|JSON List of Strings|List of Authenticator type names|["allowAll"]|no|
|
||||
|
|
|
@ -26,7 +26,7 @@ title: "Broker"
|
|||
|
||||
### Configuration
|
||||
|
||||
For Broker Process Configuration, see [Broker Configuration](../configuration/index.html#broker).
|
||||
For Apache Druid (incubating) Broker Process Configuration, see [Broker Configuration](../configuration/index.html#broker).
|
||||
|
||||
### HTTP endpoints
|
||||
|
||||
|
@ -45,7 +45,7 @@ org.apache.druid.cli.Main server broker
|
|||
|
||||
### Forwarding Queries
|
||||
|
||||
Most druid queries contain an interval object that indicates a span of time for which data is requested. Likewise, Druid [Segments](../design/segments.html) are partitioned to contain data for some interval of time and segments are distributed across a cluster. Consider a simple datasource with 7 segments where each segment contains data for a given day of the week. Any query issued to the datasource for more than one day of data will hit more than one segment. These segments will likely be distributed across multiple processes, and hence, the query will likely hit multiple processes.
|
||||
Most Druid queries contain an interval object that indicates a span of time for which data is requested. Likewise, Druid [Segments](../design/segments.html) are partitioned to contain data for some interval of time and segments are distributed across a cluster. Consider a simple datasource with 7 segments where each segment contains data for a given day of the week. Any query issued to the datasource for more than one day of data will hit more than one segment. These segments will likely be distributed across multiple processes, and hence, the query will likely hit multiple processes.
|
||||
|
||||
To determine which processes to forward queries to, the Broker process first builds a view of the world from information in Zookeeper. Zookeeper maintains information about [Historical](../design/historical.html) and streaming ingestion [Peon](../design/peons.html) processes and the segments they are serving. For every datasource in Zookeeper, the Broker process builds a timeline of segments and the processes that serve them. When queries are received for a specific datasource and interval, the Broker process performs a lookup into the timeline associated with the query datasource for the query interval and retrieves the processes that contain data for the query. The Broker process then forwards down the query to the selected processes.
|
||||
|
||||
|
|
|
@ -26,7 +26,7 @@ title: "Coordinator Process"
|
|||
|
||||
### Configuration
|
||||
|
||||
For Coordinator Process Configuration, see [Coordinator Configuration](../configuration/index.html#coordinator).
|
||||
For Apache Druid (incubating) Coordinator Process Configuration, see [Coordinator Configuration](../configuration/index.html#coordinator).
|
||||
|
||||
### HTTP endpoints
|
||||
|
||||
|
|
|
@ -26,7 +26,7 @@ title: "Historical Process"
|
|||
|
||||
### Configuration
|
||||
|
||||
For Historical Process Configuration, see [Historical Configuration](../configuration/index.html#historical).
|
||||
For Apache Druid (incubating) Historical Process Configuration, see [Historical Configuration](../configuration/index.html#historical).
|
||||
|
||||
### HTTP Endpoints
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Design"
|
||||
title: "Apache Druid (incubating) Design"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
@ -24,7 +24,7 @@ title: "Design"
|
|||
|
||||
# What is Druid?<a id="what-is-druid"></a>
|
||||
|
||||
Druid is a data store designed for high-performance slice-and-dice analytics
|
||||
Apache Druid (incubating) is a data store designed for high-performance slice-and-dice analytics
|
||||
("[OLAP](http://en.wikipedia.org/wiki/Online_analytical_processing)"-style) on large data sets. Druid is most often
|
||||
used as a data store for powering GUI analytical applications, or as a backend for highly-concurrent APIs that need
|
||||
fast aggregations. Common application areas for Druid include:
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Indexing Service"
|
|||
|
||||
# Indexing Service
|
||||
|
||||
The indexing service is a highly-available, distributed service that runs indexing related tasks.
|
||||
The Apache Druid (incubating) indexing service is a highly-available, distributed service that runs indexing related tasks.
|
||||
|
||||
Indexing [tasks](../ingestion/tasks.html) create (and sometimes destroy) Druid [segments](../design/segments.html). The indexing service has a master/slave like architecture.
|
||||
|
||||
|
|
|
@ -26,7 +26,7 @@ title: "MiddleManager Process"
|
|||
|
||||
### Configuration
|
||||
|
||||
For Middlemanager Process Configuration, see [Indexing Service Configuration](../configuration/index.html#middlemanager-and-peons).
|
||||
For Apache Druid (incubating) Middlemanager Process Configuration, see [Indexing Service Configuration](../configuration/index.html#middlemanager-and-peons).
|
||||
|
||||
### HTTP Endpoints
|
||||
|
||||
|
|
|
@ -26,7 +26,7 @@ title: "Overlord Process"
|
|||
|
||||
### Configuration
|
||||
|
||||
For Overlord Process Configuration, see [Overlord Configuration](../configuration/index.html#overlord).
|
||||
For Apache Druid (incubating) Overlord Process Configuration, see [Overlord Configuration](../configuration/index.html#overlord).
|
||||
|
||||
### HTTP Endpoints
|
||||
|
||||
|
|
|
@ -26,7 +26,7 @@ title: "Peons"
|
|||
|
||||
### Configuration
|
||||
|
||||
For Peon Configuration, see [Peon Query Configuration](../configuration/index.html#peon-query-configuration) and [Additional Peon Configuration](../configuration/index.html#additional-peon-configuration).
|
||||
For Apache Druid (incubating) Peon Configuration, see [Peon Query Configuration](../configuration/index.html#peon-query-configuration) and [Additional Peon Configuration](../configuration/index.html#additional-peon-configuration).
|
||||
|
||||
### HTTP Endpoints
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Druid Plumbers"
|
||||
title: "Apache Druid (incubating) Plumbers"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Druid Processes and Servers"
|
||||
title: "Apache Druid (incubating) Processes and Servers"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -28,7 +28,7 @@ title: "Real-time Process"
|
|||
NOTE: Realtime processes are deprecated. Please use the <a href="../development/extensions-core/kafka-ingestion.html">Kafka Indexing Service</a> for stream pull use cases instead.
|
||||
</div>
|
||||
|
||||
For Real-time Process Configuration, see [Realtime Configuration](../configuration/realtime.html).
|
||||
For Apache Druid (incubating) Real-time Process Configuration, see [Realtime Configuration](../configuration/realtime.html).
|
||||
|
||||
For Real-time Ingestion, see [Realtime Ingestion](../ingestion/stream-ingestion.html).
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Segments"
|
|||
|
||||
# Segments
|
||||
|
||||
Druid stores its index in *segment files*, which are partitioned by
|
||||
Apache Druid (incubating) stores its index in *segment files*, which are partitioned by
|
||||
time. In a basic setup, one segment file is created for each time
|
||||
interval, where the time interval is configurable in the
|
||||
`segmentGranularity` parameter of the `granularitySpec`, which is
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Build from Source"
|
|||
|
||||
# Build from Source
|
||||
|
||||
You can build Druid directly from source. Please note that these instructions are for building the latest stable version of Druid.
|
||||
You can build Apache Druid (incubating) directly from source. Please note that these instructions are for building the latest stable version of Druid.
|
||||
For building the latest code in master, follow the instructions [here](https://github.com/apache/incubator-druid/blob/master/docs/content/development/build.md).
|
||||
|
||||
|
||||
|
|
|
@ -36,4 +36,4 @@ To enable experimental features, include their artifacts in the configuration ru
|
|||
druid.extensions.loadList=["druid-histogram"]
|
||||
```
|
||||
|
||||
The configuration files for all the Druid processes need to be updated with this.
|
||||
The configuration files for all the Apache Druid (incubating) processes need to be updated with this.
|
||||
|
|
|
@ -24,11 +24,11 @@ title: "Ambari Metrics Emitter"
|
|||
|
||||
# Ambari Metrics Emitter
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `ambari-metrics-emitter` extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `ambari-metrics-emitter` extension.
|
||||
|
||||
## Introduction
|
||||
|
||||
This extension emits druid metrics to a ambari-metrics carbon server.
|
||||
This extension emits Druid metrics to a ambari-metrics carbon server.
|
||||
Events are sent after been [pickled](http://ambari-metrics.readthedocs.org/en/latest/feeding-carbon.html#the-pickle-protocol); the size of the batch is configurable.
|
||||
|
||||
## Configuration
|
||||
|
|
|
@ -24,11 +24,11 @@ title: "Microsoft Azure"
|
|||
|
||||
# Microsoft Azure
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `druid-azure-extensions` extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-azure-extensions` extension.
|
||||
|
||||
## Deep Storage
|
||||
|
||||
[Microsoft Azure Storage](http://azure.microsoft.com/en-us/services/storage/) is another option for deep storage. This requires some additional druid configuration.
|
||||
[Microsoft Azure Storage](http://azure.microsoft.com/en-us/services/storage/) is another option for deep storage. This requires some additional Druid configuration.
|
||||
|
||||
|Property|Possible Values|Description|Default|
|
||||
|--------|---------------|-----------|-------|
|
||||
|
|
|
@ -24,8 +24,8 @@ title: "Apache Cassandra"
|
|||
|
||||
# Apache Cassandra
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `druid-cassandra-storage` extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-cassandra-storage` extension.
|
||||
|
||||
[Apache Cassandra](http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra) can also
|
||||
be leveraged for deep storage. This requires some additional druid configuration as well as setting up the necessary
|
||||
be leveraged for deep storage. This requires some additional Druid configuration as well as setting up the necessary
|
||||
schema within a Cassandra keystore.
|
||||
|
|
|
@ -24,9 +24,11 @@ title: "Rackspace Cloud Files"
|
|||
|
||||
# Rackspace Cloud Files
|
||||
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-cloudfiles-extensions` extension.
|
||||
|
||||
## Deep Storage
|
||||
|
||||
[Rackspace Cloud Files](http://www.rackspace.com/cloud/files/) is another option for deep storage. This requires some additional druid configuration.
|
||||
[Rackspace Cloud Files](http://www.rackspace.com/cloud/files/) is another option for deep storage. This requires some additional Druid configuration.
|
||||
|
||||
|Property|Possible Values|Description|Default|
|
||||
|--------|---------------|-----------|-------|
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "DistinctCount Aggregator"
|
|||
|
||||
# DistinctCount Aggregator
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) the `druid-distinctcount` extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) the `druid-distinctcount` extension.
|
||||
|
||||
Additionally, follow these steps:
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Google Cloud Storage"
|
|||
|
||||
# Google Cloud Storage
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `druid-google-extensions` extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-google-extensions` extension.
|
||||
|
||||
## Deep Storage
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Graphite Emitter"
|
|||
|
||||
# Graphite Emitter
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `graphite-emitter` extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `graphite-emitter` extension.
|
||||
|
||||
## Introduction
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "InfluxDB Line Protocol Parser"
|
|||
|
||||
# InfluxDB Line Protocol Parser
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `druid-influx-extensions`.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-influx-extensions`.
|
||||
|
||||
This extension enables Druid to parse the [InfluxDB Line Protocol](https://docs.influxdata.com/influxdb/v1.5/write_protocols/line_protocol_tutorial/), a popular text-based timeseries metric serialization format.
|
||||
|
||||
|
|
|
@ -24,11 +24,11 @@ title: "Kafka Emitter"
|
|||
|
||||
# Kafka Emitter
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `kafka-emitter` extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `kafka-emitter` extension.
|
||||
|
||||
## Introduction
|
||||
|
||||
This extension emits Druid metrics to a [Kafka](https://kafka.apache.org) directly with JSON format.<br>
|
||||
This extension emits Druid metrics to [Apache Kafka](https://kafka.apache.org) directly with JSON format.<br>
|
||||
Currently, Kafka has not only their nice ecosystem but also consumer API readily available.
|
||||
So, If you currently use Kafka, It's easy to integrate various tool or UI
|
||||
to monitor the status of your Druid cluster with this extension.
|
||||
|
|
|
@ -24,11 +24,11 @@ title: "Kafka Simple Consumer"
|
|||
|
||||
# Kafka Simple Consumer
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `druid-kafka-eight-simpleConsumer` extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-kafka-eight-simpleConsumer` extension.
|
||||
|
||||
## Firehose
|
||||
|
||||
This is an experimental firehose to ingest data from kafka using kafka simple consumer api. Currently, this firehose would only work inside standalone realtime processes.
|
||||
This is an experimental firehose to ingest data from Apache Kafka using the Kafka simple consumer api. Currently, this firehose would only work inside standalone realtime processes.
|
||||
The configuration for KafkaSimpleConsumerFirehose is similar to the Kafka Eight Firehose , except `firehose` should be replaced with `firehoseV2` like this:
|
||||
|
||||
```json
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Materialized View"
|
|||
|
||||
# Materialized View
|
||||
|
||||
To use this feature, make sure to only load `materialized-view-selection` on Broker and load `materialized-view-maintenance` on Overlord. In addtion, this feature currently requires a Hadoop cluster.
|
||||
To use this Apache Druid (incubating) feature, make sure to only load `materialized-view-selection` on Broker and load `materialized-view-maintenance` on Overlord. In addtion, this feature currently requires a Hadoop cluster.
|
||||
|
||||
This feature enables Druid to greatly improve the query performance, especially when the query dataSource has a very large number of dimensions but the query only required several dimensions. This feature includes two parts. One is `materialized-view-maintenance`, and the other is `materialized-view-selection`.
|
||||
|
||||
|
|
|
@ -24,10 +24,10 @@ title: "Moment Sketches for Approximate Quantiles module"
|
|||
|
||||
# MomentSketch Quantiles Sketch module
|
||||
|
||||
This module provides Druid aggregators for approximate quantile queries using the [momentsketch](https://github.com/stanford-futuredata/momentsketch) library.
|
||||
This module provides aggregators for approximate quantile queries using the [momentsketch](https://github.com/stanford-futuredata/momentsketch) library.
|
||||
The momentsketch provides coarse quantile estimates with less space and aggregation time overheads than traditional sketches, approaching the performance of counts and sums by reconstructing distributions from computed statistics.
|
||||
|
||||
To use this aggregator, make sure you [include](../../operations/including-extensions.html) the extension in your config file:
|
||||
To use this Apache Druid (incubating) extension, make sure you [include](../../operations/including-extensions.html) the extension in your config file:
|
||||
|
||||
```
|
||||
druid.extensions.loadList=["druid-momentsketch"]
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "OpenTSDB Emitter"
|
|||
|
||||
# OpenTSDB Emitter
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `opentsdb-emitter` extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `opentsdb-emitter` extension.
|
||||
|
||||
## Introduction
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "RabbitMQ"
|
|||
|
||||
# RabbitMQ
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `druid-rabbitmq` extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-rabbitmq` extension.
|
||||
|
||||
## Firehose
|
||||
|
||||
|
|
|
@ -24,6 +24,8 @@ title: "Druid Redis Cache"
|
|||
|
||||
# Druid Redis Cache
|
||||
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-redis-cache` extension.
|
||||
|
||||
A cache implementation for Druid based on [Redis](https://github.com/antirez/redis).
|
||||
|
||||
# Configuration
|
||||
|
|
|
@ -24,6 +24,6 @@ title: "RocketMQ"
|
|||
|
||||
# RocketMQ
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `druid-rocketmq` extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-rocketmq` extension.
|
||||
|
||||
Original author: [https://github.com/lizhanhui](https://github.com/lizhanhui).
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Microsoft SQLServer"
|
|||
|
||||
# Microsoft SQLServer
|
||||
|
||||
Make sure to [include](../../operations/including-extensions.html) `sqlserver-metadata-storage` as an extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `sqlserver-metadata-storage` as an extension.
|
||||
|
||||
## Setting up SQLServer
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "StatsD Emitter"
|
|||
|
||||
# StatsD Emitter
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `statsd-emitter` extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `statsd-emitter` extension.
|
||||
|
||||
## Introduction
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Thrift"
|
|||
|
||||
# Thrift
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `druid-thrift-extensions`.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-thrift-extensions`.
|
||||
|
||||
This extension enables Druid to ingest thrift compact data online (`ByteBuffer`) and offline (SequenceFile of type `<Writable, BytesWritable>` or LzoThriftBlock File).
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Timestamp Min/Max aggregators"
|
|||
|
||||
# Timestamp Min/Max aggregators
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `druid-time-min-max`.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-time-min-max`.
|
||||
|
||||
These aggregators enable more precise calculation of min and max time of given events than `__time` column whose granularity is sparse, the same as query granularity.
|
||||
To use this feature, a "timeMin" or "timeMax" aggregator must be included at indexing time.
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Approximate Histogram aggregators"
|
|||
|
||||
# Approximate Histogram aggregators
|
||||
|
||||
Make sure to [include](../../operations/including-extensions.html) `druid-histogram` as an extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-histogram` as an extension.
|
||||
|
||||
The `druid-histogram` extension provides an approximate histogram aggregator and a fixed buckets histogram aggregator.
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Avro"
|
|||
|
||||
# Avro
|
||||
|
||||
This extension enables Druid to ingest and understand the Apache Avro data format. Make sure to [include](../../operations/including-extensions.html) `druid-avro-extensions` as an extension.
|
||||
This Apache Druid (incubating) extension enables Druid to ingest and understand the Apache Avro data format. Make sure to [include](../../operations/including-extensions.html) `druid-avro-extensions` as an extension.
|
||||
|
||||
### Avro Stream Parser
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Bloom Filter"
|
|||
|
||||
# Bloom Filter
|
||||
|
||||
This extension adds the ability to both construct bloom filters from query results, and filter query results by testing
|
||||
This Apache Druid (incubating) extension adds the ability to both construct bloom filters from query results, and filter query results by testing
|
||||
against a bloom filter. Make sure to [include](../../operations/including-extensions.html) `druid-bloom-filter` as an
|
||||
extension.
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "DataSketches extension"
|
|||
|
||||
# DataSketches extension
|
||||
|
||||
Druid aggregators based on [datasketches](http://datasketches.github.io/) library. Sketches are data structures implementing approximate streaming mergeable algorithms. Sketches can be ingested from the outside of Druid or built from raw data at ingestion time. Sketches can be stored in Druid segments as additive metrics.
|
||||
Apache Druid (incubating) aggregators based on [datasketches](http://datasketches.github.io/) library. Sketches are data structures implementing approximate streaming mergeable algorithms. Sketches can be ingested from the outside of Druid or built from raw data at ingestion time. Sketches can be stored in Druid segments as additive metrics.
|
||||
|
||||
To use the datasketches aggregators, make sure you [include](../../operations/including-extensions.html) the extension in your config file:
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "DataSketches HLL Sketch module"
|
|||
|
||||
# DataSketches HLL Sketch module
|
||||
|
||||
This module provides Druid aggregators for distinct counting based on HLL sketch from [datasketches](http://datasketches.github.io/) library. At ingestion time, this aggregator creates the HLL sketch objects to be stored in Druid segments. At query time, sketches are read and merged together. In the end, by default, you receive the estimate of the number of distinct values presented to the sketch. Also, you can use post aggregator to produce a union of sketch columns in the same row.
|
||||
This module provides Apache Druid (incubating) aggregators for distinct counting based on HLL sketch from [datasketches](http://datasketches.github.io/) library. At ingestion time, this aggregator creates the HLL sketch objects to be stored in Druid segments. At query time, sketches are read and merged together. In the end, by default, you receive the estimate of the number of distinct values presented to the sketch. Also, you can use post aggregator to produce a union of sketch columns in the same row.
|
||||
You can use the HLL sketch aggregator on columns of any identifiers. It will return estimated cardinality of the column.
|
||||
|
||||
To use this aggregator, make sure you [include](../../operations/including-extensions.html) the extension in your config file:
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "DataSketches Quantiles Sketch module"
|
|||
|
||||
# DataSketches Quantiles Sketch module
|
||||
|
||||
This module provides Druid aggregators based on numeric quantiles DoublesSketch from [datasketches](http://datasketches.github.io/) library. Quantiles sketch is a mergeable streaming algorithm to estimate the distribution of values, and approximately answer queries about the rank of a value, probability mass function of the distribution (PMF) or histogram, cummulative distribution function (CDF), and quantiles (median, min, max, 95th percentile and such). See [Quantiles Sketch Overview](https://datasketches.github.io/docs/Quantiles/QuantilesOverview.html).
|
||||
This module provides Apache Druid (incubating) aggregators based on numeric quantiles DoublesSketch from [datasketches](http://datasketches.github.io/) library. Quantiles sketch is a mergeable streaming algorithm to estimate the distribution of values, and approximately answer queries about the rank of a value, probability mass function of the distribution (PMF) or histogram, cummulative distribution function (CDF), and quantiles (median, min, max, 95th percentile and such). See [Quantiles Sketch Overview](https://datasketches.github.io/docs/Quantiles/QuantilesOverview.html).
|
||||
|
||||
There are three major modes of operation:
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "DataSketches Theta Sketch module"
|
|||
|
||||
# DataSketches Theta Sketch module
|
||||
|
||||
This module provides Druid aggregators based on Theta sketch from [datasketches](http://datasketches.github.io/) library. Note that sketch algorithms are approximate; see details in the "Accuracy" section of the datasketches doc.
|
||||
This module provides Apache Druid (incubating) aggregators based on Theta sketch from [datasketches](http://datasketches.github.io/) library. Note that sketch algorithms are approximate; see details in the "Accuracy" section of the datasketches doc.
|
||||
At ingestion time, this aggregator creates the Theta sketch objects which get stored in Druid segments. Logically speaking, a Theta sketch object can be thought of as a Set data structure. At query time, sketches are read and aggregated (set unioned) together. In the end, by default, you receive the estimate of the number of unique entries in the sketch object. Also, you can use post aggregators to do union, intersection or difference on sketch columns in the same row.
|
||||
Note that you can use `thetaSketch` aggregator on columns which were not ingested using the same. It will return estimated cardinality of the column. It is recommended to use it at ingestion time as well to make querying faster.
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "DataSketches Tuple Sketch module"
|
|||
|
||||
# DataSketches Tuple Sketch module
|
||||
|
||||
This module provides Druid aggregators based on Tuple sketch from [datasketches](http://datasketches.github.io/) library. ArrayOfDoublesSketch sketches extend the functionality of the count-distinct Theta sketches by adding arrays of double values associated with unique keys.
|
||||
This module provides Apache Druid (incubating) aggregators based on Tuple sketch from [datasketches](http://datasketches.github.io/) library. ArrayOfDoublesSketch sketches extend the functionality of the count-distinct Theta sketches by adding arrays of double values associated with unique keys.
|
||||
|
||||
To use this aggregator, make sure you [include](../../operations/including-extensions.html) the extension in your config file:
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Basic Security"
|
|||
|
||||
# Druid Basic Security
|
||||
|
||||
This extension adds:
|
||||
This Apache Druid (incubating) extension adds:
|
||||
- an Authenticator which supports [HTTP Basic authentication](https://en.wikipedia.org/wiki/Basic_access_authentication)
|
||||
- an Authorizer which implements basic role-based access control
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Kerberos"
|
|||
|
||||
# Kerberos
|
||||
|
||||
Druid Extension to enable Authentication for Druid Processes using Kerberos.
|
||||
Apache Druid (incubating) Extension to enable Authentication for Druid Processes using Kerberos.
|
||||
This extension adds an Authenticator which is used to protect HTTP Endpoints using the simple and protected GSSAPI negotiation mechanism [SPNEGO](https://en.wikipedia.org/wiki/SPNEGO).
|
||||
Make sure to [include](../../operations/including-extensions.html) `druid-kerberos` as an extension.
|
||||
|
||||
|
|
|
@ -27,7 +27,7 @@ title: "Cached Lookup Module"
|
|||
<div class="note info">Please note that this is an experimental module and the development/testing still at early stage. Feel free to try it and give us your feedback.</div>
|
||||
|
||||
## Description
|
||||
This module provides a per-lookup caching mechanism for JDBC data sources.
|
||||
This Apache Druid (incubating) module provides a per-lookup caching mechanism for JDBC data sources.
|
||||
The main goal of this cache is to speed up the access to a high latency lookup sources and to provide a caching isolation for every lookup source.
|
||||
Thus user can define various caching strategies or and implementation per lookup, even if the source is the same.
|
||||
This module can be used side to side with other lookup module like the global cached lookup module.
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "HDFS"
|
|||
|
||||
# HDFS
|
||||
|
||||
Make sure to [include](../../operations/including-extensions.html) `druid-hdfs-storage` as an extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-hdfs-storage` as an extension.
|
||||
|
||||
## Deep Storage
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Kafka Eight Firehose"
|
||||
title: "Apache Kafka Eight Firehose"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
@ -24,7 +24,7 @@ title: "Kafka Eight Firehose"
|
|||
|
||||
# Kafka Eight Firehose
|
||||
|
||||
Make sure to [include](../../operations/including-extensions.html) `druid-kafka-eight` as an extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-kafka-eight` as an extension.
|
||||
|
||||
This firehose acts as a Kafka 0.8.x consumer and ingests data from Kafka.
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Kafka Lookups"
|
||||
title: "Apache Kafka Lookups"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
@ -28,7 +28,7 @@ title: "Kafka Lookups"
|
|||
Lookups are an <a href="../experimental.html">experimental</a> feature.
|
||||
</div>
|
||||
|
||||
Make sure to [include](../../operations/including-extensions.html) `druid-lookups-cached-global` and `druid-kafka-extraction-namespace` as an extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-lookups-cached-global` and `druid-kafka-extraction-namespace` as an extension.
|
||||
|
||||
If you need updates to populate as promptly as possible, it is possible to plug into a kafka topic whose key is the old value and message is the desired new value (both in UTF-8) as a LookupExtractorFactory.
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Kafka Indexing Service"
|
||||
title: "Apache Kafka Indexing Service"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
@ -31,7 +31,7 @@ able to read non-recent events from Kafka and are not subject to the window peri
|
|||
ingestion mechanisms using Tranquility. The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures,
|
||||
and ensure that the scalability and replication requirements are maintained.
|
||||
|
||||
This service is provided in the `druid-kafka-indexing-service` core extension (see
|
||||
This service is provided in the `druid-kafka-indexing-service` core Apache Druid (incubating) extension (see
|
||||
[Including Extensions](../../operations/including-extensions.html)).
|
||||
|
||||
<div class="note info">
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Kinesis Indexing Service"
|
||||
title: "Amazon Kinesis Indexing Service"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
@ -31,7 +31,7 @@ able to read non-recent events from Kinesis and are not subject to the window pe
|
|||
ingestion mechanisms using Tranquility. The supervisor oversees the state of the indexing tasks to coordinate handoffs, manage failures,
|
||||
and ensure that the scalability and replication requirements are maintained.
|
||||
|
||||
The Kinesis indexing service is provided as the `druid-kinesis-indexing-service` core extension (see
|
||||
The Kinesis indexing service is provided as the `druid-kinesis-indexing-service` core Apache Druid (incubating) extension (see
|
||||
[Including Extensions](../../operations/including-extensions.html)). Please note that this is
|
||||
currently designated as an *experimental feature* and is subject to the usual
|
||||
[experimental caveats](../experimental.html).
|
||||
|
|
|
@ -28,7 +28,7 @@ title: "Globally Cached Lookups"
|
|||
Lookups are an <a href="../experimental.html">experimental</a> feature.
|
||||
</div>
|
||||
|
||||
Make sure to [include](../../operations/including-extensions.html) `druid-lookups-cached-global` as an extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-lookups-cached-global` as an extension.
|
||||
|
||||
## Configuration
|
||||
<div class="note caution">
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "MySQL Metadata Store"
|
|||
|
||||
# MySQL Metadata Store
|
||||
|
||||
Make sure to [include](../../operations/including-extensions.html) `mysql-metadata-storage` as an extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `mysql-metadata-storage` as an extension.
|
||||
|
||||
<div class="note caution">
|
||||
The MySQL extension requires the MySQL Connector/J library which is not included in the Druid distribution.
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Druid ORC Extension"
|
||||
title: "ORC Extension"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
@ -22,9 +22,9 @@ title: "Druid ORC Extension"
|
|||
~ under the License.
|
||||
-->
|
||||
|
||||
# Druid ORC Extension
|
||||
# ORC Extension
|
||||
|
||||
This module extends [Druid Hadoop based indexing](../../ingestion/hadoop.html) to ingest data directly from offline
|
||||
This Apache Druid (incubating) module extends [Druid Hadoop based indexing](../../ingestion/hadoop.html) to ingest data directly from offline
|
||||
Apache ORC files.
|
||||
|
||||
To use this extension, make sure to [include](../../operations/including-extensions.html) `druid-orc-extensions`.
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Druid Parquet Extension"
|
||||
title: "Apache Parquet Extension"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
@ -22,9 +22,9 @@ title: "Druid Parquet Extension"
|
|||
~ under the License.
|
||||
-->
|
||||
|
||||
# Druid Parquet Extension
|
||||
# Apache Parquet Extension
|
||||
|
||||
This module extends [Druid Hadoop based indexing](../../ingestion/hadoop.html) to ingest data directly from offline
|
||||
This Apache Druid (incubating) module extends [Druid Hadoop based indexing](../../ingestion/hadoop.html) to ingest data directly from offline
|
||||
Apache Parquet files.
|
||||
|
||||
Note: `druid-parquet-extensions` depends on the `druid-avro-extensions` module, so be sure to
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "PostgreSQL Metadata Store"
|
|||
|
||||
# PostgreSQL Metadata Store
|
||||
|
||||
Make sure to [include](../../operations/including-extensions.html) `postgresql-metadata-storage` as an extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `postgresql-metadata-storage` as an extension.
|
||||
|
||||
## Setting up PostgreSQL
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Protobuf"
|
|||
|
||||
# Protobuf
|
||||
|
||||
This extension enables Druid to ingest and understand the Protobuf data format. Make sure to [include](../../operations/including-extensions.html) `druid-protobuf-extensions` as an extension.
|
||||
This Apache Druid (incubating) extension enables Druid to ingest and understand the Protobuf data format. Make sure to [include](../../operations/including-extensions.html) `druid-protobuf-extensions` as an extension.
|
||||
|
||||
## Protobuf Parser
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "S3-compatible"
|
|||
|
||||
# S3-compatible
|
||||
|
||||
Make sure to [include](../../operations/including-extensions.html) `druid-s3-extensions` as an extension.
|
||||
To use this Apache Druid (incubating) extension, make sure to [include](../../operations/including-extensions.html) `druid-s3-extensions` as an extension.
|
||||
|
||||
## Deep Storage
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Simple SSLContext Provider Module"
|
|||
|
||||
# Simple SSLContext Provider Module
|
||||
|
||||
This module contains a simple implementation of [SSLContext](http://docs.oracle.com/javase/8/docs/api/javax/net/ssl/SSLContext.html)
|
||||
This Apache Druid (incubating) module contains a simple implementation of [SSLContext](http://docs.oracle.com/javase/8/docs/api/javax/net/ssl/SSLContext.html)
|
||||
that will be injected to be used with HttpClient that Druid processes use internally to communicate with each other. To learn more about
|
||||
Java's SSL support, please refer to [this](http://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html) guide.
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Stats aggregator"
|
|||
|
||||
# Stats aggregator
|
||||
|
||||
Includes stat-related aggregators, including variance and standard deviations, etc. Make sure to [include](../../operations/including-extensions.html) `druid-stats` as an extension.
|
||||
This Apache Druid (incubating) extension includes stat-related aggregators, including variance and standard deviations, etc. Make sure to [include](../../operations/including-extensions.html) `druid-stats` as an extension.
|
||||
|
||||
## Variance aggregator
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Test Stats Aggregators"
|
|||
|
||||
# Test Stats Aggregators
|
||||
|
||||
Incorporates test statistics related aggregators, including z-score and p-value. Please refer to [https://www.paypal-engineering.com/2017/06/29/democratizing-experimentation-data-for-product-innovations/](https://www.paypal-engineering.com/2017/06/29/democratizing-experimentation-data-for-product-innovations/) for math background and details.
|
||||
This Apache Druid (incubating) extension incorporates test statistics related aggregators, including z-score and p-value. Please refer to [https://www.paypal-engineering.com/2017/06/29/democratizing-experimentation-data-for-product-innovations/](https://www.paypal-engineering.com/2017/06/29/democratizing-experimentation-data-for-product-innovations/) for math background and details.
|
||||
|
||||
Make sure to include `druid-stats` extension in order to use these aggregrators.
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Druid extensions"
|
||||
title: "Apache Druid (incubating) extensions"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Geographic Queries"
|
|||
|
||||
# Geographic Queries
|
||||
|
||||
Druid supports filtering specially spatially indexed columns based on an origin and a bound.
|
||||
Apache Druid (incubating) supports filtering specially spatially indexed columns based on an origin and a bound.
|
||||
|
||||
# Spatial Indexing
|
||||
In any of the data specs, there is the option of providing spatial dimensions. For example, for a JSON data spec, spatial dimensions can be specified as follows:
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Integrating Druid With Other Technologies"
|
||||
title: "Integrating Apache Druid (incubating) With Other Technologies"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "JavaScript Programming Guide"
|
|||
|
||||
# JavaScript Programming Guide
|
||||
|
||||
This page discusses how to use JavaScript to extend Druid.
|
||||
This page discusses how to use JavaScript to extend Apache Druid (incubating).
|
||||
|
||||
## Examples
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Extending Druid With Custom Modules"
|
||||
title: "Extending Apache Druid (incubating) With Custom Modules"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Developing on Druid"
|
||||
title: "Developing on Apache Druid (incubating)"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Router Process"
|
|||
|
||||
# Router Process
|
||||
|
||||
The Router process can be used to route queries to different Broker processes. By default, the broker routes queries based on how [Rules](../operations/rule-configuration.html) are set up. For example, if 1 month of recent data is loaded into a `hot` cluster, queries that fall within the recent month can be routed to a dedicated set of brokers. Queries outside this range are routed to another set of brokers. This set up provides query isolation such that queries for more important data are not impacted by queries for less important data.
|
||||
The Apache Druid (incubating) Router process can be used to route queries to different Broker processes. By default, the broker routes queries based on how [Rules](../operations/rule-configuration.html) are set up. For example, if 1 month of recent data is loaded into a `hot` cluster, queries that fall within the recent month can be routed to a dedicated set of brokers. Queries outside this range are routed to another set of brokers. This set up provides query isolation such that queries for more important data are not impacted by queries for less important data.
|
||||
|
||||
For query routing purposes, you should only ever need the Router process if you have a Druid cluster well into the terabyte range.
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Versioning Druid"
|
||||
title: "Versioning Apache Druid (incubating)"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Batch Data Ingestion"
|
|||
|
||||
# Batch Data Ingestion
|
||||
|
||||
Druid can load data from static files through a variety of methods described here.
|
||||
Apache Druid (incubating) can load data from static files through a variety of methods described here.
|
||||
|
||||
## Native Batch Ingestion
|
||||
|
||||
|
|
|
@ -32,7 +32,7 @@ java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath lib/*:<hadoop
|
|||
|
||||
## Options
|
||||
|
||||
- "--coordinate" - provide a version of Hadoop to use. This property will override the default Hadoop coordinates. Once specified, Druid will look for those Hadoop dependencies from the location specified by `druid.extensions.hadoopDependenciesDir`.
|
||||
- "--coordinate" - provide a version of Apache Hadoop to use. This property will override the default Hadoop coordinates. Once specified, Apache Druid (incubating) will look for those Hadoop dependencies from the location specified by `druid.extensions.hadoopDependenciesDir`.
|
||||
- "--no-default-hadoop" - don't pull down the default hadoop version
|
||||
|
||||
## Spec file
|
||||
|
|
|
@ -90,7 +90,7 @@ data segments loaded in it (or if the interval you specify is empty).
|
|||
|
||||
The output segment can have different metadata from the input segments unless all input segments have the same metadata.
|
||||
|
||||
- Dimensions: since Druid supports schema change, the dimensions can be different across segments even if they are a part of the same dataSource.
|
||||
- Dimensions: since Apache Druid (incubating) supports schema change, the dimensions can be different across segments even if they are a part of the same dataSource.
|
||||
If the input segments have different dimensions, the output segment basically includes all dimensions of the input segments.
|
||||
However, even if the input segments have the same set of dimensions, the dimension order or the data type of dimensions can be different. For example, the data type of some dimensions can be
|
||||
changed from `string` to primitive types, or the order of dimensions can be changed for better locality.
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Data Formats for Ingestion"
|
|||
|
||||
# Data Formats for Ingestion
|
||||
|
||||
Druid can ingest denormalized data in JSON, CSV, or a delimited form such as TSV, or any custom format. While most examples in the documentation use data in JSON format, it is not difficult to configure Druid to ingest any other delimited data.
|
||||
Apache Druid (incubating) can ingest denormalized data in JSON, CSV, or a delimited form such as TSV, or any custom format. While most examples in the documentation use data in JSON format, it is not difficult to configure Druid to ingest any other delimited data.
|
||||
We welcome any contributions to new formats.
|
||||
|
||||
For additional data formats, please see our [extensions list](../development/extensions.html).
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Deleting Data"
|
|||
|
||||
# Deleting Data
|
||||
|
||||
Permanent deletion of a Druid segment has two steps:
|
||||
Permanent deletion of a segment in Apache Druid (incubating) has two steps:
|
||||
|
||||
1. The segment must first be marked as "unused". This occurs when a segment is dropped by retention rules, and when a user manually disables a segment through the Coordinator API.
|
||||
2. After segments have been marked as "unused", a Kill Task will delete any "unused" segments from Druid's metadata store as well as deep storage.
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "My Data isn't being loaded"
|
||||
title: "Apache Druid (incubating) FAQ"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: doc_page
|
||||
title: "Druid Firehoses"
|
||||
title: "Apache Druid (incubating) Firehoses"
|
||||
---
|
||||
|
||||
<!--
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Hadoop-based Batch Ingestion VS Native Batch Ingestion"
|
|||
|
||||
# Comparison of Batch Ingestion Methods
|
||||
|
||||
Druid basically supports three types of batch ingestion: Hadoop-based
|
||||
Apache Druid (incubating) basically supports three types of batch ingestion: Apache Hadoop-based
|
||||
batch ingestion, native parallel batch ingestion, and native local batch
|
||||
ingestion. The below table shows what features are supported by each
|
||||
ingestion method.
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Hadoop-based Batch Ingestion"
|
|||
|
||||
# Hadoop-based Batch Ingestion
|
||||
|
||||
Hadoop-based batch ingestion in Druid is supported via a Hadoop-ingestion task. These tasks can be posted to a running
|
||||
Apache Hadoop-based batch ingestion in Apache Druid (incubating) is supported via a Hadoop-ingestion task. These tasks can be posted to a running
|
||||
instance of a Druid [Overlord](../design/overlord.html).
|
||||
|
||||
Please check [Hadoop-based Batch Ingestion VS Native Batch Ingestion](./hadoop-vs-native-batch.html) for differences between native batch ingestion and Hadoop-based ingestion.
|
||||
|
|
|
@ -30,7 +30,7 @@ title: "Ingestion"
|
|||
|
||||
### Datasources and segments
|
||||
|
||||
Druid data is stored in "datasources", which are similar to tables in a traditional RDBMS. Each datasource is
|
||||
Apache Druid (incubating) data is stored in "datasources", which are similar to tables in a traditional RDBMS. Each datasource is
|
||||
partitioned by time and, optionally, further partitioned by other attributes. Each time range is called a "chunk" (for
|
||||
example, a single day, if your datasource is partitioned by day). Within a chunk, data is partitioned into one or more
|
||||
"segments". Each segment is a single file, typically comprising up to a few million rows of data. Since segments are
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Ingestion Spec"
|
|||
|
||||
# Ingestion Spec
|
||||
|
||||
A Druid ingestion spec consists of 3 components:
|
||||
An Apache Druid (incubating) ingestion spec consists of 3 components:
|
||||
|
||||
```json
|
||||
{
|
||||
|
|
|
@ -43,7 +43,7 @@ Tasks are also part of a "task group", which is a set of tasks that can share in
|
|||
|
||||
## Priority
|
||||
|
||||
Druid's indexing tasks use locks for atomic data ingestion. Each lock is acquired for the combination of a dataSource and an interval. Once a task acquires a lock, it can write data for the dataSource and the interval of the acquired lock unless the lock is released or preempted. Please see [the below Locking section](#locking)
|
||||
Apache Druid (incubating)'s indexing tasks use locks for atomic data ingestion. Each lock is acquired for the combination of a dataSource and an interval. Once a task acquires a lock, it can write data for the dataSource and the interval of the acquired lock unless the lock is released or preempted. Please see [the below Locking section](#locking)
|
||||
|
||||
Each task has a priority which is used for lock acquisition. The locks of higher-priority tasks can preempt the locks of lower-priority tasks if they try to acquire for the same dataSource and interval. If some locks of a task are preempted, the behavior of the preempted task depends on the task implementation. Usually, most tasks finish as failed if they are preempted.
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Native Index Tasks"
|
|||
|
||||
# Native Index Tasks
|
||||
|
||||
Druid currently has two types of native batch indexing tasks, `index_parallel` which runs tasks
|
||||
Apache Druid (incubating) currently has two types of native batch indexing tasks, `index_parallel` which runs tasks
|
||||
in parallel on multiple MiddleManager processes, and `index` which will run a single indexing task locally on a single
|
||||
MiddleManager.
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Schema Changes"
|
|||
|
||||
# Schema Changes
|
||||
|
||||
Schemas for datasources can change at any time and Druid supports different schemas among segments.
|
||||
Schemas for datasources can change at any time and Apache Druid (incubating) supports different schemas among segments.
|
||||
|
||||
## Replacing Segments
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Schema Design"
|
|||
|
||||
# Schema Design
|
||||
|
||||
This page is meant to assist users in designing a schema for data to be ingested in Druid. Druid offers a unique data
|
||||
This page is meant to assist users in designing a schema for data to be ingested in Apache Druid (incubating). Druid offers a unique data
|
||||
modeling system that bears similarity to both relational and timeseries models. The key factors are:
|
||||
|
||||
* Druid data is stored in [datasources](index.html#datasources), which are similar to tables in a traditional RDBMS.
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Loading Streams"
|
|||
|
||||
# Loading Streams
|
||||
|
||||
Streams can be ingested in Druid using either [Tranquility](https://github.com/druid-io/tranquility) (a Druid-aware
|
||||
Streams can be ingested in Apache Druid (incubating) using either [Tranquility](https://github.com/druid-io/tranquility) (a Druid-aware
|
||||
client) or the [Kafka Indexing Service](../development/extensions-core/kafka-ingestion.html).
|
||||
|
||||
## Tranquility (Stream Push)
|
||||
|
|
|
@ -29,7 +29,7 @@ NOTE: Realtime processes are deprecated. Please use the <a href="../development/
|
|||
# Stream Pull Ingestion
|
||||
|
||||
If you have an external service that you want to pull data from, you have two options. The simplest
|
||||
option is to set up a "copying" service that reads from the data source and writes to Druid using
|
||||
option is to set up a "copying" service that reads from the data source and writes to Apache Druid (incubating) using
|
||||
the [stream push method](stream-push.html).
|
||||
|
||||
Another option is *stream pull*. With this approach, a Druid Realtime Process ingests data from a
|
||||
|
|
|
@ -24,7 +24,7 @@ title: "Stream Push"
|
|||
|
||||
# Stream Push
|
||||
|
||||
Druid can connect to any streaming data source through
|
||||
Apache Druid (incubating) can connect to any streaming data source through
|
||||
[Tranquility](https://github.com/druid-io/tranquility/blob/master/README.md), a package for pushing
|
||||
streams to Druid in real-time. Druid does not come bundled with Tranquility, and you will have to download the distribution.
|
||||
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue