Merge pull request #2537 from druid-io/refactor-ext

refactor extensions into core and contrib
This commit is contained in:
Charles Allen 2016-03-09 08:18:42 -08:00
commit 4c3a3f8da6
271 changed files with 652 additions and 550 deletions

View File

@ -71,10 +71,6 @@
<argument>-c</argument>
<argument>io.druid.extensions:druid-examples</argument>
<argument>-c</argument>
<argument>io.druid.extensions:druid-azure-extensions</argument>
<argument>-c</argument>
<argument>io.druid.extensions:druid-cassandra-storage</argument>
<argument>-c</argument>
<argument>io.druid.extensions:druid-datasketches</argument>
<argument>-c</argument>
<argument>io.druid.extensions:druid-hdfs-storage</argument>
@ -83,8 +79,6 @@
<argument>-c</argument>
<argument>io.druid.extensions:druid-kafka-eight</argument>
<argument>-c</argument>
<argument>io.druid.extensions:druid-kafka-eight-simple-consumer</argument>
<argument>-c</argument>
<argument>io.druid.extensions:druid-kafka-extraction-namespace</argument>
<argument>-c</argument>
<argument>io.druid.extensions:mysql-metadata-storage</argument>
@ -93,11 +87,7 @@
<argument>-c</argument>
<argument>io.druid.extensions:postgresql-metadata-storage</argument>
<argument>-c</argument>
<argument>io.druid.extensions:druid-rabbitmq</argument>
<argument>-c</argument>
<argument>io.druid.extensions:druid-s3-extensions</argument>
<argument>-c</argument>
<argument>io.druid.extensions:druid-cloudfiles-extensions</argument>
</arguments>
</configuration>
</execution>

View File

@ -4,9 +4,7 @@ layout: doc_page
# Deep Storage
Deep storage is where segments are stored. It is a storage mechanism that Druid does not provide. This deep storage infrastructure defines the level of durability of your data, as long as Druid nodes can see this storage infrastructure and get at the segments stored on it, you will not lose data no matter how many Druid nodes you lose. If segments disappear from this storage layer, then you will lose whatever data those segments represented.
## Production Tested Deep Stores
### Local Mount
## Local Mount
A local mount can be used for storage of segments as well. This allows you to use just your local file system or anything else that can be mount locally like NFS, Ceph, etc. This is the default deep storage implementation.
@ -21,14 +19,12 @@ Note that you should generally set `druid.storage.storageDirectory` to something
If you are using the Hadoop indexer in local mode, then just give it a local file as your output directory and it will work.
### S3-compatible
## S3-compatible
S3-compatible deep storage is basically either S3 or something like Google Storage which exposes the same API as S3.
S3 configuration parameters are
|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.s3.accessKey`||S3 access key.|Must be set.|
@ -36,7 +32,7 @@ S3 configuration parameters are
|`druid.storage.bucket`||Bucket to store in.|Must be set.|
|`druid.storage.baseKey`||Base key prefix to use, i.e. what directory.|Must be set.|
### HDFS
## HDFS
In order to use hdfs for deep storage, you need to set the following configuration in your common configs.
@ -46,44 +42,3 @@ In order to use hdfs for deep storage, you need to set the following configurati
|`druid.storage.storageDirectory`||Directory for storing segments.|Must be set.|
If you are using the Hadoop indexer, set your output directory to be a location on Hadoop and it will work
## Community Contributed Deep Stores
### Cassandra
[Apache Cassandra](http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra) can also be leveraged for deep storage. This requires some additional druid configuration as well as setting up the necessary schema within a Cassandra keystore.
Please note that this is a community contributed module and does not support Cassandra 2.x or hadoop-based batch indexing. For more information on using Cassandra as deep storage, see [Cassandra Deep Storage](../dependencies/cassandra-deep-storage.html).
## Azure
[Microsoft Azure Storage](http://azure.microsoft.com/en-us/services/storage/) is another option for deep storage. This requires some additional druid configuration.
|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.storage.type`|azure||Must be set.|
|`druid.azure.account`||Azure Storage account name.|Must be set.|
|`druid.azure.key`||Azure Storage account key.|Must be set.|
|`druid.azure.container`||Azure Storage container name.|Must be set.|
|`druid.azure.protocol`|http or https||https|
|`druid.azure.maxTries`||Number of tries before cancel an Azure operation.|3|
Please note that this is a community contributed module. See [Azure Services](http://azure.microsoft.com/en-us/pricing/free-trial/) for more information.
### Rackspace
[Rackspace Cloud Files](http://www.rackspace.com/cloud/files/) is another option for deep storage. This requires some additional druid configuration.
|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.storage.type`|cloudfiles||Must be set.|
|`druid.storage.region`||Rackspace Cloud Files region.|Must be set.|
|`druid.storage.container`||Rackspace Cloud Files container name.|Must be set.|
|`druid.storage.basePath`||Rackspace Cloud Files base path to use in the container.|Must be set.|
|`druid.storage.operationMaxRetries`||Number of tries before cancel a Rackspace operation.|10|
|`druid.cloudfiles.userName`||Rackspace Cloud username|Must be set.|
|`druid.cloudfiles.apiKey`||Rackspace Cloud api key.|Must be set.|
|`druid.cloudfiles.provider`|rackspace-cloudfiles-us,rackspace-cloudfiles-uk|Name of the provider depending on the region.|Must be set.|
|`druid.cloudfiles.useServiceNet`|true,false|Whether to use the internal service net.|true|
Please note that this is a community contributed module.

View File

@ -0,0 +1,62 @@
---
layout: doc_page
---
# Microsoft Azure
## Deep Storage
[Microsoft Azure Storage](http://azure.microsoft.com/en-us/services/storage/) is another option for deep storage. This requires some additional druid configuration.
|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.storage.type`|azure||Must be set.|
|`druid.azure.account`||Azure Storage account name.|Must be set.|
|`druid.azure.key`||Azure Storage account key.|Must be set.|
|`druid.azure.container`||Azure Storage container name.|Must be set.|
|`druid.azure.protocol`|http or https||https|
|`druid.azure.maxTries`||Number of tries before cancel an Azure operation.|3|
See [Azure Services](http://azure.microsoft.com/en-us/pricing/free-trial/) for more information.
## Firehose
#### StaticAzureBlobStoreFirehose
This firehose ingests events, similar to the StaticS3Firehose, but from an Azure Blob Store.
Data is newline delimited, with one JSON object per line and parsed as per the `InputRowParser` configuration.
The storage account is shared with the one used for Azure deep storage functionality, but blobs can be in a different container.
As with the S3 blobstore, it is assumed to be gzipped if the extension ends in .gz
Sample spec:
```json
"firehose" : {
"type" : "static-azure-blobstore",
"blobs": [
{
"container": "container",
"path": "/path/to/your/file.json"
},
{
"container": "anothercontainer",
"path": "/another/path.json"
}
]
}
```
|property|description|default|required?|
|--------|-----------|-------|---------|
|type|This should be "static-azure-blobstore".|N/A|yes|
|blobs|JSON array of [Azure blobs](https://msdn.microsoft.com/en-us/library/azure/ee691964.aspx).|N/A|yes|
Azure Blobs:
|property|description|default|required?|
|--------|-----------|-------|---------|
|container|Name of the azure [container](https://azure.microsoft.com/en-us/documentation/articles/storage-dotnet-how-to-use-blobs/#create-a-container)|N/A|yes|
|path|The path where data is located.|N/A|yes|

View File

@ -0,0 +1,9 @@
---
layout: doc_page
---
# Apache Cassandra
[Apache Cassandra](http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra) can also
be leveraged for deep storage. This requires some additional druid configuration as well as setting up the necessary
schema within a Cassandra keystore.

View File

@ -0,0 +1,65 @@
---
layout: doc_page
---
# Rackspace Cloud Files
## Deep Storage
[Rackspace Cloud Files](http://www.rackspace.com/cloud/files/) is another option for deep storage. This requires some additional druid configuration.
|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.storage.type`|cloudfiles||Must be set.|
|`druid.storage.region`||Rackspace Cloud Files region.|Must be set.|
|`druid.storage.container`||Rackspace Cloud Files container name.|Must be set.|
|`druid.storage.basePath`||Rackspace Cloud Files base path to use in the container.|Must be set.|
|`druid.storage.operationMaxRetries`||Number of tries before cancel a Rackspace operation.|10|
|`druid.cloudfiles.userName`||Rackspace Cloud username|Must be set.|
|`druid.cloudfiles.apiKey`||Rackspace Cloud api key.|Must be set.|
|`druid.cloudfiles.provider`|rackspace-cloudfiles-us,rackspace-cloudfiles-uk|Name of the provider depending on the region.|Must be set.|
|`druid.cloudfiles.useServiceNet`|true,false|Whether to use the internal service net.|true|
## Firehose
#### StaticCloudFilesFirehose
This firehose ingests events, similar to the StaticAzureBlobStoreFirehose, but from Rackspace's Cloud Files.
Data is newline delimited, with one JSON object per line and parsed as per the `InputRowParser` configuration.
The storage account is shared with the one used for Racksapce's Cloud Files deep storage functionality, but blobs can be in a different region and container.
As with the Azure blobstore, it is assumed to be gzipped if the extension ends in .gz
Sample spec:
```json
"firehose" : {
"type" : "static-cloudfiles",
"blobs": [
{
"region": "DFW"
"container": "container",
"path": "/path/to/your/file.json"
},
{
"region": "ORD"
"container": "anothercontainer",
"path": "/another/path.json"
}
]
}
```
|property|description|default|required?|
|--------|-----------|-------|---------|
|type|This should be "static-cloudfiles".|N/A|yes|
|blobs|JSON array of Cloud Files blobs.|N/A|yes|
Cloud Files Blobs:
|property|description|default|required?|
|--------|-----------|-------|---------|
|container|Name of the Cloud Files container|N/A|yes|
|path|The path where data is located.|N/A|yes|

View File

@ -1,9 +1,15 @@
## introduction
---
layout: doc_page
---
# Graphite Emitter
## Introduction
This extension emits druid metrics to a graphite carbon server.
Events are sent after been [pickled](http://graphite.readthedocs.org/en/latest/feeding-carbon.html#the-pickle-protocol); the size of the batch is configurable.
## configuration
## Configuration
All the configuration parameters for graphite emitter are under `druid.emitter.graphite`.
@ -69,4 +75,4 @@ druid.emitter.graphite.eventConverter={"type":"whiteList", "namespacePrefix": "d
```
**Druid emits a huge number of metrics we highly recommend to use the `whiteList` converter**
**Druid emits a huge number of metrics we highly recommend to use the `whiteList` converter**

View File

@ -1,19 +1,23 @@
---
layout: doc_page
---
# KafkaSimpleConsumerFirehose
# Kafka Simple Consumer
## Firehose
This is an experimental firehose to ingest data from kafka using kafka simple consumer api. Currently, this firehose would only work inside standalone realtime nodes.
The configuration for KafkaSimpleConsumerFirehose is similar to the KafkaFirehose [Kafka firehose example](../ingestion/stream-pull.html#realtime-specfile), except `firehose` should be replaced with `firehoseV2` like this:
The configuration for KafkaSimpleConsumerFirehose is similar to the Kafka Eight Firehose , except `firehose` should be replaced with `firehoseV2` like this:
```json
"firehoseV2": {
"type" : "kafka-0.8-v2",
"brokerList" : ["localhost:4443"],
"queueBufferLength":10001,
"resetOffsetToEarliest":"true",
"partitionIdList" : ["0"],
"clientId" : "localclient",
"feed": "wikipedia"
"type" : "kafka-0.8-v2",
"brokerList" : ["localhost:4443"],
"queueBufferLength":10001,
"resetOffsetToEarliest":"true",
"partitionIdList" : ["0"],
"clientId" : "localclient",
"feed": "wikipedia"
}
```

View File

@ -0,0 +1,59 @@
---
layout: doc_page
---
# RabbitMQ
## Firehose
#### RabbitMQFirehose
This firehose ingests events from a define rabbit-mq queue.
**Note:** Add **amqp-client-3.2.1.jar** to lib directory of druid to use this firehose.
A sample spec for rabbitmq firehose:
```json
"firehose" : {
"type" : "rabbitmq",
"connection" : {
"host": "localhost",
"port": "5672",
"username": "test-dude",
"password": "test-word",
"virtualHost": "test-vhost",
"uri": "amqp://mqserver:1234/vhost"
},
"config" : {
"exchange": "test-exchange",
"queue" : "druidtest",
"routingKey": "#",
"durable": "true",
"exclusive": "false",
"autoDelete": "false",
"maxRetries": "10",
"retryIntervalSeconds": "1",
"maxDurationSeconds": "300"
}
}
```
|property|description|default|required?|
|--------|-----------|-------|---------|
|type|This should be "rabbitmq"|N/A|yes|
|host|The hostname of the RabbitMQ broker to connect to|localhost|no|
|port|The port number to connect to on the RabbitMQ broker|5672|no|
|username|The username to use to connect to RabbitMQ|guest|no|
|password|The password to use to connect to RabbitMQ|guest|no|
|virtualHost|The virtual host to connect to|/|no|
|uri|The URI string to use to connect to RabbitMQ| |no|
|exchange|The exchange to connect to| |yes|
|queue|The queue to connect to or create| |yes|
|routingKey|The routing key to use to bind the queue to the exchange| |yes|
|durable|Whether the queue should be durable|false|no|
|exclusive|Whether the queue should be exclusive|false|no|
|autoDelete|Whether the queue should auto-delete on disconnect|false|no|
|maxRetries|The max number of reconnection retry attempts| |yes|
|retryIntervalSeconds|The reconnection interval| |yes|
|maxDurationSeconds|The max duration of trying to reconnect| |yes|

View File

@ -0,0 +1,7 @@
---
layout: doc_page
---
# RocketMQ
Original author: [https://github.com/lizhanhui](https://github.com/lizhanhui).

View File

@ -0,0 +1,54 @@
---
layout: doc_page
---
# Druid extensions
Druid implements an extension system that allows for adding functionality at runtime. Extensions
are commonly used to add support for deep storages (like HDFS and S3), metadata stores (like MySQL
and PostgreSQL), new aggregators, new input formats, and so on.
Production clusters will generally use at least two extensions; one for deep storage and one for a
metadata store. Many clusters will also use additional extensions.
## Including extensions
Please see [here](../operations/including-extensions.html).
## Core extensions
Core extensions are maintained by Druid committers.
|Name|Description|Docs|
|----|-----------|----|
|druid-avro-extensions|Support for data in Apache Avro data format.|[link](../ingestion/index.html)|
|druid-datasketches|Support for approximate counts and set operations with [DataSketches](http://datasketches.github.io/).|[link](../development/datasketches-aggregators.html)|
|druid-hdfs-storage|Support for data in Apache Avro data format.|[link](../ingestion/index.html)|
|druid-histogram|HDFS deep storage.|[link](../dependencies/deep-storage.html#hdfs)|
|druid-kafka-eight|Kafka ingest firehose (high level consumer).|[link](../ingestion/firehose.html#kafkaeightfirehose)|
|druid-kafka-extraction-namespace|Kafka namespaced lookup.|[link](../querying/lookups.html#kafka-namespaced-lookup)|
|druid-namespace-lookup|Namespaced lookups.|[link](../querying/lookups.html)|
|druid-s3-extensions|S3 deep storage.|[link](../dependencies/deep-storage.html#s3-compatible/)|
|mysql-metadata-storage|MySQL metadata store.|[link](../dependencies/metadata-storage.html#setting-up-mysql)|
|postgresql-metadata-storage|PostgreSQL metadata store.|[link](../dependencies/metadata-storage.html#setting-up-postgresql)|
# Community Extensions
A number of community members have contributed their own extensions to Druid that are not packaged with the default Druid tarball.
Community extensions are not maintained by Druid committers, although we accept patches from community members using these extensions.
If you'd like to take on maintenance for a community extension, please post on [druid-development group](https://groups.google.com/forum/#!forum/druid-development) to let us know!
|Name|Description|Docs|
|----|-----------|----|
|druid-azure-extensions|Microsoft Azure deep storage.|[link](../development/community-extensions/azure.html)|
|druid-cassandra-storage|Apache Cassandra deep storage.|[link](../development/community-extensions/cassandra.html)|
|druid-cloudfiles-extensions|Rackspace Cloudfiles deep storage and firehose.|[link](../development/community-extensions/cloudfiles.html)|
|druid-kafka-eight-simpleConsumer|Kafka ingest firehose (low level consumer).|[link](../development/community-extensions/kafka-simple.html)|
|druid-rabbitmq|RabbitMQ firehose.|[link](../development/community-extensions/rabbitmq.html)|
|druid-rocketmq|RocketMQ firehose.|[link](../development/community-extensions/rocketmq.html)|
|graphite-emitter|Graphite metrics emitter|[link](../development/community-extensions/graphite.html)|
## Promoting Community Extension to Core Extension
Please [let us know](https://groups.google.com/forum/#!forum/druid-development) if you'd like an extension to be promoted to core.
If we see a community extension actively supported by the community, we can promote it to core based on community feedback.

View File

@ -3,6 +3,7 @@ layout: doc_page
---
# Druid Firehoses
Firehoses describe the data stream source. They are pluggable and thus the configuration schema can and will vary based on the `type` of the firehose.
| Field | Type | Description | Required |
@ -14,8 +15,7 @@ We describe the configuration of the [Kafka firehose example](../ingestion/strea
- `consumerProps` is a map of properties for the Kafka consumer. The JSON object is converted into a Properties object and passed along to the Kafka consumer.
- `feed` is the feed that the Kafka consumer should read from.
Available Firehoses
-------------------
## Available Firehoses
There are several firehoses readily available in Druid, some are meant for examples, others can be used directly in a production environment.
@ -66,160 +66,6 @@ Sample spec:
|type|This should be "static-s3"|N/A|yes|
|uris|JSON array of URIs where s3 files to be ingested are located.|N/A|yes|
#### StaticAzureBlobStoreFirehose
This firehose ingests events, similar to the StaticS3Firehose, but from an Azure Blob Store.
Data is newline delimited, with one JSON object per line and parsed as per the `InputRowParser` configuration.
The storage account is shared with the one used for Azure deep storage functionality, but blobs can be in a different container.
As with the S3 blobstore, it is assumed to be gzipped if the extension ends in .gz
Sample spec:
```json
"firehose" : {
"type" : "static-azure-blobstore",
"blobs": [
{
"container": "container",
"path": "/path/to/your/file.json"
},
{
"container": "anothercontainer",
"path": "/another/path.json"
}
]
}
```
|property|description|default|required?|
|--------|-----------|-------|---------|
|type|This should be "static-azure-blobstore".|N/A|yes|
|blobs|JSON array of [Azure blobs](https://msdn.microsoft.com/en-us/library/azure/ee691964.aspx).|N/A|yes|
Azure Blobs:
|property|description|default|required?|
|--------|-----------|-------|---------|
|container|Name of the azure [container](https://azure.microsoft.com/en-us/documentation/articles/storage-dotnet-how-to-use-blobs/#create-a-container)|N/A|yes|
|path|The path where data is located.|N/A|yes|
#### StaticCloudFilesFirehose
This firehose ingests events, similar to the StaticAzureBlobStoreFirehose, but from Rackspace's Cloud Files.
Data is newline delimited, with one JSON object per line and parsed as per the `InputRowParser` configuration.
The storage account is shared with the one used for Racksapce's Cloud Files deep storage functionality, but blobs can be in a different region and container.
As with the Azure blobstore, it is assumed to be gzipped if the extension ends in .gz
Sample spec:
```json
"firehose" : {
"type" : "static-cloudfiles",
"blobs": [
{
"region": "DFW"
"container": "container",
"path": "/path/to/your/file.json"
},
{
"region": "ORD"
"container": "anothercontainer",
"path": "/another/path.json"
}
]
}
```
|property|description|default|required?|
|--------|-----------|-------|---------|
|type|This should be "static-cloudfiles".|N/A|yes|
|blobs|JSON array of Cloud Files blobs.|N/A|yes|
Cloud Files Blobs:
|property|description|default|required?|
|--------|-----------|-------|---------|
|container|Name of the Cloud Files container|N/A|yes|
|path|The path where data is located.|N/A|yes|
#### TwitterSpritzerFirehose
This firehose connects directly to the twitter spritzer data stream.
Sample spec:
```json
"firehose" : {
"type" : "twitzer",
"maxEventCount": -1,
"maxRunMinutes": 0
}
```
|property|description|default|required?|
|--------|-----------|-------|---------|
|type|This should be "twitzer"|N/A|yes|
|maxEventCount|max events to receive, -1 is infinite, 0 means nothing is delivered; use this to prevent infinite space consumption or to prevent getting throttled at an inconvenient time.|N/A|yes|
|maxRunMinutes|maximum number of minutes to fetch Twitter events. Use this to prevent getting throttled at an inconvenient time. If zero or less, no time limit for run.|N/A|yes|
#### RabbitMQFirehose
This firehose ingests events from a define rabbit-mq queue.
**Note:** Add **amqp-client-3.2.1.jar** to lib directory of druid to use this firehose.
A sample spec for rabbitmq firehose:
```json
"firehose" : {
"type" : "rabbitmq",
"connection" : {
"host": "localhost",
"port": "5672",
"username": "test-dude",
"password": "test-word",
"virtualHost": "test-vhost",
"uri": "amqp://mqserver:1234/vhost"
},
"config" : {
"exchange": "test-exchange",
"queue" : "druidtest",
"routingKey": "#",
"durable": "true",
"exclusive": "false",
"autoDelete": "false",
"maxRetries": "10",
"retryIntervalSeconds": "1",
"maxDurationSeconds": "300"
}
}
```
|property|description|default|required?|
|--------|-----------|-------|---------|
|type|This should be "rabbitmq"|N/A|yes|
|host|The hostname of the RabbitMQ broker to connect to|localhost|no|
|port|The port number to connect to on the RabbitMQ broker|5672|no|
|username|The username to use to connect to RabbitMQ|guest|no|
|password|The password to use to connect to RabbitMQ|guest|no|
|virtualHost|The virtual host to connect to|/|no|
|uri|The URI string to use to connect to RabbitMQ| |no|
|exchange|The exchange to connect to| |yes|
|queue|The queue to connect to or create| |yes|
|routingKey|The routing key to use to bind the queue to the exchange| |yes|
|durable|Whether the queue should be durable|false|no|
|exclusive|Whether the queue should be exclusive|false|no|
|autoDelete|Whether the queue should auto-delete on disconnect|false|no|
|maxRetries|The max number of reconnection retry attempts| |yes|
|retryIntervalSeconds|The reconnection interval| |yes|
|maxDurationSeconds|The max duration of trying to reconnect| |yes|
#### LocalFirehose
This Firehose can be used to read the data from files on local disk.
@ -280,7 +126,6 @@ This can be used to merge data from more than one firehose.
|type|This should be "combining"|yes|
|delegates|list of firehoses to combine data from|yes|
#### EventReceiverFirehose
EventReceiverFirehoseFactory can be used to ingest events using an http endpoint.
@ -324,4 +169,23 @@ An example is shown below:
|type|This should be "timed"|yes|
|shutoffTime|time at which the firehose should shut down, in ISO8601 format|yes|
|delegate|firehose to use|yes|
=======
#### TwitterSpritzerFirehose
This firehose connects directly to the twitter spritzer data stream.
Sample spec:
```json
"firehose" : {
"type" : "twitzer",
"maxEventCount": -1,
"maxRunMinutes": 0
}
```
|property|description|default|required?|
|--------|-----------|-------|---------|
|type|This should be "twitzer"|N/A|yes|
|maxEventCount|max events to receive, -1 is infinite, 0 means nothing is delivered; use this to prevent infinite space consumption or to prevent getting throttled at an inconvenient time.|N/A|yes|
|maxRunMinutes|maximum number of minutes to fetch Twitter events. Use this to prevent getting throttled at an inconvenient time. If zero or less, no time limit for run.|N/A|yes|

View File

@ -3,7 +3,8 @@ layout: doc_page
---
# Including Extensions
Druid uses a module system that allows for the addition of extensions at runtime. To instruct Druid to load extensions, follow the steps below.
Druid uses a module system that allows for the addition of extensions at runtime. Core extensions are bundled with the Druid tarball.
Community extensions be download locally via the [pull-deps](../operations/pull-deps.html) tool.
## Download extensions

View File

@ -44,9 +44,11 @@ To run `pull-deps`, you should
Example:
Suppose you want to download ```druid-examples```, ```mysql-metadata-storage``` and ```hadoop-client```(both 2.3.0 and 2.4.0) with a specific version, you can run `pull-deps` command with `-c io.druid.extensions:druid-examples:0.9.0`, `-c io.druid.extensions:mysql-metadata-storage:0.9.0`, `-h org.apache.hadoop:hadoop-client:2.3.0` and `-h org.apache.hadoop:hadoop-client:2.4.0`, an example command would be:
Suppose you want to download ```druid-rabbitmq```, ```mysql-metadata-storage``` and ```hadoop-client```(both 2.3.0 and 2.4.0) with a specific version, you can run `pull-deps` command with `-c io.druid.extensions:druid-examples:0.9.0`, `-c io.druid.extensions:mysql-metadata-storage:0.9.0`, `-h org.apache.hadoop:hadoop-client:2.3.0` and `-h org.apache.hadoop:hadoop-client:2.4.0`, an example command would be:
```java -classpath "/my/druid/library/*" io.druid.cli.Main tools pull-deps --clean -c io.druid.extensions:mysql-metadata-storage:0.9.0 -c io.druid.extensions:druid-examples:0.9.0 -h org.apache.hadoop:hadoop-client:2.3.0 -h org.apache.hadoop:hadoop-client:2.4.0```
```
java -classpath "/my/druid/library/*" io.druid.cli.Main tools pull-deps --clean -c io.druid.extensions:mysql-metadata-storage:0.9.0 -c io.druid.extensions.contrib:druid-rabbitmq:0.9.0 -h org.apache.hadoop:hadoop-client:2.3.0 -h org.apache.hadoop:hadoop-client:2.4.0
```
Because `--clean` is supplied, this command will first remove the directories specified at `druid.extensions.directory` and `druid.extensions.hadoopDependenciesDir`, then recreate them and start downloading the extensions there. After finishing downloading, if you go to the extension directories you specified, you will see
@ -90,8 +92,14 @@ hadoop-dependencies/
..... lots of jars
```
Note that if you specify `--defaultVersion`, you don't have to put version information in the coordinate. For example, if you want both `druid-examples` and `mysql-metadata-storage` to use version `0.9.0`, you can change the command above to
Note that if you specify `--defaultVersion`, you don't have to put version information in the coordinate. For example, if you want both `druid-rabbitmq` and `mysql-metadata-storage` to use version `0.9.0`, you can change the command above to
```
java -classpath "/my/druid/library/*" io.druid.cli.Main tools pull-deps --defaultVersion 0.9.0 --clean -c io.druid.extensions:mysql-metadata-storage -c io.druid.extensions:druid-examples -h org.apache.hadoop:hadoop-client:2.3.0 -h org.apache.hadoop:hadoop-client:2.4.0
java -classpath "/my/druid/library/*" io.druid.cli.Main tools pull-deps --defaultVersion 0.9.0 --clean -c io.druid.extensions:mysql-metadata-storage -c io.druid.extensions.contrib:druid-rabbitmq -h org.apache.hadoop:hadoop-client:2.3.0 -h org.apache.hadoop:hadoop-client:2.4.0
```
<div class="note info">
Please note to use the pull-deps tool you must know the Maven groupId, artifactId, and version of your extension.
For Druid community extensions listed <a href="../development/extensions.html">here</a>, the groupId is "io.druid.extensions.contrib" and the artifactId is the name of the extension.
</div>

View File

@ -85,8 +85,9 @@
## Development
* [Overview](../development/overview.html)
* [Libraries](../development/libraries.html)
* [Libraries](../development/libraries.html)
* [Extending Druid](../development/modules.html)
* [Available Modules](../development/extensions.html)
* [Build From Source](../development/build.html)
* [Versioning](../development/versioning.html)
* [Integration](../development/integrating-druid-with-other-technologies.html)
@ -95,8 +96,7 @@
* [Geographic Queries](../development/geo.html)
* [Approximate Histograms and Quantiles](../development/approximate-histograms.html)
* [Datasketches](../development/datasketches-aggregators.html)
* [Router](../development/router.html)
* [Kafka Simple Consumer Firehose](../development/kafka-simple-consumer-firehose.html)
* [Router](../development/router.html)
## Misc
* [Papers & Talks](../misc/papers-and-talks.html)

View File

@ -0,0 +1,6 @@
# Community Extensions
Please contribute all community extensions in this directory and include a doc of how your extension can be used under /docs/content/development/community-extensions/.
Please note that community extensions are maintained by their original contributors and are not packaged with the core Druid distribution.
If you'd like to take on maintenance for a community extension, please post on [druid-development group](https://groups.google.com/forum/#!forum/druid-development) to let us know!

View File

@ -1,25 +1,28 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Druid - a distributed column store.
~ Copyright 2012 - 2015 Metamarkets Group Inc.
~
~ Licensed under the Apache License, Version 2.0 (the "License");
~ you may not use this file except in compliance with the License.
~ You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
~ Licensed to Metamarkets Group Inc. (Metamarkets) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. Metamarkets licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>io.druid.extensions</groupId>
<groupId>io.druid.extensions.contrib</groupId>
<artifactId>druid-azure-extensions</artifactId>
<name>druid-azure-extensions</name>
<description>druid-azure-extensions</description>

View File

@ -1,24 +1,27 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Druid - a distributed column store.
~ Copyright 2012 - 2015 Metamarkets Group Inc.
~
~ Licensed under the Apache License, Version 2.0 (the "License");
~ you may not use this file except in compliance with the License.
~ You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
~ Licensed to Metamarkets Group Inc. (Metamarkets) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. Metamarkets licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>io.druid.extensions</groupId>
<groupId>io.druid.extensions.contrib</groupId>
<artifactId>druid-cassandra-storage</artifactId>
<name>druid-cassandra-storage</name>
<description>druid-cassandra-storage</description>

View File

@ -1,29 +1,28 @@
<?xml version="1.0"?>
<!--
~ Licensed to Metamarkets Group Inc. (Metamarkets) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. Metamarkets licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
~ Licensed to Metamarkets Group Inc. (Metamarkets) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. Metamarkets licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
<project
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"
xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<project xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"
xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<modelVersion>4.0.0</modelVersion>
<groupId>io.druid.extensions</groupId>
<groupId>io.druid.extensions.contrib</groupId>
<artifactId>druid-cloudfiles-extensions</artifactId>
<name>druid-cloudfiles-extensions</name>
<description>druid-cloudfiles-extensions</description>

View File

@ -25,41 +25,41 @@ import javax.validation.constraints.NotNull;
public class CloudFilesBlob
{
@JsonProperty
@NotNull
private String container = null;
@JsonProperty
@NotNull
private String container = null;
@JsonProperty
@NotNull
private String path = null;
@JsonProperty
@NotNull
private String path = null;
@JsonProperty
@NotNull
private String region = null;
@JsonProperty
@NotNull
private String region = null;
public CloudFilesBlob()
{
}
public CloudFilesBlob()
{
}
public CloudFilesBlob(String container, String path, String region)
{
this.container = container;
this.path = path;
this.region = region;
}
public CloudFilesBlob(String container, String path, String region)
{
this.container = container;
this.path = path;
this.region = region;
}
public String getContainer()
{
return container;
}
public String getContainer()
{
return container;
}
public String getPath()
{
return path;
}
public String getPath()
{
return path;
}
public String getRegion()
{
return region;
}
public String getRegion()
{
return region;
}
}

View File

@ -19,31 +19,30 @@
package io.druid.firehose.cloudfiles;
import java.util.List;
import com.fasterxml.jackson.databind.Module;
import com.fasterxml.jackson.databind.jsontype.NamedType;
import com.fasterxml.jackson.databind.module.SimpleModule;
import com.google.common.collect.ImmutableList;
import com.google.inject.Binder;
import io.druid.initialization.DruidModule;
import java.util.List;
public class CloudFilesFirehoseDruidModule implements DruidModule
{
@Override
public List<? extends Module> getJacksonModules()
{
return ImmutableList.of(
new SimpleModule().registerSubtypes(
new NamedType(StaticCloudFilesFirehoseFactory.class, "static-cloudfiles")));
}
@Override
public List<? extends Module> getJacksonModules()
{
return ImmutableList.of(
new SimpleModule().registerSubtypes(
new NamedType(StaticCloudFilesFirehoseFactory.class, "staticcloudfiles")));
}
@Override
public void configure(Binder arg0)
{
@Override
public void configure(Binder arg0)
{
}
}
}

View File

@ -0,0 +1,138 @@
/*
* Licensed to Metamarkets Group Inc. (Metamarkets) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. Metamarkets licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package io.druid.firehose.cloudfiles;
import com.fasterxml.jackson.annotation.JacksonInject;
import com.fasterxml.jackson.annotation.JsonCreator;
import com.fasterxml.jackson.annotation.JsonProperty;
import com.google.common.base.Charsets;
import com.google.common.base.Preconditions;
import com.google.common.base.Throwables;
import com.google.common.collect.ImmutableList;
import com.google.common.collect.Lists;
import com.metamx.common.CompressionUtils;
import com.metamx.common.logger.Logger;
import com.metamx.common.parsers.ParseException;
import io.druid.data.input.Firehose;
import io.druid.data.input.FirehoseFactory;
import io.druid.data.input.impl.FileIteratingFirehose;
import io.druid.data.input.impl.StringInputRowParser;
import io.druid.storage.cloudfiles.CloudFilesByteSource;
import io.druid.storage.cloudfiles.CloudFilesObjectApiProxy;
import org.apache.commons.io.IOUtils;
import org.apache.commons.io.LineIterator;
import org.jclouds.rackspace.cloudfiles.v1.CloudFilesApi;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
public class StaticCloudFilesFirehoseFactory implements FirehoseFactory<StringInputRowParser>
{
private static final Logger log = new Logger(StaticCloudFilesFirehoseFactory.class);
private final CloudFilesApi cloudFilesApi;
private final List<CloudFilesBlob> blobs;
@JsonCreator
public StaticCloudFilesFirehoseFactory(
@JacksonInject("objectApi") CloudFilesApi cloudFilesApi,
@JsonProperty("blobs") CloudFilesBlob[] blobs
)
{
this.cloudFilesApi = cloudFilesApi;
this.blobs = ImmutableList.copyOf(blobs);
}
@JsonProperty
public List<CloudFilesBlob> getBlobs()
{
return blobs;
}
@Override
public Firehose connect(StringInputRowParser stringInputRowParser) throws IOException, ParseException
{
Preconditions.checkNotNull(cloudFilesApi, "null cloudFilesApi");
final LinkedList<CloudFilesBlob> objectQueue = Lists.newLinkedList(blobs);
return new FileIteratingFirehose(
new Iterator<LineIterator>()
{
@Override
public boolean hasNext()
{
return !objectQueue.isEmpty();
}
@Override
public LineIterator next()
{
final CloudFilesBlob nextURI = objectQueue.poll();
final String region = nextURI.getRegion();
final String container = nextURI.getContainer();
final String path = nextURI.getPath();
log.info("Retrieving file from region[%s], container[%s] and path [%s]",
region, container, path
);
CloudFilesObjectApiProxy objectApi = new CloudFilesObjectApiProxy(
cloudFilesApi, region, container);
final CloudFilesByteSource byteSource = new CloudFilesByteSource(objectApi, path);
try {
final InputStream innerInputStream = byteSource.openStream();
final InputStream outerInputStream = path.endsWith(".gz")
? CompressionUtils.gzipInputStream(innerInputStream)
: innerInputStream;
return IOUtils.lineIterator(
new BufferedReader(
new InputStreamReader(outerInputStream, Charsets.UTF_8)));
}
catch (IOException e) {
log.error(e,
"Exception opening container[%s] blob[%s] from region[%s]",
container, path, region
);
throw Throwables.propagate(e);
}
}
@Override
public void remove()
{
throw new UnsupportedOperationException();
}
},
stringInputRowParser
);
}
}

View File

@ -27,6 +27,8 @@
<version>0.9.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>
<groupId>io.druid.extensions.contrib</groupId>
<artifactId>druid-rocketmq</artifactId>
<properties>

View File

@ -1,22 +1,22 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Licensed to Metamarkets Group Inc. (Metamarkets) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. Metamarkets licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
~ Licensed to Metamarkets Group Inc. (Metamarkets) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. Metamarkets licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
@ -29,7 +29,7 @@
<relativePath>../../pom.xml</relativePath>
</parent>
<groupId>io.druid.extensions</groupId>
<groupId>io.druid.extensions.contrib</groupId>
<artifactId>graphite-emitter</artifactId>
<name>graphite-emitter</name>
<description>Druid emitter extension to support graphite</description>

View File

@ -19,7 +19,7 @@
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>io.druid.extensions</groupId>
<groupId>io.druid.extensions.contrib</groupId>
<artifactId>druid-kafka-eight-simple-consumer</artifactId>
<name>druid-kafka-eight-simple-consumer</name>
<description>druid-kafka-eight-simple-consumer</description>

View File

@ -1,25 +1,27 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Druid - a distributed column store.
~ Copyright 2012 - 2015 Metamarkets Group Inc.
~
~ Licensed under the Apache License, Version 2.0 (the "License");
~ you may not use this file except in compliance with the License.
~ You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
~ Licensed to Metamarkets Group Inc. (Metamarkets) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. Metamarkets licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>io.druid.extensions</groupId>
<groupId>io.druid.extensions.contrib</groupId>
<artifactId>druid-rabbitmq</artifactId>
<name>druid-rabbitmq</name>
<description>druid-rabbitmq</description>

View File

@ -1,24 +1,27 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Druid - a distributed column store.
~ Copyright 2012 - 2015 Metamarkets Group Inc.
~
~ Licensed under the Apache License, Version 2.0 (the "License");
~ you may not use this file except in compliance with the License.
~ You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
~ Licensed to Metamarkets Group Inc. (Metamarkets) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. Metamarkets licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>io.druid.extensions</groupId>
<artifactId>druid-avro-extensions</artifactId>
<name>druid-avro-extensions</name>

Some files were not shown because too many files have changed in this diff Show More