druid/docs/content/development/extensions.md

108 lines
8.6 KiB
Markdown
Raw Normal View History

---
layout: doc_page
title: "Apache Druid (incubating) extensions"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
# Apache Druid (incubating) extensions
Druid implements an extension system that allows for adding functionality at runtime. Extensions
are commonly used to add support for deep storages (like HDFS and S3), metadata stores (like MySQL
and PostgreSQL), new aggregators, new input formats, and so on.
Production clusters will generally use at least two extensions; one for deep storage and one for a
metadata store. Many clusters will also use additional extensions.
## Including extensions
Please see [here](../operations/including-extensions.html).
## Core extensions
Core extensions are maintained by Druid committers.
|Name|Description|Docs|
|----|-----------|----|
|druid-avro-extensions|Support for data in Apache Avro data format.|[link](../development/extensions-core/avro.html)|
2018-01-05 17:23:13 -05:00
|druid-basic-security|Support for Basic HTTP authentication and role-based access control.|[link](../development/extensions-core/druid-basic-security.html)|
|druid-bloom-filter|Support for providing Bloom filters in druid queries.|[link](../development/extensions-core/bloom-filter.html)|
|druid-caffeine-cache|A local cache implementation backed by Caffeine.|[link](../configuration/index.html#cache-configuration)|
|druid-datasketches|Support for approximate counts and set operations with [DataSketches](https://datasketches.github.io/).|[link](../development/extensions-core/datasketches-extension.html)|
|druid-hdfs-storage|HDFS deep storage.|[link](../development/extensions-core/hdfs.html)|
|druid-histogram|Approximate histograms and quantiles aggregator. Deprecated, please use the [DataSketches quantiles aggregator](../development/extensions-core/datasketches-quantiles.html) from the `druid-datasketches` extension instead.|[link](../development/extensions-core/approximate-histograms.html)|
|druid-kafka-extraction-namespace|Kafka-based namespaced lookup. Requires namespace lookup extension.|[link](../development/extensions-core/kafka-extraction-namespace.html)|
|druid-kafka-indexing-service|Supervised exactly-once Kafka ingestion for the indexing service.|[link](../development/extensions-core/kafka-ingestion.html)|
Add Kinesis Indexing Service to core Druid (#6431) * created seekablestream classes * created seekablestreamsupervisor class * first attempt to integrate kafa indexing service to use SeekableStream * seekablestream bug fixes * kafkarecordsupplier * integrated kafka indexing service with seekablestream * implemented resume/suspend and refactored some package names * moved kinesis indexing service into core druid extensions * merged some changes from kafka supervisor race condition * integrated kinesis-indexing-service with seekablestream * unite tests for kinesis-indexing-service * various bug fixes for kinesis-indexing-service * refactored kinesisindexingtask * finished up more kinesis unit tests * more bug fixes for kinesis-indexing-service * finsihed refactoring kinesis unit tests * removed KinesisParititons and KafkaPartitions to use SeekableStreamPartitions * kinesis-indexing-service code cleanup and docs * merge #6291 merge #6337 merge #6383 * added more docs and reordered methods * fixd kinesis tests after merging master and added docs in seekablestream * fix various things from pr comment * improve recordsupplier and add unit tests * migrated to aws-java-sdk-kinesis * merge changes from master * fix pom files and forbiddenapi checks * checkpoint JavaType bug fix * fix pom and stuff * disable checkpointing in kinesis * fix kinesis sequence number null in closed shard * merge changes from master * fixes for kinesis tasks * capitalized <partitionType, sequenceType> * removed abstract class loggers * conform to guava api restrictions * add docker for travis other modules test * address comments * improve RecordSupplier to supply records in batch * fix strict compile issue * add test scope for localstack dependency * kinesis indexing task refactoring * comments * github comments * minor fix * removed unneeded readme * fix deserialization bug * fix various bugs * KinesisRecordSupplier unable to catch up to earliest position in stream bug fix * minor changes to kinesis * implement deaggregate for kinesis * Merge remote-tracking branch 'upstream/master' into seekablestream * fix kinesis offset discrepancy with kafka * kinesis record supplier disable getPosition * pr comments * mock for kinesis tests and remove docker dependency for unit tests * PR comments * avg lag in kafkasupervisor #6587 * refacotred SequenceMetadata in taskRunners * small fix * more small fix * recordsupplier resource leak * revert .travis.yml formatting * fix style * kinesis docs * doc part2 * more docs * comments * comments*2 * revert string replace changes * comments * teamcity * comments part 1 * comments part 2 * comments part 3 * merge #6754 * fix injection binding * comments * KinesisRegion refactor * comments part idk lol * can't think of a commit msg anymore * remove possiblyResetDataSourceMetadata() for IncrementalPublishingTaskRunner * commmmmmmmmmments * extra error handling in KinesisRecordSupplier getRecords * comments * quickfix * typo * oof
2018-12-21 14:49:24 -05:00
|druid-kinesis-indexing-service|Supervised exactly-once Kinesis ingestion for the indexing service.|[link](../development/extensions-core/kinesis-ingestion.html)|
2019-02-28 21:10:39 -05:00
|druid-kerberos|Kerberos authentication for druid processes.|[link](../development/extensions-core/druid-kerberos.html)|
|druid-lookups-cached-global|A module for [lookups](../querying/lookups.html) providing a jvm-global eager caching for lookups. It provides JDBC and URI implementations for fetching lookup data.|[link](../development/extensions-core/lookups-cached-global.html)|
|druid-lookups-cached-single| Per lookup caching module to support the use cases where a lookup need to be isolated from the global pool of lookups |[link](../development/extensions-core/druid-lookups.html)|
|druid-orc-extensions|Support for data in Apache Orc data format.|[link](../development/extensions-core/orc.html)|
|druid-parquet-extensions|Support for data in Apache Parquet data format. Requires druid-avro-extensions to be loaded.|[link](../development/extensions-core/parquet.html)|
|druid-protobuf-extensions| Support for data in Protobuf data format.|[link](../development/extensions-core/protobuf.html)|
|druid-s3-extensions|Interfacing with data in AWS S3, and using S3 as deep storage.|[link](../development/extensions-core/s3.html)|
|druid-ec2-extensions|Interfacing with AWS EC2 for autoscaling middle managers|UNDOCUMENTED|
|druid-stats|Statistics related module including variance and standard deviation.|[link](../development/extensions-core/stats.html)|
|mysql-metadata-storage|MySQL metadata store.|[link](../development/extensions-core/mysql.html)|
|postgresql-metadata-storage|PostgreSQL metadata store.|[link](../development/extensions-core/postgresql.html)|
2019-02-28 21:10:39 -05:00
|simple-client-sslcontext|Simple SSLContext provider module to be used by Druid's internal HttpClient when talking to other Druid processes over HTTPS.|[link](../development/extensions-core/simple-client-sslcontext.html)|
# Community Extensions
<div class="note caution">
Community extensions are not maintained by Druid committers, although we accept patches from community members using these extensions. They may not have been as extensively tested as the core extensions.
</div>
A number of community members have contributed their own extensions to Druid that are not packaged with the default Druid tarball.
If you'd like to take on maintenance for a community extension, please post on [dev@druid.apache.org](https://lists.apache.org/list.html?dev@druid.apache.org) to let us know!
All of these community extensions can be downloaded using [pull-deps](../operations/pull-deps.html) while specifying a `-c` coordinate option to pull `org.apache.druid.extensions.contrib:{EXTENSION_NAME}:{DRUID_VERSION}`.
|Name|Description|Docs|
|----|-----------|----|
|ambari-metrics-emitter|Ambari Metrics Emitter |[link](../development/extensions-contrib/ambari-metrics-emitter.html)|
|druid-azure-extensions|Microsoft Azure deep storage.|[link](../development/extensions-contrib/azure.html)|
|druid-cassandra-storage|Apache Cassandra deep storage.|[link](../development/extensions-contrib/cassandra.html)|
|druid-cloudfiles-extensions|Rackspace Cloudfiles deep storage and firehose.|[link](../development/extensions-contrib/cloudfiles.html)|
2016-03-28 20:29:41 -04:00
|druid-distinctcount|DistinctCount aggregator|[link](../development/extensions-contrib/distinctcount.html)|
|druid-redis-cache|A cache implementation for Druid based on Redis.|[link](../development/extensions-contrib/redis-cache.html)|
|druid-time-min-max|Min/Max aggregator for timestamp.|[link](../development/extensions-contrib/time-min-max.html)|
|druid-google-extensions|Google Cloud Storage deep storage.|[link](../development/extensions-contrib/google.html)|
|sqlserver-metadata-storage|Microsoft SqlServer deep storage.|[link](../development/extensions-contrib/sqlserver.html)|
|graphite-emitter|Graphite metrics emitter|[link](../development/extensions-contrib/graphite.html)|
|statsd-emitter|StatsD metrics emitter|[link](../development/extensions-contrib/statsd.html)|
|kafka-emitter|Kafka metrics emitter|[link](../development/extensions-contrib/kafka-emitter.html)|
|druid-thrift-extensions|Support thrift ingestion |[link](../development/extensions-contrib/thrift.html)|
|druid-opentsdb-emitter|OpenTSDB metrics emitter |[link](../development/extensions-contrib/opentsdb-emitter.html)|
|materialized-view-selection, materialized-view-maintenance|Materialized View|[link](../development/extensions-contrib/materialized-view.html)|
Contributing Moving-Average Query to open source. (#6430) * Contributing Moving-Average Query to open source. * Fix failing code inspections. * See if explicit types will invoke the correct comparison function. * Explicitly remove support for druid.generic.useDefaultValueForNull configuration parameter. * Update styling and headers for complience. * Refresh code with latest master changes: * Remove NullDimensionSelector. * Apply changes of RequestLogger. * Apply changes of TimelineServerView. * Small checkstyle fix. * Checkstyle fixes. * Fixing rat errors; Teamcity errors. * Removing support theta sketches. Will be added back in this pr or a following once DI conflicts with datasketches are resolved. * Implements some of the review fixes. * Contributing Moving-Average Query to open source. * Fix failing code inspections. * See if explicit types will invoke the correct comparison function. * Explicitly remove support for druid.generic.useDefaultValueForNull configuration parameter. * Update styling and headers for complience. * Refresh code with latest master changes: * Remove NullDimensionSelector. * Apply changes of RequestLogger. * Apply changes of TimelineServerView. * Small checkstyle fix. * Checkstyle fixes. * Fixing rat errors; Teamcity errors. * Removing support theta sketches. Will be added back in this pr or a following once DI conflicts with datasketches are resolved. * Implements some of the review fixes. * More fixes for review. * More fixes from review. * MapBasedRow is Unmodifiable. Create new rows instead of modifying existing ones. * Remove more changes related to datasketches support. * Refactor BaseAverager startFrom field and add a comment. * fakeEvents field: Refactor initialization and add comment. * Rename parameters (tiny change). * Fix variable name typo in test (JAN_4). * Fix styling of non camelCase fields. * Fix Preconditions.checkArgument for cycleSize. * Add more documentation to RowBucketIterable and other classes. * key/value comment on in MovingAverageIterable. * Fix anonymous makeColumnValueSelector returning null. * Replace IdentityYieldingAccumolator with Yielders.each(). * * internalNext() should return null instead of throwing exception. * Remove unused variables/prarameters. * Harden MovingAverageIterableTest (Switch anyOf to exact match). * Change internalNext() from recursion to iteration; Simplify next() and hasNext(). * Remove unused imports. * Address review comments. * Rename fakeEvents to emptyEvents. * Remove redundant parameter key from computeMovingAverage. * Check yielder as well in RowBucketIterable#hasNext() * Fix javadoc.
2019-04-26 20:07:48 -04:00
|druid-moving-average-query|Support for [Moving Average](https://en.wikipedia.org/wiki/Moving_average) and other Aggregate [Window Functions](https://en.wikibooks.org/wiki/Structured_Query_Language/Window_functions) in Druid queries.|[link](../development/extensions-contrib/moving-average-query.html)|
|druid-influxdb-emitter|InfluxDB metrics emitter|[link](../development/extensions-contrib/influxdb-emitter.html)|
|druid-momentsketch|Support for approximate quantile queries using the [momentsketch](https://github.com/stanford-futuredata/momentsketch) library|[link](../development/extensions-contrib/momentsketch-quantiles.html)|
|druid-tdigestsketch|Support for approximate sketch aggregators based on [T-Digest](https://github.com/tdunning/t-digest)|[link](../development/extensions-contrib/tdigestsketch-quantiles.html)|
## Promoting Community Extension to Core Extension
Please post on [dev@druid.apache.org](https://lists.apache.org/list.html?dev@druid.apache.org) if you'd like an extension to be promoted to core.
If we see a community extension actively supported by the community, we can promote it to core based on community feedback.
# Creating your own Extensions
For information how to create your own extension, please see [here](../development/modules.html).