Apache Druid: a high performance real-time analytics database.
Go to file
Agustin Gonzalez 9efa6cc9c8
Make persists concurrent with adding rows in batch ingestion (#11536)
* Make persists concurrent with ingestion

* Remove semaphore but keep concurrent persists (with add) and add push in the backround as well

* Go back to documented default persists (zero)

* Move to debug

* Remove unnecessary Atomics

* Comments on synchronization (or not) for sinks & sinkMetadata

* Some cleanup for unit tests but they still need further work

* Shutdown & wait for persists and push on close

* Provide support for three existing batch appenderators using batchProcessingMode flag

* Fix reference to wrong appenderator

* Fix doc typos

* Add BatchAppenderators class test coverage

* Add log message to batchProcessingMode final value, fix typo in enum name

* Another typo and minor fix to log message

* LEGACY->OPEN_SEGMENTS, Edit docs

* Minor update legacy->open segments log message

* More code comments, mostly small adjustments to naming etc

* fix spelling

* Exclude BtachAppenderators from Jacoco since it is fully tested but Jacoco still refuses to ack coverage

* Coverage for Appenderators & BatchAppenderators, name change of a method that was still using "legacy" rather than "openSegments"

Co-authored-by: Clint Wylie <cjwylie@gmail.com>
2021-09-08 13:31:52 -07:00
.github Lock hadoop dependencies to 2.8.5 (#11583) 2021-08-12 15:16:47 +05:30
.idea Use ExecutorService variables to assign ExecutorService Instances (#11373) 2021-06-25 16:56:34 -07:00
benchmarks Configurable maxStreamLength for doubles sketches (#11574) 2021-08-31 14:56:37 -07:00
cloud update jackson dependencies to use bom (#11353) 2021-06-16 18:37:30 -07:00
codestyle handle timestamps of complex types when parsing protobuf messages (#11293) 2021-06-07 15:19:39 +05:30
core Improve error message when buckets are null for cloud objects (#11644) 2021-09-07 17:31:17 -07:00
dev chore: fix case of GitHub (#10928) 2021-05-07 01:15:43 -07:00
distribution fix integration tests (#11638) 2021-08-30 13:53:13 -07:00
docs Make persists concurrent with adding rows in batch ingestion (#11536) 2021-09-08 13:31:52 -07:00
examples Allow spaces in java home. (#11407) 2021-07-05 18:50:36 +05:30
extendedset add single input string expression dimension vector selector and better expression planning (#11213) 2021-07-06 11:20:49 -07:00
extensions-contrib Fix an exception when using redis cluster as cache (#11369) 2021-08-30 16:59:53 -07:00
extensions-core Make persists concurrent with adding rows in batch ingestion (#11536) 2021-09-08 13:31:52 -07:00
helm/druid remove DEPRECATION part (#11326) 2021-06-09 15:52:43 +08:00
hll Bump dev version to 0.22.0-SNAPSHOT (#10759) 2021-01-15 13:16:23 -08:00
hooks Add git pre-commit hook to source control (#9554) 2020-06-05 11:19:42 -10:00
indexing-hadoop Cleanup test dependencies in hdfs-storage extension (#11563) 2021-08-10 07:52:32 -07:00
indexing-service Make persists concurrent with adding rows in batch ingestion (#11536) 2021-09-08 13:31:52 -07:00
integration-tests Cancel API for sqls (#11643) 2021-09-05 10:57:45 -07:00
licenses Web console: Better hotkeys and library upgrades (#11365) 2021-06-17 18:24:29 -07:00
processing fix goldilocks bug with HashVectorGrouper improperly initializing memory (#11649) 2021-09-02 02:25:26 -07:00
publications De-incubation cleanup in code, docs, packaging (#9108) 2020-01-03 12:33:19 -05:00
server Make persists concurrent with adding rows in batch ingestion (#11536) 2021-09-08 13:31:52 -07:00
services Cancel API for sqls (#11643) 2021-09-05 10:57:45 -07:00
sql Add test for IS NOT NULL filter on join column in left join (#11636) 2021-09-06 12:20:41 +05:30
web-console Web console: Improve the lookup view UX (#11620) 2021-08-30 14:36:23 -07:00
website Configurable maxStreamLength for doubles sketches (#11574) 2021-08-31 14:56:37 -07:00
.asf.yaml Add .asf.yaml. (#9083) 2019-12-20 16:45:38 -08:00
.backportrc.json Add 0.18.0 to .backportrc.json to facilitate backport. (#9661) 2020-04-11 13:49:04 -07:00
.codecov.yml Use Codecov (#8388) 2019-08-28 08:49:30 -07:00
.dockerignore Add docker container for druid (#6896) 2019-02-08 12:12:28 +00:00
.gitignore Web console basic end-to-end-test (#9595) 2020-04-09 12:38:09 -07:00
.lgtm.yml Suppress LGTM warnings about stack trace exposure (#9631) 2020-04-09 17:31:03 -07:00
.travis.yml Support custom coordinator duties (#11601) 2021-08-19 11:54:11 +07:00
CONTRIBUTING.md Fix numbered list formatting in markdown. (#9664) 2020-04-21 20:18:12 -07:00
LABELS Add plain text README.txt, use relative link from README.md to build.md (#7611) 2019-05-09 21:29:26 -07:00
LICENSE support Aliyun OSS service as deep storage (#9898) 2020-07-01 22:20:53 -07:00
NOTICE license.yaml fixes for code introduced related to AWS RDS token based password provider in PR #9518 (#10885) 2021-03-10 12:59:25 -08:00
README.md Updates to source and doc build pages (#11464) 2021-07-20 18:07:34 -07:00
README.template De-incubation cleanup in code, docs, packaging (#9108) 2020-01-03 12:33:19 -05:00
check_test_suite.py chill, travis (#11283) 2021-05-27 05:40:40 -07:00
check_test_suite_test.py chill, travis (#11283) 2021-05-27 05:40:40 -07:00
licenses.yaml Bump parquet.version from 1.11.1 to 1.12.0 (#11346) 2021-08-13 19:17:57 -07:00
owasp-dependency-check-suppressions.xml Suppress CVEs for jdom2, kafka-clients, libthrift, solr-solrj (#11572) 2021-08-11 15:46:57 +05:30
pom.xml Put sleep in an extension (#11632) 2021-08-25 01:27:45 -07:00
setup-hooks.sh Add git pre-commit hook to source control (#9554) 2020-06-05 11:19:42 -10:00
upload.sh Adding licenses and enable apache-rat-plugin. (#6215) 2018-09-18 08:39:26 -07:00

README.md

Slack Build Status Language grade: Java Coverage Status Docker Helm


Website | Documentation | Developer Mailing List | User Mailing List | Slack | Twitter | Download


Apache Druid

Druid is a high performance real-time analytics database. Druid's main value add is to reduce time to insight and action.

Druid is designed for workflows where fast queries and ingest really matter. Druid excels at powering UIs, running operational (ad-hoc) queries, or handling high concurrency. Consider Druid as an open source alternative to data warehouses for a variety of use cases. The design documentation explains the key concepts.

Getting started

You can get started with Druid with our local or Docker quickstart.

Druid provides a rich set of APIs (via HTTP and JDBC) for loading, managing, and querying your data. You can also interact with Druid via the built-in console (shown below).

Load data

data loader Kafka

Load streaming and batch data using a point-and-click wizard to guide you through ingestion setup. Monitor one off tasks and ingestion supervisors.

Manage the cluster

management

Manage your cluster with ease. Get a view of your datasources, segments, ingestion tasks, and services from one convenient location. All powered by SQL systems tables, allowing you to see the underlying query for each view.

Issue queries

query view combo

Use the built-in query workbench to prototype DruidSQL and native queries or connect one of the many tools that help you make the most out of Druid.

Documentation

You can find the documentation for the latest Druid release on the project website.

If you would like to contribute documentation, please do so under /docs in this repository and submit a pull request.

Community

Community support is available on the druid-user mailing list, which is hosted at Google Groups.

Development discussions occur on dev@druid.apache.org, which you can subscribe to by emailing dev-subscribe@druid.apache.org.

Chat with Druid committers and users in real-time on the #druid channel in the Apache Slack team. Please use this invitation link to join the ASF Slack, and once joined, go into the #druid channel.

Building from source

Please note that JDK 8 is required to build Druid.

For instructions on building Druid from source, see docs/development/build.md

Contributing

Please follow the community guidelines for contributing.

For instructions on setting up IntelliJ dev/intellij-setup.md

License

Apache License, Version 2.0