Apache Druid: a high performance real-time analytics database.
Go to file
Agustin Gonzalez 7e61042794
Bound memory utilization for dynamic partitioning (i.e. memory growth is constant) (#11294)
* Bound memory in native batch ingest create segments

* Move BatchAppenderatorDriverTest to indexing service... note that we had to put the sink back in sinks in mergeandpush since the persistent data needs to be dropped and the sink is required for that

* Remove sinks from memory and clean up intermediate persists dirs manually after sink has been merged

* Changed name from RealtimeAppenderator to StreamAppenderator

* Style

* Incorporating tests from StreamAppenderatorTest

* Keep totalRows and cleanup code

* Added missing dep

* Fix unit test

* Checkstyle

* allowIncrementalPersists should always be true for batch

* Added sinks metadata

* clear sinks metadata when closing appenderator

* Style + minor edits to log msgs

* Update sinks metadata & totalRows when dropping a sink (segment)

* Remove max

* Intelli-j check

* Keep a count of hydrants persisted by sink for sanity check before merge

* Move out sanity

* Add previous hydrant count to sink metadata

* Remove redundant field from SinkMetadata

* Remove unneeded functions

* Cleanup unused code

* Removed unused code

* Remove unused field

* Exclude it from jacoco because it is very hard to get branch coverage

* Remove segment announcement and some other minor cleanup

* Add fallback flag

* Minor code cleanup

* Checkstyle

* Code review changes

* Update batchMemoryMappedIndex name

* Code review comments

* Exclude class from coverage, will include again when packaging gets fixed

* Moved test classes to server module

* More BatchAppenderator cleanup

* Fix bug in wrong counting of totalHydrants plus minor cleanup in add

* Removed left over comments

* Have BatchAppenderator follow the Appenderator contract for push & getSegments

* Fix LGTM violations

* Review comments

* Add stats after push is done

* Code review comments (cleanup, remove rest of synchronization constructs in batch appenderator, reneame feature flag,
remove real time flag stuff from stream appenderator, etc.)

* Update javadocs

* Add thread safety notice to BatchAppenderator

* Further cleanup config

* More config cleanup
2021-07-09 00:10:29 -07:00
.github revert commons-io to 2.6 (#11392) 2021-06-29 23:04:38 -07:00
.idea Use ExecutorService variables to assign ExecutorService Instances (#11373) 2021-06-25 16:56:34 -07:00
benchmarks add single input string expression dimension vector selector and better expression planning (#11213) 2021-07-06 11:20:49 -07:00
cloud update jackson dependencies to use bom (#11353) 2021-06-16 18:37:30 -07:00
codestyle handle timestamps of complex types when parsing protobuf messages (#11293) 2021-06-07 15:19:39 +05:30
core support using mariadb connector with mysql extensions (#11402) 2021-07-08 12:25:37 -07:00
dev chore: fix case of GitHub (#10928) 2021-05-07 01:15:43 -07:00
distribution support using mariadb connector with mysql extensions (#11402) 2021-07-08 12:25:37 -07:00
docs Bound memory utilization for dynamic partitioning (i.e. memory growth is constant) (#11294) 2021-07-09 00:10:29 -07:00
examples Allow spaces in java home. (#11407) 2021-07-05 18:50:36 +05:30
extendedset add single input string expression dimension vector selector and better expression planning (#11213) 2021-07-06 11:20:49 -07:00
extensions-contrib Delete buildV9Directly in Kafka and Kinesis Indexing Service (#11351) 2021-06-23 16:36:46 -07:00
extensions-core support using mariadb connector with mysql extensions (#11402) 2021-07-08 12:25:37 -07:00
helm/druid remove DEPRECATION part (#11326) 2021-06-09 15:52:43 +08:00
hll Bump dev version to 0.22.0-SNAPSHOT (#10759) 2021-01-15 13:16:23 -08:00
hooks Add git pre-commit hook to source control (#9554) 2020-06-05 11:19:42 -10:00
indexing-hadoop Eliminate ambiguities of KB/MB/GB in the doc (#11333) 2021-06-30 13:42:45 -07:00
indexing-service Bound memory utilization for dynamic partitioning (i.e. memory growth is constant) (#11294) 2021-07-09 00:10:29 -07:00
integration-tests support using mariadb connector with mysql extensions (#11402) 2021-07-08 12:25:37 -07:00
licenses Web console: Better hotkeys and library upgrades (#11365) 2021-06-17 18:24:29 -07:00
processing add single input string expression dimension vector selector and better expression planning (#11213) 2021-07-06 11:20:49 -07:00
publications De-incubation cleanup in code, docs, packaging (#9108) 2020-01-03 12:33:19 -05:00
server Bound memory utilization for dynamic partitioning (i.e. memory growth is constant) (#11294) 2021-07-09 00:10:29 -07:00
services Replace Processing ExecutorService with QueryProcessingPool (#11382) 2021-07-01 16:03:08 +05:30
sql Better error message for unsupported double values (#11409) 2021-07-08 16:55:17 +05:30
web-console Web console: allow encoding of ASCII control chars (#10795) 2021-06-26 18:54:41 -07:00
website Avro union support (#10505) 2021-07-06 22:05:41 -07:00
.asf.yaml Add .asf.yaml. (#9083) 2019-12-20 16:45:38 -08:00
.backportrc.json Add 0.18.0 to .backportrc.json to facilitate backport. (#9661) 2020-04-11 13:49:04 -07:00
.codecov.yml Use Codecov (#8388) 2019-08-28 08:49:30 -07:00
.dockerignore Add docker container for druid (#6896) 2019-02-08 12:12:28 +00:00
.gitignore Web console basic end-to-end-test (#9595) 2020-04-09 12:38:09 -07:00
.lgtm.yml Suppress LGTM warnings about stack trace exposure (#9631) 2020-04-09 17:31:03 -07:00
.travis.yml support using mariadb connector with mysql extensions (#11402) 2021-07-08 12:25:37 -07:00
CONTRIBUTING.md Fix numbered list formatting in markdown. (#9664) 2020-04-21 20:18:12 -07:00
LABELS Add plain text README.txt, use relative link from README.md to build.md (#7611) 2019-05-09 21:29:26 -07:00
LICENSE support Aliyun OSS service as deep storage (#9898) 2020-07-01 22:20:53 -07:00
NOTICE license.yaml fixes for code introduced related to AWS RDS token based password provider in PR #9518 (#10885) 2021-03-10 12:59:25 -08:00
README.md Update badge for travis in README.md (#10717) 2021-01-07 18:39:58 -08:00
README.template De-incubation cleanup in code, docs, packaging (#9108) 2020-01-03 12:33:19 -05:00
check_test_suite.py chill, travis (#11283) 2021-05-27 05:40:40 -07:00
check_test_suite_test.py chill, travis (#11283) 2021-05-27 05:40:40 -07:00
licenses.yaml revert commons-io to 2.6 (#11392) 2021-06-29 23:04:38 -07:00
owasp-dependency-check-suppressions.xml i dig the optimism, but need more time (#11250) 2021-05-13 11:16:10 -07:00
pom.xml support using mariadb connector with mysql extensions (#11402) 2021-07-08 12:25:37 -07:00
setup-hooks.sh Add git pre-commit hook to source control (#9554) 2020-06-05 11:19:42 -10:00
upload.sh Adding licenses and enable apache-rat-plugin. (#6215) 2018-09-18 08:39:26 -07:00

README.md

Slack Build Status Language grade: Java Coverage Status Docker Helm


Website | Documentation | Developer Mailing List | User Mailing List | Slack | Twitter | Download


Apache Druid

Druid is a high performance real-time analytics database. Druid's main value add is to reduce time to insight and action.

Druid is designed for workflows where fast queries and ingest really matter. Druid excels at powering UIs, running operational (ad-hoc) queries, or handling high concurrency. Consider Druid as an open source alternative to data warehouses for a variety of use cases.

Getting started

You can get started with Druid with our local or Docker quickstart.

Druid provides a rich set of APIs (via HTTP and JDBC) for loading, managing, and querying your data. You can also interact with Druid via the built-in console (shown below).

Load data

data loader Kafka

Load streaming and batch data using a point-and-click wizard to guide you through ingestion setup. Monitor one off tasks and ingestion supervisors.

Manage the cluster

management

Manage your cluster with ease. Get a view of your datasources, segments, ingestion tasks, and services from one convenient location. All powered by SQL systems tables, allowing you to see the underlying query for each view.

Issue queries

query view combo

Use the built-in query workbench to prototype DruidSQL and native queries or connect one of the many tools that help you make the most out of Druid.

Documentation

You can find the documentation for the latest Druid release on the project website.

If you would like to contribute documentation, please do so under /docs in this repository and submit a pull request.

Community

Community support is available on the druid-user mailing list, which is hosted at Google Groups.

Development discussions occur on dev@druid.apache.org, which you can subscribe to by emailing dev-subscribe@druid.apache.org.

Chat with Druid committers and users in real-time on the #druid channel in the Apache Slack team. Please use this invitation link to join the ASF Slack, and once joined, go into the #druid channel.

Building from source

Please note that JDK 8 is required to build Druid.

For instructions on building Druid from source, see docs/development/build.md

Contributing

Please follow the community guidelines for contributing.

For instructions on setting up IntelliJ dev/intellij-setup.md

License

Apache License, Version 2.0