druid

mirror of https://github.com/apache/druid.git synced 2025-02-11 20:45:01 +00:00

Go to file

Clint Wylie 7aafcf8bca parallel broker merges on fork join pool (#8578 )

* sketch of broker parallel merges done in small batches on fork join pool

* fix non-terminating sequences, auto compute parallelism

* adjust benches

* adjust benchmarks

* now hella more faster, fixed dumb

* fix

* remove comments

* log.info for debug

* javadoc

* safer block for sequence to yielder conversion

* refactor LifecycleForkJoinPool into LifecycleForkJoinPoolProvider which wraps a ForkJoinPool

* smooth yield rate adjustment, more logs to help tune

* cleanup, less logs

* error handling, bug fixes, on by default, more parallel, more tests

* remove unused var

* comments

* timeboundary mergeFn

* simplify, more javadoc

* formatting

* pushdown config

* use nanos consistently, move logs back to debug level, bit more javadoc

* static terminal result batch

* javadoc for nullability of createMergeFn

* cleanup

* oops

* fix race, add docs

* spelling, remove todo, add unhandled exception log

* cleanup, revert unintended change

* another unintended change

* review stuff

* add ParallelMergeCombiningSequenceBenchmark, fixes

* hyper-threading is the enemy

* fix initial start delay, lol

* parallelism computer now balances partition sizes to partition counts using sqrt of sequence count instead of sequence count by 2

* fix those important style issues with the benchmarks code

* lazy sequence creation for benchmarks

* more benchmark comments

* stable sequence generation time

* update defaults to use 100ms target time, 4096 batch size, 16384 initial yield, also update user docs

* add jmh thread based benchmarks, cleanup some stuff

* oops

* style

* add spread to jmh thread benchmark start range, more comments to benchmarks parameters and purpose

* retool benchmark to allow modeling more typical heterogenous heavy workloads

* spelling

* fix

* refactor benchmarks

* formatting

* docs

* add maxThreadStartDelay parameter to threaded benchmark

* why does catch need to be on its own line but else doesnt

2019-11-07 11:58:46 -08:00

.github

add checkbox for licenses.yaml in PR template, mention it in CONTRIBUTING.md (#8367 )

2019-08-22 14:14:24 -07:00

.idea

Implementing dropwizard emitter for druid (#7363 )

2019-10-01 14:59:30 -07:00

benchmarks

parallel broker merges on fork join pool (#8578 )

2019-11-07 11:58:46 -08:00

cloud

Add credentials for ECS (#8651 )

2019-10-12 09:12:14 -07:00

codestyle

Fix dependency analyze warnings (#8230 )

2019-09-09 14:37:21 -07:00

core

parallel broker merges on fork join pool (#8578 )

2019-11-07 11:58:46 -08:00

dev

Add an item to concurrency checklist about assertions in parall… (#8701 )

2019-10-29 11:38:04 +03:00

distribution

update how to release doc (#8590 )

2019-10-02 08:51:25 -07:00

docs

parallel broker merges on fork join pool (#8578 )

2019-11-07 11:58:46 -08:00

examples

Fix verify script. (#8798 )

2019-10-30 23:30:01 -07:00

extendedset

bump master version to 0.17.0-incubating-SNAPSHOT (#8421 )

2019-08-28 01:58:36 -07:00

extensions-contrib

parallel broker merges on fork join pool (#8578 )

2019-11-07 11:58:46 -08:00

extensions-core

Fix ambiguity about IndexerSQLMetadataStorageCoordinator.getUsedSegmentsForInterval() returning only non-overshadowed or all used segments (#8564 )

2019-11-06 11:07:04 -08:00

hll

Fix dependency analyze warnings (#8230 )

2019-09-09 14:37:21 -07:00

indexing-hadoop

Fix ambiguity about IndexerSQLMetadataStorageCoordinator.getUsedSegmentsForInterval() returning only non-overshadowed or all used segments (#8564 )

2019-11-06 11:07:04 -08:00

indexing-service

Fix ambiguity about IndexerSQLMetadataStorageCoordinator.getUsedSegmentsForInterval() returning only non-overshadowed or all used segments (#8564 )

2019-11-06 11:07:04 -08:00

integration-tests

remove select query (#8739 )

2019-10-30 19:29:56 -07:00

licenses

add jaxb-runtime to fix exception with newer versions of java (#8409 )

2019-08-27 14:25:05 -06:00

processing

parallel broker merges on fork join pool (#8578 )

2019-11-07 11:58:46 -08:00

publications

[ImgBot] Optimize images (#7873 )

2019-06-24 21:27:48 -07:00

server

parallel broker merges on fork join pool (#8578 )

2019-11-07 11:58:46 -08:00

services

Fix ambiguity about IndexerSQLMetadataStorageCoordinator.getUsedSegmentsForInterval() returning only non-overshadowed or all used segments (#8564 )

2019-11-06 11:07:04 -08:00

sql

Fix ambiguity about IndexerSQLMetadataStorageCoordinator.getUsedSegmentsForInterval() returning only non-overshadowed or all used segments (#8564 )

2019-11-06 11:07:04 -08:00

web-console

Web console: fine grained capabilities / graceful degradation (#8805 )

2019-11-05 23:39:14 -08:00

website

parallel broker merges on fork join pool (#8578 )

2019-11-07 11:58:46 -08:00

.codecov.yml

Use Codecov (#8388 )

2019-08-28 08:49:30 -07:00

.dockerignore

Add docker container for druid (#6896 )

2019-02-08 12:12:28 +00:00

.gitignore

autogenerate NOTICE.BINARY from NOTICE and licenses.yaml (#8306 )

2019-08-21 12:46:27 -07:00

.travis.yml

Spellcheck docs (#8548 )

2019-09-17 12:47:30 -07:00

CONTRIBUTING.md

Fix incorrect build from source path in README.md and druid repo url. (#8531 )

2019-09-12 19:48:01 -07:00

DISCLAIMER

add missing license headers, in particular to MD files; clean up RAT … (#6563 )

2018-11-13 09:38:37 -08:00

LABELS

Add plain text README.txt, use relative link from README.md to build.md (#7611 )

2019-05-09 21:29:26 -07:00

LICENSE

Add missing license pointer for Porter Stemmer (#7941 )

2019-06-24 12:21:40 -07:00

licenses.yaml

Upgrade joda-time to 2.10.5 (#8821 )

2019-11-06 14:30:22 -08:00

NOTICE

add copyright info back to NOTICE and NOTICE.BINARY (#8298 )

2019-08-14 19:42:47 -05:00

pom.xml

Upgrade joda-time to 2.10.5 (#8821 )

2019-11-06 14:30:22 -08:00

README.md

Update README.md (#8829 )

2019-11-06 08:59:00 -08:00

README.template

switch links from druid.io to druid.apache.org (#7914 )

2019-06-18 09:06:27 -07:00

upload.sh

Adding licenses and enable apache-rat-plugin. (#6215 )

2018-09-18 08:39:26 -07:00

README.md

Apache Druid (incubating)

Apache Druid (incubating) is a high performance real-time analytics database.

Druid is a next-gen open source alternative to analytical databases such as Vertica, Greenplum, and Exadata, and data warehouses such as Snowflake, BigQuery, and Redshift.

Getting started

You can get started with Druid with our quickstart.

Druid provides a rich set of APIs (via HTTP and JDBC) for loading, managing, and querying your data. You can also interact with Druid via the built-in console (shown below).

Load data

Load streaming and batch data using a point-and-click wizard to guide you through ingestion setup. Monitor one off tasks and ingestion supervisors.

Manage the cluster

Manage your cluster with ease. Get a view of your datasources, segments, ingestion tasks, and servers from one convenient location. All powered by SQL systems tables allowing you to see the underlying query for each view.

Issue queries

Use the built-in query workbench to prototype DruidSQL and native queries or connect one of the many tools that help you make the most out of Druid.

Documentation

You can find the documentation for the latest Druid release on the project website.

If you would like to contribute documentation, please do so under /docs in this repository and submit a pull request.

Community

Community support is available on the druid-user mailing list, which is hosted at Google Groups.

Development discussions occur on dev@druid.apache.org, which you can subscribe to by emailing dev-subscribe@druid.apache.org.

Chat with Druid committers and users in real-time on the #druid channel in the Apache Slack team. Please use this invitation link to join the ASF Slack, and once joined, go into the #druid channel.

Building from source

Please note that JDK 8 is required to build Druid.

For instructions on building Druid from source, see docs/development/build.md

Contributing

Please follow the community guidelines for contributing.

License

Apache License, Version 2.0

Disclaimer: Apache Druid is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

Languages

Java 62.4%

ReScript 30.7%

TypeScript 3.1%

Euphoria 0.9%

Csound 0.8%

Other 1.9%