Apache Druid: a high performance real-time analytics database.
Go to file
Gian Merlino 501dcb43fa Some changes that make it possible to restart tasks on the same hardware.
This is done by killing and respawning the jvms rather than reconnecting to existing
jvms, for a couple reasons. One is that it lets you restore tasks after server reboots
too, and another is that it lets you upgrade all the software on a box at once by just
restarting everything.

The main changes are,

1) Add "canRestore" and "stopGracefully" methods to Tasks that say if a task can
   stop gracefully, and actually do a graceful stop. RealtimeIndexTask is the only
   one that currently implements this.

2) Add "stop" method to TaskRunners that attempts to do an orderly shutdown.
   ThreadPoolTaskRunner- call stopGracefully on restorable tasks, wait for exit
   ForkingTaskRunner- close output stream to restorable tasks, wait for exit
   RemoteTaskRunner- do nothing special, we actually don't want to shutdown

3) Add "restore" method to TaskRunners that attempts to bootstrap tasks from last run.
   Only ForkingTaskRunner does anything here. It maintains a "restore.json" file with
   a list of restorable tasks.

4) Have the CliPeon's ExecutorLifecycle lock the task base directory to avoid a restored
   task and a zombie old task from stomping on each other.
2015-11-23 11:22:08 -08:00
aws-common cleanup and remove unused imports 2015-11-11 12:25:21 -08:00
benchmarks cleanup and remove unused imports 2015-11-11 12:25:21 -08:00
common Merge pull request #1896 from gianm/allocate-segment 2015-11-18 21:05:46 -08:00
distribution Added cloudfiles-extensions in order to support Rackspace's cloudfiles as deep storage 2015-11-04 17:44:48 +01:00
docs Some changes that make it possible to restart tasks on the same hardware. 2015-11-23 11:22:08 -08:00
examples New extension loading mechanism 2015-10-21 14:22:36 -05:00
extensions reformat datasketches module to satisfy druid style guidelines 2015-11-19 01:07:03 -06:00
indexing-hadoop Merge pull request #1896 from gianm/allocate-segment 2015-11-18 21:05:46 -08:00
indexing-service Some changes that make it possible to restart tasks on the same hardware. 2015-11-23 11:22:08 -08:00
integration-tests switch to Java 8 + cleanup 2015-11-17 13:35:06 -08:00
processing Merge pull request #1974 from jon-wei/dim_order_merge 2015-11-18 19:51:34 -06:00
publications more edits to radstack paper 2015-10-18 19:52:44 -07:00
server Some changes that make it possible to restart tasks on the same hardware. 2015-11-23 11:22:08 -08:00
services Merge pull request #1896 from gianm/allocate-segment 2015-11-18 21:05:46 -08:00
.gitignore move distribution artifacts to distribution/target 2015-10-30 12:40:05 -05:00
.travis.yml faster build: cache maven dependencies in travis 2015-08-07 18:05:30 -07:00
CONTRIBUTING.md Minor documentation fixes for CONTRIBUTING.md 2015-09-29 12:22:35 -06:00
DruidCorporateCLA.pdf fix CLA email / mailing address 2014-04-17 15:26:28 -07:00
DruidIndividualCLA.pdf fix CLA email / mailing address 2014-04-17 15:26:28 -07:00
LICENSE Clean up README and license 2015-02-18 23:09:28 -08:00
NOTICE support for PasswordProvider interface to enable writing druid extension which can get metadata store password from secured location or anywhere instead of plain text properties file 2015-02-25 14:05:19 -06:00
README.md Revert "Update README.md" 2015-09-22 20:38:12 -07:00
eclipse_formatting.xml towards a community led druid 2015-01-31 20:57:36 -08:00
intellij_formatting.jar rename code style jar 2014-11-21 14:51:23 -08:00
pom.xml adding datasketches module to top level pom 2015-11-12 00:04:33 -06:00
upload.sh rework tarball distribution: 2015-08-18 18:32:33 -07:00

README.md

Build Status Coverage Status

Druid

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments.

Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

Druid can load both streaming and batch data and integrates with Samza, Kafka, Storm, and Hadoop.

License

Apache License, Version 2.0

More Information

More information about Druid can be found on http://www.druid.io.

Documentation

You can find the latest Druid Documentation on the project website.

If you would like to contribute documentation, please do so under /docs/content in this repository and submit a pull request.

Tutorials

We have a series of tutorials to get started with Druid. If you are just getting started, we suggest going over the first Druid tutorial.

Reporting Issues

If you find any bugs, please file a GitHub issue.

Community

Community support is available on the druid-user mailing list(druid-user@googlegroups.com).

Development discussions occur on the druid-development list(druid-development@googlegroups.com).

We also have a couple people hanging out on IRC in #druid-dev on irc.freenode.net.