druid

Apache Druid: a high performance real-time analytics database.

druid

Go to file

Gian Merlino 501dcb43fa Some changes that make it possible to restart tasks on the same hardware. This is done by killing and respawning the jvms rather than reconnecting to existing jvms, for a couple reasons. One is that it lets you restore tasks after server reboots too, and another is that it lets you upgrade all the software on a box at once by just restarting everything. The main changes are, 1) Add "canRestore" and "stopGracefully" methods to Tasks that say if a task can stop gracefully, and actually do a graceful stop. RealtimeIndexTask is the only one that currently implements this. 2) Add "stop" method to TaskRunners that attempts to do an orderly shutdown. ThreadPoolTaskRunner- call stopGracefully on restorable tasks, wait for exit ForkingTaskRunner- close output stream to restorable tasks, wait for exit RemoteTaskRunner- do nothing special, we actually don't want to shutdown 3) Add "restore" method to TaskRunners that attempts to bootstrap tasks from last run. Only ForkingTaskRunner does anything here. It maintains a "restore.json" file with a list of restorable tasks. 4) Have the CliPeon's ExecutorLifecycle lock the task base directory to avoid a restored task and a zombie old task from stomping on each other.		2015-11-23 11:22:08 -08:00
aws-common	cleanup and remove unused imports	2015-11-11 12:25:21 -08:00
benchmarks	cleanup and remove unused imports	2015-11-11 12:25:21 -08:00
common	Merge pull request #1896 from gianm/allocate-segment	2015-11-18 21:05:46 -08:00
distribution	Added cloudfiles-extensions in order to support Rackspace's cloudfiles as deep storage	2015-11-04 17:44:48 +01:00
docs	Some changes that make it possible to restart tasks on the same hardware.	2015-11-23 11:22:08 -08:00
examples	New extension loading mechanism	2015-10-21 14:22:36 -05:00
extensions	reformat datasketches module to satisfy druid style guidelines	2015-11-19 01:07:03 -06:00
indexing-hadoop	Merge pull request #1896 from gianm/allocate-segment	2015-11-18 21:05:46 -08:00
indexing-service	Some changes that make it possible to restart tasks on the same hardware.	2015-11-23 11:22:08 -08:00
integration-tests	switch to Java 8 + cleanup	2015-11-17 13:35:06 -08:00
processing	Merge pull request #1974 from jon-wei/dim_order_merge	2015-11-18 19:51:34 -06:00
publications	more edits to radstack paper	2015-10-18 19:52:44 -07:00
server	Some changes that make it possible to restart tasks on the same hardware.	2015-11-23 11:22:08 -08:00
services	Merge pull request #1896 from gianm/allocate-segment	2015-11-18 21:05:46 -08:00
.gitignore	move distribution artifacts to distribution/target	2015-10-30 12:40:05 -05:00
.travis.yml	faster build: cache maven dependencies in travis	2015-08-07 18:05:30 -07:00
CONTRIBUTING.md	Minor documentation fixes for CONTRIBUTING.md	2015-09-29 12:22:35 -06:00
DruidCorporateCLA.pdf	fix CLA email / mailing address	2014-04-17 15:26:28 -07:00
DruidIndividualCLA.pdf	fix CLA email / mailing address	2014-04-17 15:26:28 -07:00
LICENSE	Clean up README and license	2015-02-18 23:09:28 -08:00
NOTICE	support for PasswordProvider interface to enable writing druid extension which can get metadata store password from secured location or anywhere instead of plain text properties file	2015-02-25 14:05:19 -06:00
README.md	Revert "Update README.md"	2015-09-22 20:38:12 -07:00
eclipse_formatting.xml	towards a community led druid	2015-01-31 20:57:36 -08:00
intellij_formatting.jar	rename code style jar	2014-11-21 14:51:23 -08:00
pom.xml	adding datasketches module to top level pom	2015-11-12 00:04:33 -06:00
upload.sh	rework tarball distribution:	2015-08-18 18:32:33 -07:00

README.md

Druid

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments.

Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

Druid can load both streaming and batch data and integrates with Samza, Kafka, Storm, and Hadoop.

License

Apache License, Version 2.0

More Information

More information about Druid can be found on http://www.druid.io.

Documentation

You can find the latest Druid Documentation on the project website.

If you would like to contribute documentation, please do so under /docs/content in this repository and submit a pull request.

Tutorials

We have a series of tutorials to get started with Druid. If you are just getting started, we suggest going over the first Druid tutorial.

Reporting Issues

If you find any bugs, please file a GitHub issue.

Community

Community support is available on the druid-user mailing list(druid-user@googlegroups.com).

Development discussions occur on the druid-development list(druid-development@googlegroups.com).

We also have a couple people hanging out on IRC in #druid-dev on irc.freenode.net.