druid

Apache Druid: a high performance real-time analytics database.

druid

Go to file

Gian Merlino 2be7068f6e Fixes and improvements to SQL metadata caching. (#4551 ) * Fixes and improvements to SQL metadata caching. Also adds support for MultipleSpecificSegmentSpec to CachingClusteredClient. SQL changes: - Cache metadata on a per-segment level, in addition to per-dataSource, so we don't need to re-query all segments whenever a single new one appears. This should lower the load placed on the cluster by metadata queries. - Fix race condition in DruidSchema that can cause us to miss metadata. It was possible to notice new segments, then issue a query, and have that query not actually hit those segments, and not notice that it didn't hit those segments. Then, the metadata from those segments would be ignored. - Fix assumption in DruidSchema that all segments are immutable. Now, mutable segments are periodically re-queried. - Fix inappropriate re-use of SchemaPlus. Now we create one for each planning cycle, rather than sharing one. It caches table objects, which we want to avoid, since it can cause stale metadata. We do the caching in DruidSchema so we don't need the SchemaPlus caching. Server changes: - Add a TimelineCallback to TimelineServerView, for callers that want to get updates when the timeline has been modified. - Change CachingClusteredClient from a QueryRunner to a QuerySegmentWalker. This allows it to accept queries that are segment-descriptor-based rather than intervals-based. In particular it will now support MultipleSpecificSegmentSpec. * Fix DruidSchema, and unused imports. * Remove unused import. * Fix SqlBenchmark.		2017-07-20 10:14:15 -07:00
.idea	Add PMD and prohibit unnecessary fully qualified class names in code (#4350 )	2017-07-17 22:22:29 +09:00
api	Adding double colums supports (#4491 )	2017-07-20 10:14:14 +03:00
aws-common	Add PMD and prohibit unnecessary fully qualified class names in code (#4350 )	2017-07-17 22:22:29 +09:00
benchmarks	Fixes and improvements to SQL metadata caching. (#4551 )	2017-07-20 10:14:15 -07:00
bytebuffer-collections	PolygonBound.contains() fix (#4553 )	2017-07-20 10:12:46 +03:00
ci	Run integration tests on travis (#4344 )	2017-05-31 18:27:34 -07:00
codestyle	Fix some unnecessary use of boxed types and incorrect format strings spotted by lgtm. (#4474 )	2017-07-13 12:15:32 -07:00
common	Add PMD and prohibit unnecessary fully qualified class names in code (#4350 )	2017-07-17 22:22:29 +09:00
distribution	adding notice file to distribution (#4522 )	2017-07-10 12:59:50 -07:00
docs	Adding double colums supports (#4491 )	2017-07-20 10:14:14 +03:00
examples	Fix some unnecessary use of boxed types and incorrect format strings spotted by lgtm. (#4474 )	2017-07-13 12:15:32 -07:00
extendedset	Add PMD and prohibit unnecessary fully qualified class names in code (#4350 )	2017-07-17 22:22:29 +09:00
extensions-contrib	Adding double colums supports (#4491 )	2017-07-20 10:14:14 +03:00
extensions-core	Fixes and improvements to SQL metadata caching. (#4551 )	2017-07-20 10:14:15 -07:00
hll	Use Double.NEGATIVE_INFINITY and Double.POSITIVE_INFINITY (#4496 )	2017-07-07 09:10:13 -06:00
indexing-hadoop	Adding double colums supports (#4491 )	2017-07-20 10:14:14 +03:00
indexing-service	Fix issue-4539 (#4546 )	2017-07-19 09:38:29 -07:00
integration-tests	Adding double colums supports (#4491 )	2017-07-20 10:14:14 +03:00
java-util	Add PMD and prohibit unnecessary fully qualified class names in code (#4350 )	2017-07-17 22:22:29 +09:00
processing	Adding double colums supports (#4491 )	2017-07-20 10:14:14 +03:00
publications	Changes to lambda architecture paper required for HICSS (#3382 )	2016-09-06 21:32:21 -07:00
server	Fixes and improvements to SQL metadata caching. (#4551 )	2017-07-20 10:14:15 -07:00
services	Fix RemoteTaskRunner's auto-scaling (#3768 )	2017-07-14 09:11:39 +09:00
sql	Fixes and improvements to SQL metadata caching. (#4551 )	2017-07-20 10:14:15 -07:00
.gitignore	move distribution artifacts to distribution/target	2015-10-30 12:40:05 -05:00
.travis.yml	Use Ubuntu Precise for Travis unit test jobs (#4572 )	2017-07-19 00:19:33 -06:00
CONTRIBUTING.md	Update git workflow (#4418 )	2017-06-16 19:27:46 -07:00
DruidCorporateCLA.pdf	fix CLA email / mailing address	2014-04-17 15:26:28 -07:00
DruidIndividualCLA.pdf	fix CLA email / mailing address	2014-04-17 15:26:28 -07:00
INTELLIJ_SETUP.md	Add INTELLIJ_SETUP.md (#4261 )	2017-05-17 01:26:16 +09:00
LICENSE	Clean up README and license	2015-02-18 23:09:28 -08:00
NOTICE	Copy closer into Druid codebase (fixes #3652 ) (#4153 )	2017-04-10 09:38:45 +09:00
README.md	Add TeamCity inspections badge (#4351 )	2017-06-06 20:53:25 -04:00
druid_intellij_formatting.xml	Make formatting IntelliJ 2016 friendly (#2978 )	2016-05-18 12:42:21 -07:00
eclipse.importorder	Merge pull request #2905 from javasoze/eclipse_formatting	2016-04-29 18:42:03 -07:00
eclipse_formatting.xml	Merge pull request #2905 from javasoze/eclipse_formatting	2016-04-29 18:42:03 -07:00
pom.xml	Add PMD and prohibit unnecessary fully qualified class names in code (#4350 )	2017-07-17 22:22:29 +09:00
upload.sh	upload.sh: Use awscli if s3cmd is not available. (#3114 )	2016-06-08 17:01:46 -07:00

README.md

Druid

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments.

Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

Druid can load both streaming and batch data and integrates with Samza, Kafka, Storm, Spark, and Hadoop.

License

Apache License, Version 2.0

More Information

More information about Druid can be found on http://www.druid.io.

Documentation

You can find the documentation for the latest Druid release on the project website.

If you would like to contribute documentation, please do so under /docs/content in this repository and submit a pull request.

Getting Started

You can get started with Druid with our quickstart.

Reporting Issues

If you find any bugs, please file a GitHub issue.

Community

Community support is available on the druid-user mailing list(druid-user@googlegroups.com).

Development discussions occur on the druid-development list(druid-development@googlegroups.com).

We also have a couple people hanging out on IRC in #druid-dev on irc.freenode.net.

Contributing

Please follow the guidelines listed here.