Apache Druid: a high performance real-time analytics database.
Go to file
Gian Merlino 2be7068f6e Fixes and improvements to SQL metadata caching. (#4551)
* Fixes and improvements to SQL metadata caching.

Also adds support for MultipleSpecificSegmentSpec to CachingClusteredClient.

SQL changes:
- Cache metadata on a per-segment level, in addition to per-dataSource, so
  we don't need to re-query all segments whenever a single new one appears.
  This should lower the load placed on the cluster by metadata queries.
- Fix race condition in DruidSchema that can cause us to miss metadata. It was
  possible to notice new segments, then issue a query, and have that query
  not actually hit those segments, and not notice that it didn't hit those segments.
  Then, the metadata from those segments would be ignored.
- Fix assumption in DruidSchema that all segments are immutable. Now, mutable
  segments are periodically re-queried.
- Fix inappropriate re-use of SchemaPlus. Now we create one for each planning
  cycle, rather than sharing one. It caches table objects, which we want to
  avoid, since it can cause stale metadata. We do the caching in DruidSchema
  so we don't need the SchemaPlus caching.

Server changes:
- Add a TimelineCallback to TimelineServerView, for callers that want to get updates
  when the timeline has been modified.
- Change CachingClusteredClient from a QueryRunner to a QuerySegmentWalker. This
  allows it to accept queries that are segment-descriptor-based rather than
  intervals-based. In particular it will now support MultipleSpecificSegmentSpec.

* Fix DruidSchema, and unused imports.

* Remove unused import.

* Fix SqlBenchmark.
2017-07-20 10:14:15 -07:00
.idea Add PMD and prohibit unnecessary fully qualified class names in code (#4350) 2017-07-17 22:22:29 +09:00
api Adding double colums supports (#4491) 2017-07-20 10:14:14 +03:00
aws-common Add PMD and prohibit unnecessary fully qualified class names in code (#4350) 2017-07-17 22:22:29 +09:00
benchmarks Fixes and improvements to SQL metadata caching. (#4551) 2017-07-20 10:14:15 -07:00
bytebuffer-collections PolygonBound.contains() fix (#4553) 2017-07-20 10:12:46 +03:00
ci Run integration tests on travis (#4344) 2017-05-31 18:27:34 -07:00
codestyle Fix some unnecessary use of boxed types and incorrect format strings spotted by lgtm. (#4474) 2017-07-13 12:15:32 -07:00
common Add PMD and prohibit unnecessary fully qualified class names in code (#4350) 2017-07-17 22:22:29 +09:00
distribution adding notice file to distribution (#4522) 2017-07-10 12:59:50 -07:00
docs Adding double colums supports (#4491) 2017-07-20 10:14:14 +03:00
examples Fix some unnecessary use of boxed types and incorrect format strings spotted by lgtm. (#4474) 2017-07-13 12:15:32 -07:00
extendedset Add PMD and prohibit unnecessary fully qualified class names in code (#4350) 2017-07-17 22:22:29 +09:00
extensions-contrib Adding double colums supports (#4491) 2017-07-20 10:14:14 +03:00
extensions-core Fixes and improvements to SQL metadata caching. (#4551) 2017-07-20 10:14:15 -07:00
hll Use Double.NEGATIVE_INFINITY and Double.POSITIVE_INFINITY (#4496) 2017-07-07 09:10:13 -06:00
indexing-hadoop Adding double colums supports (#4491) 2017-07-20 10:14:14 +03:00
indexing-service Fix issue-4539 (#4546) 2017-07-19 09:38:29 -07:00
integration-tests Adding double colums supports (#4491) 2017-07-20 10:14:14 +03:00
java-util Add PMD and prohibit unnecessary fully qualified class names in code (#4350) 2017-07-17 22:22:29 +09:00
processing Adding double colums supports (#4491) 2017-07-20 10:14:14 +03:00
publications Changes to lambda architecture paper required for HICSS (#3382) 2016-09-06 21:32:21 -07:00
server Fixes and improvements to SQL metadata caching. (#4551) 2017-07-20 10:14:15 -07:00
services Fix RemoteTaskRunner's auto-scaling (#3768) 2017-07-14 09:11:39 +09:00
sql Fixes and improvements to SQL metadata caching. (#4551) 2017-07-20 10:14:15 -07:00
.gitignore move distribution artifacts to distribution/target 2015-10-30 12:40:05 -05:00
.travis.yml Use Ubuntu Precise for Travis unit test jobs (#4572) 2017-07-19 00:19:33 -06:00
CONTRIBUTING.md Update git workflow (#4418) 2017-06-16 19:27:46 -07:00
DruidCorporateCLA.pdf fix CLA email / mailing address 2014-04-17 15:26:28 -07:00
DruidIndividualCLA.pdf fix CLA email / mailing address 2014-04-17 15:26:28 -07:00
INTELLIJ_SETUP.md Add INTELLIJ_SETUP.md (#4261) 2017-05-17 01:26:16 +09:00
LICENSE Clean up README and license 2015-02-18 23:09:28 -08:00
NOTICE Copy closer into Druid codebase (fixes #3652) (#4153) 2017-04-10 09:38:45 +09:00
README.md Add TeamCity inspections badge (#4351) 2017-06-06 20:53:25 -04:00
druid_intellij_formatting.xml Make formatting IntelliJ 2016 friendly (#2978) 2016-05-18 12:42:21 -07:00
eclipse.importorder Merge pull request #2905 from javasoze/eclipse_formatting 2016-04-29 18:42:03 -07:00
eclipse_formatting.xml Merge pull request #2905 from javasoze/eclipse_formatting 2016-04-29 18:42:03 -07:00
pom.xml Add PMD and prohibit unnecessary fully qualified class names in code (#4350) 2017-07-17 22:22:29 +09:00
upload.sh upload.sh: Use awscli if s3cmd is not available. (#3114) 2016-06-08 17:01:46 -07:00

README.md

Build Status Inspections Status Coverage Status

Druid

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments.

Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

Druid can load both streaming and batch data and integrates with Samza, Kafka, Storm, Spark, and Hadoop.

License

Apache License, Version 2.0

More Information

More information about Druid can be found on http://www.druid.io.

Documentation

You can find the documentation for the latest Druid release on the project website.

If you would like to contribute documentation, please do so under /docs/content in this repository and submit a pull request.

Getting Started

You can get started with Druid with our quickstart.

Reporting Issues

If you find any bugs, please file a GitHub issue.

Community

Community support is available on the druid-user mailing list(druid-user@googlegroups.com).

Development discussions occur on the druid-development list(druid-development@googlegroups.com).

We also have a couple people hanging out on IRC in #druid-dev on irc.freenode.net.

Contributing

Please follow the guidelines listed here.