* Frame format for data transfer and short-term storage. As we move towards query execution plans that involve more transfer of data between servers, it's important to have a data format that provides for doing this more efficiently than the options available to us today. This patch adds: - Columnar frames, which support fast querying. - Row-based frames, which support fast sorting via memory comparison and fast whole-row copies via memory copying. - Frame files, a container format that can be stored on disk or transferred between servers. The idea is we should use row-based frames when data is expected to be sorted, and columnar frames when data is expected to be queried. The code in this patch is not used in production yet. Therefore, the patch involves minimal changes outside of the org.apache.druid.frame package. The main ones are adjustments to SqlBenchmark to add benchmarks for queries on frames, and the addition of a "forEach" method to Sequence. * Fixes based on tests, static analysis. * Additional fixes. * Skip DS mapping tests on JDK 14+ * Better JDK checking in tests. * Fix imports. * Additional comment. * Adjustments from code review. * Update test case.
Website | Documentation | Developer Mailing List | User Mailing List | Slack | Twitter | Download
Apache Druid
Druid is a high performance real-time analytics database. Druid's main value add is to reduce time to insight and action.
Druid is designed for workflows where fast queries and ingest really matter. Druid excels at powering UIs, running operational (ad-hoc) queries, or handling high concurrency. Consider Druid as an open source alternative to data warehouses for a variety of use cases. The design documentation explains the key concepts.
Getting started
You can get started with Druid with our local or Docker quickstart.
Druid provides a rich set of APIs (via HTTP and JDBC) for loading, managing, and querying your data. You can also interact with Druid via the built-in console (shown below).
Load data
Load streaming and batch data using a point-and-click wizard to guide you through ingestion setup. Monitor one off tasks and ingestion supervisors.
Manage the cluster
Manage your cluster with ease. Get a view of your datasources, segments, ingestion tasks, and services from one convenient location. All powered by SQL systems tables, allowing you to see the underlying query for each view.
Issue queries
Use the built-in query workbench to prototype DruidSQL and native queries or connect one of the many tools that help you make the most out of Druid.
Documentation
You can find the documentation for the latest Druid release on the project website.
If you would like to contribute documentation, please do so under
/docs
in this repository and submit a pull request.
Community
Community support is available on the druid-user mailing list, which is hosted at Google Groups.
Development discussions occur on dev@druid.apache.org, which you can subscribe to by emailing dev-subscribe@druid.apache.org.
Chat with Druid committers and users in real-time on the Apache Druid Slack channel. Please use this invitation link to join and invite others.
Building from source
Please note that JDK 8 or JDK 11 is required to build Druid.
For instructions on building Druid from source, see docs/development/build.md
Contributing
Please follow the community guidelines for contributing.
For instructions on setting up IntelliJ dev/intellij-setup.md