druid/docs/querying/joins.md
Gian Merlino 42590ae64b
Refresh query docs. (#9704)
* Refresh query docs.

Larger changes:

- New doc: querying/datasource.md describes the various kinds of
datasources you can use, and has examples for both SQL and native.
- New doc: querying/query-execution.md describes how native queries
are executed at a high level. It doesn't go into the details of specific
query engines or how queries run at a per-segment level. But I think it
would be good to add or link that content here in the future.
- Refreshed doc: querying/sql.md updated to refer to joins, reformatted
a bit, added a new "Query translation" section that explains how
queries are translated from SQL to native, and removed configuration
details (moved to configuration/index.md).
- Refreshed doc: querying/joins.md updated to refer to join datasources.

Smaller changes:

- Add helpful banners to the top of query documentation pages telling
people whether a given page describes SQL, native, or both.
- Add SQL metrics to operations/metrics.md.
- Add some color and cross-links in various places.
- Add native query component docs to the sidebar, and renamed them so
they look nicer.
- Remove Select query from the sidebar.
- Fix Broker SQL configs in configuration/index.md. Remove them from
querying/sql.md.
- Combined querying/searchquery.md and querying/searchqueryspec.md.

* Updates.

* Fix numbering.

* Fix glitches.

* Add new words to spellcheck file.

* Assorted changes.

* Further adjustments.

* Add missing punctuation.
2020-04-15 16:12:20 -07:00

2.0 KiB

id title
joins Joins

Druid has two features related to joining of data:

  1. Join operators. These are available using a join datasource in native queries, or using the JOIN operator in Druid SQL. Refer to the join datasource documentation for information about how joins work in Druid.
  2. Query-time lookups, simple key-to-value mappings. These are preloaded on all servers that are involved in queries and can be accessed with or without an explicit join operator. Refer to the lookups documentation for more details.

Whenever possible, for best performance it is good to avoid joins at query time. Often this can be accomplished by joining data before it is loaded into Druid. However, there are situations where joins or lookups are the best solution available despite the performance overhead, including:

  • The fact-to-dimension (star and snowflake schema) case: you need to change dimension values after initial ingestion, and aren't able to reingest to do this. In this case, you can use lookups for your dimension tables.
  • Your workload requires joins or filters on subqueries.