2018-12-13 14:47:20 -05:00
|
|
|
---
|
2019-08-21 00:48:59 -04:00
|
|
|
id: joins
|
2018-12-13 14:47:20 -05:00
|
|
|
title: "Joins"
|
|
|
|
---
|
|
|
|
|
2018-11-13 12:38:37 -05:00
|
|
|
<!--
|
|
|
|
~ Licensed to the Apache Software Foundation (ASF) under one
|
|
|
|
~ or more contributor license agreements. See the NOTICE file
|
|
|
|
~ distributed with this work for additional information
|
|
|
|
~ regarding copyright ownership. The ASF licenses this file
|
|
|
|
~ to you under the Apache License, Version 2.0 (the
|
|
|
|
~ "License"); you may not use this file except in compliance
|
|
|
|
~ with the License. You may obtain a copy of the License at
|
|
|
|
~
|
|
|
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
~
|
|
|
|
~ Unless required by applicable law or agreed to in writing,
|
|
|
|
~ software distributed under the License is distributed on an
|
|
|
|
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
|
|
|
~ KIND, either express or implied. See the License for the
|
|
|
|
~ specific language governing permissions and limitations
|
|
|
|
~ under the License.
|
|
|
|
-->
|
|
|
|
|
2022-05-17 19:56:31 -04:00
|
|
|
Apache Druid has two features related to joining of data:
|
2015-08-21 16:54:00 -04:00
|
|
|
|
2020-04-15 19:12:20 -04:00
|
|
|
1. [Join](datasource.md#join) operators. These are available using a [join datasource](datasource.md#join) in native
|
2022-05-17 19:56:31 -04:00
|
|
|
queries, or using the [JOIN operator](sql.md) in Druid SQL. Refer to the
|
2020-04-15 19:12:20 -04:00
|
|
|
[join datasource](datasource.md#join) documentation for information about how joins work in Druid.
|
|
|
|
2. [Query-time lookups](lookups.md), simple key-to-value mappings. These are preloaded on all servers that are involved
|
|
|
|
in queries and can be accessed with or without an explicit join operator. Refer to the [lookups](lookups.md)
|
|
|
|
documentation for more details.
|
2015-08-21 16:54:00 -04:00
|
|
|
|
2020-04-15 19:12:20 -04:00
|
|
|
Whenever possible, for best performance it is good to avoid joins at query time. Often this can be accomplished by
|
|
|
|
joining data before it is loaded into Druid. However, there are situations where joins or lookups are the best solution
|
|
|
|
available despite the performance overhead, including:
|
2015-08-21 16:54:00 -04:00
|
|
|
|
2020-04-15 19:12:20 -04:00
|
|
|
- The fact-to-dimension (star and snowflake schema) case: you need to change dimension values after initial ingestion,
|
|
|
|
and aren't able to reingest to do this. In this case, you can use lookups for your dimension tables.
|
|
|
|
- Your workload requires joins or filters on subqueries.
|