mirror of https://github.com/apache/druid.git
docs: copyedits for MSQ join algos (#14012)
This commit is contained in:
parent
f128b9b666
commit
cc37987dff
|
@ -244,16 +244,22 @@ The following table lists the context parameters for the MSQ task engine:
|
||||||
|
|
||||||
## Joins
|
## Joins
|
||||||
|
|
||||||
Joins in multi-stage queries use one of two algorithms, based on the [context parameter](#context-parameters)
|
Joins in multi-stage queries use one of two algorithms based on what you set the [context parameter](#context-parameters) `sqlJoinAlgorithm` to:
|
||||||
`sqlJoinAlgorithm`. This context parameter applies to the entire SQL statement, so it is not possible to mix different
|
|
||||||
|
- [`broadcast`](#broadcast) (default)
|
||||||
|
- [`sortMerge`](#sort-merge).
|
||||||
|
|
||||||
|
If you omit this context parameter, the MSQ task engine uses broadcast since it's the default join algorithm. The context parameter applies to the entire SQL statement, so you can't mix different
|
||||||
join algorithms in the same query.
|
join algorithms in the same query.
|
||||||
|
|
||||||
### Broadcast
|
### Broadcast
|
||||||
|
|
||||||
Set `sqlJoinAlgorithm` to `broadcast`.
|
|
||||||
|
|
||||||
The default join algorithm for multi-stage queries is a broadcast hash join, which is similar to how
|
The default join algorithm for multi-stage queries is a broadcast hash join, which is similar to how
|
||||||
[joins are executed with native queries](../querying/query-execution.md#join). First, any adjacent joins are flattened
|
[joins are executed with native queries](../querying/query-execution.md#join).
|
||||||
|
|
||||||
|
To use broadcast joins, either omit the `sqlJoinAlgorithm` or set it to `broadcast`.
|
||||||
|
|
||||||
|
For a broadcast join, any adjacent joins are flattened
|
||||||
into a structure with a "base" input (the bottom-leftmost one) and other leaf inputs (the rest). Next, any subqueries
|
into a structure with a "base" input (the bottom-leftmost one) and other leaf inputs (the rest). Next, any subqueries
|
||||||
that are inputs the join (either base or other leafs) are planned into independent stages. Then, the non-base leaf
|
that are inputs the join (either base or other leafs) are planned into independent stages. Then, the non-base leaf
|
||||||
inputs are all connected as broadcast inputs to the "base" stage.
|
inputs are all connected as broadcast inputs to the "base" stage.
|
||||||
|
@ -266,13 +272,14 @@ Only LEFT JOIN, INNER JOIN, and CROSS JOIN are supported with with `broadcast`.
|
||||||
Join conditions, if present, must be equalities. It is not necessary to include a join condition; for example,
|
Join conditions, if present, must be equalities. It is not necessary to include a join condition; for example,
|
||||||
`CROSS JOIN` and comma join do not require join conditions.
|
`CROSS JOIN` and comma join do not require join conditions.
|
||||||
|
|
||||||
As an example, the following statement has a single join chain where `orders` is the base input, and `products` and
|
The following example has a single join chain where `orders` is the base input while `products` and
|
||||||
`customers` are non-base leaf inputs. The query will first read `products` and `customers`, then broadcast both to
|
`customers` are non-base leaf inputs. The broadcast inputs (`products` and `customers`) must fall under the limit on broadcast table footprint, but the base `orders` input
|
||||||
the stage that reads `orders`. That stage loads the broadcast inputs (`products` and `customers`) in memory, and walks
|
|
||||||
through `orders` row by row. The results are then aggregated and written to the table `orders_enriched`. The broadcast
|
|
||||||
inputs (`products` and `customers`) must fall under the limit on broadcast table footprint, but the base `orders` input
|
|
||||||
can be unlimited in size.
|
can be unlimited in size.
|
||||||
|
|
||||||
|
The query reads `products` and `customers` and then broadcasts both to
|
||||||
|
the stage that reads `orders`. That stage loads the broadcast inputs (`products` and `customers`) in memory and walks
|
||||||
|
through `orders` row by row. The results are aggregated and written to the table `orders_enriched`.
|
||||||
|
|
||||||
```
|
```
|
||||||
REPLACE INTO orders_enriched
|
REPLACE INTO orders_enriched
|
||||||
OVERWRITE ALL
|
OVERWRITE ALL
|
||||||
|
@ -291,26 +298,23 @@ CLUSTERED BY product_name
|
||||||
|
|
||||||
### Sort-merge
|
### Sort-merge
|
||||||
|
|
||||||
Set `sqlJoinAlgorithm` to `sortMerge`.
|
You can use the sort-merge join algorithm to make queries more scalable at the cost of performance. If your goal is performance, consider [broadcast joins](#broadcast). There are various scenarios where broadcast join would return a [`BroadcastTablesTooLarge`](#error-codes) error, but a sort-merge join would succeed.
|
||||||
|
|
||||||
Multi-stage queries can use a sort-merge join algorithm. With this algorithm, each pairwise join is planned into its own
|
To use the sort-merge join algorithm, set the context parameter `sqlJoinAlgorithm` to `sortMerge`.
|
||||||
stage with two inputs. The two inputs are partitioned and sorted using a hash partitioning on the same key. This
|
|
||||||
approach is generally less performant, but more scalable, than `broadcast`. There are various scenarios where broadcast
|
|
||||||
join would return a [`BroadcastTablesTooLarge`](#errors) error, but a sort-merge join would succeed.
|
|
||||||
|
|
||||||
There is no limit on the overall size of either input, so sort-merge is a good choice for performing a join of two large
|
In a sort-merge join, each pairwise join is planned into its own stage with two inputs. The two inputs are partitioned and sorted using a hash partitioning on the same key.
|
||||||
inputs, or for performing a self-join of a large input with itself.
|
|
||||||
|
|
||||||
There is a limit on the amount of data associated with each individual key. If _both_ sides of the join exceed this
|
When using the sort-merge algorithm, keep the following in mind:
|
||||||
limit, the query returns a [`TooManyRowsWithSameKey`](#errors) error. If only one side exceeds the limit, the query
|
|
||||||
does not return this error.
|
|
||||||
|
|
||||||
Join conditions, if present, must be equalities. It is not necessary to include a join condition; for example,
|
- There is no limit on the overall size of either input, so sort-merge is a good choice for performing a join of two large inputs or for performing a self-join of a large input with itself.
|
||||||
`CROSS JOIN` and comma join do not require join conditions.
|
|
||||||
|
|
||||||
All join types are supported with `sortMerge`: LEFT, RIGHT, INNER, FULL, and CROSS.
|
- There is a limit on the amount of data associated with each individual key. If _both_ sides of the join exceed this limit, the query returns a [`TooManyRowsWithSameKey`](#error-codes) error. If only one side exceeds the limit, the query does not return this error.
|
||||||
|
|
||||||
As an example, the following statement runs using a single sort-merge join stage that receives `eventstream`
|
- Join conditions are optional but must be equalities if they are present. For example, `CROSS JOIN` and comma join do not require join conditions.
|
||||||
|
|
||||||
|
- All join types are supported with `sortMerge`: LEFT, RIGHT, INNER, FULL, and CROSS.
|
||||||
|
|
||||||
|
The following example runs using a single sort-merge join stage that receives `eventstream`
|
||||||
(partitioned on `user_id`) and `users` (partitioned on `id`) as inputs. There is no limit on the size of either input.
|
(partitioned on `user_id`) and `users` (partitioned on `id`) as inputs. There is no limit on the size of either input.
|
||||||
|
|
||||||
```
|
```
|
||||||
|
@ -328,6 +332,8 @@ PARTITIONED BY HOUR
|
||||||
CLUSTERED BY user
|
CLUSTERED BY user
|
||||||
```
|
```
|
||||||
|
|
||||||
|
The context parameter that sets `sqlJoinAlgorithm` to `sortMerge` is not shown in the above example.
|
||||||
|
|
||||||
## Durable Storage
|
## Durable Storage
|
||||||
|
|
||||||
Using durable storage with your SQL-based ingestions can improve their reliability by writing intermediate files to a storage location temporarily.
|
Using durable storage with your SQL-based ingestions can improve their reliability by writing intermediate files to a storage location temporarily.
|
||||||
|
|
Loading…
Reference in New Issue