druid/processing
Gian Merlino 202c78c8f3
Enable rewriting certain inner joins as filters. (#11068)
* Enable rewriting certain inner joins as filters.

The main logic for doing the rewrite is in JoinableFactoryWrapper's
segmentMapFn method. The requirements are:

- It must be an inner equi-join.
- The right-hand columns referenced by the condition must not contain any
  duplicate values. (If they did, the inner join would not be guaranteed
  to return at most one row for each left-hand-side row.)
- No columns from the right-hand side can be used by anything other than
  the join condition itself.

HashJoinSegmentStorageAdapter is also modified to pass through to
the base adapter (even allowing vectorization!) in the case where 100%
of join clauses could be rewritten as filters.

In support of this goal:

- Add Query getRequiredColumns() method to help us figure out whether
  the right-hand side of a join datasource is being used or not.
- Add JoinConditionAnalysis getRequiredColumns() method to help us
  figure out if the right-hand side of a join is being used by later
  join clauses acting on the same base.
- Add Joinable getNonNullColumnValuesIfAllUnique method to enable
  retrieving the set of values that will form the "in" filter.
- Add LookupExtractor canGetKeySet() and keySet() methods to support
  LookupJoinable in its efforts to implement the new Joinable method.
- Add "enableRewriteJoinToFilter" feature flag to
  JoinFilterRewriteConfig. The default is disabled.

* Test improvements.

* Test fixes.

* Avoid slow size() call.

* Remove invalid test.

* Fix style.

* Fix mistaken default.

* Small fixes.

* Fix logic error.
2021-04-14 10:49:27 -07:00
..
src Enable rewriting certain inner joins as filters. (#11068) 2021-04-14 10:49:27 -07:00
pom.xml Bump dev version to 0.22.0-SNAPSHOT (#10759) 2021-01-15 13:16:23 -08:00