druid/extensions-contrib/kubernetes-overlord-extensions
Gian Merlino e40b96e026
Reverse lookup fixes and enhancements. (#15611)
* Reverse lookup fixes and enhancements.

1) Add a "mayIncludeUnknown" parameter to DimFilter#optimize. This is important
   because otherwise the reverse-lookup optimization is done improperly when
   the "in" filter appears under a "not", and the lookup extractionFn may return
   null for some possible values of the filtered column. The "includeUnknown" test
   cases in InDimFilterTest illustrate the difference in behavior.

2) Enhance InDimFilter#optimizeLookup to handle "mayIncludeUnknown", and to be able
   to do a reverse lookup in a wider variety of cases.

3) Make "unapply" protected in LookupExtractor, and move callers to "unapplyAll".
   The main reason is that MapLookupExtractor, a common implementation, lacks a
   reverse mapping and therefore does a scan of the map for each call to "unapply".
   For performance sake these calls need to be batched.

* Remove optimize call from BloomDimFilter.

* Follow the law.

* Fix tests.

* Fix imports.

* Switch function.

* Fix tests.

* More tests.
2024-01-03 13:28:44 -08:00
..
src Reverse lookup fixes and enhancements. (#15611) 2024-01-03 13:28:44 -08:00
README.md Add readme for kubernetes-overlord-extensions and update docs (#14674) 2023-08-01 13:29:44 -07:00
pom.xml unpin snakeyaml, add suppressions and licenses (#15549) 2023-12-15 10:33:14 -08:00

README.md

druid-kubernetes-overlord-extensions

Overview

The Kubernetes Task Scheduling extension allows a Druid cluster running on Kubernetes to schedule its tasks as Kubernetes Jobs instead of sending them to workers (middle managers or indexers).

Documentation

More detailed documentation about how to configure and use the extension is available here