druid

History

Kashif Faraz e648b01afb Improve memory estimates in Aggregator and DimensionIndexer (#12073 ) Fixes #12022 ### Description The current implementations of memory estimation in `OnHeapIncrementalIndex` and `StringDimensionIndexer` tend to over-estimate which leads to more persistence cycles than necessary. This PR replaces the max estimation mechanism with getting the incremental memory used by the aggregator or indexer at each invocation of `aggregate` or `encode` respectively. ### Changes - Add new flag `useMaxMemoryEstimates` in the task context. This overrides the same flag in DefaultTaskConfig i.e. `druid.indexer.task.default.context` map - Add method `AggregatorFactory.factorizeWithSize()` that returns an `AggregatorAndSize` which contains the aggregator instance and the estimated initial size of the aggregator - Add method `Aggregator.aggregateWithSize()` which returns the incremental memory used by this aggregation step - Update the method `DimensionIndexer.processRowValsToKeyComponent()` to return the encoded key component as well as its effective size in bytes - Update `OnHeapIncrementalIndex` to use the new estimations only if `useMaxMemoryEstimates = false`		2022-02-03 10:34:02 +05:30
..
src/test	Improve memory estimates in Aggregator and DimensionIndexer (#12073 )	2022-02-03 10:34:02 +05:30
assembly.xml	Fix for building in Eclipse & VS Code. (#7481 )	2020-02-13 14:58:32 -08:00
pom.xml	bump version to 0.23.0-SNAPSHOT (#11670 )	2021-09-08 15:56:04 -07:00