druid

History

Gian Merlino 93aeaf4801 Improve on-heap aggregator footprint estimates. (#11950 ) Add a "guessAggregatorHeapFootprint" method to AggregatorFactory that mitigates #6743 by enabling heap footprint estimates based on a specific number of rows. The idea is that at ingestion time, the number of rows that go into an aggregator will be 1 (if rollup is off) or will likely be a small number (if rollup is on). It's a heuristic, because of course nothing guarantees that the rollup ratio is a small number. But it's a common case, and I expect this logic to go wrong much less often than the current logic. Also, when it does go wrong, users can fix it by lowering maxRowsInMemory or maxBytesInMemory. The current situation is unintuitive: when the estimation goes wrong, users get an OOME, but actually they need to raise these limits to fix it.		2021-11-28 13:21:24 +05:30
..
src	Improve on-heap aggregator footprint estimates. (#11950 )	2021-11-28 13:21:24 +05:30
README.md	update links datasketches.github.io to datasketches.apache.org (#10107 )	2020-07-01 14:56:17 -07:00
pom.xml	bump version to 0.23.0-SNAPSHOT (#11670 )	2021-09-08 15:56:04 -07:00

This module provides Druid aggregators based on https://datasketches.apache.org/.

Credits: This module is a result of feedback and work done by following people.