OpenSearch/x-pack/plugin
Nik Everett 2f38aeb5e2
Save memory when numeric terms agg is not top (#55873) (#56454)
Right now all implementations of the `terms` agg allocate a new
`Aggregator` per bucket. This uses a bunch of memory. Exactly how much
isn't clear but each `Aggregator` ends up making its own objects to read
doc values which have non-trivial buffers. And it forces all of it
sub-aggregations to do the same. We allocate a new `Aggregator` per
bucket for two reasons:

1. We didn't have an appropriate data structure to track the
   sub-ordinals of each parent bucket.
2. You can only make a single call to `runDeferredCollections(long...)`
   per `Aggregator` which was the only way to delay collection of
   sub-aggregations.

This change switches the method that builds aggregation results from
building them one at a time to building all of the results for the
entire aggregator at the same time.

It also adds a fairly simplistic data structure to track the sub-ordinals
for `long`-keyed buckets.

It uses both of those to power numeric `terms` aggregations and removes
the per-bucket allocation of their `Aggregator`. This fairly
substantially reduces memory consumption of numeric `terms` aggregations
that are not the "top level", especially when those aggregations contain
many sub-aggregations. It also is a pretty big speed up, especially when
the aggregation is under a non-selective aggregation like
the `date_histogram`.

I picked numeric `terms` aggregations because those have the simplest
implementation. At least, I could kind of fit it in my head. And I
haven't fully understood the "bytes"-based terms aggregations, but I
imagine I'll be able to make similar optimizations to them in follow up
changes.
2020-05-08 20:38:53 -04:00
..
analytics Save memory when numeric terms agg is not top (#55873) (#56454) 2020-05-08 20:38:53 -04:00
async-search Async Search: correct shards counting (#55758) 2020-05-06 12:13:30 +02:00
autoscaling Add wire tests for get autoscaling decision objects 2020-04-05 21:34:36 -04:00
ccr Add Functionality to Consistently Read RepositoryData For CS Updates (#55773) (#56091) 2020-05-04 08:13:14 +02:00
core [7.x][ML] Allow stopping DF analytics whose config is missing (#56360) (#56408) 2020-05-08 13:54:44 +03:00
deprecation Add xpack setting deprecations to deprecation API (#56290) 2020-05-07 10:28:17 -04:00
enrich Deprecated xpack "enable" settings should be no-ops (#55416) (#56167) 2020-05-05 10:40:49 -04:00
eql EQL: simplify equals/not-equals TRUE/FALSE expressions (#56191) (#56306) 2020-05-07 03:02:04 +03:00
frozen-indices Allow searching of snapshot taken while indexing (#55511) 2020-04-21 13:21:38 +01:00
graph Convert remaining license methods to isAllowed (#55908) (#55991) 2020-04-30 15:52:22 -07:00
identity-provider Backport: Deprecate the kibana reserved user (#54967) (#55822) 2020-04-28 10:30:25 -04:00
ilm Deprecated xpack "enable" settings should be no-ops (#55416) (#56167) 2020-05-05 10:40:49 -04:00
logstash Deprecated xpack "enable" settings should be no-ops (#55416) (#56167) 2020-05-05 10:40:49 -04:00
mapper-constant-keyword Simplify signature of FieldMapper#parseCreateField. (#56144) 2020-05-06 11:12:09 -07:00
mapper-flattened Simplify signature of FieldMapper#parseCreateField. (#56144) 2020-05-06 11:12:09 -07:00
ml [7.x][ML] Use non-zero timeout when force stopping DF analytics (#56423) (#56428) 2020-05-08 21:12:11 +03:00
monitoring Serialize Monitoring Bulk Request Compressed (#56410) (#56442) 2020-05-08 23:16:07 +02:00
ql EQL: simplify equals/not-equals TRUE/FALSE expressions (#56191) (#56306) 2020-05-07 03:02:04 +03:00
rollup Save memory when numeric terms agg is not top (#55873) (#56454) 2020-05-08 20:38:53 -04:00
search-business-rules [7.x] Create new `geo` module and migrate geo_shape registration (#53562) (#54924) 2020-04-07 16:30:58 -07:00
searchable-snapshots Use snapshot information to build searchable snapshot store MetadataSnapshot (#56289) (#56403) 2020-05-08 14:16:19 +02:00
security Let realms gracefully terminate the authN chain (#55623) 2020-05-05 10:11:49 +03:00
spatial Save memory when numeric terms agg is not top (#55873) (#56454) 2020-05-08 20:38:53 -04:00
sql Upgrade to Jackson 2.10.4 (#56188) 2020-05-06 17:20:23 -04:00
src/test [Transform] fixes http status code when bad scripts are provided (#56117) (#56219) 2020-05-05 12:36:22 -04:00
transform [7.x] Get index includes parent data stream for backing indices (#56238) 2020-05-05 15:43:42 -05:00
vectors Simplify signature of FieldMapper#parseCreateField. (#56144) 2020-05-06 11:12:09 -07:00
voting-only-node Convert remaining license methods to isAllowed (#55908) (#55991) 2020-04-30 15:52:22 -07:00
watcher Deprecated xpack "enable" settings should be no-ops (#55416) (#56167) 2020-05-05 10:40:49 -04:00
wildcard Simplify signature of FieldMapper#parseCreateField. (#56144) 2020-05-06 11:12:09 -07:00
build.gradle [7.x] json spec - add description for autoscaling (#55748) (#55901) 2020-04-29 08:40:11 -05:00