OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nick Knize	923ea001f5	[Rename] o.e.action.support classes (#253 ) This commit refactors the classes in o.e.action.support to o.opensearch.action.support. The remaining directories will be refactored in a separate commit. Signed-off-by: Nicholas Knize <nknize@amazon.com>	2021-03-21 20:56:34 -05:00
Rabi Panda	3eee5183d1	[Rename] server/rest (#229 ) This commit refactors the `server/rest` package as part of the Elasticsearch to OpenSearch renaming. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-21 20:56:34 -05:00
Rabi Panda	0e25c23e5f	[Rename] modules/parent-join (#216 ) Refactor parent-join module as part of the Elasticsearch to OpenSearch renaming effort. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-21 20:56:34 -05:00
Nick Knize	e60906fc11	[Rename] ElasticsearchDirectoryReader class in server module (#176 ) This commit refactors the ElasticsearchDirectoryReader class located in the server module to OpenSearchDirectoryReader. References and usages, along with method names, throughout the rest of the codebase are fully refactored. Signed-off-by: Nicholas Knize <nknize@amazon.com>	2021-03-21 20:56:34 -05:00
Nick Knize	ccceb381db	[Rename] ElasticsearchException class in server module (#165 ) This commit refactors the ElasticsearchException class located in the server module to OpenSearchException. References and usages throughout the rest of the codebase are fully refactored. Signed-off-by: Nicholas Knize <nknize@amazon.com>	2021-03-21 20:56:34 -05:00
Julie Tibshirani	24c0f01543	Ensure all query builder tests consider older versions. (#66401 ) This PR removes outdated overrides in some tests that prevent them from testing older index versions. Also removes an old comment + logic from AggregatorFactoriesTests.	2020-12-16 11:57:01 -08:00
Nik Everett	0c47d49784	Make sure non-collecting aggs include sub-aggs (backport of #64214 ) (#64247 ) Now that we're consistently using `cat_match` to filter which shards we run on we can get this confusing case: 1. You have a search with, say, a range and a sub-agg. 2. That search has a query that `can_match` can recognize will match no docs. On any shard. 3. So we dutifully run it on a single shard so it can produce the "empty" aggs. 4. The shard we pick happens to not have the target of the range mapped. 5. This kicks in the special range aggregator that doesn't collect any documents. 6. Before this commit, that range aggregator also never produced any sub-aggs. So, without this change, it was quite possible for a search that happened to match no documents to "throw away" the sub-aggs of a range and a few other aggs. We've had this problem for a long, long time but it is more confusing now because `can_match` is really kicking in and causing us to see cases where it looks like you are targeting a lot of shards but you really are only targeting a couple. It used to be that to get the "no sub-aggs" behavior you had to explicitly target only shards that didn't map the target field of the `range` agg. And, like, in that case it isn't too bad because you targeted a sort of degenerate shard. But now that `can_match` is doing its thing you can end up with the confusing steps above. It took me several hours to track down what what happening I know how the individual pieces of all of this works. It took four hours to figure out how they fit together in this case.... Anyway! This replaces all the aggregator implementations that throw out the sub-aggregators with ones that keep them. I think this'll be less confusing in the future. Closes #64142	2020-10-28 08:38:05 -04:00
Nik Everett	5583db5a73	Fix broken parent and child aggregator (backport #63811 ) (#63892 ) In #57892 I broke some sub-aggregations inside of the `parent` and `child` aggregator, specifically any sub-aggregations that do work in the `postCollect` phase. This fixes it by delaying the post collect phase of aggs under `parent` and `child` until `beforeBuildingBuckets` because, well, we haven't done any collection until after that phase.	2020-10-19 13:05:22 -04:00
Julie Tibshirani	ae2fc4118d	Add factory methods for common value fetchers. (#63438 ) This PR adds factory methods for the most common implementations: * `SourceValueFetcher.identity` to pass through the source value untouched. * `SourceValueFetcher.toString` to simply convert the source value to a string.	2020-10-08 12:14:53 -07:00
Julie Tibshirani	f17ca18dfa	Make array value parsing flag more robust. (#63371 ) When constructing a value fetcher, the 'parsesArrayValue' flag must match `FieldMapper#parsesArrayValue`. However there is nothing in code or tests to help enforce this. This PR reworks the value fetcher constructors so that `parsesArrayValue` is 'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must explicitly set it to true and ensure the behavior is covered by tests. Follow-up to #62974.	2020-10-06 17:49:25 -07:00
Alan Woodward	01950bc80f	Move FieldMapper#valueFetcher to MappedFieldType (#62974 ) (#63220 ) For runtime fields, we will want to do all search-time interaction with a field definition via a MappedFieldType, rather than a FieldMapper, to avoid interfering with the logic of document parsing. Currently, fetching values for runtime scripts and for building top hits responses need to call a method on FieldMapper. This commit moves this method to MappedFieldType, incidentally simplifying the current call sites and freeing us up to implement runtime fields as pure MappedFieldType objects.	2020-10-04 14:54:59 +01:00
Luca Cavanna	862fab06d3	Share same existsQuery impl throughout mappers (#57607 ) Most of our field types have the same implementation for their `existsQuery` method which relies on doc_values if present, otherwise it queries norms if available or uses a term query against the _field_names meta field. This standard implementation is repeated in many different mappers. There are field types that only query doc_values, because they always have them, and field types that always query _field_names, because they never have norms nor doc_values. We could apply the same standard logic to all of these field types as `MappedFieldType` has the knowledge about what data structures are available. This commit introduces a standard implementation that does the right thing depending on the data structure that is available. With that only field types that require a different behaviour need to override the existsQuery method. At the same time, this no longer forces subclasses to override `existsQuery`, which could be forgotten when needed. To address this we introduced a new test method in `MapperTestCase` that verifies the `existsQuery` being generated and its consistency with the available data structures.	2020-09-23 11:00:53 +02:00
Luca Cavanna	5ca86d541c	Move stored flag from TextSearchInfo to MappedFieldType (#62717 ) (#62770 )	2020-09-23 09:40:34 +02:00
Nik Everett	24a24d050a	Implement fields fetch for runtime fields (backport of #61995 ) (#62416 ) This implements the `fields` API in `_search` for runtime fields using doc values. Most of that implementation is stolen from the `docvalue_fields` fetch sub-phase, just moved into the same API that the `fields` API uses. At this point the `docvalue_fields` fetch phase looks like a special case of the `fields` API. While I was at it I moved the "which doc values sub-implementation should I use for fetching?" question from a bunch of `instanceof`s to a method on `LeafFieldData` so we can be much more flexible with what is returned and we're not forced to extend certain classes just to make the fetch phase happy. Relates to #59332	2020-09-15 20:24:10 -04:00
Julie Tibshirani	4a19bdb2ea	Support the 'fields' option in inner_hits and top_hits. (#62337 ) This PR adds support for the 'fields' option in the following places: * Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing * The `top_hits` aggregation Addresses #61949.	2020-09-14 11:51:45 -07:00
Nhat Nguyen	3d69b5c41e	Introduce point in time APIs in x-pack basic (#61062 ) This commit introduces a new API that manages point-in-times in x-pack basic. Elasticsearch pit (point in time) is a lightweight view into the state of the data as it existed when initiated. A search request by default executes against the most recent point in time. In some cases, it is preferred to perform multiple search requests using the same point in time. For example, if refreshes happen between search_after requests, then the results of those requests might not be consistent as changes happening between searches are only visible to the more recent point in time. A point in time must be opened before being used in search requests. The `keep_alive` parameter tells Elasticsearch how long it should keep a point in time around. ``` POST /my_index/_pit?keep_alive=1m ``` The response from the above request includes a `id`, which should be passed to the `id` of the `pit` parameter of search requests. ``` POST /_search { "query": { "match" : { "title" : "elasticsearch" } }, "pit": { "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "1m" } } ``` Point-in-times are automatically closed when the `keep_alive` is elapsed. However, keeping point-in-times has a cost; hence, point-in-times should be closed as soon as they are no longer used in search requests. ``` DELETE /_pit { "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA=" } ``` #### Notable works in this change: - Move the search state to the coordinating node: #52741 - Allow searches with a specific reader context: #53989 - Add the ability to acquire readers in IndexShard: #54966 Relates #46523 Relates #26472 Co-authored-by: Jim Ferenczi <jimczi@apache.org>	2020-09-10 19:25:47 -04:00
Nik Everett	c19f67ce30	Support longs in BitArray (backport of #61867 ) (#61871 ) We frequently use `long`s with `BitArray` in aggs and right now we have to assert that the `long` fits in an `int`. This adds support for `long` to `BitArray` so we don't need those assertions.	2020-09-02 17:24:31 -04:00
Luca Cavanna	f769821bc8	Pass SearchLookup supplier through to fielddataBuilder (#61430 ) (#61638 ) Runtime fields need to have a SearchLookup available, when building their fielddata implementations, so that they can look up other fields, runtime or not. To achieve that, we add a Supplier<SearchLookup> argument to the existing MappedFieldType#fielddataBuilder method. As we introduce the ability to look up other fields while building fielddata for mapped fields, we implicitly add the ability for a field to require other fields. This requires some protection mechanism that detects dependency cycles to prevent stack overflow errors. With this commit we also introduce detection for cycles, as well as a limit on the depth of the references for a runtime field. Note that we also plan on introducing cycles detection at compile time, so the runtime cycles detection is a last resort to prevent stack overflow errors but we hope that we can reject runtime fields from being registered in the mappings when they create a cycle in their definition. Note that this commit does not introduce any production implementation of runtime fields, but is rather a pre-requisite to merge the runtime fields feature branch. This is a breaking change for MapperPlugins that plug in a mapper, as the signature of MappedFieldType#fielddataBuilder changes from taking a single argument (the index name), to also accept a Supplier<SearchLookup>. Relates to #59332 Co-authored-by: Nik Everett <nik9000@gmail.com>	2020-08-27 18:09:56 +02:00
Przemyslaw Gomulka	9f566644af	Do not create two loggers for DeprecationLogger backport(#58435 ) (#61530 ) DeprecationLogger's constructor should not create two loggers. It was taking parent logger instance, changing its name with a .deprecation prefix and creating a new logger. Most of the time parent logger was not needed. It was causing Log4j to unnecessarily cache the unused parent logger instance. depends on #61515 backports #58435	2020-08-26 16:04:02 +02:00
Przemyslaw Gomulka	f3f7d25316	Header warning logging refactoring backport(#55941 ) (#61515 ) Splitting DeprecationLogger into two. HeaderWarningLogger - responsible for adding a response warning headers and ThrottlingLogger - responsible for limiting the duplicated log entries for the same key (previously deprecateAndMaybeLog). Introducing A ThrottlingAndHeaderWarningLogger which is a base for other common logging usages where both response warning header and logging throttling was needed. relates #55699 relates #52369 backports #55941	2020-08-25 16:35:54 +02:00
Julie Tibshirani	997c73ec17	Correct how field retrieval handles multifields and copy_to. (#61391 ) Before when a value was copied to a field through a parent field or `copy_to`, we parsed it using the `FieldMapper` from the source field. Instead we should parse it using the target `FieldMapper`. This ensures that we apply the appropriate mapping type and options to the copied value. To implement the fix cleanly, this PR refactors the value parsing strategy. Now instead of looking up values directly, field mappers produce a helper object `ValueFetcher`. The value fetchers are responsible for almost all aspects of fetching, including looking up the right paths in the _source. The PR is fairly big but each commit can be reviewed individually. Fixes #61033.	2020-08-20 15:53:35 -07:00
Mark Tozzi	db1df6cc30	[7.x] Remove a bunch of type boilerplate from Aggs (#60852 ) (#61031 )	2020-08-17 12:13:05 -04:00
Nik Everett	664ba0a80a	Fix the parent join aggregator test case (#60991 ) The test was putting parent and child documents into different segments which is unrealistic and was causing errors. Closes #60980	2020-08-11 17:53:15 -04:00
Nhat Nguyen	4bdf283619	Mute ChildrenToParentAggregatorTests Tracked at #60980	2020-08-11 12:56:29 -04:00
Jim Ferenczi	f30f1f04e2	Replace AggregatorTestCase#search with AggregatorTestCase#searchAndReduce (#60816 ) This commit removes the ability to test the top level result of an aggregator before it runs the final reduce. All aggregator tests that use AggregatorTestCase#search are rewritten with AggregatorTestCase#searchAndReduce in order to ensure that we test the final output (the one sent to the end user) rather than an intermediary result that could be different. This change also removes spurious commits triggered on top of a random index writer. These commits slow down the tests and are redundant with the commits that the random index writer performs.	2020-08-10 17:23:00 +02:00
Alan Woodward	b3ae5d26bd	Move mapper validation to the mappers themselves (#60072 ) (#60649 ) Currently, validation of mappers (checking that cross-references are correct, limits on field name lengths and object depths, multiple definitions, etc) is performed by the MapperService. This means that any mapper-specific validation, for example that done on the CompletionFieldMapper, needs to be called specifically from core server code, and so we can't add validation to mappers that live in plugins. This commit reworks the validation framework so that mapper-specific validation is done on the Mapper itself. Mapper gets a new `validate(MappingLookup)` method (already present on `MetadataFieldMapper` and now pulled up to the parent interface), which is called from a new `DocumentMapper.validate()` method. All the validation code currently living on `MapperService` moves either to individual mapper implementations (FieldAliasMapper, CompletionFieldMapper) or into `MappingLookup`, an altered `DocumentFieldMappers` which now knows about object fields and can check for duplicate definitions, or into DocumentMapper which handles soft limit checks.	2020-08-04 14:39:20 +01:00
Julie Tibshirani	f99584c6f3	Avoid reloading _source for every inner hit. (#60632 ) Previously if an inner_hits block required _ source, we would reload and parse the root document's source for every hit. This PR adds a shared SourceLookup to the inner hits context that allows inner hits to reuse parsed source if it's already available. This matches our approach for sharing the root document ID. Relates to #32818.	2020-08-03 17:12:27 -07:00
Julie Tibshirani	c7bfb5de41	Add search `fields` parameter to support high-level field retrieval. (#60258 ) This feature adds a new `fields` parameter to the search request, which consults both the document `_source` and the mappings to fetch fields in a consistent way. The PR merges the `field-retrieval` feature branch. Addresses #49028 and #55363.	2020-07-28 10:58:20 -07:00
Jake Landis	92ce41cfaf	[7.x] Introduce javaRestTest source set/task and convert modules (#59939 ) (#60026 ) Introduce a javaRestTest source set and task to compliment the yamlRestTest. javaRestTest differs such that the code is sourced from Java and may have different dependencies and setup requirements for the test clusters. This also allows the tests to run in parallel in different cluster instances to prevent any cross test contamination between the two types of tests. Included in this PR is all :modules no longer use the integTest task. The tests are now driven by test, yamlRestTest, javaRestTest, and internalClusterTest. Since only :modules (and :rest-api-spec) have been converted to yamlRestTest we can now disable the integTest task if either yamlRestTest or javaRestTest have been applied. Once all projects are converted, we can delete the integTest task. related: #56841 related: #59444	2020-07-28 08:39:11 -05:00
Nik Everett	6f6076e208	Drop some params from IndexFieldData.Builder (backport of #59934 ) (#59972 ) We never used the `IndexSettings` parameter and we only used the `MappedFieldType` parameter to get the name of the field which we already know everywhere where we build the `IFD.Builder`. This allows us to drop a fair bit of ceremony from a couple of tests.	2020-07-21 10:28:59 -04:00
Jake Landis	665b7b7bd8	Convert modules to use yamlRestTest (#59089 ) (#59446 ) This commit moves the modules REST tests to the newly introduced yamlRestTest source set. A few tests have also been re-named to include the correct IT suffix. Without changing the names, the testing conventions task would fail since now that the YAML tests are no longer present pacify the convention. These tests have moved to the internalClusterTest source set. related: #56841	2020-07-13 13:53:05 -05:00
Alan Woodward	f4caadd239	MappedFieldType no longer requires equals/hashCode/clone (#59212 ) With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer have any cause to compare MappedFieldType instances. This means that we can remove all equals and hashCode implementations, and in addition we no longer need the clone implementations which were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes, which will be particularly useful for the runtime fields project.	2020-07-09 21:05:10 +01:00
Nik Everett	a29d3515a2	Improve cardinality measure used to build aggs (#56533 ) (#59107 ) This makes a `parentCardinality` available to every `Aggregator`'s ctor so it can make intelligent choices about how it collects bucket values. This replaces `collectsFromSingleBucket` and is similar to it but: 1. It supports `NONE`, `ONE`, and `MANY` values and is generally extensible if we decide we can use more precise counts. 2. It is more accurate. `collectsFromSingleBucket` assumed that all sub-aggregations live under multi-bucket aggregations. This is normally true but `parentCardinality` is properly carried forward for single bucket aggregations like `filter` and for multi-bucket aggregations configured in single-bucket for like `range` with a single range. While I was touching every aggregation I renamed `doCreateInternal` to `createMapped` because that seemed like a much better name and it was right there, next to the change I was already making. Relates to #56487 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-07-08 08:42:23 -04:00
Jake Landis	604c6dd528	7.x - Create plugin for yamlTest task (#56841 ) (#59090 ) This commit creates a new Gradle plugin to provide a separate task name and source set for running YAML based REST tests. The only project converted to use the new plugin in this PR is distribution/archives/integ-test-zip. For which the testing has been moved to :rest-api-spec since it makes the most sense and it avoids a small but awkward change to the distribution plugin. The remaining cases in modules, plugins, and x-pack will be handled in followups. This plugin is distinctly different from the plugin introduced in #55896 since the YAML REST tests are intended to be black box tests over HTTP. As such they should not (by default) have access to the classpath for that which they are testing. The YAML based REST tests will be moved to separate source sets (yamlRestTest). The which source is the target for the test resources is dependent on if this new plugin is applied. If it is not applied, it will default to the test source set. Further, this introduces a breaking change for plugin developers that use the YAML testing framework. They will now need to either use the new source set and matching task, or configure the rest resources to use the old "test" source set that matches the old integTest task. (The former should be preferred). As part of this change (which is also breaking for plugin developers) the rest resources plugin has been removed from the build plugin and now requires either explicit application or application via the new YAML REST test plugin. Plugin developers should be able to fix the breaking changes to the YAML tests by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests under a yamlRestTest folder (instead of test)	2020-07-06 14:16:26 -05:00
Nik Everett	2965c7fe12	Fix bug in parent and child aggregators when parent field not defined (#57089 ) (#59074 ) Adding null check for ParentJoinFieldMapper in ChildrenAggregationBuilder.joinFieldResolveConfig Closes #42997 Co-authored-by: ParthPunkster <parthjain.pj1994@gmail.com>	2020-07-06 10:59:47 -04:00
Alan Woodward	3ba16e0f39	Move MappedFieldType#getSearchAnalyzer and #getSearchQuoteAnalyzer to TextSearchInfo (#58830 ) Analyzers are specific to text searching, and so should be in TextSearchInfo rather than on the generic MappedFieldType. Backport of #58639	2020-07-01 14:52:14 +01:00
Alan Woodward	8ebd341710	Add text search information to MappedFieldType (#58230 ) (#58432 ) Now that MappedFieldType no longer extends lucene's FieldType, we need to have a way of getting the index information about a field necessary for building text queries, building term vectors, highlighting, etc. This commit introduces a new TextSearchInfo abstraction that holds this information, and a getTextSearchInfo() method to MappedFieldType to make it available. Field types that do not support text search can just return null here. This allows us to remove the MapperService.getLuceneFieldType() shim method.	2020-06-23 14:37:26 +01:00
Alan Woodward	ca2d12d039	Remove Settings parameter from FieldMapper base class (#58237 ) This is currently used to set the indexVersionCreated parameter on FieldMapper. However, this parameter is only actually used by two implementations, and clutters the API considerably. We should just remove it, and use it directly in the implementations that require it.	2020-06-18 12:53:54 +01:00
Alan Woodward	12a3f6dfca	MappedFieldType should not extend FieldType (#58160 ) MappedFieldType is a combination of two concerns: * an extension of lucene's FieldType, defining how a field should be indexed * a set of query factory methods, defining how a field should be searched We want to break these two concerns apart. This commit is a first step to doing this, breaking the inheritance relationship between MappedFieldType and FieldType. MappedFieldType instead has a series of boolean flags defining whether or not the field is searchable or aggregatable, and FieldMapper has a separate FieldType passed to its constructor defining how indexing should be done. Relates to #56814	2020-06-16 16:56:43 +01:00
Mark Tozzi	36f551bdb4	Make ValuesSourceConfig behave like a config object (#57762 ) (#58012 )	2020-06-11 17:23:55 -04:00
Nik Everett	0a2bd10758	Save memory when parent and child are not on top (#57892 ) (#57944 ) Reworks the `parent` and `child` aggregation are not at the top level using the optimization from #55873. Instead of wrapping all non-top-level `parent` and `child` aggregators we now handle being a child aggregator in the aggregator, specifically by adding recording which global ordinals show up in the parent and then checking if they match the child.	2020-06-10 16:25:10 -04:00
Nik Everett	928794cd61	Make parent and child aggregator more obvious (#57490 ) (#57553 ) Pulls the way that the `ParentJoinAggregator` collects global ordinals into a strategy object so it is a little simpler to reason about and it'll be simpler to save memory by removing `asMultiBucketAggregator` in the future. Relates to #56487	2020-06-02 16:22:38 -04:00
Mark Tozzi	e50f514092	IndexFieldData should hold the ValuesSourceType (#57373 ) (#57532 )	2020-06-02 12:16:53 -04:00
Alan Woodward	d6b79bcd95	Remove Mapper.updateFieldType() (#57151 ) When we had multiple mapping types, an update to a field in one type had to be propagated to the same field in all other types. This was done using the Mapper.updateFieldType() method, called at the end of a merge. However, now that we only have a single type per index, this method is unnecessary and can be removed. Relates to #41059 Backport of #56986	2020-05-27 09:21:24 +01:00
Alan Woodward	18bfbeda29	Move merge compatibility logic from MappedFieldType to FieldMapper (#56915 ) Merging logic is currently split between FieldMapper, with its merge() method, and MappedFieldType, which checks for merging compatibility. The compatibility checks are called from a third class, MappingMergeValidator. This makes it difficult to reason about what is or is not compatible in updates, and even what is in fact updateable - we have a number of tests that check compatibility on changes in mapping configuration that are not in fact possible. This commit refactors the compatibility logic so that it all sits on FieldMapper, and makes it called at merge time. It adds a new FieldMapperTestCase base class that FieldMapper tests can extend, and moves the compatibility testing machinery from FieldTypeTestCase to here. Relates to #56814	2020-05-20 09:43:13 +01:00
Alan Woodward	d33d13f2be	Simplify generics on Mapper.Builder (#56747 ) Mapper.Builder currently has some complex generics on it to allow fluent builder construction. However, the second parameter, a return type from the build() method, is unnecessary, as we can use covariant return types. This commit removes this second generic parameter.	2020-05-15 12:14:49 +01:00
Mark Tozzi	b718193a01	Clean up DocValuesIndexFieldData (#56372 ) (#56684 )	2020-05-14 12:42:37 -04:00
Nik Everett	2f38aeb5e2	Save memory when numeric terms agg is not top (#55873 ) (#56454 ) Right now all implementations of the `terms` agg allocate a new `Aggregator` per bucket. This uses a bunch of memory. Exactly how much isn't clear but each `Aggregator` ends up making its own objects to read doc values which have non-trivial buffers. And it forces all of it sub-aggregations to do the same. We allocate a new `Aggregator` per bucket for two reasons: 1. We didn't have an appropriate data structure to track the sub-ordinals of each parent bucket. 2. You can only make a single call to `runDeferredCollections(long...)` per `Aggregator` which was the only way to delay collection of sub-aggregations. This change switches the method that builds aggregation results from building them one at a time to building all of the results for the entire aggregator at the same time. It also adds a fairly simplistic data structure to track the sub-ordinals for `long`-keyed buckets. It uses both of those to power numeric `terms` aggregations and removes the per-bucket allocation of their `Aggregator`. This fairly substantially reduces memory consumption of numeric `terms` aggregations that are not the "top level", especially when those aggregations contain many sub-aggregations. It also is a pretty big speed up, especially when the aggregation is under a non-selective aggregation like the `date_histogram`. I picked numeric `terms` aggregations because those have the simplest implementation. At least, I could kind of fit it in my head. And I haven't fully understood the "bytes"-based terms aggregations, but I imagine I'll be able to make similar optimizations to them in follow up changes.	2020-05-08 20:38:53 -04:00
Julie Tibshirani	e852bb29b7	Simplify signature of FieldMapper#parseCreateField. (#56144 ) `FieldMapper#parseCreateField` accepts the parse context, plus a list of fields as an output parameter. These fields are immediately added to the document through `ParseContext#doc()`. This commit simplifies the signature by removing the list of fields, and having the mappers add the fields directly to `ParseContext#doc()`. I think this is nicer for implementors, because previously fields could be added either through the list, or the context (through `add`, `addWithKey`, etc.)	2020-05-06 11:12:09 -07:00
Igor Motov	d8f9df771d	Expose agg usage in Feature Usage API (#55732 ) (#56048 ) Counts usage of the aggs and exposes them on the _nodes/usage/. Closes #53746	2020-04-30 12:53:36 -04:00

1 2 3 4

155 Commits