Indexing ids in binary form should help with indexing speed since we would
have to compare fewer bytes upon sorting, should help with memory usage of
the live version map since keys will be shorter, and might help with disk
usage depending on how efficient the terms dictionary is at compressing
terms.
Since we can only expect base64 ids in the auto-generated case, this PR tries
to use an encoding that makes the binary id equal to the base64-decoded id in
the majority of cases (253 out of 256). It also specializes numeric ids, since
this seems to be common when content that is stored in Elasticsearch comes
from another database that uses eg. auto-increment ids.
Another option could be to require base64 ids all the time. It would make things
simpler but I'm not sure users would welcome this requirement.
This PR should bring some benefits, but I expect it to be mostly useful when
coupled with something like #24615.
Closes#18154
QueryParseContext is currently only used as a wrapper for an XContentParser, so
this change removes it entirely and changes the appropriate APIs that use it so
far to only accept a parser instead.
Currently QueryParseContext is only a thin wrapper around an XContentParser that
adds little functionality of its own. I provides helpers for long deprecated
field names which can be removed and two helper methods that can be made static
and moved to other classes. This is a first step in helping to remove
QueryParseContext entirely.
This removes the remaining usage of `mapping.single_type` from the parent join
module and moves it's bwc test to the mixed cluster tests
Relates to #24961
Relates to #20257
This change cleans up remaining tests to not use index.mapping.single_type=false
but instead where applicable use a single type or markt the index as created
with a pre 6.x version.
Yet, there is still on leftover in the client tests that needs special attention.
See `org.elasticsearch.client.SearchIT`
Relates to #24961
* Add a section named "relation" in the ParentJoinFieldMapper
This commit puts the parent/child definition in an inner section named "relation".
Mapping for the parent-join will look like this:
```
"join_field": {
"type": "join"
"relations":
"parent": "child"
}
}
```
This commit adds a NamedXContentProvider interface that can
be implemented by plugins or modules using Java's SPI feature
in order to provide additional NamedXContent parsers to external
applications like the Java High Level Rest Client.
The `scorerSupplier` API allows to give a hint to queries in order to let them
know that they will be consumed in a random-access fashion. We should use this
for aggregations, function_score and matched queries.
This change moves the parent_id query to the parent-join module and handles the case when only the parent-join field can be declared on an index (index with single type on).
If single type is off it uses the legacy parent join field mapper and switch to the new one otherwise (default in 6).
Relates #20257
This change ensures that there is a single parent-join field defined per mapping.
The verification is done through the addition of a special field mapper (MetaJoinFieldMapper) with a unique name (_parent_join) that is registered to the mapping service
when the first parent-join field is defined. If a new parent-join is added, this field mapper will clash with the new one and the update will fail.
This change also simplifies the parent join fetch sub phase by retrieving the parent-join field without iterating on all fields in the mapping.
* Introduce ParentJoinFieldMapper, a field mapper that creates parent/child relation within documents of the same index
This change adds a new field mapper named ParentJoinFieldMapper. This mapper is a replacement for the ParentFieldMapper but instead of using the types in the mapping
it uses an internal field to materialize parent/child relation within a single index.
This change also adds a fetch sub phase that automatically retrieves the join name (parent or child name) and the parent id for child documents in the response hit fields.
The compatibility with `has_parent`, `has_child` queries and `children` agg will be added in a follow up.
Relates #20257
Removes the need for the `_UNRELEASED` suffix on versions by detecting if a version should be unreleased or not based on the versions around it. This should make it simpler to automate the task of adding a new version label.
This commit moves the handling of nested and parent/child inner hits to specialized classes that can be defined outside of ES core.
InnerHitBuilderContext is now used by the parent query (nested or hasChild, ...) to build the sub context from the InnerHitBuilder definition.
BWC is also ensured so that nodes in previous versions can still send/receive inner hits to/from this version.
Relates #20257
This change removes the field data specialization needed for the parent field and replaces it with
a simple DocValuesIndexFieldData. The underlying global ordinals are retrieved via a new function called
IndexOrdinalsFieldData#getOrdinalMap.
The children aggregation is also modified to use a simple WithOrdinals value source rather than the deleted WithOrdinals.Parent.
Relates #20257
This commit renames all rest test files to use the .yml extension
instead of .yaml. This way the extension used within all of
elasticsearch for yaml is consistent.
* Add parent-join module
This change adds a new module named `parent-join`.
The goal of this module is to provide a replacement for the `_parent` field but as a first step this change only moves the `has_child`, `has_parent` queries and the `children` aggregation to this module.
These queries and aggregations are no longer in core but they are deployed by default as a module.
Relates #20257