The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values.
This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on.
NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document.
Fixes#8192
Doc values can now only be enabled by setting `doc_values: true` in the
mappings. Removing this feature also means that we can now fail mapping updates
that try to disable doc values.
Doc values currently default to `true` if the field is indexed and not analyzed.
So setting `index:no` automatically disables doc values, which is not explicit
in the documentation.
This commit makes doc values default to true for numerics, booleans regardless
of whether they are indexed. Not indexed strings still don't have doc values,
since we can't know whether it is rather a text or keyword field. This
potential source of confusion should go away when we split `string` into `text`
and `keyword`.
RescoreBuilder: Add parsing and creating of RescoreSearchContext
Adding the ability to parse from xContent to the rescore builder. Also making RescoreBuilder an abstract base class that encapsulates the window_size setting, with QueryRescoreBuilder as its only implementation at the moment.
Relates to #15559
Merge feature/ingest branch into master branch.
This adds the ingest feature to ES that allows to preprocess document before indexing on an ingest node.
By default a node is an ingest node. Documents are preprocessed via a pipeline. A pipeline consists
out of one or more processors Each processor makes one or more modifications to a document processed.
There are many types of processors available out-of-the-box that are designed to make a specific change to a document being processed. In a cluster many pipeline can be configured via dedicated pipeline APIs. An new option on the bulk
and index APIs allows to control what pipeline is picked for preprocessing. If no pipeline is specified then the ingest
feature is skipped and no preprocessing takes place.
- move ingest plugin docs to core reference docs
- move geoip processor docs to plugins/ingest-geoip.asciidoc
- add missing options tables for some processors
- add description of pipeline definition
- add description of processor definitions including common parameters
like "tag" and "on_failure"
With this commit we deprecate the widely misunderstood
fuzzy query but will still allow the fuzziness
parameter in match queries and suggesters.
Relates to #15760
This commit modifies the default setting for standard output in the
systemd configuration to the journal instead of /dev/null. This is to
address a user pain point where Elasticsearch would fail to start but
the error message would be sent to standard output and therefore
/dev/null leading to difficult-to-debug situations.
* Cleaned up MapperService#searchFilter(...) and moved it DefaultSearchContext, since it that class was the only user. As part of the cleanup percolate query documents are no longer excluded from the search response.
* Removed resolveClosestNestedObjectMapper(...) method as it was no longer used.
* Removed DocumentTypeListener infrastructure. Before it was used by the percolator and parent/child, but these features no longer use it.
Closes#15924
* Remove remaining 1.x bwc logic.
* Stop storing stored fields and indexed terms. The _parent field's only purpose is to support joins between parent and child type and only storing doc values is sufficient.
* In the mapping the parent field mapper is now known under '{parent}#{child}' key, because this is the field the parent/child join uses too.
* Added new sub fetch phase to lookup that _parent field from doc values field if that is required (before this was fetched from stored _parent field)
* Removed the ability to query directly on `_parent` in the query dsl. Instead the `{parent}#{child}` field should be used. Under the hood a doc values query is used instead of a term query, because only doc values fields are stored now.
* Added a new `parent_id` query to easily query child documents with a specific parent id without having to know what join field to use
* Also in aggregations `_parent` field can't be used any more and `{parent}#{child}` field name should be used instead to aggregate directly on the _parent join field.
This commit modifies the load_average in the node stats API response
to be an object containing the one-minute, five-minute and
fifteen-minute load averages as fields (if those values are
available). Additionally, this commit modifies the cat nodes API
response to format the one-minute, five-minute and fifteen-minute load
averages as null if any of the respective values are not available.
Site plugins used to be used for things like kibana and marvel, but
there is no longer a need since kibana (and marvel as a kibana plugin)
uses node.js. This change removes site plugins, as well as the flag for
jvm plugins. Now all plugins are jvm plugins.
The indexing buffer on a node (default: 10% of the JVM heap) is now a "shared pool" across all shards on that node. This way, shards doing intense indexing can use much more than other shards doing only light indexing, and only once the sum of all indexing buffers across all shards exceeds the node's indexing buffer will we ask shards to move recently indexed documents to segments on disk.