OpenSearch/docs/reference/migration/migrate_6_0/search.asciidoc

138 lines
5.8 KiB
Plaintext

[[breaking_60_search_changes]]
=== Search and Query DSL changes
==== Changes to queries
* The `collect_payloads` parameter of the `span_near` query has been removed. Payloads will be
loaded when needed.
* Queries on boolean fields now strictly parse boolean-like values. This means
only the strings `"true"` and `"false"` will be parsed into their boolean
counterparts. Other strings will cause an error to be thrown.
* The `in` query (a synonym for the `terms` query) has been removed
* The `geo_bbox` query (a synonym for the `geo_bounding_box` query) has been removed
* The `mlt` query (a synonym for the `more_like_this` query) has been removed
* The `fuzzy_match` and `match_fuzzy` query (synonyma for the `match` query) have been removed
* The `terms` query now always returns scores equal to `1` and is not subject to
`indices.query.bool.max_clause_count` anymore.
* The deprecated `indices` query has been removed.
* Support for empty query objects (`{ }`) has been removed from the query DSL.
An error is thrown whenever an empty query object is provided.
* The deprecated `minimum_number_should_match` parameter in the `bool` query has
been removed, use `minimum_should_match` instead.
* The `query_string` query now correctly parses the maximum number of
states allowed when
"https://en.wikipedia.org/wiki/Powerset_construction#Complexity[determinizing]"
a regex as `max_determinized_states` instead of the typo
`max_determined_states`.
* The `query_string` query no longer accepts `enable_position_increment`, use
`enable_position_increments` instead.
* For `geo_distance` queries, sorting, and aggregations the `sloppy_arc` option
has been removed from the `distance_type` parameter.
* The `geo_distance_range` query, which was deprecated in 5.0, has been removed.
* The `optimize_bbox` parameter has been removed from `geo_distance` queries.
* The `ignore_malformed` and `coerce` parameters have been removed from
`geo_bounding_box`, `geo_polygon`, and `geo_distance` queries.
* The `disable_coord` parameter of the `bool` and `common_terms` queries has
been removed. If provided, it will be ignored and issue a deprecation warning.
* The `template` query has been removed. This query was deprecated since 5.0
* The `percolate` query's `document_type` has been deprecated. From 6.0 and later
it is no longer required to specify the `document_type` parameter.
* The `split_on_whitespace` parameter for the `query_string` query has been removed.
If provided, it will be ignored and issue a deprecation warning.
The `query_string` query now splits on operator only.
* The `use_dismax` parameter for the `query_string` query has been removed.
If provided, it will be ignored and issue a deprecation warning.
The `tie_breaker` parameter must be used instead.
* The `auto_generate_phrase_queries` parameter for the `query_string` query has been removed,
use an explicit quoted query instead.
If provided, it will be ignored and issue a deprecation warning.
* The `all_fields` parameter for the `query_string` has been removed.
Set `default_field` to *` instead.
If provided, `default_field` will be automatically set to `*`
* The `index` parameter in the terms filter, used to look up terms in a dedicated index is
now mandatory. Previously, the index defaulted to the index the query was executed on. Now this index
must be explicitly set in the request.
==== Search shards API
The search shards API no longer accepts the `type` url parameter, which didn't
have any effect in previous versions.
==== Changes to the Profile API
The `"time"` field showing human readable timing output has been replaced by the `"time_in_nanos"`
field which displays the elapsed time in nanoseconds. The `"time"` field can be turned on by adding
`"?human=true"` to the request url. It will display a rounded, human readable time value.
==== Scoring changes
===== Query normalization
Query normalization has been removed. This means that the TF-IDF similarity no
longer tries to make scores comparable across queries and that boosts are now
integrated into scores as simple multiplicative factors.
Other similarities are not affected as they did not normalize scores and
already integrated boosts into scores as multiplicative factors.
See https://issues.apache.org/jira/browse/LUCENE-7347[`LUCENE-7347`] for more
information.
===== Coordination factors
Coordination factors have been removed from the scoring formula. This means that
boolean queries no longer score based on the number of matching clauses.
Instead, they always return the sum of the scores of the matching clauses.
As a consequence, use of the TF-IDF similarity is now discouraged as this was
an important component of the quality of the scores that this similarity
produces. BM25 is recommended instead.
See https://issues.apache.org/jira/browse/LUCENE-7347[`LUCENE-7347`] for more
information.
==== Fielddata on `_uid`
Fielddata on `_uid` is deprecated. It is possible to switch to `_id` instead
but the only reason why it has not been deprecated too is because it is used
for the `random_score` function. If you really need access to the id of
documents for sorting, aggregations or search scripts, the recommendation is
to duplicate the id as a field in the document.
==== Highlighters
The `unified` highlighter is the new default choice for highlighter.
The offset strategy for each field is picked internally by this highlighter depending on the
type of the field (`index_options`).
It is still possible to force the highlighter to `fvh` or `plain` types.
The `postings` highlighter has been removed from Lucene and Elasticsearch.
The `unified` highlighter outputs the same highlighting when `index_options` is set
to `offsets`.
==== `fielddata_fields`
The deprecated `fielddata_fields` have now been removed. `docvalue_fields` should be used instead.