187 lines
5.4 KiB
Plaintext
187 lines
5.4 KiB
Plaintext
[[breaking_20_query_dsl_changes]]
|
|
=== Query DSL changes
|
|
|
|
==== Queries and filters merged
|
|
|
|
Queries and filters have been merged -- all filter clauses are now query
|
|
clauses. Instead, query clauses can now be used in _query context_ or in
|
|
_filter context_:
|
|
|
|
Query context::
|
|
|
|
A query used in query context will calculate relevance scores and will not be
|
|
cacheable. Query context is used whenever filter context does not apply.
|
|
|
|
Filter context::
|
|
+
|
|
--
|
|
|
|
A query used in filter context will not calculate relevance scores, and will
|
|
be cacheable. Filter context is introduced by:
|
|
|
|
* the `constant_score` query
|
|
* the `must_not` and (newly added) `filter` parameter in the `bool` query
|
|
* the `filter` and `filters` parameters in the `function_score` query
|
|
* any API called `filter`, such as the `post_filter` search parameter, or in
|
|
aggregations or index aliases
|
|
--
|
|
|
|
As a result of this change, the `execution` option of the `terms` filter is now
|
|
deprecated and ignored if provided.
|
|
|
|
==== `or` and `and` now implemented via `bool`
|
|
|
|
The `or` and `and` filters previously had a different execution pattern to the
|
|
`bool` filter. It used to be important to use `and`/`or` with certain filter
|
|
clauses, and `bool` with others.
|
|
|
|
This distinction has been removed: the `bool` query is now smart enough to
|
|
handle both cases optimally. As a result of this change, the `or` and `and`
|
|
filters are now sugar syntax which are executed internally as a `bool` query.
|
|
These filters may be removed in the future.
|
|
|
|
==== `filtered` query and `query` filter deprecated
|
|
|
|
The `query` filter is deprecated as is it no longer needed -- all queries can
|
|
be used in query or filter context.
|
|
|
|
The `filtered` query is deprecated in favour of the `bool` query. Instead of
|
|
the following:
|
|
|
|
[source,js]
|
|
-------------------------
|
|
GET _search
|
|
{
|
|
"query": {
|
|
"filtered": {
|
|
"query": {
|
|
"match": {
|
|
"text": "quick brown fox"
|
|
}
|
|
},
|
|
"filter": {
|
|
"term": {
|
|
"status": "published"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
-------------------------
|
|
|
|
move the query and filter to the `must` and `filter` parameters in the `bool`
|
|
query:
|
|
|
|
[source,js]
|
|
-------------------------
|
|
GET _search
|
|
{
|
|
"query": {
|
|
"bool": {
|
|
"must": {
|
|
"match": {
|
|
"text": "quick brown fox"
|
|
}
|
|
},
|
|
"filter": {
|
|
"term": {
|
|
"status": "published"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
-------------------------
|
|
|
|
==== Filter auto-caching
|
|
|
|
It used to be possible to control which filters were cached with the `_cache`
|
|
option and to provide a custom `_cache_key`. These options are deprecated
|
|
and, if present, will be ignored.
|
|
|
|
Query clauses used in filter context are now auto-cached when it makes sense
|
|
to do so. The algorithm takes into account the frequency of use, the cost of
|
|
query execution, and the cost of building the filter.
|
|
|
|
The `terms` filter lookup mechanism no longer caches the values of the
|
|
document containing the terms. It relies on the filesystem cache instead. If
|
|
the lookup index is not too large, it is recommended to replicate it to all
|
|
nodes by setting `index.auto_expand_replicas: 0-all` in order to remove the
|
|
network overhead as well.
|
|
|
|
==== Numeric queries use IDF for scoring
|
|
|
|
Previously, term queries on numeric fields were deliberately prevented from
|
|
using the usual Lucene scoring logic and this behaviour was undocumented and,
|
|
to some, unexpected.
|
|
|
|
Single `term` queries on numeric fields now score in the same way as string
|
|
fields, using IDF and norms (if enabled).
|
|
|
|
To query numeric fields without scoring, the query clause should be used in
|
|
filter context, e.g. in the `filter` parameter of the `bool` query, or wrapped
|
|
in a `constant_score` query:
|
|
|
|
[source,js]
|
|
----------------------------
|
|
GET _search
|
|
{
|
|
"query": {
|
|
"bool": {
|
|
"must": [
|
|
{
|
|
"match": { <1>
|
|
"numeric_tag": 5
|
|
}
|
|
}
|
|
],
|
|
"filter": [
|
|
{
|
|
"match": { <2>
|
|
"count": 5
|
|
}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
----------------------------
|
|
<1> This clause would include IDF in the relevance score calculation.
|
|
<2> This clause would have no effect on the relevance score.
|
|
|
|
==== Fuzziness and fuzzy-like-this
|
|
|
|
Fuzzy matching used to calculate the score for each fuzzy alternative, meaning
|
|
that rare misspellings would have a higher score than the more common correct
|
|
spellings. Now, fuzzy matching blends the scores of all the fuzzy alternatives
|
|
to use the IDF of the most frequently occurring alternative.
|
|
|
|
Fuzziness can no longer be specified using a percentage, but should instead
|
|
use the number of allowed edits:
|
|
|
|
* `0`, `1`, `2`, or
|
|
* `AUTO` (which chooses `0`, `1`, or `2` based on the length of the term)
|
|
|
|
The `fuzzy_like_this` and `fuzzy_like_this_field` queries used a very
|
|
expensive approach to fuzzy matching and have been removed.
|
|
|
|
==== More Like This
|
|
|
|
The More Like This (`mlt`) API and the `more_like_this_field` (`mlt_field`)
|
|
query have been removed in favor of the
|
|
<<query-dsl-mlt-query, `more_like_this`>> query.
|
|
|
|
The parameter `percent_terms_to_match` has been removed in favor of
|
|
`minimum_should_match`.
|
|
|
|
==== `limit` filter deprecated
|
|
|
|
The `limit` filter is deprecated and becomes a no-op. You can achieve similar
|
|
behaviour using the <<search-request-body,terminate_after>> parameter.
|
|
|
|
==== Java plugins registering custom queries
|
|
|
|
Java plugins that register custom queries can do so by using the
|
|
`IndicesQueriesModule#addQuery(Class<? extends QueryParser>)` method. Other
|
|
ways to register custom queries are not supported anymore.
|