[DOCS] Rewrite dis max query (#43586)

This commit is contained in:
James Rodewig 2019-07-03 08:55:50 -04:00
parent cff027499a
commit e2a9a787fc
1 changed files with 47 additions and 32 deletions

View File

@ -1,48 +1,63 @@
[[query-dsl-dis-max-query]]
=== Dis Max Query
=== Disjunction Max Query
A query that generates the union of documents produced by its
subqueries, and that scores each document with the maximum score for
that document as produced by any subquery, plus a tie breaking increment
for any additional matching subqueries.
Returns documents matching one or more wrapped queries, called query clauses or
clauses.
This is useful when searching for a word in multiple fields with
different boost factors (so that the fields cannot be combined
equivalently into a single search field). We want the primary score to
be the one associated with the highest boost, not the sum of the field
scores (as Boolean Query would give). If the query is "albino elephant"
this ensures that "albino" matching one field and "elephant" matching
another gets a higher score than "albino" matching both fields. To get
this result, use both Boolean Query and DisjunctionMax Query: for each
term a DisjunctionMaxQuery searches for it in each field, while the set
of these DisjunctionMaxQuery's is combined into a BooleanQuery.
If a returned document matches multiple query clauses, the `dis_max` query
assigns the document the highest relevance score from any matching clause, plus
a tie breaking increment for any additional matching subqueries.
The tie breaker capability allows results that include the same term in
multiple fields to be judged better than results that include this term
in only the best of those multiple fields, without confusing this with
the better case of two different terms in the multiple fields. The
default `tie_breaker` is `0.0`.
You can use the `dis_max` to search for a term in fields mapped with different
<<mapping-boost,boost>> factors.
This query maps to Lucene `DisjunctionMaxQuery`.
[[query-dsl-dis-max-query-ex-request]]
==== Example request
[source,js]
--------------------------------------------------
----
GET /_search
{
"query": {
"dis_max" : {
"tie_breaker" : 0.7,
"boost" : 1.2,
"queries" : [
{
"term" : { "age" : 34 }
},
{
"term" : { "age" : 35 }
}
]
{ "term" : { "title" : "Quick pets" }},
{ "term" : { "body" : "Quick pets" }}
],
"tie_breaker" : 0.7
}
}
}
--------------------------------------------------
----
// CONSOLE
[[query-dsl-dis-max-query-top-level-params]]
==== Top-level parameters for `dis_max`
`queries` (Required)::
(array of query objects) Contains one or more query clauses. Returned documents
**must match one or more** of these queries. If a document matches multiple
queries, {es} uses the highest <<query-filter-context, relevance score>>.
`tie_breaker` (Optional)::
+
--
(float) Floating point number between `0` and `1.0` used to increase the
<<query-filter-context, relevance scores>> of documents matching multiple query
clauses. Defaults to `0.0`.
You can use the `tie_breaker` value to assign higher relevance scores to
documents that contain the same term in multiple fields than documents that
contain this term in only the best of those multiple fields, without confusing
this with the better case of two different terms in the multiple fields.
If a document matches multiple clauses, the `dis_max` query calculates the
relevance score for the document as follows:
. Take the relevance score from a matching clause with the highest score.
. Multiply the score from any other matching clauses by the `tie_breaker` value.
. Add the highest score to the multiplied scores.
If the `tie_breaker` value is greater than `0.0`, all matching clauses count,
but the clause with the highest score counts most.
--