2019-07-17 08:49:22 -04:00
|
|
|
[[request-body-search-rescore]]
|
2019-07-19 14:35:36 -04:00
|
|
|
==== Rescoring
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
Rescoring can help to improve precision by reordering just the top (eg
|
|
|
|
100 - 500) documents returned by the
|
2019-07-19 09:16:35 -04:00
|
|
|
<<request-body-search-query,`query`>> and
|
|
|
|
<<request-body-search-post-filter,`post_filter`>> phases, using a
|
2013-08-28 19:24:34 -04:00
|
|
|
secondary (usually more costly) algorithm, instead of applying the
|
|
|
|
costly algorithm to all documents in the index.
|
|
|
|
|
|
|
|
A `rescore` request is executed on each shard before it returns its
|
|
|
|
results to be sorted by the node handling the overall search request.
|
|
|
|
|
|
|
|
Currently the rescore API has only one implementation: the query
|
2016-09-14 12:05:36 -04:00
|
|
|
rescorer, which uses a query to tweak the scoring. In the future,
|
2013-08-28 19:24:34 -04:00
|
|
|
alternative rescorers may be made available, for example, a pair-wise rescorer.
|
|
|
|
|
2019-07-19 09:16:35 -04:00
|
|
|
NOTE: An error will be thrown if an explicit <<request-body-search-sort,`sort`>>
|
2018-07-06 04:11:36 -04:00
|
|
|
(other than `_score` in descending order) is provided with a `rescore` query.
|
2016-09-14 12:05:36 -04:00
|
|
|
|
2014-10-18 05:25:50 -04:00
|
|
|
NOTE: when exposing pagination to your users, you should not change
|
|
|
|
`window_size` as you step through each page (by passing different
|
|
|
|
`from` values) since that can alter the top hits causing results to
|
|
|
|
confusingly shift as the user steps through pages.
|
|
|
|
|
2019-07-19 14:35:36 -04:00
|
|
|
===== Query rescorer
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
The query rescorer executes a second query only on the Top-K results
|
2019-07-19 09:16:35 -04:00
|
|
|
returned by the <<request-body-search-query,`query`>> and
|
|
|
|
<<request-body-search-post-filter,`post_filter`>> phases. The
|
2013-08-28 19:24:34 -04:00
|
|
|
number of docs which will be examined on each shard can be controlled by
|
2018-07-04 08:07:20 -04:00
|
|
|
the `window_size` parameter, which defaults to 10.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
2014-01-23 10:22:39 -05:00
|
|
|
By default the scores from the original query and the rescore query are
|
|
|
|
combined linearly to produce the final `_score` for each document. The
|
|
|
|
relative importance of the original query and of the rescore query can
|
|
|
|
be controlled with the `query_weight` and `rescore_query_weight`
|
2013-08-28 19:24:34 -04:00
|
|
|
respectively. Both default to `1`.
|
|
|
|
|
|
|
|
For example:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2017-02-07 16:24:05 -05:00
|
|
|
POST /_search
|
|
|
|
{
|
2013-08-28 19:24:34 -04:00
|
|
|
"query" : {
|
|
|
|
"match" : {
|
2017-02-07 16:24:05 -05:00
|
|
|
"message" : {
|
2013-10-05 11:18:15 -04:00
|
|
|
"operator" : "or",
|
2017-02-07 16:24:05 -05:00
|
|
|
"query" : "the quick brown"
|
2013-08-28 19:24:34 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"rescore" : {
|
|
|
|
"window_size" : 50,
|
|
|
|
"query" : {
|
|
|
|
"rescore_query" : {
|
2017-02-07 16:24:05 -05:00
|
|
|
"match_phrase" : {
|
|
|
|
"message" : {
|
2013-08-28 19:24:34 -04:00
|
|
|
"query" : "the quick brown",
|
|
|
|
"slop" : 2
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"query_weight" : 0.7,
|
|
|
|
"rescore_query_weight" : 1.2
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2017-02-07 16:24:05 -05:00
|
|
|
// CONSOLE
|
|
|
|
// TEST[setup:twitter]
|
2014-01-23 10:22:39 -05:00
|
|
|
|
2015-05-05 04:03:15 -04:00
|
|
|
The way the scores are combined can be controlled with the `score_mode`:
|
2014-01-23 10:22:39 -05:00
|
|
|
[cols="<,<",options="header",]
|
|
|
|
|=======================================================================
|
|
|
|
|Score Mode |Description
|
|
|
|
|`total` |Add the original score and the rescore query score. The default.
|
|
|
|
|`multiply` |Multiply the original score by the rescore query score. Useful
|
|
|
|
for <<query-dsl-function-score-query,`function query`>> rescores.
|
|
|
|
|`avg` |Average the original score and the rescore query score.
|
|
|
|
|`max` |Take the max of original score and the rescore query score.
|
|
|
|
|`min` |Take the min of the original score and the rescore query score.
|
|
|
|
|=======================================================================
|
2014-01-15 17:58:22 -05:00
|
|
|
|
2019-07-19 14:35:36 -04:00
|
|
|
===== Multiple Rescores
|
2014-01-15 17:58:22 -05:00
|
|
|
|
|
|
|
It is also possible to execute multiple rescores in sequence:
|
2017-02-07 16:24:05 -05:00
|
|
|
|
2014-01-15 17:58:22 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2017-02-07 16:24:05 -05:00
|
|
|
POST /_search
|
|
|
|
{
|
2014-01-15 17:58:22 -05:00
|
|
|
"query" : {
|
|
|
|
"match" : {
|
2017-02-07 16:24:05 -05:00
|
|
|
"message" : {
|
2014-01-15 17:58:22 -05:00
|
|
|
"operator" : "or",
|
2017-02-07 16:24:05 -05:00
|
|
|
"query" : "the quick brown"
|
2014-01-15 17:58:22 -05:00
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"rescore" : [ {
|
|
|
|
"window_size" : 100,
|
|
|
|
"query" : {
|
|
|
|
"rescore_query" : {
|
2017-02-07 16:24:05 -05:00
|
|
|
"match_phrase" : {
|
|
|
|
"message" : {
|
2014-01-15 17:58:22 -05:00
|
|
|
"query" : "the quick brown",
|
|
|
|
"slop" : 2
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"query_weight" : 0.7,
|
|
|
|
"rescore_query_weight" : 1.2
|
|
|
|
}
|
|
|
|
}, {
|
|
|
|
"window_size" : 10,
|
|
|
|
"query" : {
|
|
|
|
"score_mode": "multiply",
|
|
|
|
"rescore_query" : {
|
|
|
|
"function_score" : {
|
|
|
|
"script_score": {
|
2016-06-27 09:55:16 -04:00
|
|
|
"script": {
|
2017-06-09 11:29:25 -04:00
|
|
|
"source": "Math.log10(doc.likes.value + 2)"
|
2016-06-27 09:55:16 -04:00
|
|
|
}
|
2014-01-15 17:58:22 -05:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
} ]
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2017-02-07 16:24:05 -05:00
|
|
|
// CONSOLE
|
|
|
|
// TEST[setup:twitter]
|
2014-01-15 17:58:22 -05:00
|
|
|
|
|
|
|
The first one gets the results of the query then the second one gets the
|
|
|
|
results of the first, etc. The second rescore will "see" the sorting done
|
|
|
|
by the first rescore so it is possible to use a large window on the first
|
|
|
|
rescore to pull documents into a smaller window for the second rescore.
|