2013-08-28 19:24:34 -04:00
|
|
|
[[search-request-rescore]]
|
|
|
|
=== Rescoring
|
|
|
|
|
|
|
|
Rescoring can help to improve precision by reordering just the top (eg
|
|
|
|
100 - 500) documents returned by the
|
|
|
|
<<search-request-query,`query`>> and
|
2013-12-16 12:07:33 -05:00
|
|
|
<<search-request-post-filter,`post_filter`>> phases, using a
|
2013-08-28 19:24:34 -04:00
|
|
|
secondary (usually more costly) algorithm, instead of applying the
|
|
|
|
costly algorithm to all documents in the index.
|
|
|
|
|
|
|
|
A `rescore` request is executed on each shard before it returns its
|
|
|
|
results to be sorted by the node handling the overall search request.
|
|
|
|
|
|
|
|
Currently the rescore API has only one implementation: the query
|
|
|
|
rescorer, which uses a query to tweak the scoring. In the future,
|
|
|
|
alternative rescorers may be made available, for example, a pair-wise rescorer.
|
|
|
|
|
2014-10-18 05:25:50 -04:00
|
|
|
NOTE: the `rescore` phase is not executed when
|
2013-08-28 19:24:34 -04:00
|
|
|
<<search-request-search-type,`search_type`>> is set
|
|
|
|
to `scan` or `count`.
|
|
|
|
|
2014-10-18 05:25:50 -04:00
|
|
|
NOTE: when exposing pagination to your users, you should not change
|
|
|
|
`window_size` as you step through each page (by passing different
|
|
|
|
`from` values) since that can alter the top hits causing results to
|
|
|
|
confusingly shift as the user steps through pages.
|
|
|
|
|
2013-08-28 19:24:34 -04:00
|
|
|
==== Query rescorer
|
|
|
|
|
|
|
|
The query rescorer executes a second query only on the Top-K results
|
|
|
|
returned by the <<search-request-query,`query`>> and
|
2013-12-16 12:07:33 -05:00
|
|
|
<<search-request-post-filter,`post_filter`>> phases. The
|
2013-08-28 19:24:34 -04:00
|
|
|
number of docs which will be examined on each shard can be controlled by
|
|
|
|
the `window_size` parameter, which defaults to
|
|
|
|
<<search-request-from-size,`from` and `size`>>.
|
|
|
|
|
2014-01-23 10:22:39 -05:00
|
|
|
By default the scores from the original query and the rescore query are
|
|
|
|
combined linearly to produce the final `_score` for each document. The
|
|
|
|
relative importance of the original query and of the rescore query can
|
|
|
|
be controlled with the `query_weight` and `rescore_query_weight`
|
2013-08-28 19:24:34 -04:00
|
|
|
respectively. Both default to `1`.
|
|
|
|
|
|
|
|
For example:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
curl -s -XPOST 'localhost:9200/_search' -d '{
|
|
|
|
"query" : {
|
|
|
|
"match" : {
|
|
|
|
"field1" : {
|
2013-10-05 11:18:15 -04:00
|
|
|
"operator" : "or",
|
2013-08-28 19:24:34 -04:00
|
|
|
"query" : "the quick brown",
|
|
|
|
"type" : "boolean"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"rescore" : {
|
|
|
|
"window_size" : 50,
|
|
|
|
"query" : {
|
|
|
|
"rescore_query" : {
|
|
|
|
"match" : {
|
|
|
|
"field1" : {
|
|
|
|
"query" : "the quick brown",
|
|
|
|
"type" : "phrase",
|
|
|
|
"slop" : 2
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"query_weight" : 0.7,
|
|
|
|
"rescore_query_weight" : 1.2
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
'
|
|
|
|
--------------------------------------------------
|
2014-01-23 10:22:39 -05:00
|
|
|
|
2015-05-05 04:03:15 -04:00
|
|
|
The way the scores are combined can be controlled with the `score_mode`:
|
2014-01-23 10:22:39 -05:00
|
|
|
[cols="<,<",options="header",]
|
|
|
|
|=======================================================================
|
|
|
|
|Score Mode |Description
|
|
|
|
|`total` |Add the original score and the rescore query score. The default.
|
|
|
|
|`multiply` |Multiply the original score by the rescore query score. Useful
|
|
|
|
for <<query-dsl-function-score-query,`function query`>> rescores.
|
|
|
|
|`avg` |Average the original score and the rescore query score.
|
|
|
|
|`max` |Take the max of original score and the rescore query score.
|
|
|
|
|`min` |Take the min of the original score and the rescore query score.
|
|
|
|
|=======================================================================
|
2014-01-15 17:58:22 -05:00
|
|
|
|
|
|
|
==== Multiple Rescores
|
|
|
|
|
|
|
|
It is also possible to execute multiple rescores in sequence:
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
curl -s -XPOST 'localhost:9200/_search' -d '{
|
|
|
|
"query" : {
|
|
|
|
"match" : {
|
|
|
|
"field1" : {
|
|
|
|
"operator" : "or",
|
|
|
|
"query" : "the quick brown",
|
|
|
|
"type" : "boolean"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"rescore" : [ {
|
|
|
|
"window_size" : 100,
|
|
|
|
"query" : {
|
|
|
|
"rescore_query" : {
|
|
|
|
"match" : {
|
|
|
|
"field1" : {
|
|
|
|
"query" : "the quick brown",
|
|
|
|
"type" : "phrase",
|
|
|
|
"slop" : 2
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"query_weight" : 0.7,
|
|
|
|
"rescore_query_weight" : 1.2
|
|
|
|
}
|
|
|
|
}, {
|
|
|
|
"window_size" : 10,
|
|
|
|
"query" : {
|
|
|
|
"score_mode": "multiply",
|
|
|
|
"rescore_query" : {
|
|
|
|
"function_score" : {
|
|
|
|
"script_score": {
|
|
|
|
"script": "log10(doc['numeric'].value + 2)"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
} ]
|
|
|
|
}
|
|
|
|
'
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
The first one gets the results of the query then the second one gets the
|
|
|
|
results of the first, etc. The second rescore will "see" the sorting done
|
|
|
|
by the first rescore so it is possible to use a large window on the first
|
|
|
|
rescore to pull documents into a smaller window for the second rescore.
|