mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-07 13:38:49 +00:00
While function scores using scripts do allow explanations, they are only creatable with an expert plugin. This commit improves the situation for the newer script score query by adding the ability to set the explanation from the script itself. To set the explanation, a user would check for `explanation != null` to indicate an explanation is needed, and then call `explanation.set("some description")`.
363 lines
12 KiB
Plaintext
363 lines
12 KiB
Plaintext
[[query-dsl-script-score-query]]
|
|
=== Script score query
|
|
++++
|
|
<titleabbrev>Script score</titleabbrev>
|
|
++++
|
|
|
|
Uses a <<modules-scripting,script>> to provide a custom score for returned
|
|
documents.
|
|
|
|
The `script_score` query is useful if, for example, a scoring function is expensive and you only need to calculate the score of a filtered set of documents.
|
|
|
|
|
|
[[script-score-query-ex-request]]
|
|
==== Example request
|
|
The following `script_score` query assigns each returned document a score equal to the `likes` field value divided by `10`.
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"query" : {
|
|
"script_score" : {
|
|
"query" : {
|
|
"match": { "message": "elasticsearch" }
|
|
},
|
|
"script" : {
|
|
"source" : "doc['likes'].value / 10 "
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
|
|
[[script-score-top-level-params]]
|
|
==== Top-level parameters for `script_score`
|
|
`query`::
|
|
(Required, query object) Query used to return documents.
|
|
|
|
`script`::
|
|
+
|
|
--
|
|
(Required, <<modules-scripting-using,script object>>) Script used to compute the score of documents returned by the `query`.
|
|
|
|
IMPORTANT: Final relevance scores from the `script_score` query cannot be
|
|
negative. To support certain search optimizations, Lucene requires
|
|
scores be positive or `0`.
|
|
--
|
|
|
|
`min_score`::
|
|
(Optional, float) Documents with a <<relevance-scores,relevance score>> lower
|
|
than this floating point number are excluded from the search results.
|
|
|
|
|
|
[[script-score-query-notes]]
|
|
==== Notes
|
|
|
|
[[script-score-access-scores]]
|
|
===== Use relevance scores in a script
|
|
|
|
Within a script, you can
|
|
{ref}/modules-scripting-fields.html#scripting-score[access]
|
|
the `_score` variable which represents the current relevance score of a
|
|
document.
|
|
|
|
[[script-score-predefined-functions]]
|
|
===== Predefined functions
|
|
You can use any of the available {painless}/painless-contexts.html[painless
|
|
functions] in your `script`. You can also use the following predefined functions
|
|
to customize scoring:
|
|
|
|
* <<script-score-saturation>>
|
|
* <<script-score-sigmoid>>
|
|
* <<random-score-function>>
|
|
* <<decay-functions-numeric-fields>>
|
|
* <<decay-functions-geo-fields>>
|
|
* <<decay-functions-date-fields>>
|
|
* <<script-score-functions-vector-fields>>
|
|
|
|
We suggest using these predefined functions instead of writing your own.
|
|
These functions take advantage of efficiencies from {es}' internal mechanisms.
|
|
|
|
[[script-score-saturation]]
|
|
====== Saturation
|
|
`saturation(value,k) = value/(k + value)`
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
"script" : {
|
|
"source" : "saturation(doc['likes'].value, 1)"
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
[[script-score-sigmoid]]
|
|
====== Sigmoid
|
|
`sigmoid(value, k, a) = value^a/ (k^a + value^a)`
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
"script" : {
|
|
"source" : "sigmoid(doc['likes'].value, 2, 1)"
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
[[random-score-function]]
|
|
====== Random score function
|
|
`random_score` function generates scores that are uniformly distributed
|
|
from 0 up to but not including 1.
|
|
|
|
`randomScore` function has the following syntax:
|
|
`randomScore(<seed>, <fieldName>)`.
|
|
It has a required parameter - `seed` as an integer value,
|
|
and an optional parameter - `fieldName` as a string value.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
"script" : {
|
|
"source" : "randomScore(100, '_seq_no')"
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
If the `fieldName` parameter is omitted, the internal Lucene
|
|
document ids will be used as a source of randomness. This is very efficient,
|
|
but unfortunately not reproducible since documents might be renumbered
|
|
by merges.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
"script" : {
|
|
"source" : "randomScore(100)"
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
Note that documents that are within the same shard and have the
|
|
same value for field will get the same score, so it is usually desirable
|
|
to use a field that has unique values for all documents across a shard.
|
|
A good default choice might be to use the `_seq_no`
|
|
field, whose only drawback is that scores will change if the document is
|
|
updated since update operations also update the value of the `_seq_no` field.
|
|
|
|
|
|
[[decay-functions-numeric-fields]]
|
|
====== Decay functions for numeric fields
|
|
You can read more about decay functions
|
|
{ref}/query-dsl-function-score-query.html#function-decay[here].
|
|
|
|
* `double decayNumericLinear(double origin, double scale, double offset, double decay, double docValue)`
|
|
* `double decayNumericExp(double origin, double scale, double offset, double decay, double docValue)`
|
|
* `double decayNumericGauss(double origin, double scale, double offset, double decay, double docValue)`
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
"script" : {
|
|
"source" : "decayNumericLinear(params.origin, params.scale, params.offset, params.decay, doc['dval'].value)",
|
|
"params": { <1>
|
|
"origin": 20,
|
|
"scale": 10,
|
|
"decay" : 0.5,
|
|
"offset" : 0
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
<1> Using `params` allows to compile the script only once, even if params change.
|
|
|
|
[[decay-functions-geo-fields]]
|
|
====== Decay functions for geo fields
|
|
|
|
* `double decayGeoLinear(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)`
|
|
|
|
* `double decayGeoExp(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)`
|
|
|
|
* `double decayGeoGauss(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)`
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
"script" : {
|
|
"source" : "decayGeoExp(params.origin, params.scale, params.offset, params.decay, doc['location'].value)",
|
|
"params": {
|
|
"origin": "40, -70.12",
|
|
"scale": "200km",
|
|
"offset": "0km",
|
|
"decay" : 0.2
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
[[decay-functions-date-fields]]
|
|
====== Decay functions for date fields
|
|
|
|
* `double decayDateLinear(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)`
|
|
|
|
* `double decayDateExp(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)`
|
|
|
|
* `double decayDateGauss(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)`
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
"script" : {
|
|
"source" : "decayDateGauss(params.origin, params.scale, params.offset, params.decay, doc['date'].value)",
|
|
"params": {
|
|
"origin": "2008-01-01T01:00:00Z",
|
|
"scale": "1h",
|
|
"offset" : "0",
|
|
"decay" : 0.5
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
NOTE: Decay functions on dates are limited to dates in the default format
|
|
and default time zone. Also calculations with `now` are not supported.
|
|
|
|
[[script-score-functions-vector-fields]]
|
|
====== Functions for vector fields
|
|
<<vector-functions, Functions for vector fields>> are accessible through
|
|
`script_score` query.
|
|
|
|
[[script-score-faster-alt]]
|
|
===== Faster alternatives
|
|
The `script_score` query calculates the score for
|
|
every matching document, or hit. There are faster alternative query types that
|
|
can efficiently skip non-competitive hits:
|
|
|
|
* If you want to boost documents on some static fields, use the
|
|
<<query-dsl-rank-feature-query, `rank_feature`>> query.
|
|
* If you want to boost documents closer to a date or geographic point, use the
|
|
<<query-dsl-distance-feature-query, `distance_feature`>> query.
|
|
|
|
[[script-score-function-score-transition]]
|
|
===== Transition from the function score query
|
|
We are deprecating the <<query-dsl-function-score-query, `function_score`>>
|
|
query. We recommend using the `script_score` query instead.
|
|
|
|
You can implement the following functions from the `function_score` query using
|
|
the `script_score` query:
|
|
|
|
* <<script-score>>
|
|
* <<weight>>
|
|
* <<random-score>>
|
|
* <<field-value-factor>>
|
|
* <<decay-functions>>
|
|
|
|
[[script-score]]
|
|
====== `script_score`
|
|
What you used in `script_score` of the Function Score query, you
|
|
can copy into the Script Score query. No changes here.
|
|
|
|
[[weight]]
|
|
====== `weight`
|
|
`weight` function can be implemented in the Script Score query through
|
|
the following script:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
"script" : {
|
|
"source" : "params.weight * _score",
|
|
"params": {
|
|
"weight": 2
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
[[random-score]]
|
|
====== `random_score`
|
|
|
|
Use `randomScore` function
|
|
as described in <<random-score-function, random score function>>.
|
|
|
|
[[field-value-factor]]
|
|
====== `field_value_factor`
|
|
`field_value_factor` function can be easily implemented through script:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
"script" : {
|
|
"source" : "Math.log10(doc['field'].value * params.factor)",
|
|
params" : {
|
|
"factor" : 5
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
|
|
For checking if a document has a missing value, you can use
|
|
`doc['field'].size() == 0`. For example, this script will use
|
|
a value `1` if a document doesn't have a field `field`:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
"script" : {
|
|
"source" : "Math.log10((doc['field'].size() == 0 ? 1 : doc['field'].value()) * params.factor)",
|
|
params" : {
|
|
"factor" : 5
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
This table lists how `field_value_factor` modifiers can be implemented
|
|
through a script:
|
|
|
|
[cols="<,<",options="header",]
|
|
|=======================================================================
|
|
| Modifier | Implementation in Script Score
|
|
|
|
| `none` | -
|
|
| `log` | `Math.log10(doc['f'].value)`
|
|
| `log1p` | `Math.log10(doc['f'].value + 1)`
|
|
| `log2p` | `Math.log10(doc['f'].value + 2)`
|
|
| `ln` | `Math.log(doc['f'].value)`
|
|
| `ln1p` | `Math.log(doc['f'].value + 1)`
|
|
| `ln2p` | `Math.log(doc['f'].value + 2)`
|
|
| `square` | `Math.pow(doc['f'].value, 2)`
|
|
| `sqrt` | `Math.sqrt(doc['f'].value)`
|
|
| `reciprocal` | `1.0 / doc['f'].value`
|
|
|=======================================================================
|
|
|
|
[[decay-functions]]
|
|
====== `decay` functions
|
|
The `script_score` query has equivalent <<decay-functions, decay functions>>
|
|
that can be used in script.
|
|
|
|
include::{es-repo-dir}/vectors/vector-functions.asciidoc[]
|
|
|
|
[[score-explanation]]
|
|
====== Explain request
|
|
Using an <<search-explain, explain request>> provides an explanation of how the parts of a score were computed. The `script_score` query can add its own explanation by setting the `explanation` parameter:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET /twitter/_explain/0
|
|
{
|
|
"query" : {
|
|
"script_score" : {
|
|
"query" : {
|
|
"match": { "message": "elasticsearch" }
|
|
},
|
|
"script" : {
|
|
"source" : """
|
|
long likes = doc['likes'].value;
|
|
double normalizedLikes = likes / 10;
|
|
if (explanation != null) {
|
|
explanation.set('normalized likes = likes / 10 = ' + likes + ' / 10 = ' + normalizedLikes);
|
|
}
|
|
return normalizedLikes;
|
|
"""
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:twitter]
|
|
|
|
Note that the `explanation` will be null when using in a normal `_search` request, so having a conditional guard is best practice.
|