CONSOLEify highlighting a function_score docs

Converts many of the partial examples into full search requests.

Relates #18160
This commit is contained in:
Nik Everett 2017-04-05 17:38:34 -04:00
parent 042f7566e8
commit 048191ceb6
3 changed files with 141 additions and 40 deletions

View File

@ -89,10 +89,8 @@ buildRestTests.expectedUnconvertedCandidates = [
'reference/mapping/types/percolator.asciidoc',
'reference/modules/scripting/security.asciidoc',
'reference/modules/cross-cluster-search.asciidoc', // this is hard to test since we need 2 clusters -- maybe we can trick it into referencing itself...
'reference/query-dsl/function-score-query.asciidoc',
'reference/search/field-stats.asciidoc',
'reference/search/profile.asciidoc',
'reference/search/request/highlighting.asciidoc',
'reference/search/request/inner-hits.asciidoc',
]

View File

@ -27,6 +27,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
<1> See <<score-functions>> for a list of supported functions.
@ -62,6 +63,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
<1> Boost for the whole query.
<2> See <<score-functions>> for a list of supported functions.
@ -132,35 +134,57 @@ simple sample:
[source,js]
--------------------------------------------------
"script_score" : {
"script" : {
"lang": "painless",
"inline": "_score * doc['my_numeric_field'].value"
GET /_search
{
"query": {
"function_score": {
"query": {
"match": { "message": "elasticsearch" }
},
"script_score" : {
"script" : {
"inline": "Math.log(2 + doc['likes'].value)"
}
}
}
}
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
On top of the different scripting field values and expression, the
`_score` script parameter can be used to retrieve the score based on the
wrapped query.
Scripts are cached for faster execution. If the script has parameters
that it needs to take into account, it is preferable to reuse the same
script, and provide parameters to it:
Scripts compilation is cached for faster execution. If the script has
parameters that it needs to take into account, it is preferable to reuse the
same script, and provide parameters to it:
[source,js]
--------------------------------------------------
"script_score": {
"script": {
"lang": "painless",
"params": {
"param1": value1,
"param2": value2
},
"inline": "_score * doc['my_numeric_field'].value / Math.pow(params.param1, params.param2)"
GET /_search
{
"query": {
"function_score": {
"query": {
"match": { "message": "elasticsearch" }
},
"script_score" : {
"script" : {
"params": {
"a": 5,
"b": 1.2
},
"inline": "params.a / Math.pow(params.b, doc['likes'].value)"
}
}
}
}
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
Note that unlike the `custom_score` query, the
score of the query is multiplied with the result of the script scoring. If
@ -178,6 +202,8 @@ not. The number value is of type float.
--------------------------------------------------
"weight" : number
--------------------------------------------------
// NOTCONSOLE
// I couldn't come up with a good example for this one.
[[function-random]]
==== Random
@ -191,10 +217,19 @@ be a memory intensive operation since the values are unique.
[source,js]
--------------------------------------------------
"random_score": {
"seed" : number
GET /_search
{
"query": {
"function_score": {
"random_score": {
"seed": 10
}
}
}
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
[[function-field-value-factor]]
==== Field Value factor
@ -210,17 +245,26 @@ doing so would look like:
[source,js]
--------------------------------------------------
"field_value_factor": {
"field": "popularity",
"factor": 1.2,
"modifier": "sqrt",
"missing": 1
GET /_search
{
"query": {
"function_score": {
"field_value_factor": {
"field": "likes",
"factor": 1.2,
"modifier": "sqrt",
"missing": 1
}
}
}
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
Which will translate into the following formula for scoring:
`sqrt(1.2 * doc['popularity'].value)`
`sqrt(1.2 * doc['likes'].value)`
There are a number of options for the `field_value_factor` function:
@ -291,23 +335,36 @@ decay function is specified as
}
}
--------------------------------------------------
// NOTCONSOLE
<1> The `DECAY_FUNCTION` should be one of `linear`, `exp`, or `gauss`.
<2> The specified field must be a numeric, date, or geo-point field.
In the above example, the field is a <<geo-point,`geo_point`>> and origin can be provided in geo format. `scale` and `offset` must be given with a unit in this case. If your field is a date field, you can set `scale` and `offset` as days, weeks, and so on. Example:
In the above example, the field is a <<geo-point,`geo_point`>> and origin can
be provided in geo format. `scale` and `offset` must be given with a unit in
this case. If your field is a date field, you can set `scale` and `offset` as
days, weeks, and so on. Example:
[source,js]
--------------------------------------------------
"gauss": {
"date": {
"origin": "2013-09-17", <1>
"scale": "10d",
"offset": "5d", <2>
"decay" : 0.5 <2>
GET /_search
{
"query": {
"function_score": {
"gauss": {
"date": {
"origin": "2013-09-17", <1>
"scale": "10d",
"offset": "5d", <2>
"decay" : 0.5 <2>
}
}
}
}
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
<1> The date format of the origin depends on the <<mapping-date-format,`format`>> defined in
your mapping. If you do not define the origin, the current time is used.
<2> The `offset` and `decay` parameters are optional.
@ -428,6 +485,7 @@ Example:
"multi_value_mode": "avg"
}
--------------------------------------------------
// NOTCONSOLE
==== Detailed example
@ -464,6 +522,7 @@ The function for `price` in this case would be
}
}
--------------------------------------------------
// NOTCONSOLE
<1> This decay function could also be `linear` or `exp`.
and for `location`:
@ -478,6 +537,7 @@ and for `location`:
}
}
--------------------------------------------------
// NOTCONSOLE
<1> This decay function could also be `linear` or `exp`.
Suppose you want to multiply these two functions on the original score,

View File

@ -21,6 +21,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
In the above case, the `comment` field will be highlighted for each
search hit (there will be another element in each search hit, called
@ -76,12 +77,21 @@ highlighting using the postings highlighter on it:
[source,js]
--------------------------------------------------
PUT /example
{
"type_name" : {
"comment" : {"index_options" : "offsets"}
"mappings": {
"doc" : {
"properties": {
"comment" : {
"type": "text",
"index_options" : "offsets"
}
}
}
}
}
--------------------------------------------------
// CONSOLE
[NOTE]
Note that the postings highlighter is meant to perform simple query terms
@ -118,12 +128,21 @@ the index to be bigger):
[source,js]
--------------------------------------------------
PUT /example
{
"type_name" : {
"comment" : {"term_vector" : "with_positions_offsets"}
"mappings": {
"doc" : {
"properties": {
"comment" : {
"type": "text",
"term_vector" : "with_positions_offsets"
}
}
}
}
}
--------------------------------------------------
// CONSOLE
==== Unified Highlighter
@ -166,6 +185,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
==== Force highlighting on source
@ -187,6 +207,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
[[tags]]
==== Highlighting Tags
@ -212,6 +233,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
Using the fast vector highlighter there can be more tags, and the "importance"
is ordered.
@ -233,11 +255,12 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
There are also built in "tag" schemas, with currently a single schema
called `styled` with the following `pre_tags`:
[source,js]
[source,html]
--------------------------------------------------
<em class="hlt1">, <em class="hlt2">, <em class="hlt3">,
<em class="hlt4">, <em class="hlt5">, <em class="hlt6">,
@ -265,7 +288,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
==== Encoder
@ -295,6 +318,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
The `fragment_size` is ignored when using the postings highlighter, as it
outputs sentences regardless of their length.
@ -318,6 +342,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
If the `number_of_fragments` value is set to `0` then no fragments are
produced, instead the whole content of the field is returned, and of
@ -341,6 +366,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
When using `fvh` one can use `fragment_offset`
parameter to control the margin to start highlighting from.
@ -373,6 +399,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
==== Highlight query
@ -445,6 +472,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
Note that the score of text fragment in this case is calculated by the Lucene
highlighting framework. For implementation details you can check the
@ -478,6 +506,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
[[field-match]]
==== Require Field Match
@ -503,6 +532,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
[[boundary-scanners]]
==== Boundary Scanners
@ -562,6 +592,8 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
The above matches both "run with scissors" and "running with scissors"
and would highlight "running" and "scissors" but not "run". If both
phrases appear in a large document then "running with scissors" is
@ -590,6 +622,8 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
The above highlights "run" as well as "running" and "scissors" but
still sorts "running with scissors" above "run with scissors" because
the plain match ("running") is boosted.
@ -616,6 +650,7 @@ GET /_search
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
The above query wouldn't highlight "run" or "scissor" but shows that
it is just fine not to list the field to which the matches are combined
@ -639,6 +674,7 @@ There is a small amount of overhead involved with setting
}
}
--------------------------------------------------
// NOTCONSOLE
to
[source,js]
--------------------------------------------------
@ -651,6 +687,7 @@ to
}
}
--------------------------------------------------
// NOTCONSOLE
===================================================================
[[phrase-limit]]
@ -672,12 +709,18 @@ json spec objects are unordered but if you need to be explicit about the order
that fields are highlighted then you can use an array for `fields` like this:
[source,js]
--------------------------------------------------
GET /_search
{
"highlight": {
"fields": [
{"title":{ /*params*/ }},
{"text":{ /*params*/ }}
{ "title": {} },
{ "text": {} }
]
}
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
None of the highlighters built into Elasticsearch care about the order that the
fields are highlighted but a plugin may.