Docs: Clarify constraints on scripted similarities. (#31076)

Scripted similarities provide a lot of flexibility but they still need to obey some rules to not confuse Lucene.
2018-06-05 08:51:00 +02:00 · 2018-06-05 08:51:00 +02:00 · f5073813ef
parent c7c0acc2c7
commit f5073813ef
1 changed files with 12 additions and 2 deletions
--- a/docs/reference/index-modules/similarity.asciidoc
+++ b/docs/reference/index-modules/similarity.asciidoc
@ -341,7 +341,18 @@ Which yields:
 // TESTRESPONSE[s/"took": 12/"took" : $body.took/]
 // TESTRESPONSE[s/OzrdjxNtQGaqs4DmioFw9A/$body.hits.hits.0._node/]

-You might have noticed that a significant part of the script depends on
+WARNING: While scripted similarities provide a lot of flexibility, there is
+a set of rules that they need to satisfy. Failing to do so could make
+Elasticsearch silently return wrong top hits or fail with internal errors at
+search time:
+
+ - Returned scores must be positive.
+ - All other variables remaining equal, scores must not decrease when
+   `doc.freq` increases.
+ - All other variables remaining equal, scores must not increase when
+   `doc.length` increases.
+
+You might have noticed that a significant part of the above script depends on
 statistics that are the same for every document. It is possible to make the
 above slightly more efficient by providing an `weight_script` which will
 compute the document-independent part of the score and will be available
@ -506,7 +517,6 @@ GET /index/_search?explain=true

 ////////////////////

-
 Type name: `scripted`

 [float]