Document upcoming scoring changes. (#22806)

This commit is contained in:
Adrien Grand 2017-01-30 11:08:49 +01:00 committed by GitHub
parent fe4043c8ff
commit dc62255ddd
1 changed files with 27 additions and 0 deletions

View File

@ -45,3 +45,30 @@ have any effect in previous versions.
* The `"time"` field showing human readable timing output has been replaced by the `"time_in_nanos"` * The `"time"` field showing human readable timing output has been replaced by the `"time_in_nanos"`
field which displays the elapsed time in nanoseconds. The `"time"` field can be turned on by adding field which displays the elapsed time in nanoseconds. The `"time"` field can be turned on by adding
`"?human=true"` to the request url. It will display a rounded, human readable time value. `"?human=true"` to the request url. It will display a rounded, human readable time value.
==== Scoring changes
==== Query normalization
Query normalization has been removed. This means that the TF-IDF similarity no
longer tries to make scores comparable across queries and that boosts are now
integrated into scores as simple multiplicative factors.
Other similarities are not affected as they did not normalize scores and
already integrated boosts into scores as multiplicative factors.
See https://issues.apache.org/jira/browse/LUCENE-7347[`LUCENE-7347`] for more
information.
==== Coordination factors
Coordination factors have been removed from the scoring formula. This means that
boolean queries no longer score based on the number of matching clauses.
Instead, they always return the sum of the scores of the matching clauses.
As a consequence, use of the TF-IDF similarity is now discouraged as this was
an important component of the quality of the scores that this similarity
produces. BM25 is recommended instead.
See https://issues.apache.org/jira/browse/LUCENE-7347[`LUCENE-7347`] for more
information.