From d018a0008ebab74ab21ed28fdcfc75c0b4d4d3a4 Mon Sep 17 00:00:00 2001 From: Menno Oudshoorn Date: Tue, 6 Mar 2018 15:37:18 +0100 Subject: [PATCH] Add a usage example of the JLH score (#28905) Adds a usage example of the JLH score used in significant terms aggregation. All other methods to calculate significance score have such an example Closes #28513 --- .../bucket/significantterms-aggregation.asciidoc | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/reference/aggregations/bucket/significantterms-aggregation.asciidoc b/docs/reference/aggregations/bucket/significantterms-aggregation.asciidoc index 2de6a0bbb2f..b6595c0d05c 100644 --- a/docs/reference/aggregations/bucket/significantterms-aggregation.asciidoc +++ b/docs/reference/aggregations/bucket/significantterms-aggregation.asciidoc @@ -327,6 +327,15 @@ However, the `size` and `shard size` settings covered in the next section provid ==== Parameters ===== JLH score +The JLH score can be used as a significance score by adding the parameter + +[source,js] +-------------------------------------------------- + + "jlh": { + } +-------------------------------------------------- +// NOTCONSOLE The scores are derived from the doc frequencies in _foreground_ and _background_ sets. The _absolute_ change in popularity (foregroundPercent - backgroundPercent) would favor common terms whereas the _relative_ change in popularity (foregroundPercent/ backgroundPercent) would favor rare terms. Rare vs common is essentially a precision vs recall balance and so the absolute and relative changes are multiplied to provide a sweet spot between precision and recall.