From a672a2a2d47e3b6a25eae0bec569f1c1fa64c762 Mon Sep 17 00:00:00 2001 From: James Rodewig Date: Fri, 17 Jul 2020 10:57:00 -0400 Subject: [PATCH] [DOCS] Move highlighting docs to separate page (#59768) (#59781) Moves the highlighting docs from the deprecated 'Request Body Search' chapter to the new subpage of the 'Run a search chapter' section. No substantive changes were made to the content. --- .../metrics/tophits-aggregation.asciidoc | 2 +- docs/reference/how-to/general.asciidoc | 2 +- .../mapping/fields/source-field.asciidoc | 2 +- docs/reference/query-dsl/terms-query.asciidoc | 2 +- docs/reference/redirects.asciidoc | 10 ++++- docs/reference/search/request-body.asciidoc | 5 ++- .../request/highlighters-internal.asciidoc | 12 +++--- .../search/request/highlighting.asciidoc | 40 +++++++++---------- .../search/request/inner-hits.asciidoc | 2 +- docs/reference/search/run-a-search.asciidoc | 2 + 10 files changed, 46 insertions(+), 33 deletions(-) diff --git a/docs/reference/aggregations/metrics/tophits-aggregation.asciidoc b/docs/reference/aggregations/metrics/tophits-aggregation.asciidoc index 8fff7365aa7..fc24a3527b5 100644 --- a/docs/reference/aggregations/metrics/tophits-aggregation.asciidoc +++ b/docs/reference/aggregations/metrics/tophits-aggregation.asciidoc @@ -17,7 +17,7 @@ One or more bucket aggregators determines by which properties a result set get s The top_hits aggregation returns regular search hits, because of this many per hit features can be supported: -* <> +* <> * <> * <> * <> diff --git a/docs/reference/how-to/general.asciidoc b/docs/reference/how-to/general.asciidoc index 9633c7fe843..4921d7dcb27 100644 --- a/docs/reference/how-to/general.asciidoc +++ b/docs/reference/how-to/general.asciidoc @@ -27,7 +27,7 @@ needs to fetch the `_id` of the document in all cases, and the cost of getting this field is bigger for large documents due to how the filesystem cache works. Indexing this document can use an amount of memory that is a multiplier of the original size of the document. Proximity search (phrase queries for instance) -and <> also become more expensive +and <> also become more expensive since their cost directly depends on the size of the original document. It is sometimes useful to reconsider what the unit of information should be. diff --git a/docs/reference/mapping/fields/source-field.asciidoc b/docs/reference/mapping/fields/source-field.asciidoc index ded8fc3a6b4..270d62076c1 100644 --- a/docs/reference/mapping/fields/source-field.asciidoc +++ b/docs/reference/mapping/fields/source-field.asciidoc @@ -35,7 +35,7 @@ available then a number of features are not supported: * The <>, <>, and <> APIs. -* On the fly <>. +* On the fly <>. * The ability to reindex from one Elasticsearch index to another, either to change mappings or analysis, or to upgrade an index to a new major diff --git a/docs/reference/query-dsl/terms-query.asciidoc b/docs/reference/query-dsl/terms-query.asciidoc index e4647f80df6..12e07378f1d 100644 --- a/docs/reference/query-dsl/terms-query.asciidoc +++ b/docs/reference/query-dsl/terms-query.asciidoc @@ -67,7 +67,7 @@ increases the relevance score. [[query-dsl-terms-query-highlighting]] ===== Highlighting `terms` queries -<> is best-effort only. {es} may not +<> is best-effort only. {es} may not return highlight results for `terms` queries depending on: * Highlighter type diff --git a/docs/reference/redirects.asciidoc b/docs/reference/redirects.asciidoc index 42995356324..7644d9dbe58 100644 --- a/docs/reference/redirects.asciidoc +++ b/docs/reference/redirects.asciidoc @@ -70,7 +70,7 @@ See <>. [role="exclude",id="search-request-highlighting"] === Highlight parameter for request body search API -See <>. +See <>. [role="exclude",id="search-request-index-boost"] === Index boost parameter for request body search API @@ -926,4 +926,12 @@ See <>. ==== Doc value fields See <>. + +[role="exclude",id="request-body-search-highlighting"] +==== Highlighting +See <>. + +[role="exclude",id="highlighter-internal-work"] +==== How highlighters work internally +See <>. //// diff --git a/docs/reference/search/request-body.asciidoc b/docs/reference/search/request-body.asciidoc index 23bb98de343..3a9ab104cb9 100644 --- a/docs/reference/search/request-body.asciidoc +++ b/docs/reference/search/request-body.asciidoc @@ -112,7 +112,10 @@ include::request/docvalue-fields.asciidoc[] include::request/collapse.asciidoc[] -include::request/highlighting.asciidoc[] +[[request-body-search-highlighting]] +==== Highlighting + +See <>. include::request/index-boost.asciidoc[] diff --git a/docs/reference/search/request/highlighters-internal.asciidoc b/docs/reference/search/request/highlighters-internal.asciidoc index 11534a01aa2..cad08c9ece4 100644 --- a/docs/reference/search/request/highlighters-internal.asciidoc +++ b/docs/reference/search/request/highlighters-internal.asciidoc @@ -1,5 +1,5 @@ -[[highlighter-internal-work]] -==== How highlighters work internally +[[how-highlighters-work-internally]] +=== How highlighters work internally Given a query and a text (the content of a document field), the goal of a highlighter is to find the best text fragments for the query, and highlight @@ -10,7 +10,7 @@ address several questions: - How to find the best fragments among all fragments? - How to highlight the query terms in a fragment? -===== How to break a text into fragments? +==== How to break a text into fragments? Relevant settings: `fragment_size`, `fragmenter`, `type` of highlighter, `boundary_chars`, `boundary_max_scan`, `boundary_scanner`, `boundary_scanner_locale`. @@ -28,7 +28,7 @@ fragments by utilizing Java's `BreakIterator`. This ensures that a fragment is a valid sentence as long as `fragment_size` allows for this. -===== How to find the best fragments? +==== How to find the best fragments? Relevant settings: `number_of_fragments`. To find the best, most relevant, fragments, a highlighter needs to score @@ -61,7 +61,7 @@ an in-memory index from the text. Unified highlighter uses the BM25 scoring mode to score fragments. -===== How to highlight the query terms in a fragment? +==== How to highlight the query terms in a fragment? Relevant settings: `pre-tags`, `post-tags`. The goal is to highlight only those terms that participated in generating the 'hit' on the document. @@ -78,7 +78,7 @@ fragments in some raw form, and then populate them with actual text. A highlighter uses `pre-tags`, `post-tags` to encode highlighted terms. -===== An example of the work of the unified highlighter +==== An example of the work of the unified highlighter Let's look in more details how unified highlighter works. diff --git a/docs/reference/search/request/highlighting.asciidoc b/docs/reference/search/request/highlighting.asciidoc index 8ab97f0ecf2..8406a771217 100644 --- a/docs/reference/search/request/highlighting.asciidoc +++ b/docs/reference/search/request/highlighting.asciidoc @@ -1,5 +1,5 @@ -[[request-body-search-highlighting]] -==== Highlighting +[[highlighting]] +=== Highlighting Highlighters enable you to get highlighted snippets from one or more fields in your search results so you can show users where the query matches are. @@ -41,7 +41,7 @@ highlighter). You can specify the highlighter `type` you want to use for each field. [[unified-highlighter]] -===== Unified highlighter +==== Unified highlighter The `unified` highlighter uses the Lucene Unified Highlighter. This highlighter breaks the text into sentences and uses the BM25 algorithm to score individual sentences as if they were documents in the corpus. It also supports @@ -49,7 +49,7 @@ accurate phrase and multi-term (fuzzy, prefix, regex) highlighting. This is the default highlighter. [[plain-highlighter]] -===== Plain highlighter +==== Plain highlighter The `plain` highlighter uses the standard Lucene highlighter. It attempts to reflect the query matching logic in terms of understanding word importance and any word positioning criteria in phrase queries. @@ -64,7 +64,7 @@ If you want to highlight a lot of fields in a lot of documents with complex queries, we recommend using the `unified` highlighter on `postings` or `term_vector` fields. [[fast-vector-highlighter]] -===== Fast vector highlighter +==== Fast vector highlighter The `fvh` highlighter uses the Lucene Fast Vector highlighter. This highlighter can be used on fields with `term_vector` set to `with_positions_offsets` in the mapping. The fast vector highlighter: @@ -83,7 +83,7 @@ The `fvh` highlighter does not support span queries. If you need support for span queries, try an alternative highlighter, such as the `unified` highlighter. [[offsets-strategy]] -===== Offsets Strategy +==== Offsets strategy To create meaningful search snippets from the terms being queried, the highlighter needs to know the start and end character offsets of each word in the original text. These offsets can be obtained from: @@ -116,7 +116,7 @@ limited to 1000000. This default limit can be changed for a particular index with the index setting `index.highlight.max_analyzed_offset`. [[highlighting-settings]] -===== Highlighting Settings +==== Highlighting settings Highlighting settings can be set on a global level and overridden at the field level. @@ -215,7 +215,7 @@ order:: Sorts highlighted fragments by score when set to `score`. By default, fragments will be output in the order they appear in the field (order: `none`). Setting this option to `score` will output the most relevant fragments first. Each highlighter applies its own logic to compute relevancy scores. See -the document <> +the document <> for more details how different highlighters find the best fragments. phrase_limit:: Controls the number of matching phrases in a document that are @@ -254,7 +254,7 @@ type:: The highlighter to use: `unified`, `plain`, or `fvh`. Defaults to `unified`. [[highlighting-examples]] -===== Highlighting Examples +==== Highlighting examples * <> * <> @@ -270,7 +270,7 @@ type:: The highlighter to use: `unified`, `plain`, or `fvh`. Defaults to [[override-global-settings]] [float] -==== Override global settings +=== Override global settings You can specify highlighter settings globally and selectively override them for individual fields. @@ -298,7 +298,7 @@ GET /_search [float] [[specify-highlight-query]] -==== Specify a highlight query +=== Specify a highlight query You can specify a `highlight_query` to take additional information into account when highlighting. For example, the following query includes both the search @@ -367,7 +367,7 @@ GET /_search [float] [[set-highlighter-type]] -==== Set highlighter type +=== Set highlighter type The `type` field allows to force a specific highlighter type. The allowed values are: `unified`, `plain` and `fvh`. @@ -391,7 +391,7 @@ GET /_search [[configure-tags]] [float] -==== Configure highlighting tags +=== Configure highlighting tags By default, the highlighting will wrap highlighted text in `` and ``. This can be controlled by setting `pre_tags` and `post_tags`, @@ -457,7 +457,7 @@ GET /_search [float] [[highlight-source]] -==== Highlight on source +=== Highlight on source Forces the highlighting to highlight fields based on the source even if fields are stored separately. Defaults to `false`. @@ -481,7 +481,7 @@ GET /_search [[highlight-all]] [float] -==== Highlight in all fields +=== Highlight in all fields By default, only fields that contains a query match are highlighted. Set `require_field_match` to `false` to highlight all fields. @@ -505,7 +505,7 @@ GET /_search [[matched-fields]] [float] -==== Combine matches on multiple fields +=== Combine matches on multiple fields WARNING: This is only supported by the `fvh` highlighter @@ -639,7 +639,7 @@ to [[explicit-field-order]] [float] -==== Explicitly order highlighted fields +=== Explicitly order highlighted fields Elasticsearch highlights the fields in the order that they are sent, but per the JSON spec, objects are unordered. If you need to be explicit about the order in which fields are highlighted specify the `fields` as an array: @@ -666,7 +666,7 @@ fields are highlighted but a plugin might. [float] [[control-highlighted-frags]] -==== Control highlighted fragments +=== Control highlighted fragments Each field highlighted can control the size of the highlighted fragment in characters (defaults to `100`), and the maximum number of fragments @@ -763,7 +763,7 @@ GET /_search [float] [[highlight-postings-list]] -==== Highlight using the postings list +=== Highlight using the postings list Here is an example of setting the `comment` field in the index mapping to allow for highlighting using the postings: @@ -803,7 +803,7 @@ PUT /example [float] [[specify-fragmenter]] -==== Specify a fragmenter for the plain highlighter +=== Specify a fragmenter for the plain highlighter When using the `plain` highlighter, you can choose between the `simple` and `span` fragmenters: diff --git a/docs/reference/search/request/inner-hits.asciidoc b/docs/reference/search/request/inner-hits.asciidoc index 21f2b06b46e..5f7f3bdc9ea 100644 --- a/docs/reference/search/request/inner-hits.asciidoc +++ b/docs/reference/search/request/inner-hits.asciidoc @@ -70,7 +70,7 @@ Inner hits support the following options: Inner hits also supports the following per document features: -* <> +* <> * <> * <> * <> diff --git a/docs/reference/search/run-a-search.asciidoc b/docs/reference/search/run-a-search.asciidoc index 21a4a248faf..e8f44ce810b 100644 --- a/docs/reference/search/run-a-search.asciidoc +++ b/docs/reference/search/run-a-search.asciidoc @@ -291,3 +291,5 @@ GET /*/_search include::request/from-size.asciidoc[] include::search-fields.asciidoc[] + +include::request/highlighting.asciidoc[]