[DOCS] Merge search topic and overview pages (#60459) (#60479)

This commit is contained in:
James Rodewig 2020-07-30 16:45:18 -04:00 committed by GitHub
parent 134b69d3aa
commit 0022d316bb
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
9 changed files with 123 additions and 103 deletions

View File

@ -86,7 +86,7 @@ GET my-index-000001/_search
{ref}/search-request-body.html#request-body-search-script-fields[script field]
to return the `_size` field in the search response.
<5> Uses a
{ref}/run-a-search.html#docvalue-fields[doc value
{ref}/search-your-data.html#docvalue-fields[doc value
field] to return the `_size` field in the search response. Doc value fields are
useful if
{ref}/modules-scripting-security.html#allowed-script-types-setting[inline

View File

@ -24,7 +24,7 @@ include::indices/index-templates.asciidoc[]
include::data-streams/data-streams.asciidoc[]
include::search/index.asciidoc[]
include::search/search-your-data.asciidoc[]
include::query-dsl.asciidoc[]

View File

@ -914,6 +914,16 @@ The `xpack.sql.enabled` setting has been deprecated. SQL access is always enable
See <<index-templates>>.
[role="exclude",id="run-a-search"]
=== Run a search
See <<run-an-es-search>>.
[role="exclude",id="how-highlighters-work-internally"]
=== How highlighters work internally
See <<how-es-highlighters-work-internally>>.
////
[role="exclude",id="search-request-body"]
=== Request body search
@ -941,14 +951,17 @@ See <<paginate-search-results>>.
[role="exclude",id="request-body-search-highlighting"]
==== Highlighting
See <<highlighting>>.
[role="exclude",id="highlighter-internal-work"]
==== How highlighters work internally
See <<how-highlighters-work-internally>>.
See <<how-es-highlighters-work-internally>>.
[role="exclude",id="request-body-search-sort"]
==== Sort
See <<sort-search-results>>.
[role="exclude",id="request-body-search-source-filtering"]

View File

@ -1,41 +0,0 @@
[[search-your-data]]
= Search your data
[partintro]
--
[[search-query]]
A _search query_, or _query_, is a request for information about data in
{es} data streams or indices.
You can think of a query as a question, written in a way {es} understands.
Depending on your data, you can use a query to get answers to questions like:
* What pages on my website contain a specific word or phrase?
* What processes on my server take longer than 500 milliseconds to respond?
* What users on my network ran `regsvr32.exe` within the last week?
* How many of my products have a price greater than $20?
A _search_ consists of one or more queries that are combined and sent to {es}.
Documents that match a search's queries are returned in the _hits_, or
_search results_, of the response.
A search may also contain additional information used to better process its
queries. For example, a search may be limited to a specific index or only return
a specific number of results.
[discrete]
[[search-toc]]
=== In this section
* <<run-a-search>>
* <<near-real-time>>
* <<modules-cross-cluster-search>>
* <<async-search-intro>>
--
include::run-a-search.asciidoc[]
include::{es-repo-dir}/search/near-real-time.asciidoc[]
include::{es-repo-dir}/async-search.asciidoc[]
include::{es-repo-dir}/modules/cross-cluster-search.asciidoc[]

View File

@ -1,5 +1,5 @@
[[collapse-search-results]]
=== Collapse search results
== Collapse search results
You can use the `collapse` parameter to collapse search results based
on field values. The collapsing is done by selecting only the top sorted
@ -35,8 +35,9 @@ The field used for collapsing must be a single valued <<keyword, `keyword`>> or
NOTE: The collapsing is applied to the top hits only and does not affect aggregations.
[discrete]
[[expand-collapse-results]]
==== Expand collapse results
=== Expand collapse results
It is also possible to expand each collapsed top hits with the `inner_hits` option.
@ -118,8 +119,9 @@ The default is based on the number of data nodes and the default search thread p
WARNING: `collapse` cannot be used in conjunction with <<request-body-search-scroll, scroll>>,
<<request-body-search-rescore, rescore>> or <<request-body-search-search-after, search after>>.
[discrete]
[[second-level-of-collapsing]]
==== Second level of collapsing
=== Second level of collapsing
Second level of collapsing is also supported and is applied to `inner_hits`.
For example, the following request finds the top scored tweets for

View File

@ -1,5 +1,6 @@
[[how-highlighters-work-internally]]
=== How highlighters work internally
[discrete]
[[how-es-highlighters-work-internally]]
== How highlighters work internally
Given a query and a text (the content of a document field), the goal of a
highlighter is to find the best text fragments for the query, and highlight
@ -10,7 +11,8 @@ address several questions:
- How to find the best fragments among all fragments?
- How to highlight the query terms in a fragment?
==== How to break a text into fragments?
[discrete]
=== How to break a text into fragments?
Relevant settings: `fragment_size`, `fragmenter`, `type` of highlighter,
`boundary_chars`, `boundary_max_scan`, `boundary_scanner`, `boundary_scanner_locale`.
@ -27,8 +29,8 @@ Unified or FVH highlighters do a better job of breaking up a text into
fragments by utilizing Java's `BreakIterator`. This ensures that a fragment
is a valid sentence as long as `fragment_size` allows for this.
==== How to find the best fragments?
[discrete]
=== How to find the best fragments?
Relevant settings: `number_of_fragments`.
To find the best, most relevant, fragments, a highlighter needs to score
@ -60,8 +62,8 @@ if they are available. Otherwise, similar to Plain Highlighter, it has to create
an in-memory index from the text. Unified highlighter uses the BM25 scoring model
to score fragments.
==== How to highlight the query terms in a fragment?
[discrete]
=== How to highlight the query terms in a fragment?
Relevant settings: `pre-tags`, `post-tags`.
The goal is to highlight only those terms that participated in generating the 'hit' on the document.
@ -77,8 +79,8 @@ fragments in some raw form, and then populate them with actual text.
A highlighter uses `pre-tags`, `post-tags` to encode highlighted terms.
==== An example of the work of the unified highlighter
[discrete]
=== An example of the work of the unified highlighter
Let's look in more details how unified highlighter works.

View File

@ -1,5 +1,5 @@
[[highlighting]]
=== Highlighting
== Highlighting
Highlighters enable you to get highlighted snippets from one or more fields
in your search results so you can show users where the query matches are.
@ -40,16 +40,18 @@ GET /_search
highlighter). You can specify the highlighter `type` you want to use
for each field.
[discrete]
[[unified-highlighter]]
==== Unified highlighter
=== Unified highlighter
The `unified` highlighter uses the Lucene Unified Highlighter. This
highlighter breaks the text into sentences and uses the BM25 algorithm to score
individual sentences as if they were documents in the corpus. It also supports
accurate phrase and multi-term (fuzzy, prefix, regex) highlighting. This is the
default highlighter.
[discrete]
[[plain-highlighter]]
==== Plain highlighter
=== Plain highlighter
The `plain` highlighter uses the standard Lucene highlighter. It attempts to
reflect the query matching logic in terms of understanding word importance and
any word positioning criteria in phrase queries.
@ -63,8 +65,9 @@ This is repeated for every field and every document that needs to be highlighted
If you want to highlight a lot of fields in a lot of documents with complex
queries, we recommend using the `unified` highlighter on `postings` or `term_vector` fields.
[discrete]
[[fast-vector-highlighter]]
==== Fast vector highlighter
=== Fast vector highlighter
The `fvh` highlighter uses the Lucene Fast Vector highlighter.
This highlighter can be used on fields with `term_vector` set to
`with_positions_offsets` in the mapping. The fast vector highlighter:
@ -82,8 +85,9 @@ This highlighter can be used on fields with `term_vector` set to
The `fvh` highlighter does not support span queries. If you need support for
span queries, try an alternative highlighter, such as the `unified` highlighter.
[discrete]
[[offsets-strategy]]
==== Offsets strategy
=== Offsets strategy
To create meaningful search snippets from the terms being queried,
the highlighter needs to know the start and end character offsets of each word
in the original text. These offsets can be obtained from:
@ -115,8 +119,9 @@ To protect against this, the maximum number of text characters that will be anal
limited to 1000000. This default limit can be changed
for a particular index with the index setting `index.highlight.max_analyzed_offset`.
[discrete]
[[highlighting-settings]]
==== Highlighting settings
=== Highlighting settings
Highlighting settings can be set on a global level and overridden at
the field level.
@ -215,7 +220,7 @@ order:: Sorts highlighted fragments by score when set to `score`. By default,
fragments will be output in the order they appear in the field (order: `none`).
Setting this option to `score` will output the most relevant fragments first.
Each highlighter applies its own logic to compute relevancy scores. See
the document <<how-highlighters-work-internally, How highlighters work internally>>
the document <<how-es-highlighters-work-internally, How highlighters work internally>>
for more details how different highlighters find the best fragments.
phrase_limit:: Controls the number of matching phrases in a document that are
@ -253,8 +258,9 @@ schema defines the following `pre_tags` and defines `post_tags` as
type:: The highlighter to use: `unified`, `plain`, or `fvh`. Defaults to
`unified`.
[discrete]
[[highlighting-examples]]
==== Highlighting examples
=== Highlighting examples
* <<override-global-settings, Override global settings>>
* <<specify-highlight-query, Specify a highlight query>>
@ -270,7 +276,7 @@ type:: The highlighter to use: `unified`, `plain`, or `fvh`. Defaults to
[[override-global-settings]]
[discrete]
=== Override global settings
== Override global settings
You can specify highlighter settings globally and selectively override them for
individual fields.
@ -298,7 +304,7 @@ GET /_search
[discrete]
[[specify-highlight-query]]
=== Specify a highlight query
== Specify a highlight query
You can specify a `highlight_query` to take additional information into account
when highlighting. For example, the following query includes both the search
@ -367,7 +373,7 @@ GET /_search
[discrete]
[[set-highlighter-type]]
=== Set highlighter type
== Set highlighter type
The `type` field allows to force a specific highlighter type.
The allowed values are: `unified`, `plain` and `fvh`.
@ -391,7 +397,7 @@ GET /_search
[[configure-tags]]
[discrete]
=== Configure highlighting tags
== Configure highlighting tags
By default, the highlighting will wrap highlighted text in `<em>` and
`</em>`. This can be controlled by setting `pre_tags` and `post_tags`,
@ -457,7 +463,7 @@ GET /_search
[discrete]
[[highlight-source]]
=== Highlight on source
== Highlight on source
Forces the highlighting to highlight fields based on the source even if fields
are stored separately. Defaults to `false`.
@ -481,7 +487,7 @@ GET /_search
[[highlight-all]]
[discrete]
=== Highlight in all fields
== Highlight in all fields
By default, only fields that contains a query match are highlighted. Set
`require_field_match` to `false` to highlight all fields.
@ -505,7 +511,7 @@ GET /_search
[[matched-fields]]
[discrete]
=== Combine matches on multiple fields
== Combine matches on multiple fields
WARNING: This is only supported by the `fvh` highlighter
@ -639,7 +645,7 @@ to
[[explicit-field-order]]
[discrete]
=== Explicitly order highlighted fields
== Explicitly order highlighted fields
Elasticsearch highlights the fields in the order that they are sent, but per the
JSON spec, objects are unordered. If you need to be explicit about the order
in which fields are highlighted specify the `fields` as an array:
@ -666,7 +672,7 @@ fields are highlighted but a plugin might.
[discrete]
[[control-highlighted-frags]]
=== Control highlighted fragments
== Control highlighted fragments
Each field highlighted can control the size of the highlighted fragment
in characters (defaults to `100`), and the maximum number of fragments
@ -763,7 +769,7 @@ GET /_search
[discrete]
[[highlight-postings-list]]
=== Highlight using the postings list
== Highlight using the postings list
Here is an example of setting the `comment` field in the index mapping to
allow for highlighting using the postings:
@ -803,7 +809,7 @@ PUT /example
[discrete]
[[specify-fragmenter]]
=== Specify a fragmenter for the plain highlighter
== Specify a fragmenter for the plain highlighter
When using the `plain` highlighter, you can choose between the `simple` and
`span` fragmenters:

View File

@ -1,5 +1,5 @@
[[sort-search-results]]
=== Sort search results
== Sort search results
Allows you to add one or more sorts on specific fields. Each sort can be
reversed as well. The sort is defined on a per field level, with special
@ -48,12 +48,14 @@ NOTE: `_doc` has no real use-case besides being the most efficient sort order.
So if you don't care about the order in which documents are returned, then you
should sort by `_doc`. This especially helps when <<request-body-search-scroll,scrolling>>.
==== Sort Values
[discrete]
=== Sort Values
The sort values for each document returned are also returned as part of
the response.
==== Sort Order
[discrete]
=== Sort Order
The `order` option can have the following values:
@ -64,7 +66,8 @@ The `order` option can have the following values:
The order defaults to `desc` when sorting on the `_score`, and defaults
to `asc` when sorting on anything else.
==== Sort mode option
[discrete]
=== Sort mode option
Elasticsearch supports sorting by array or multi-valued fields. The `mode` option
controls what array value is picked for sorting the document it belongs
@ -84,7 +87,8 @@ The default sort mode in the ascending sort order is `min` -- the lowest value
is picked. The default sort mode in the descending order is `max` --
the highest value is picked.
===== Sort mode example usage
[discrete]
==== Sort mode example usage
In the example below the field price has multiple prices per document.
In this case the result hits will be sorted by price ascending based on
@ -109,7 +113,8 @@ POST /_search
}
--------------------------------------------------
==== Sorting numeric fields
[discrete]
=== Sorting numeric fields
For numeric fields it is also possible to cast the values from one type
to another using the `numeric_type` option.
@ -226,8 +231,9 @@ POST /index_long,index_double/_search
To avoid overflow, the conversion to `date_nanos` cannot be applied on dates before
1970 and after 2262 as nanoseconds are represented as longs.
[discrete]
[[nested-sorting]]
==== Sorting within nested objects.
=== Sorting within nested objects.
Elasticsearch also supports sorting by
fields that are inside one or more nested objects. The sorting by nested
@ -259,7 +265,8 @@ favor of the options documented above.
============================================
===== Nested sorting examples
[discrete]
==== Nested sorting examples
In the below example `offer` is a field of type `nested`.
The nested `path` needs to be specified; otherwise, Elasticsearch doesn't know on what nested level sort values need to be captured.
@ -337,7 +344,8 @@ POST /_search
Nested sorting is also supported when sorting by
scripts and sorting by geo distance.
==== Missing Values
[discrete]
=== Missing Values
The `missing` parameter specifies how docs which are missing
the sort field should be treated: The `missing` value can be
@ -363,7 +371,8 @@ GET /_search
NOTE: If a nested inner object doesn't match with
the `nested_filter` then a missing value is used.
==== Ignoring Unmapped Fields
[discrete]
=== Ignoring Unmapped Fields
By default, the search request will fail if there is no mapping
associated with a field. The `unmapped_type` option allows you to ignore
@ -388,8 +397,9 @@ If any of the indices that are queried doesn't have a mapping for `price`
then Elasticsearch will handle it as if there was a mapping of type
`long`, with all documents in this index having no value for this field.
[discrete]
[[geo-sorting]]
==== Geo Distance Sorting
=== Geo Distance Sorting
Allow to sort by `_geo_distance`. Here is an example, assuming `pin.location` is a field of type `geo_point`:
@ -444,7 +454,8 @@ have values for the field that is used for distance computation.
The following formats are supported in providing the coordinates:
===== Lat Lon as Properties
[discrete]
==== Lat Lon as Properties
[source,console]
--------------------------------------------------
@ -468,7 +479,8 @@ GET /_search
}
--------------------------------------------------
===== Lat Lon as String
[discrete]
==== Lat Lon as String
Format in `lat,lon`.
@ -491,7 +503,8 @@ GET /_search
}
--------------------------------------------------
===== Geohash
[discrete]
==== Geohash
[source,console]
--------------------------------------------------
@ -512,7 +525,8 @@ GET /_search
}
--------------------------------------------------
===== Lat Lon as Array
[discrete]
==== Lat Lon as Array
Format in `[lon, lat]`, note, the order of lon/lat here in order to
conform with http://geojson.org/[GeoJSON].
@ -536,8 +550,8 @@ GET /_search
}
--------------------------------------------------
==== Multiple reference points
[discrete]
=== Multiple reference points
Multiple geo points can be passed as an array containing any `geo_point` format, for example
@ -565,8 +579,8 @@ and so forth.
The final distance for a document will then be `min`/`max`/`avg` (defined via `mode`) distance of all points contained in the document to all points given in the sort request.
==== Script Based Sorting
[discrete]
=== Script Based Sorting
Allow to sort based on custom scripts, here is an example:
@ -593,8 +607,8 @@ GET /_search
}
--------------------------------------------------
==== Track Scores
[discrete]
=== Track Scores
When sorting on a field, scores are not computed. By setting
`track_scores` to true, scores will still be computed and tracked.
@ -615,7 +629,8 @@ GET /_search
}
--------------------------------------------------
==== Memory Considerations
[discrete]
=== Memory Considerations
When sorting, the relevant sorted field values are loaded into memory.
This means that per shard, there should be enough memory to contain

View File

@ -1,11 +1,35 @@
[[run-a-search]]
[[search-your-data]]
= Search your data
[[search-query]]
A _search query_, or _query_, is a request for information about data in
{es} data streams or indices.
You can think of a query as a question, written in a way {es} understands.
Depending on your data, you can use a query to get answers to questions like:
* What processes on my server take longer than 500 milliseconds to respond?
* What users on my network ran `regsvr32.exe` within the last week?
* How many of my products have a price greater than $20?
* What pages on my website contain a specific word or phrase?
A _search_ consists of one or more queries that are combined and sent to {es}.
Documents that match a search's queries are returned in the _hits_, or
_search results_, of the response.
A search may also contain additional information used to better process its
queries. For example, a search may be limited to a specific index or only return
a specific number of results.
[discrete]
[[run-an-es-search]]
== Run a search
You can use the <<search-search,search API>> to search data stored in
{es} data streams or indices.
The API can run two types of searches, depending on how you provide
<<search-query,queries>>:
queries:
<<run-uri-search,URI searches>>::
Queries are provided through a query parameter. URI searches tend to be
@ -269,11 +293,10 @@ GET /*/_search
----
include::request/from-size.asciidoc[]
include::search-fields.asciidoc[]
include::request/collapse.asciidoc[]
include::request/highlighting.asciidoc[]
include::request/sort.asciidoc[]
include::{es-repo-dir}/async-search.asciidoc[]
include::{es-repo-dir}/modules/cross-cluster-search.asciidoc[]
include::{es-repo-dir}/search/near-real-time.asciidoc[]