[DOCS] Prune `Search your data` content (#61303) (#61462)

Changes: * Removes narrative around URI searches. These aren't commonly used in production. The `q` param is already covered in the search API docs: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-search.html#search-api-query-params-q * Adds a common options section that highlights narrative docs for query DSL, aggregations, multi-index search, search fields, pagination, sorting, and async search. * Adds a `Search shard routing` page. Moves narrative docs for adaptive replica selection, preference, routing , and shard limits to that section. * Moves search timeout and cancellation content to the `Search your data` page. * Creates a `Search multiple data streams and indices` page. Moves related narrative docs for multi-target syntax searches and `indices_boost` to that page. * Removes narrative examples for the `search_type` parameters. Moves documentation for this parameter to the search API docs.
2020-08-24 09:31:53 -04:00 · 2020-08-24 09:31:53 -04:00 · da89ff87bb
parent 0d8d0f423c
commit da89ff87bb
9 changed files with 522 additions and 464 deletions
--- a/docs/java-rest/high-level/document/multi-get.asciidoc
+++ b/docs/java-rest/high-level/document/multi-get.asciidoc
@ -65,7 +65,7 @@ include-tagged::{doc-tests-file}[{api}-request-item-extras]
 <2> Version
 <3> Version type
-{ref}/search-your-data.html#search-preference[`preference`],
+{ref}/search-search.html#search-preference[`preference`],
 {ref}/docs-get.html#realtime[`realtime`]
 and
 {ref}/docs-get.html#get-refresh[`refresh`] can be set on the main request but
--- a/docs/reference/search.asciidoc
+++ b/docs/reference/search.asciidoc
@ -1,156 +1,51 @@
 [[search]]
 == Search APIs
 Search APIs are used to search and aggregate data stored in {es} indices and
 data streams. For an overview and related tutorials, see <<search-your-data>>.
 Most search APIs support <<multi-index,multi-target syntax>>, with the
-exception of the <<search-explain>> endpoints.
+exception of the <<search-explain,explain API>>.
 [discrete]
-[[search-routing]]
+[[core-search-apis]]
-=== Routing
+=== Core search
-When executing a search, Elasticsearch will pick the "best" copy of the data
+* <<search-search>>
-based on the <<search-adaptive-replica,adaptive replica selection>> formula.
+* <<search-multi-search>>
-Which shards will be searched on can also be controlled by providing the
+* <<async-search>>
-`routing` parameter.
+* <<scroll-api>>
-
+* <<clear-scroll-api>>
-For example, the following indexing request routes documents to shard `1`:
+* <<search-suggesters>>
 [source,console]
 --------------------------------------------------
 POST /my-index-000001/_doc?routing=1
 {
  "@timestamp": "2099-11-15T13:12:00",
  "message": "GET /search HTTP/1.1 200 1070000",
  "user": {
    "id": "kimchy"
  }
 }
 --------------------------------------------------
 Later, you can use the `routing` parameter in a search request to search only
 the specified shard. The following search requests hits only shard `1`.
 [source,console]
 --------------------------------------------------
 POST /my-index-000001/_search?routing=1
 {
  "query": {
    "bool": {
      "must": {
        "query_string": {
          "query": "some query string here"
        }
      },
      "filter": {
        "term": { "user.id": "kimchy" }
      }
    }
  }
 }
 --------------------------------------------------
 // TEST[continued]
 The routing parameter can be multi valued represented as a comma
 separated string. This will result in hitting the relevant shards where
 the routing values match to.
 [discrete]
-[[search-adaptive-replica]]
+[[search-testing-apis]]
-=== Adaptive Replica Selection
+=== Search testing
-By default, Elasticsearch will use what is called adaptive replica selection.
+* <<search-explain>>
-This allows the coordinating node to send the request to the copy deemed "best"
+* <<search-field-caps>>
-based on a number of criteria:
+* <<search-profile>>
-
+* <<search-rank-eval>>
- Response time of past requests between the coordinating node and the node
+* <<search-shards>>
-  containing the copy of the data
+* <<search-validate>>
 - Time past search requests took to execute on the node containing the data
 - The queue size of the search threadpool on the node containing the data
 This can be turned off by changing the dynamic cluster setting
 `cluster.routing.use_adaptive_replica_selection` from `true` to `false`:
 [source,console]
 --------------------------------------------------
 PUT /_cluster/settings
 {
  "transient": {
    "cluster.routing.use_adaptive_replica_selection": false
  }
 }
 --------------------------------------------------
 If adaptive replica selection is turned off, searches are sent to the
 index/indices shards in a round robin fashion between all copies of the data
 (primaries and replicas).
 [discrete]
-[[stats-groups]]
+[[search-template-apis]]
-=== Stats Groups
+=== Search templates
-A search can be associated with stats groups, which maintains a
+* <<search-template>>
-statistics aggregation per group. It can later be retrieved using the
+* <<multi-search-template>>
 <<indices-stats,indices stats>> API
 specifically. For example, here is a search body request that associate
 the request with two different groups:
 [source,console]
 --------------------------------------------------
 POST /_search
 {
  "query" : {
    "match_all" : {}
  },
  "stats" : ["group1", "group2"]
 }
 --------------------------------------------------
 // TEST[setup:my_index]
 [discrete]
-[[global-search-timeout]]
+[[eql-search-apis]]
-=== Global Search Timeout
+=== EQL search
-Individual searches can have a timeout as part of the
+For an overview of EQL and related tutorials, see <<eql>>.
 <<search-request-body>>. Since search requests can originate from many
 sources, Elasticsearch has a dynamic cluster-level setting for a global
 search timeout that applies to all search requests that do not set a
 timeout in the request body. These requests will be cancelled after
 the specified time using the mechanism described in the following section on
 <<global-search-cancellation>>. Therefore the same caveats about timeout
 responsiveness apply.
-The setting key is `search.default_search_timeout` and can be set using the
+* <<eql-search-api>>
-<<cluster-update-settings>> endpoints. The default value is no global timeout.
+* <<get-async-eql-search-api>>
-Setting this value to `-1` resets the global search timeout to no timeout.
+* <<delete-async-eql-search-api>>
 [discrete]
 [[global-search-cancellation]]
 === Search Cancellation
 Searches can be cancelled using standard <<task-cancellation,task cancellation>>
 mechanism and are also automatically cancelled when the http connection used to
 perform the request is closed by the client. It is fundamental that the http
 client sending requests closes connections whenever requests time out or are
 aborted.
 [discrete]
 [[search-concurrency-and-parallelism]]
 === Search concurrency and parallelism
 By default Elasticsearch doesn't reject any search requests based on the number
 of shards the request hits. While Elasticsearch will optimize the search
 execution on the coordinating node a large number of shards can have a
 significant impact CPU and memory wise. It is usually a better idea to organize
 data in such a way that there are fewer larger shards. In case you would like to
 configure a soft limit, you can update the `action.search.shard_count.limit`
 cluster setting in order to reject search requests that hit too many shards.
 The request parameter `max_concurrent_shard_requests` can be used to control the
 maximum number of concurrent shard requests the search API will execute per node
 for the request. This parameter should be used to protect a single request from
 overloading a cluster (e.g., a default request will hit all indices in a cluster
 which could cause shard request rejections if the number of shards per node is
 high). This default value is `5`.
 include::search/search.asciidoc[]
--- a/docs/reference/search/request/index-boost.asciidoc
+++ b/docs/reference/search/request/index-boost.asciidoc
@ -1,37 +0,0 @@
 [discrete]
 [[index-boost]]
 === Index boost
 When searching multiple indices, you can use the `indices_boost` parameter to
 boost results from one or more specified indices. This is useful when hits
 coming from one index matter more than hits coming from another index.
 [source,console]
 --------------------------------------------------
 GET /_search
 {
  "indices_boost": [
    { "my-index-000001": 1.4 },
    { "my-index-000002": 1.3 }
  ]
 }
 --------------------------------------------------
 // TEST[s/^/PUT my-index-000001\nPUT my-index-000002\n/]
 You can also specify it as an array to control the order of boosts.
 [source,console]
 --------------------------------------------------
 GET /_search
 {
  "indices_boost": [
    { "my-alias":  1.4 },
    { "my-index*": 1.3 }
  ]
 }
 --------------------------------------------------
 // TEST[s/^/PUT my-index-000001\nPUT my-index-000001\/_alias\/my-alias\n/]
 This is important when you use aliases or wildcard expression.
 If multiple matches are found, the first match will be used.
 For example, if an index is included in both `alias1` and `index*`, boost value of `1.4` is applied.
--- a/docs/reference/search/request/preference.asciidoc
+++ b/docs/reference/search/request/preference.asciidoc
@ -1,80 +0,0 @@
 [discrete]
 [[search-preference]]
 === Preference
 You can use the `preference` parameter to control the shard copies on which a search runs. By
 default, Elasticsearch selects from the available shard copies in an
 unspecified order, taking the <<shard-allocation-awareness,allocation awareness>> and
 <<search-adaptive-replica,adaptive replica selection>> configuration into
 account. However, it may sometimes be desirable to try and route certain
 searches to certain sets of shard copies.
 A possible use case would be to make use of per-copy caches like the
 <<shard-request-cache,request cache>>. Doing this, however, runs contrary to the
 idea of search parallelization and can create hotspots on certain nodes because
 the load might not be evenly distributed anymore.
 The `preference` is a query string parameter which can be set to:
 [horizontal]
 `_only_local`::
 	The operation will be executed only on shards allocated to the local
 	node.
 `_local`::
 	The operation will be executed on shards allocated to the local node if
 	possible, and will fall back to other shards if not.
 `_prefer_nodes:abc,xyz`::
 	The operation will be executed on nodes with one of the provided node
 	ids (`abc` or `xyz` in this case) if possible. If suitable shard copies
 	exist on more than one of the selected nodes then the order of
 	preference between these copies is unspecified.
 `_shards:2,3`::
 	Restricts the operation to the specified shards. (`2` and `3` in this
 	case).  This preference can be combined with other preferences but it
 	has to appear first: `_shards:2,3|_local`
 `_only_nodes:abc*,x*yz,...`::
 	Restricts the operation to nodes specified according to the
 	<<cluster,node specification>>. If suitable shard copies exist on more
 	than one of the selected nodes then the order of preference between
 	these copies is unspecified.
 Custom (string) value::
 	Any value that does not start with `_`. If two searches both give the same
 	custom string value for their preference and the underlying cluster state
 	does not change then the same ordering of shards will be used for the
 	searches. This does not guarantee that the exact same shards will be used
 	each time: the cluster state, and therefore the selected shards, may change
 	for a number of reasons including shard relocations and shard failures, and
 	nodes may sometimes reject searches causing fallbacks to alternative nodes.
 	However, in practice the ordering of shards tends to remain stable for long
 	periods of time. A good candidate for a custom preference value is something
 	like the web session id or the user name.
 For instance, use the user's session ID `xyzabc123` as follows:
 [source,console]
 ------------------------------------------------
 GET /_search?preference=xyzabc123
 {
  "query": {
    "match": {
      "title": "elasticsearch"
    }
  }
 }
 ------------------------------------------------
 This can be an effective strategy to increase usage of e.g. the request cache for
 unique users running similar searches repeatedly by always hitting the same cache, while
 requests of different users are still spread across all shard copies.
 NOTE: The `_only_local` preference guarantees only to use shard copies on the
 local node, which is sometimes useful for troubleshooting. All other options do
 not _fully_ guarantee that any particular shard copies are used in a search,
 and on a changing index this may mean that repeated searches may yield
 different results if they are executed on different shard copies which are in
 different refresh states.
--- a/docs/reference/search/request/search-type.asciidoc
+++ b/docs/reference/search/request/search-type.asciidoc
@ -1,78 +0,0 @@
 [discrete]
 [[search-type]]
 === Search type
 There are different execution paths that can be done when executing a
 distributed search. The distributed search operation needs to be
 scattered to all the relevant shards and then all the results are
 gathered back. When doing scatter/gather type execution, there are
 several ways to do that, specifically with search engines.
 One of the questions when executing a distributed search is how many
 results to retrieve from each shard. For example, if we have 10 shards,
 the 1st shard might hold the most relevant results from 0 till 10, with
 other shards results ranking below it. For this reason, when executing a
 request, we will need to get results from 0 till 10 from all shards,
 sort them, and then return the results if we want to ensure correct
 results.
 Another question, which relates to the search engine, is the fact that each
 shard stands on its own. When a query is executed on a specific shard,
 it does not take into account term frequencies and other search engine
 information from the other shards. If we want to support accurate
 ranking, we would need to first gather the term frequencies from all
 shards to calculate global term frequencies, then execute the query on
 each shard using these global frequencies.
 Also, because of the need to sort the results, getting back a large
 document set, or even scrolling it, while maintaining the correct sorting
 behavior can be a very expensive operation. For large result set
 scrolling, it is best to sort by `_doc` if the order in which documents
 are returned is not important.
 Elasticsearch is very flexible and allows to control the type of search
 to execute on a *per search request* basis. The type can be configured
 by setting the *search_type* parameter in the query string. The types
 are:
 [discrete]
 [[query-then-fetch]]
 ==== Query Then Fetch
 Parameter value: *query_then_fetch*.
 The request is processed in two phases. In the first phase, the query
 is forwarded to *all involved shards*. Each shard executes the search
 and generates a sorted list of results, local to that shard. Each
 shard returns *just enough information* to the coordinating node
 to allow it to merge and re-sort the shard level results into a globally
 sorted set of results, of maximum length `size`. 
 During the second phase, the coordinating node requests the document
 content (and highlighted snippets, if any) from *only the relevant
 shards*.
 [source,console]
 --------------------------------------------------
 GET my-index-000001/_search?search_type=query_then_fetch
 --------------------------------------------------
 // TEST[setup:my_index]
 NOTE: This is the default setting, if you do not specify a `search_type`
      in your request.
 [discrete]
 [[dfs-query-then-fetch]]
 ==== Dfs, Query Then Fetch
 Parameter value: *dfs_query_then_fetch*.
 Same as "Query Then Fetch", except for an initial scatter phase which
 goes and computes the distributed term frequencies for more accurate
 scoring.
 [source,console]
 --------------------------------------------------
 GET my-index-000001/_search?search_type=dfs_query_then_fetch
 --------------------------------------------------
 // TEST[setup:my_index]
--- a/docs/reference/search/search-multiple-indices.asciidoc
+++ b/docs/reference/search/search-multiple-indices.asciidoc
@ -0,0 +1,117 @@
 [[search-multiple-indices]]
 == Search multiple data streams and indices
 To search multiple data streams and indices, add them as comma-separated values
 in the <<search-search,search API>>'s request path.
 The following request searches the `my-index-000001` and `my-index-000002`
 indices.
 [source,console]
 ----
 GET /my-index-000001,my-index-000002/_search
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 ----
 // TEST[setup:my_index]
 // TEST[s/^/PUT my-index-000002\n/]
 You can also search multiple data streams and indices using an index pattern.
 The following request targets the `my-index-*` index pattern. The request
 searches any data streams or indices in the cluster that start with `my-index-`.
 [source,console]
 ----
 GET /my-index-*/_search
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 ----
 // TEST[setup:my_index]
 To search all data streams and indices in a cluster, omit the target from the
 request path. Alternatively, you can use `_all` or `*`.
 The following requests are equivalent and search all data streams and indices in
 the cluster.
 [source,console]
 ----
 GET /_search
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 GET /_all/_search
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 GET /*/_search
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 ----
 // TEST[setup:my_index]
 [discrete]
 [[index-boost]]
 === Index boost
 When searching multiple indices, you can use the `indices_boost` parameter to
 boost results from one or more specified indices. This is useful when hits
 coming from some indices matter more than hits from other.
 NOTE: You cannot use `indices_boost` with data streams.
 [source,console]
 --------------------------------------------------
 GET /_search
 {
  "indices_boost": [
    { "my-index-000001": 1.4 },
    { "my-index-000002": 1.3 }
  ]
 }
 --------------------------------------------------
 // TEST[s/^/PUT my-index-000001\nPUT my-index-000002\n/]
 Index aliases and index patterns can also be used:
 [source,console]
 --------------------------------------------------
 GET /_search
 {
  "indices_boost": [
    { "my-alias":  1.4 },
    { "my-index*": 1.3 }
  ]
 }
 --------------------------------------------------
 // TEST[s/^/PUT my-index-000001\nPUT my-index-000001\/_alias\/my-alias\n/]
 If multiple matches are found, the first match will be used. For example, if an
 index is included in `alias1` and matches the `my-index*` pattern, a boost value
 of `1.4` is applied.
--- a/docs/reference/search/search-shard-routing.asciidoc
+++ b/docs/reference/search/search-shard-routing.asciidoc
@ -0,0 +1,184 @@
 [[search-shard-routing]]
 == Search shard routing
 To protect against hardware failure and increase search capacity, {es} can store
 copies of an index's data across multiple shards on multiple nodes. When running
 a search request, {es} selects a node containing a copy of the index's data and
 forwards the search request to that node's shards. This process is known as
 _search shard routing_ or _routing_.
 [discrete]
 [[search-adaptive-replica]]
 === Adaptive replica selection
 By default, {es} uses _adaptive replica selection_ to route search requests.
 This method selects an eligible node using <<allocation-awareness,allocation
 awareness>> and the following criteria:
 * Response time of prior requests between the coordinating node
 and the eligible node
 * How long the eligible node took to run previous searches
 * Queue size of the eligible node's `search` <<modules-threadpool,threadpool>>
 Adaptive replica selection is designed to decrease search latency. However, you
 can disable adaptive replica selection by setting
 `cluster.routing.use_adaptive_replica_selection` to `false` using the
 <<cluster-update-settings,cluster settings API>>. If disabled, {es} routes
 search requests using a round-robin method, which may result in slower searches.
 [discrete]
 [[shard-and-node-preference]]
 === Set a preference
 By default, adaptive replica selection chooses from all eligible nodes and
 shards. However, you may only want data from a local node or want to route
 searches to a specific node based on its hardware. Or you may want to send
 repeated searches to the same shard to take advantage of caching.
 To limit the set of nodes and shards eligible for a search request, use
 the search API's <<search-preference,`preference`>> query parameter.
 For example, the following request searches `my-index-000001` with a
 `preference` of `_local`. This restricts the search to shards on the
 local node. If the local node contains no shard copies of the index's data, the
 request uses adaptive replica selection to another eligible node
 as a fallback.
 [source,console]
 ----
 GET /my-index-000001/_search?preference=_local
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 ----
 // TEST[setup:my_index]
 You can also use the `preference` parameter to route searches to specific shards
 based on a provided string. If the cluster state and selected shards
 do not change, searches using the same `preference` string are routed to the
 same shards in the same order.
 We recommend using a unique `preference` string, such as a user name or web
 session ID. This string cannot start with a `_`.
 TIP: You can use this option to serve cached results for frequently used and
 resource-intensive searches. If the shard's data doesn't change, repeated
 searches with the same `preference` string retrieve results from the same
 <<shard-request-cache,shard request cache>>. For time-series use cases, such as
 logging, data in older indices is rarely updated and can be served directly from
 this cache.
 The following request searches `my-index-000001` with a `preference` string of
 `my-custom-shard-string`.
 [source,console]
 ----
 GET /my-index-000001/_search?preference=my-custom-shard-string
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 ----
 // TEST[setup:my_index]
 NOTE: If the cluster state or selected shards change, the same `preference`
 string may not route searches to the same shards in the same order. This can
 occur for a number of reasons, including shard relocations and shard failures. A
 node can also reject a search request, which {es} would re-route to another
 node.
 [discrete]
 [[search-routing]]
 === Use a routing value
 When you index a document, you can specify an optional
 <<mapping-routing-field,routing value>>, which routes the document to a
 specific shard.
 For example, the following indexing request routes a document using
 `my-routing-value`.
 [source,console]
 ----
 POST /my-index-000001/_doc?routing=my-routing-value
 {
  "@timestamp": "2099-11-15T13:12:00",
  "message": "GET /search HTTP/1.1 200 1070000",
  "user": {
    "id": "kimchy"
  }
 }
 ----
 You can use the same routing value in the search API's `routing` query
 parameter. This ensures the search runs on the same shard used to index the
 document.
 [source,console]
 ----
 GET /my-index-000001/_search?routing=my-routing-value
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 ----
 // TEST[setup:my_index]
 You can also provide multiple comma-separated routing values:
 [source,console]
 ----
 GET /my-index-000001/_search?routing=my-routing-value,my-routing-value-2
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 ----
 // TEST[setup:my_index]
 [discrete]
 [[search-concurrency-and-parallelism]]
 === Search concurrency and parallelism
 By default, {es} doesn't reject search requests based on the number of shards
 the request hits. However, hitting a large number of shards can significantly
 increase CPU and memory usage.
 TIP: For tips on preventing indices with large numbers of shards, see
 <<avoid-oversharding>>.
 You can use the `max_concurrent_shard_requests` query parameter to control
 maximum number of concurrent shards a search request can hit per node. This
 prevents a single request from overloading a cluster. The parameter defaults to
 a maximum of `5`.
 [source,console]
 ----
 GET /my-index-000001/_search?max_concurrent_shard_requests=3
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 ----
 // TEST[setup:my_index]
 You can also use the `action.search.shard_count.limit` cluster setting to set a
 search shard limit and reject requests that hit too many shards. You can
 configure `action.search.shard_count.limit` using the
 <<cluster-update-settings,cluster settings API>>.
--- a/docs/reference/search/search-your-data.asciidoc
+++ b/docs/reference/search/search-your-data.asciidoc
@ -24,55 +24,30 @@ a specific number of results.
 [[run-an-es-search]]
 == Run a search
-You can use the <<search-search,search API>> to search data stored in
+You can use the <<search-search,search API>> to search and
-{es} data streams or indices.
+<<search-aggregations,aggregate>> data stored in {es} data streams or indices.
 The API's `query` request body parameter accepts queries written in
 <<query-dsl,Query DSL>>.
-The API can run two types of searches, depending on how you provide
+The following request searches `my-index-000001` using a
-queries:
+<<query-dsl-match-query,`match`>> query. This query matches documents with a
-
+`user.id` value of `kimchy`.
 <<run-uri-search,URI searches>>::
  Queries are provided through a query parameter. URI searches tend to be
  simpler and best suited for testing.
 <<run-request-body-search,Request body searches>>::
  Queries are provided through the JSON body of the API request. These queries
  are written in <<query-dsl,Query DSL>>. We recommend using request body
  searches in most production use cases.
 [WARNING]
 ====
 If you specify a query in both the URI and request body, the search API request
 runs only the URI query.
 ====
 [discrete]
 [[run-uri-search]]
 === Run a URI search
 You can use the search API's <<search-api-query-params-q,`q` query string
 parameter>> to run a search in the request's URI. The `q` parameter only accepts
 queries written in Lucene's <<query-string-syntax,query string syntax>>.
 The following URI search matches documents with a `user.id` value of `kimchy`.
 [source,console]
 ----
-GET /my-index-000001/_search?q=user.id:kimchy
+GET /my-index-000001/_search
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 ----
 // TEST[setup:my_index]
-The API returns the following response.
+The API response returns the top 10 documents matching the query in the
-
+`hits.hits` property.
 By default, the `hits.hits` property returns the top 10 documents matching the
 query. To retrieve more documents, see <<paginate-search-results>>.
 The response sorts documents in `hits.hits` by `_score`, a
 <<relevance-scores,relevance score>> that measures how well each document
 matches the query.
 The `hit.hits` property also includes the <<mapping-source-field,`_source`>> for
 each matching document. To retrieve only a subset of the `_source` or other
 fields, see <<search-fields>>.
 [source,console-result]
 ----
@ -126,20 +101,84 @@ fields, see <<search-fields>>.
 // TESTRESPONSE[s/"_id": "kxWFcnMByiguvud1Z8vC"/"_id": "$body.hits.hits.0._id"/]
 [discrete]
-[[run-request-body-search]]
+[[common-search-options]]
-=== Run a request body search
+=== Common search options
-You can use the search API's <<request-body-search-query,`query` request
+You can use the following options to customize your searches.
 body parameter>> to provide a query as a JSON object, written in
 <<query-dsl,Query DSL>>.
-The following request body search uses the <<query-dsl-match-query,`match`>>
+*Query DSL* +
-query to match documents with a `user.id` value of `kimchy`.
+<<query-dsl,Query DSL>> supports a variety of query types you can mix and match
 to get the results you want. Query types include:
 * <<query-dsl-bool-query,Boolean>> and other <<compound-queries,compound
 queries>>, which let you combine queries and match results based on multiple
 criteria
 * <<term-level-queries,Term-level queries>> for filtering and finding exact matches
 * <<full-text-queries,Full text queries>>, which are commonly used in search
 engines
 * <<geo-queries,Geo>> and <<shape-queries,spatial queries>>
 *Aggregations* +
 You can use <<search-aggregations,search aggregations>> to get statistics and
 other analytics for your search results. Aggregations help you answer questions
 like:
 * What's the average response time for my servers?
 * What are the top IP addresses hit by users on my network?
 * What is the total transaction revenue by customer?
 *Search multiple data streams and indices* +
 You can use comma-separated values and grep-like index patterns to search
 several data streams and indices in the same request. You can even boost search
 results from specific indices. See <<search-multiple-indices>>.
 *Paginate search results* +
 By default, searches return only the top 10 matching hits. To retrieve
 more or fewer documents, see <<paginate-search-results>>.
 *Retrieve selected fields* +
 The search response's `hit.hits` property includes the full document
 <<mapping-source-field,`_source`>> for each hit. To retrieve only a subset of
 the `_source` or other fields, see <<search-fields>>.
 *Sort search results* +
 By default, search hits are sorted by `_score`, a <<relevance-scores,relevance
 score>> that measures how well each document matches the query. To customize the
 calculation of these scores, use the
 <<query-dsl-script-score-query,`script_score`>> query. To sort search hits by
 other field values, see <<sort-search-results>>.
 *Run an async search* +
 {es} searches are designed to run on large volumes of data quickly, often
 returning results in milliseconds. For this reason, searches are
 _synchronous_ by default. The search request waits for complete results before
 returning a response.
 However, complete results can take longer for searches across
 <<frozen-indices,frozen indices>> or <<modules-cross-cluster-search,multiple
 clusters>>.
 To avoid long waits, you can use run an _asynchronous_, or _async_, search
 instead. An <<async-search-intro,async search>> lets you retrieve partial
 results for a long-running search now and get complete results later.
 [discrete]
 [[search-timeout]]
 === Search timeout
 By default, search requests don't time out. The request waits for complete
 results before returning a response.
 While <<async-search-intro,async search>> is designed for long-running
 searches, you can also use the `timeout` parameter to specify a duration you'd
 like to wait for a search to complete. If no response is received before this
 period ends, the request fails and returns an error.
 [source,console]
 ----
 GET /my-index-000001/_search
 {
  "timeout": "2s",
  "query": {
    "match": {
      "user.id": "kimchy"
@ -149,88 +188,23 @@ GET /my-index-000001/_search
 ----
 // TEST[setup:my_index]
 To set a cluster-wide default timeout for all search requests, configure
 `search.default_search_timeout` using the <<cluster-update-settings,cluster
 settings API>>. This global timeout duration is used if no `timeout` argument is
 passed in the request. If the global search timeout expires before the search
 request finishes, the request is cancelled using <<task-cancellation,task
 cancellation>>. The `search.default_search_timeout` setting defaults to `-1` (no
 timeout).
 [discrete]
-[[search-multiple-indices]]
+[[global-search-cancellation]]
-=== Search multiple data streams and indices
+=== Search cancellation
-To search multiple data streams and indices, add them as comma-separated values
+You can cancel a search request using the <<task-cancellation,task management
-in the search API request path.
+API>>. {es} also automatically cancels a search request when your client's HTTP
 connection closes. We recommend you set up your client to close HTTP connections
 when a search request is aborted or times out.
 The following request searches the `my-index-000001` and `my-index-000002`
 indices.
 [source,console]
 ----
 GET /my-index-000001,my-index-000002/_search
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 ----
 // TEST[setup:my_index]
 // TEST[s/^/PUT my-index-000002\n/]
 You can also search multiple data streams and indices using a wildcard (`*`)
 pattern.
 The following request targets the wildcard pattern `my-index-*`. The request
 searches any data streams or indices in the cluster that start with `my-index-`.
 [source,console]
 ----
 GET /my-index-*/_search
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 ----
 // TEST[setup:my_index]
 To search all data streams and indices in a cluster, omit the target from the
 request path. Alternatively, you can use `_all` or `*`.
 The following requests are equivalent and search all data streams and indices in the cluster.
 [source,console]
 ----
 GET /_search
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 GET /_all/_search
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 GET /*/_search
 {
  "query": {
    "match": {
      "user.id": "kimchy"
    }
  }
 }
 ----
 // TEST[setup:my_index]
 include::request/index-boost.asciidoc[]
 include::request/preference.asciidoc[]
 include::request/search-type.asciidoc[]
 include::request/track-total-hits.asciidoc[]
 include::quickly-check-for-matching-docs.asciidoc[]
@ -243,4 +217,6 @@ include::paginate-search-results.asciidoc[]
 include::request/inner-hits.asciidoc[]
 include::search-fields.asciidoc[]
 include::{es-repo-dir}/modules/cross-cluster-search.asciidoc[]
 include::search-multiple-indices.asciidoc[]
 include::search-shard-routing.asciidoc[]
 include::request/sort.asciidoc[]
--- a/docs/reference/search/search.asciidoc
+++ b/docs/reference/search/search.asciidoc
@ -129,9 +129,44 @@ When unspecified, the pre-filter phase is executed if any of these conditions is
  - The request targets one or more read-only index.
  - The primary sort of the query targets an indexed field.
 [[search-preference]]
 `preference`::
-(Optional, string) Specifies the node or shard the operation should be
+(Optional, string)
-performed on. Random by default.
+Nodes and shards used for the search. By default, {es} selects from eligible
 nodes and shards using <<search-adaptive-replica,adaptive replica selection>>,
 accounting for <<shard-allocation-awareness,allocation awareness>>.
 +
 .Valid values for `preference`
 [%collapsible%open]
 ====
 `_only_local`::
 Run the search only on shards on the local node.
 `_local`::
 If possible, run the search on shards on the local node. If not, select shards
 using the default method.
 `_only_nodes:<node-id>,<node-id>`::
 Run the search on only the specified nodes IDs. If suitable shards exist on more
 than one selected nodes, use shards on those nodes using the default method. If
 none of the specified nodes are available, select shards from any available node
 using the default method.
 `_prefer_nodes:<node-id>,<node-id>`::
 If possible, run the search on the specified nodes IDs. If not, select shards
 using the default method.
 `_shards:<shard>,<shard>`::
 Run the search only on the specified shards. This value can be combined with
 other `preference` values, but this value must come first. For example:
 `_shards:2,3|_local`
 <custom-string>::
 Any string that does not start with `_`. If the cluster state and selected
 shards do not change, searches using the same `<custom-string>` value are routed
 to the same shards in the same order.
 ====
 [[search-api-query-params-q]]
 include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=search-q]
@ -164,7 +199,28 @@ Period to retain the <<scroll-search-context,search context>> for scrolling. See
 By default, this value cannot exceed `1d` (24 hours). You can change
 this limit using the `search.max_keep_alive` cluster-level setting.
-include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=search_type]
+[[search-type]]
 `search_type`::
 (Optional, string)
 How {wikipedia}/Tf–idf[distributed term frequencies] are calculated for
 <<relevance-scores,relevance scoring>>.
 +
 .Valid values for `search_type`
 [%collapsible%open]
 ====
 `query_then_fetch`::
 (Default)
 Distributed term frequencies are calculated locally for each shard running the
 search. We recommend this option for faster searches with potentially less
 accurate scoring.
 [[dfs-query-then-fetch]]
 `dfs_query_then_fetch`::
 Distributed term frequencies are calculated globally, using information gathered
 from all shards running the search. While this option increases the accuracy of
 scoring, it adds a round-trip to each shard, which can result in slower
 searches.
 ====
 `seq_no_primary_term`::
 (Optional, boolean) If `true`, returns sequence number and primary term of the
@ -284,7 +340,7 @@ You can specify items in the array as a string or object.
 See <<docvalue-fields>>.
 +
 .Properties of `docvalue_fields` objects
-[%collapsible]
+[%collapsible%open]
 ====
 `field`::
 (Required, string)
@ -326,6 +382,24 @@ As an alternative to deep paging, we recommend using
 <<search-after,`search_after`>> parameter.
 --
 `indices_boost`::
 (Optional, array of objects)
 Boosts the <<relevance-scores,`_score`>> of documents from specified indices.
 +
 .Properties of `indices_boost` objects
 [%collapsible%open]
 ====
 `<index>: <boost-value>`::
 (Required, float)
 `<index>` is the name of the index or index alias. Wildcard (`*`) expressions
 are supported.
 +
 `<boost-value>` is the factor by which scores are multiplied.
 +
 A boost value greater than `1.0` increases the score. A boost value between
 `0` and `1.0` decreases the score. 
 ====
 [[search-api-min-score]]
 `min_score`::
 (Optional, float)
@ -409,6 +483,13 @@ exclude fields from this subset using the `excludes` property.
 =====
 ====
 [[stats-groups]]
 `stats`::
 (Optional, array of strings)
 Stats groups to associate with the search. Each group maintains a statistics
 aggregation for its associated searches. You can retrieve these stats using the
 <<indices-stats,indices stats API>>.
 include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=terminate_after]
 +
 Defaults to `0`, which does not terminate query execution early.