OpenSearch/docs/reference/search/request-body.asciidoc

146 lines
3.9 KiB
Plaintext
Raw Normal View History

[[search-request-body]]
=== Request Body Search
Specifies search criteria as request body parameters.
[source,console]
--------------------------------------------------
GET /twitter/_search
{
"query" : {
"term" : { "user" : "kimchy" }
}
}
--------------------------------------------------
// TEST[setup:twitter]
[[search-request-body-api-request]]
==== {api-request-title}
`GET /<target>/_search
{
"query": {<parameters>}
}`
[[search-request-body-api-desc]]
==== {api-description-title}
The search request can be executed with a search DSL, which includes the
<<query-dsl,Query DSL>>, within its body.
[[search-request-body-api-path-params]]
==== {api-path-parms-title}
`<target>`::
(Optional, string)
Comma-separated list or wildcard (`*`) expression of data streams, indices,
and index aliases used to limit the request.
+
To search all data streams and indices in a cluster, omit this parameter or use
`_all` or `*`.
[[search-request-body-api-request-body]]
==== {api-request-body-title}
See the search API's <<search-search-api-request-body,request body parameters>>.
==== Fast check for any matching docs
NOTE: `terminate_after` is always applied **after** the `post_filter` and stops
the query as well as the aggregation executions when enough hits have been
collected on the shard. Though the doc count on aggregations may not reflect
the `hits.total` in the response since aggregations are applied **before** the
post filtering.
In case we only want to know if there are any documents matching a
specific query, we can set the `size` to `0` to indicate that we are not
interested in the search results. Also we can set `terminate_after` to `1`
to indicate that the query execution can be terminated whenever the first
matching document was found (per shard).
[source,console]
--------------------------------------------------
GET /_search?q=message:number&size=0&terminate_after=1
--------------------------------------------------
// TEST[setup:twitter]
The response will not contain any hits as the `size` was set to `0`. The
`hits.total` will be either equal to `0`, indicating that there were no
matching documents, or greater than `0` meaning that there were at least
as many documents matching the query when it was early terminated.
Also if the query was terminated early, the `terminated_early` flag will
be set to `true` in the response.
[source,console-result]
--------------------------------------------------
{
"took": 3,
"timed_out": false,
"terminated_early": true,
"_shards": {
"total": 1,
"successful": 1,
Add a shard filter search phase to pre-filter shards based on query rewriting (#25658) Today if we search across a large amount of shards we hit every shard. Yet, it's quite common to search across an index pattern for time based indices but filtering will exclude all results outside a certain time range ie. `now-3d`. While the search can potentially hit hundreds of shards the majority of the shards might yield 0 results since there is not document that is within this date range. Kibana for instance does this regularly but used `_field_stats` to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results. This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.
2017-07-12 16:19:20 -04:00
"skipped" : 0,
"failed": 0
},
"hits": {
"total" : {
"value": 1,
"relation": "eq"
},
"max_score": null,
"hits": []
}
}
--------------------------------------------------
// TESTRESPONSE[s/"took": 3/"took": $body.took/]
The `took` time in the response contains the milliseconds that this request
took for processing, beginning quickly after the node received the query, up
until all search related work is done and before the above JSON is returned
to the client. This means it includes the time spent waiting in thread pools,
executing a distributed search across the whole cluster and gathering all the
results.
include::request/docvalue-fields.asciidoc[]
include::request/collapse.asciidoc[]
include::request/highlighting.asciidoc[]
include::request/index-boost.asciidoc[]
include::request/inner-hits.asciidoc[]
include::request/min-score.asciidoc[]
include::request/named-queries-and-filters.asciidoc[]
include::request/post-filter.asciidoc[]
include::request/preference.asciidoc[]
include::request/rescore.asciidoc[]
include::request/script-fields.asciidoc[]
include::request/scroll.asciidoc[]
include::request/search-after.asciidoc[]
include::request/search-type.asciidoc[]
include::request/sort.asciidoc[]
include::request/source-filtering.asciidoc[]
include::request/stored-fields.asciidoc[]
include::request/track-total-hits.asciidoc[]