OpenSearch/docs/reference/search/uri-request.asciidoc

134 lines
4.8 KiB
Plaintext
Raw Normal View History

[[search-uri-request]]
== URI Search
A search request can be executed purely using a URI by providing request
parameters. Not all search options are exposed when executing a search
using this mode, but it can be handy for quick "curl tests". Here is an
example:
[source,js]
--------------------------------------------------
GET twitter/_search?q=user:kimchy
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
And here is a sample response:
[source,js]
--------------------------------------------------
{
"timed_out": false,
"took": 62,
"_shards":{
"total" : 1,
"successful" : 1,
Add a shard filter search phase to pre-filter shards based on query rewriting (#25658) Today if we search across a large amount of shards we hit every shard. Yet, it's quite common to search across an index pattern for time based indices but filtering will exclude all results outside a certain time range ie. `now-3d`. While the search can potentially hit hundreds of shards the majority of the shards might yield 0 results since there is not document that is within this date range. Kibana for instance does this regularly but used `_field_stats` to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results. This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.
2017-07-12 16:19:20 -04:00
"skipped" : 0,
"failed" : 0
},
"hits":{
"total" : {
"value": 1,
"relation": "eq"
},
"max_score": 1.3862944,
"hits" : [
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "0",
"_score": 1.3862944,
"_source" : {
"user" : "kimchy",
"date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch",
"likes": 0
}
}
]
}
}
--------------------------------------------------
// TESTRESPONSE[s/"took": 62/"took": "$body.took"/]
[float]
=== Parameters
The parameters allowed in the URI are:
[cols="<,<",options="header",]
|=======================================================================
|Name |Description
|`q` |The query string (maps to the `query_string` query, see
<<query-dsl-query-string-query,_Query String
Query_>> for more details).
|`df` |The default field to use when no field prefix is defined within the
query.
|`analyzer` |The analyzer name to be used when analyzing the query string.
|`analyze_wildcard` |Should wildcard and prefix queries be analyzed or
not. Defaults to `false`.
|`batched_reduce_size` | The number of shard results that should be reduced
at once on the coordinating node. This value should be used as a protection
mechanism to reduce the memory overhead per search request if the potential
number of shards in the request can be large.
|`default_operator` |The default operator to be used, can be `AND` or
`OR`. Defaults to `OR`.
|`lenient` |If set to true will cause format based failures (like
providing text to a numeric field) to be ignored. Defaults to false.
|`explain` |For each hit, contain an explanation of how scoring of the
hits was computed.
|`_source`|Set to `false` to disable retrieval of the `_source` field. You can also retrieve
part of the document by using `_source_includes` & `_source_excludes` (see the <<search-request-source-filtering, request body>>
documentation for more details)
|`stored_fields` |The selective stored fields of the document to return for each hit,
comma delimited. Not specifying any value will cause no fields to return.
|`sort` |Sorting to perform. Can either be in the form of `fieldName`, or
`fieldName:asc`/`fieldName:desc`. The fieldName can either be an actual
field within the document, or the special `_score` name to indicate
sorting based on scores. There can be several `sort` parameters (order
is important).
|`track_scores` |When sorting, set to `true` in order to still track
scores and return them as part of each hit.
|`track_total_hits` |Set to `false` in order to disable the tracking
of the total number of hits that match the query.
(see <<index-modules-index-sorting,_Index Sorting_>> for more details).
Defaults to true.
|`timeout` |A search timeout, bounding the search request to be executed
within the specified time value and bail with the hits accumulated up to
that point when expired. Defaults to no timeout.
|`terminate_after` |The maximum number of documents to collect for
each shard, upon reaching which the query execution will terminate early.
If set, the response will have a boolean field `terminated_early` to
indicate whether the query execution has actually terminated_early.
Defaults to no terminate_after.
|`from` |The starting from index of the hits to return. Defaults to `0`.
|`size` |The number of hits to return. Defaults to `10`.
|`search_type` |The type of the search operation to perform. Can be
`dfs_query_then_fetch` or `query_then_fetch`.
Defaults to `query_then_fetch`. See
<<search-request-search-type,_Search Type_>> for
more details on the different types of search that can be performed.
|`allow_partial_search_results` |Set to `false` to return an overall failure if the request would produce
partial results. Defaults to true, which will allow partial results in the case of timeouts
or partial failures. This default can be controlled using the cluster-level setting
`search.default_allow_partial_results`.
|=======================================================================