OpenSearch

Commit Graph

Author	SHA1	Message	Date
Julie Tibshirani	4f4c4a8713	Add a reference on returning fields during a search. (#57500 ) This PR adds a section to the new 'run a search' reference that explains the options for returning fields. Previously each option was only listed as a separate request parameter and it was hard to know what was available.	2020-06-03 09:41:48 -07:00
James Rodewig	80faafdfc7	[DOCS] Add clear scroll API reference docs (#57367 ) (#57611 )	2020-06-03 11:58:16 -04:00
James Rodewig	a2e44a0c76	[DOCS] Refactor admons for multi-parameter options (#57491 ) (#57540 ) Several APIs support options that can be specified as a query parameter or a request body parameter. Currently, this is documented using notes, which can get rather lengthy. This replaces those multiple notes with a single note and a footnote.	2020-06-02 12:12:29 -04:00
James Rodewig	808835ac1c	[DOCS] Add scroll API reference docs (#57153 ) (#57528 ) Changes: * Adds API reference docs for the scroll API * Documents several related parameters in the search API docs	2020-06-02 10:11:12 -04:00
Julie Tibshirani	e434c481dc	Avoid unnecessary use of stored_fields in our docs. (#57488 ) Generally we don't advocate for using `stored_fields`, and we're interested in eventually removing the need for this parameter. So it's best to avoid using stored fields in our docs examples when it's not actually necessary. Individual changes: * Avoid using 'stored_fields' in our docs. * When defining script fields in top-hits, de-emphasize stored fields.	2020-06-01 17:31:42 -07:00
Lisa Cawley	db5bf92acf	[7.x][DOCS] Replace docdir attribute with es-repo-dir (#57489 ) (#57494 )	2020-06-01 16:42:53 -07:00
James Rodewig	6592c3856d	[DOCS] Fix deep paging recommendations Corrects recommendation to reference the `search_after` parameter, not API. Also corrects a typo and whitespace inconsistencies in the search docs.	2020-06-01 18:04:04 -04:00
James Rodewig	994781ff36	[DOCS] Add search pagination docs (#56785 ) (#57477 ) Reworks the `from / size` content to `Paginate search results`. Moves those docs from the request body search API page (slated for deletion) to the `Run a search` tutorial docs. Also adds some notes to the `from` and `size` param docs. Co-authored-by: debadair <debadair@elastic.co>	2020-06-01 16:43:06 -04:00
James Rodewig	ab8ae7cf25	[DOCS] Combine search API and URI search API reference docs (#55884 ) (#57469 ) The search API and URI search pages document the same `_search` API. This combines the documentation from each page under the search API docs.	2020-06-01 15:53:40 -04:00
James Rodewig	50a8779c94	[DOCS] Create top-level "Search your data" page (#56058 ) (#57463 ) Goal Create a top-level search section. This will let us clean up our search API reference docs, particularly content from [`Request body search`][0]. Changes * Creates a top-level `Search your data` page. This page is designed to house concept and tutorial docs related to search. * Creates a `Run a search` page under `Search your data`. For now, This contains a basic search tutorial. The goal is to add content from [`Request body search`][0] to this in the future. * Relocates `Long-running searches` and `Search across clusters` under `Search your data`. Increments several headings in that content. * Reorders the top-level TOC to move `Search your data` higher. Also moves the `Query DSL`, `EQL`, and `SQL access` chapters immediately after. Relates to #48194 [0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-request-body.html	2020-06-01 14:55:26 -04:00
James Rodewig	6934264162	[DOCS] Relocate `shard allocation` module content (#56535 ) (#57448 )	2020-06-01 13:15:08 -04:00
Nik Everett	4d5be7c817	Save memory on numeric sig terms when not top (backport of #56789 ) (#57221 ) This saves memory when running numeric significant terms which are not at the top level by merging its collection into numeric terms and relying on the optimization that we made in #55873.	2020-05-27 12:03:28 -04:00
Théophile Helleboid - chtitux	a2c6d61ed5	[DOCS] Fix typo in search API `explain` param def (#56991 ) Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-05-20 09:22:23 -04:00
James Rodewig	f6d2688de2	[DOCS] Add JS client helper links to docs (#55216 ) (#56968 ) Adds links for the Elasticsearch-js client to the bulk and scroll docs. Co-authored-by: Tomas Della Vedova <delvedor@users.noreply.github.com>	2020-05-19 16:53:22 -04:00
Nik Everett	126619ae3c	Add list of defered aggregations to the profiler (backport of #56208 ) (#56682 ) This adds a few things to the `breakdown` of the profiler: * `histogram` aggregations now contain `total_buckets` which is the count of buckets that they collected. This could be useful when debugging a histogram inside of another bucketing agg that is fairly selective. * All bucketing aggs that can delay their sub-aggregations will now add a list of delayed sub-aggregations. This is useful because we sometimes have fairly involved logic around which sub-aggregations get delayed and this will save you from having to guess. * Aggregtations wrapped in the `MultiBucketAggregatorWrapper` can't accurately add anything to the breakdown. Instead they the wrapper adds a marker entry `"multi_bucket_aggregator_wrapper": true` so we can be quickly pick out such aggregations when debugging. It also fixes a bug where `_count` breakdown entries were contributing to the overall `time_in_nanos`. They didn't add a large amount of time so it is unlikely that this caused a big problem, but I was there. To support the arbitrary breakdown data this reworks the profiler so that the `breakdown` can contain any data that is supported by `StreamOutput#writeGenericValue(Object)` and `XContentBuilder#value(Object)`.	2020-05-13 16:33:22 -04:00
James Rodewig	2be6d7b8b6	[DOCS] Relocate request body param docs to search API docs (#56436 ) Moves documentation for the following request body parameters to the search API reference docs: * `explain` * `query` * `seq_no_primary_term` * `version` Removes documentation for these parameters from the Request body search page[0]. [0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-request-body.html	2020-05-11 11:29:38 -04:00
James Rodewig	ea76b0c22b	[DOCS] Relocate search API's request body parameters (#56304 ) Changes: * Moves the document request body parameters for the search API from the Request body search page to the Search API reference page. * Relocates a search request body example from the Request body search page to the Search API reference page. * Adds a note to any duplicated query and request body parameters.	2020-05-07 11:00:03 -04:00
Luca Cavanna	ef66018119	[DOCS] Async search: clarify behaviour when submit returns final results (#55934 ) Closes #55636	2020-05-06 10:01:23 +02:00
James Rodewig	a528319827	[DOCS] Remove invalid search API query parameters (#55884 ) (#56212 ) This is a backport of #55884 with redirects removed. Changes: * Adds an abbreviated title for the search API page. * Removes the following invalid query parameters: * `analyzer` * `analyze_wildcard` * `default_operator` * `df` * `lenient` * `suggest_mode` * `suggest_size` * Replaces the URI search page's query parameter docs with a xref * Updates the headings of several examples	2020-05-05 11:10:34 -04:00
James Rodewig	922a80c3f4	[DOCS] Add collapsible sections to search API response (#55887 )	2020-05-04 16:57:10 -04:00
Luca Cavanna	8b05027bf0	[DOCS] Clarify async search response flags (#55574 ) Relates to #55572	2020-04-29 15:22:05 +02:00
James Rodewig	ddc7305ac9	[DOCS] Correct search API's timeout parm default (#55855 )	2020-04-28 09:44:50 -04:00
Adrien Grand	58c3bb5ae1	Repurpose `ignore_throttled` to be only about frozen indices. (#55047 ) (#55852 ) This has no practical impact on users since frozen indices are the only throttled indices today. However this has an impact on upcoming features that would use search throttling. Filtering out throttled indices made sense a couple years ago, but as we're now improving support for slow requests with `_async_search` and exploring ways to reduce storage costs, this feature has most likely become a trap, that we'd like to not have with upcoming features that would use search throttling. Relates #54058	2020-04-28 14:31:54 +02:00
James Rodewig	8d6f0f6a76	[DOCS] Document `max_concurrent_searches` default (#55116 )	2020-04-15 10:04:23 -04:00
Christoph Büscher	f7ea794312	[Test] Don't expect specific scores in docs tests (#54297 ) The failing suggester documentation test was expecting specific scores in the test response, which is fragile implementation details that e.g. can change with different lucene versions and generally shouldn't be done in documentation test. Instead we usually replace the float values in the output response by the ones in the actual response. Closes #54257	2020-03-27 10:27:47 +01:00
Luca Cavanna	ff269160af	Async search: rename REST parameters (#54198 ) This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search. Also it renames clean_on_completion to keep_on_completion and turns around its behaviour. Closes #54069	2020-03-26 09:40:50 +01:00
Luca Cavanna	6b457abbd3	Async search: prevent users from overriding pre_filter_shard_size (#54088 ) Submit async search forces pre_filter_shard_size for the underlying search that it creates. With this commit we also prevent users from overriding such default as part of request validation.	2020-03-24 17:06:04 +01:00
Jim Ferenczi	9e3f7f4575	Add heuristics to compute pre_filter_shard_size when unspecified (#53873 ) (#54007 ) This commit changes the pre_filter_shard_size default from 128 to unspecified. This allows to apply heuristics based on the request and the target indices when deciding whether the can match phase should run or not. When unspecified, this pr runs the can match phase automatically if one of these conditions is met: * The request targets more than 128 shards. * The request contains read-only indices. * The primary sort of the query targets an indexed field. Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value. Closes #39835	2020-03-24 02:05:15 +01:00
Luca Cavanna	932a7e3112	Backport of async search changes (#53976 ) * Get Async Search: omit _clusters section when empty (#53907) The _clusters section is omitted by the search API whenever no remote clusters are searched. Async search should do the same, but Get Async Search returns a deserialized response, hence a weird `_clusters` section with all values set to `0` gets returned instead. In fact the recreated Clusters object is not the same object as the EMPTY constant, yet it has the same content. This commit addresses this by changing the comparison in the `toXContent` method to not print out the section if the number of total clusters is `0`. * Async search: remove version from response (#53960) The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available). That said this commit clarifies this in the docs and removes the version field from the async search response * Async Search: replicas to auto expand from 0 to 1 (#53964) This way single node clusters that are green don't go yellow once async search is used, while all the others still have one replica. * [DOCS] address timing issue in async search docs tests (#53910) The docs snippets for submit async search have proven difficult to test as it is not possible to guarantee that you get a response that is not final, even when providing `wait_for_completion=0`. In the docs we want to show though a proper long-running query, and its first response should be partial rather than final. With this commit we adapt the docs snippets to show a partial response, and replace under the hood all that's needed to make the snippets tests succeed when we get a final response. Also, increased the timeout so we always get a final response. Closes #53887 Closes #53891	2020-03-23 19:13:31 +01:00
Mark Vieira	0cfe6d90cc	Mute async-search test	2020-03-20 11:35:24 -07:00
Luca Cavanna	d486bdefdd	[DOCS] correct async search note The sort optimization kicks in whenever results are sorted by field.	2020-03-20 15:58:19 +01:00
Luca Cavanna	03fca61fcb	[DOCS] add docs for async search (#53675 ) Relates to #49091 Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2020-03-20 14:46:38 +01:00
Julie Tibshirani	c33afea9fb	Small corrections to stored_fields docs. (#53247 ) * Fix a reference to the 'field' option. * Remove claim about detecting script fields. * Specify that object fields will just be ignored.	2020-03-09 10:59:17 -07:00
James Rodewig	2b59f8ac34	[DOCS] Correct `hits.total.relation` response parm def (#52847 ) Fixes a partially completed definition for the `hits.total.relation` response parameter in the search API docs.	2020-03-04 08:23:34 -05:00
Josh Devins	68ba571f70	Adds recall@k metric to rank eval API (#52889 ) This change adds the recall@k metric and refactors precision@k to match the new metric. Recall@k is an important metric to use for learning to rank (LTR) use-cases. Candidate generation or first ranking phase ranking functions are often optimized for high recall, in order to generate as many relevant candidates in the top-k as possible for a second phase of ranking. Adding this metric allows tuning that base query for LTR. See: https://github.com/elastic/elasticsearch/issues/51676 Backports: https://github.com/elastic/elasticsearch/pull/52577	2020-02-27 16:04:24 +01:00
James Rodewig	98bcf06bae	[DOCS] Correct multi search API docs (#52523 ) * Adds an example request to the top of the page. * Relocates several parameters erroneously listed under "Request body" to the appropriate "Query parameters" section. * Updates the "Request body" section to better document the NDJSON structure of msearch requests.	2020-02-24 07:43:10 -05:00
Marios Trivyzas	c03f51f68f	[Docs] Clarify default value for `allow_no_indices` (#52635 ) (#52697 ) Add default value to each one of the usages of `allow_no_indices` since it differs between different APIs. Relates to: #52534 (cherry picked from commit 2eb986488ac326d6da6ab8ad0203a94e08684a36)	2020-02-24 11:57:32 +01:00
debadair	2588022b81	[DOCS] Fixed typo. (#52071 )	2020-02-07 11:04:56 -08:00
Jess	4b31ad1c0c	[Docs] Small edits to Ranking Evaluation API docs (#51116 ) Small updates to grammar, syntax, and unclear wordings.	2020-01-20 10:30:23 +01:00
Adrien Grand	31158ab3d5	Add per-field metadata. (#50333 ) This PR adds per-field metadata that can be set in the mappings and is later returned by the field capabilities API. This metadata is completely opaque to Elasticsearch but may be used by tools that index data in Elasticsearch to communicate metadata about fields with tools that then search this data. A typical example that has been requested in the past is the ability to attach a unit to a numeric field. In order to not bloat the cluster state, Elasticsearch requires that this metadata be small: - keys can't be longer than 20 chars, - values can only be numbers or strings of no more than 50 chars - no inner arrays or objects, - the metadata can't have more than 5 keys in total. Given that metadata is opaque to Elasticsearch, field capabilities don't try to do anything smart when merging metadata about multiple indices, the union of all field metadatas is returned. Here is how the meta might look like in mappings: ```json { "properties": { "latency": { "type": "long", "meta": { "unit": "ms" } } } } ``` And then in the field capabilities response: ```json { "latency": { "long": { "searchable": true, "aggreggatable": true, "meta": { "unit": [ "ms" ] } } } } ``` When there are no conflicts, values are arrays of size 1, but when there are conflicts, Elasticsearch includes all unique values in this array, without giving ways to know which index has which metadata value: ```json { "latency": { "long": { "searchable": true, "aggreggatable": true, "meta": { "unit": [ "ms", "ns" ] } } } } ``` Closes #33267	2020-01-08 16:21:18 +01:00
James Rodewig	3f7f31b6b0	[DOCS] Fix search request body links (#50500 ) PR #44238 changed several links related to the Elasticsearch search request body API. This updates several places still using outdated links or anchors. This will ultimately let us remove some redirects related to those link changes.	2019-12-26 14:31:09 -05:00
Nik Everett	01293ebad5	Fix docs typos (#50365 ) (#50464 ) Fixes a few typos in the docs. Co-authored-by: Xiang Dai <764524258@qq.com>	2019-12-23 12:38:17 -05:00
James Rodewig	27ae9a1435	[DOCS] Remove outdated file scripts refererence (#50437 ) File scripts were removed in 6.0 with #24627. This removes an outdated file scripts reference from the conditional clauses section of the search templates docs.	2019-12-20 14:53:40 -05:00
Adrien Grand	87e72156ce	Upgrade to lucene 8.4.0-snapshot-662c455. (#50016 ) (#50039 ) Lucene 8.4 is about to be released so we should check it doesn't cause problems with Elasticsearch.	2019-12-10 18:04:58 +01:00
Mayya Sharipova	7cf170830c	Optimize sort on numeric long and date fields. (#49732 ) This rewrites long sort as a `DistanceFeatureQuery`, which can efficiently skip non-competitive blocks and segments of documents. Depending on the dataset, the speedups can be 2 - 10 times. The optimization can be disabled with setting the system property `es.search.rewrite_sort` to `false`. Optimization is skipped when an index has 50% or more data with the same value. Optimization is done through: 1. Rewriting sort as `DistanceFeatureQuery` which can efficiently skip non-competitive blocks and segments of documents. 2. Sorting segments according to the primary numeric sort field(#44021) This allows to skip non-competitive segments. 3. Using collector manager. When we optimize sort, we sort segments by their min/max value. As a collector expects to have segments in order, we can not use a single collector for sorted segments. We use collectorManager, where for every segment a dedicated collector will be created. 4. Using Lucene's shared TopFieldCollector manager This collector manager is able to exchange minimum competitive score between collectors, which allows us to efficiently skip the whole segments that don't contain competitive scores. 5. When index is force merged to a single segment, #48533 interleaving old and new segments allows for this optimization as well, as blocks with non-competitive docs can be skipped. Backport for #48804 Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>	2019-11-29 15:37:40 -05:00
James Rodewig	03600e4e12	[DOCS] Document `script_score` float precision limit (#49402 ) All document scores are positive 32-bit floating point numbers. However, this wasn't previously documented. This can result in surprising behavior, such as precision loss, for users when customizing scores using the function score query. This commit updates an existing admonition in the function score query docs to document the 32-bits precision limit. It also updates the search API reference docs to note that `_score` is a 32-bit float.	2019-11-21 08:54:49 -05:00
Orhan Toy	561351d2fc	[Docs] Fix _count HTTP method (#48979 )	2019-11-12 15:45:26 +01:00
Patrick Maynard	4b85498617	[DOCS] Fix typo in search type docs (#48868 )	2019-11-11 09:38:48 -05:00
Christoph Büscher	1de49d8a70	Remove Ranking Evaluation API experimental status (#48603 ) The API has been released long enough to remove the experimental status.	2019-10-29 20:57:39 +01:00
Ian Danforth	82e25c4ac7	[Docs] Fix typo in suggesters search API doc (#48477 )	2019-10-29 09:58:05 +01:00

1 2 3 4 5 ...

903 Commits