Commit Graph

973 Commits

Author SHA1 Message Date
Luca Cavanna d486bdefdd [DOCS] correct async search note
The sort optimization kicks in whenever results are sorted by field.
2020-03-20 15:58:19 +01:00
Luca Cavanna 03fca61fcb [DOCS] add docs for async search (#53675)
Relates to #49091

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2020-03-20 14:46:38 +01:00
Julie Tibshirani c33afea9fb Small corrections to stored_fields docs. (#53247)
* Fix a reference to the 'field' option.
* Remove claim about detecting script fields.
* Specify that object fields will just be ignored.
2020-03-09 10:59:17 -07:00
James Rodewig 2b59f8ac34 [DOCS] Correct `hits.total.relation` response parm def (#52847)
Fixes a partially completed definition for the `hits.total.relation`
response parameter in the search API docs.
2020-03-04 08:23:34 -05:00
Josh Devins 68ba571f70
Adds recall@k metric to rank eval API (#52889)
This change adds the recall@k metric and refactors precision@k to match
the new metric.

Recall@k is an important metric to use for learning to rank (LTR)
use-cases. Candidate generation or first ranking phase ranking functions
are often optimized for high recall, in order to generate as many
relevant candidates in the top-k as possible for a second phase of
ranking. Adding this metric allows tuning that base query for LTR.

See: https://github.com/elastic/elasticsearch/issues/51676
Backports: https://github.com/elastic/elasticsearch/pull/52577
2020-02-27 16:04:24 +01:00
James Rodewig 98bcf06bae [DOCS] Correct multi search API docs (#52523)
* Adds an example request to the top of the page.
* Relocates several parameters erroneously listed under "Request body"
to the appropriate "Query parameters" section.
* Updates the "Request body" section to better document the NDJSON
  structure of msearch requests.
2020-02-24 07:43:10 -05:00
Marios Trivyzas c03f51f68f
[Docs] Clarify default value for `allow_no_indices` (#52635) (#52697)
Add default value to each one of the usages of `allow_no_indices`
since it differs between different APIs.

Relates to: #52534

(cherry picked from commit 2eb986488ac326d6da6ab8ad0203a94e08684a36)
2020-02-24 11:57:32 +01:00
debadair 2588022b81 [DOCS] Fixed typo. (#52071) 2020-02-07 11:04:56 -08:00
Jess 4b31ad1c0c [Docs] Small edits to Ranking Evaluation API docs (#51116)
Small updates to grammar, syntax, and unclear wordings.
2020-01-20 10:30:23 +01:00
Adrien Grand 31158ab3d5
Add per-field metadata. (#50333)
This PR adds per-field metadata that can be set in the mappings and is later
returned by the field capabilities API. This metadata is completely opaque to
Elasticsearch but may be used by tools that index data in Elasticsearch to
communicate metadata about fields with tools that then search this data. A
typical example that has been requested in the past is the ability to attach
a unit to a numeric field.

In order to not bloat the cluster state, Elasticsearch requires that this
metadata be small:
 - keys can't be longer than 20 chars,
 - values can only be numbers or strings of no more than 50 chars - no inner
   arrays or objects,
 - the metadata can't have more than 5 keys in total.

Given that metadata is opaque to Elasticsearch, field capabilities don't try to
do anything smart when merging metadata about multiple indices, the union of
all field metadatas is returned.

Here is how the meta might look like in mappings:

```json
{
  "properties": {
    "latency": {
      "type": "long",
      "meta": {
        "unit": "ms"
      }
    }
  }
}
```

And then in the field capabilities response:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms" ]
      }
    }
  }
}
```

When there are no conflicts, values are arrays of size 1, but when there are
conflicts, Elasticsearch includes all unique values in this array, without
giving ways to know which index has which metadata value:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms", "ns" ]
      }
    }
  }
}
```

Closes #33267
2020-01-08 16:21:18 +01:00
James Rodewig 3f7f31b6b0 [DOCS] Fix search request body links (#50500)
PR #44238 changed several links related to the Elasticsearch search request body API. This updates several places still using outdated links or anchors.

This will ultimately let us remove some redirects related to those link changes.
2019-12-26 14:31:09 -05:00
Nik Everett 01293ebad5
Fix docs typos (#50365) (#50464)
Fixes a few typos in the docs.

Co-authored-by: Xiang Dai <764524258@qq.com>
2019-12-23 12:38:17 -05:00
James Rodewig 27ae9a1435 [DOCS] Remove outdated file scripts refererence (#50437)
File scripts were removed in 6.0 with #24627.

This removes an outdated file scripts reference from the conditional clauses section of the search templates docs.
2019-12-20 14:53:40 -05:00
Adrien Grand 87e72156ce
Upgrade to lucene 8.4.0-snapshot-662c455. (#50016) (#50039)
Lucene 8.4 is about to be released so we should check it doesn't cause problems
with Elasticsearch.
2019-12-10 18:04:58 +01:00
Mayya Sharipova 7cf170830c
Optimize sort on numeric long and date fields. (#49732)
This rewrites long sort as a `DistanceFeatureQuery`, which can
efficiently skip non-competitive blocks and segments of documents.
Depending on the dataset, the speedups can be 2 - 10 times.

The optimization can be disabled with setting the system property
`es.search.rewrite_sort` to `false`.

Optimization is skipped when an index has 50% or more data with
the same value.

Optimization is done through:
1. Rewriting sort as `DistanceFeatureQuery` which can
efficiently skip non-competitive blocks and segments of documents.

2. Sorting segments according to the primary numeric sort field(#44021)
This allows to skip non-competitive segments.

3. Using collector manager.
When we optimize sort, we sort segments by their min/max value.
As a collector expects to have segments in order,
we can not use a single collector for sorted segments.
We use collectorManager, where for every segment a dedicated collector
will be created.

4. Using Lucene's shared TopFieldCollector manager
This collector manager is able to exchange minimum competitive
score between collectors, which allows us to efficiently skip
the whole segments that don't contain competitive scores.

5. When index is force merged to a single segment, #48533 interleaving
old and new segments allows for this optimization as well,
as blocks with non-competitive docs can be skipped.

Backport for #48804


Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
2019-11-29 15:37:40 -05:00
James Rodewig 03600e4e12 [DOCS] Document `script_score` float precision limit (#49402)
All document scores are positive 32-bit floating point numbers. However, this
wasn't previously documented.

This can result in surprising behavior, such as precision loss, for users when
customizing scores using the function score query.

This commit updates an existing admonition in the function score query docs to
document the 32-bits precision limit. It also updates the search API reference
docs to note that `_score` is a 32-bit float.
2019-11-21 08:54:49 -05:00
Orhan Toy 561351d2fc [Docs] Fix _count HTTP method (#48979) 2019-11-12 15:45:26 +01:00
Patrick Maynard 4b85498617 [DOCS] Fix typo in search type docs (#48868) 2019-11-11 09:38:48 -05:00
Christoph Büscher 1de49d8a70 Remove Ranking Evaluation API experimental status (#48603)
The API has been released long enough to remove the experimental status.
2019-10-29 20:57:39 +01:00
Ian Danforth 82e25c4ac7 [Docs] Fix typo in suggesters search API doc (#48477) 2019-10-29 09:58:05 +01:00
James Rodewig e9c8e4f6d1 [DOCS] Fix note format in index suggestion docs (#48536) 2019-10-25 11:31:47 -04:00
Christoph Büscher 055a0800eb [Docs] Mention reserved completion suggestion characters (#48445)
We currently don't mention the three reserved characters anywhere. This change
adds a short note mentioning them

Closes #48341
2019-10-25 16:58:23 +02:00
James Rodewig 852622d970 [DOCS] Remove binary gendered language (#48362) 2019-10-23 09:37:12 -05:00
Jim Ferenczi dc39196ea4 Fix tag in the search request timeout option docs (#47776)
and add missing parentheses `search_timeout` param
2019-10-10 10:35:44 +02:00
James Rodewig c03cdb4b15 [DOCS] Correct callouts in search template docs (#47655) 2019-10-07 09:25:32 -04:00
James Rodewig fd421bd12d
[7.x] [DOCS] Add response body parms to search API docs (#47042) (#47303) 2019-09-30 13:54:06 -04:00
István Zoltán Szabó 0ab7132c47 [DOCS] Reformats Profile API (#47168)
* [DOCS] Reformats Profile API.

* [DOCS] Fixes failing docs test.
2019-09-27 11:14:14 +02:00
István Zoltán Szabó 74fd21f0b0 [DOCS] Reformats ranking evaluation API (#46974)
* [DOCS] Reformats ranking evaluation API.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-25 15:01:10 +02:00
István Zoltán Szabó 83365e94ba [DOCS] Reformat suggesters page. (#47010) 2019-09-25 14:42:16 +02:00
István Zoltán Szabó fcea154f2e [DOCS] Reformats Field capabilities API (#46866)
* [DOCS] Reformats Field capabilities API.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-20 11:28:19 +02:00
István Zoltán Szabó 363075cf1d [DOCS] Reformats explain API (#46857)
* [DOCS] Reformats explain API.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-20 11:00:33 +02:00
James Rodewig 251dbd8522 [DOCS] Remove `lowercase_terms` parm from term suggester docs (#46879) 2019-09-19 15:56:47 -04:00
Takumasa Ochi 7a3054c5dc Fix typos in `match` in profile API (#46723)
* Replace `matches` with correct `match`
* Use present tense consistently
* Replace `metric` with correct `match`
2019-09-19 16:07:52 +02:00
István Zoltán Szabó e59be0354a [DOCS] Reformats validate API (#46389)
* [DOCS] Reformats validate API.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-18 14:31:17 +02:00
István Zoltán Szabó 595bf52927 [DOCS] Reformats count API (#46377)
* [DOCS] Reformats count API.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-17 09:54:19 +02:00
James Rodewig e253ee6ba6
[DOCS] Change // CONSOLE comments to [source,console] (#46440) (#46494) 2019-09-09 12:35:50 -04:00
James Rodewig f04573f8e8
[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449) (#46459) 2019-09-06 16:09:09 -04:00
James Rodewig bb7bff5e30
[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295) (#46418) 2019-09-06 09:22:08 -04:00
James Rodewig 1f36c4e50c
[DOCS] Replace "// CONSOLE" comments with [source,console] (#46159) (#46332) 2019-09-05 10:11:25 -04:00
István Zoltán Szabó 0f0b77b263 [DOCS] Reformats search template and multi search template APIs (#46236)
* [DOCS] Reformats search template and multi search template APIs.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-04 15:14:06 +02:00
István Zoltán Szabó c71d959d61 [DOCS] Reformats search shards API (#46240)
* [DOCS] Reformats search shards API
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-04 11:36:08 +02:00
István Zoltán Szabó 5c5af77565 [DOCS] Reformats request body search API (#46254)
* [DOCS] Reformats request body search API.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-04 10:53:16 +02:00
István Zoltán Szabó f2bdd392e7 [DOCS] Reformats multi search API (#46256)
* [DOCS] Reformats multi search API.

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-04 10:19:43 +02:00
István Zoltán Szabó 53f70ee996 [DOCS] Reformats URI search request (#45844)
* [DOCS] Reformats URI search request.

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

Co-Authored-By: debadair <debadair@elastic.co>
2019-08-30 13:45:29 +02:00
István Zoltán Szabó 4b086fbef2 Revert "[DOCS] Reformats URI search request (#45844)"
This reverts commit 7f11c32400.
2019-08-29 11:58:28 +02:00
István Zoltán Szabó 7f11c32400 [DOCS] Reformats URI search request (#45844)
* [DOCS] Reformats URI search request.

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

Co-Authored-By: debadair <debadair@elastic.co>
2019-08-29 10:06:16 +02:00
James Rodewig ceb8b9bbee Change `{var}` convention to `<var>` (#45904) 2019-08-23 10:57:48 -04:00
István Zoltán Szabó 6e696296fe [DOCS] Reformats search API (#45786)
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-08-23 11:56:49 +02:00
Nathan Howard bdfd90560f Adding a warning to from-size.asciidoc
Customers occasionally discover a known behavior in Elasticsearch's pagination that does not appear to be documented. This warning is intended to educate customers of this behavior while still highlighting alternative solutions.
2019-08-22 19:03:13 -07:00
James Rodewig 33d4801213 Revert "[DOCS] Reformats search API (#45786)"
This reverts commit f6ffa00142.
2019-08-22 09:49:15 -04:00
István Zoltán Szabó f6ffa00142 [DOCS] Reformats search API (#45786)
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-08-22 15:09:09 +02:00
James Rodewig 5e44e695fd [DOCS] Add template docs to scripts. Reorder template examples. (#45817)
* [DOCS] Add template docs to scripts. Reorder template examples.

* Adds a 'Search template' section to the 'How to use scripts' chapter.
  This links to the 'Search template' chapter for detailed info and
  examples.

* Reorders and retitles several examples in the 'Search template'
  chapter. This is primarily to make examples for storing, deleting, and
  using search templates more prominent.

* Change <templatename> to <templateid>
2019-08-22 08:40:32 -04:00
Jonathan Hult 041385559c [DOCS] Fix typo in highlighting doc (#45707) 2019-08-20 07:28:05 -04:00
James Rodewig c75fd40f2c [DOCS] Add diagrams to cross-cluster search documentation (#45569) 2019-08-15 11:00:25 -04:00
Emmanuel DEMEY f8c08c537b Add snippet for the search_type query parameter (#43540) 2019-08-11 18:36:52 -04:00
Jesse Wright f19f2adbe6 [Docs] Fix typo in rank-eval.asciidoc (#44978) 2019-07-31 12:37:49 +02:00
James Rodewig a63f60b776 [DOCS] Remove heading offsets for REST APIs (#44568)
Several files in the REST APIs nav section are included using
:leveloffset: tags. This increments headings (h2 -> h3, h3 -> h4, etc.)
in those files and removes the :leveloffset: tags.

Other supporting changes:
* Alphabetizes top-level REST API nav items.
* Change 'indices APIs' heading to 'index APIs.'
* Changes 'Snapshot lifecycle management' heading to sentence case.
2019-07-19 14:36:06 -04:00
James Rodewig d46545f729 [DOCS] Update anchors and links for Elasticsearch API relocation (#44500) 2019-07-19 09:18:23 -04:00
James Rodewig 34725e20fb [DOCS] Move Elasticsearch APIs to REST APIs section. (#44238) (#44372)
Moves the following API sections under the REST APIs navigations:
- API Conventions
- Document APIs
- Search APIs
- Index APIs (previously named Indices APIs)
- cat APIs
- Cluster APIs

Other supporting changes:
- Removes the previous index APIs page under REST APIs. Adds a redirect for the removed page.
- Removes several [partintro] macros so the docs build correctly.
- Changes anchors for pages that become sections of a parent page.
- Adds several redirects for existing pages that become sections of a parent page.

This commit re-applies changes from #44238. Changes from that PR were reverted due to broken links in several repos. This commit adds redirects for those broken links.
2019-07-17 09:18:31 -04:00
Julie Tibshirani 141d09ee15 Correct a formatting mistake in the _field_caps docs. (#44303)
The 'indices' block that was recently added should appear in the top-level of
the response, as opposed to being nested under 'fields'.
2019-07-15 09:46:02 -07:00
John Murphy 8030d8f6dc [DOCS] Add `lowercase` filter to phrase suggester example so searches are case insensitive (#44186) 2019-07-11 15:27:31 -04:00
David Kyle d1280339a8
specifies which index to search in docs for various queries (#43307) (#43428)
the geo-bounding-box and phrase-suggest docs were susceptible to
failing due to other indices in the cluster. This change restricts
the queries to the index that is set up for the test.

relates to #43271.
2019-06-21 10:15:51 +01:00
Luca Cavanna 4da0fadedc [DOCS] Clarify phrase suggester docs smoothing parameter (#42947)
Closes #28512
2019-06-12 11:25:03 +02:00
Luca Cavanna e538592652 Update max_concurrent_shard_request parameter docs (#42227)
Some of the docs were outdated as they did not mention that the limit is
not per node. Also, The default value changed.

Relates to #31206
2019-06-12 11:25:03 +02:00
Christoph Büscher d9c582e66b [Docs] Add to preference parameter docs (#42797)
Adding notes to the existing docs about how using `preference` might increase
request cache utilization but also add warning about the downsides.

Closes #24278
2019-06-04 14:38:18 +02:00
Julie Tibshirani 3a00d08c50 Clarify that inner_hits must be used to access nested fields. (#42724)
This PR updates the docs for `docvalue_fields` and `stored_fields` to clarify
that nested fields must be accessed through `inner_hits`. It also tweaks the
nested fields documentation to make this point more visible.

Addresses #23766.
2019-05-31 10:06:11 -07:00
bellengao 380f296631 Update script-fields.asciidoc (#42490) 2019-05-27 11:48:37 +02:00
James Rodewig 53702efddd [DOCS] Add anchors for Asciidoctor migration (#41648) 2019-04-30 10:20:17 -04:00
James Rodewig fc4f401214 [DOCS] Document 200 http code response for partial results (#40367) 2019-04-29 08:30:17 -04:00
James Rodewig 4adf7963c7 [DOCS] Escape commas in experimental[] for Asciidoctor migration (#41578) 2019-04-26 11:35:47 -04:00
Jim Ferenczi 6184efaff6
Handle unmapped fields in _field_caps API (#34071) (#41426)
Today the `_field_caps` API returns the list of indices where a field
is present only if this field has different types within the requested indices.
However if the request is an index pattern (or an alias, or both...) there
is no way to infer the indices if the response contains only fields that have
the same type in all indices. This commit changes the response to always return
the list of indices in the response. It also adds a way to retrieve unmapped field
in a specific section per field called `unmapped`. This section is created for each field
that is present in some indices but not all if the parameter `include_unmapped` is set to
true in the request (defaults to false).
2019-04-25 18:13:48 +02:00
David Turner 411994b489 Mention the cost of tracking live docs in scrolls (#41375)
Relates #41337, in which a heap dump shows hundreds of MBs allocated on the
heap for tracking the live docs for each scroll.
2019-04-23 15:37:41 +01:00
Adrien Grand f7e590ce0d
ProfileScorer should propagate `setMinCompetitiveScore`. (#40958) (#41302)
Currently enabling profiling disables top-hits optimizations, which is
unfortunate: it would be nice to be able to notice the difference in method
counts and timings depending on whether total hit counts are requested.
2019-04-17 16:11:14 +02:00
James Rodewig 9f3fae2c59 [DOCS] Fix code block length for Asciidoctor migration (#41152) 2019-04-12 12:27:29 -04:00
Alexander Reelsen 3b9ab5da04 Fix order of request body search parameter names in documentation (#40777)
The order was random, which made it super hard to find anything. This
changes the order to be alphabetically.
2019-04-09 16:35:45 +02:00
Christoph Büscher 40638d7b28 [Docs] Delete explanation for completion suggester default analyzer choice (#36720)
The explanation given in the completion suggester documentation why we use the
"simple" analyzer as the default is no longer valid. Since we still use "simple"
as the default, we should just delete the explanation that doesn't fit anymore.

Closes #36715
2019-04-09 13:50:29 +02:00
Mayya Sharipova 1e90b29c00 Add information about the default sort mode (#40657) 2019-03-30 10:58:47 -04:00
Andy Bristol 23395a9b9f
search as you type fieldmapper (#35600)
Adds the search_as_you_type field type that acts like a text field optimized
for as-you-type search completion. It creates a couple subfields that analyze
the indexed terms as shingles, against which full terms are queried, and a
prefix subfield that analyze terms as the largest shingle size used and
edge-ngrams, against which partial terms are queried

Adds a match_bool_prefix query type that creates a boolean clause of a term
query for each term except the last, for which a boolean clause with a prefix
query is created.

The match_bool_prefix query is the recommended way of querying a search as you
type field, which will boil down to term queries for each shingle of the input
text on the appropriate shingle field, and the final (possibly partial) term
as a term query on the prefix field. This field type also supports phrase and
phrase prefix queries however
2019-03-27 13:29:13 -07:00
Jim Ferenczi 3400483af4
Add date and date_nanos conversion to the numeric_type sort option (#40199) (#40224)
This change adds an option to convert a `date` field to nanoseconds resolution
 and a `date_nanos` field to millisecond resolution when sorting.
The resolution of the sort can be set using the `numeric_type` option of the
field sort builder. The conversion is done at the shard level and is restricted
to dates from 1970 to 2262 for the nanoseconds resolution in order to avoid
numeric overflow.
2019-03-20 16:50:28 +01:00
Jim Ferenczi 5b73a1bc7d
Add an option to force the numeric type of a field sort (#38095) (#40084)
This change adds an option to the `FieldSortBuilder` that allows to transform the type
of a numeric field into another. Possible values for this option are `long` that transforms
the source field into an integer and `double` that transforms the source field into a floating point.
This new option is useful for cross-index search when the sort field is mapped differently on some
indices. For instance if a field is mapped as a floating point in one index and as an integer in another
it is possible to align the type for both indices using the `numeric_type` option:

```
{
   "sort": {
    "field": "my_field",
    "numeric_type": "double" <1>
   }
}
```

<1> Ensure that values for this field are transformed to a floating point if needed.
2019-03-18 09:32:45 +01:00
Lisa Cawley c92476f591 [DOCS] Replaces CCS terms with attributes (#40076) 2019-03-15 07:57:51 -07:00
Darren Meiss eae2c9dd5c Edits to text in Phrase Suggester doc (#38966) 2019-02-20 12:35:21 +01:00
Clinton Gormley d59ec89726 Update track-total-hits.asciidoc
Added missing `
2019-02-18 13:33:23 +01:00
Darren Meiss dc0e657091 Edits to text & formatting in Term Suggester doc (#38963) 2019-02-15 16:00:28 -05:00
Darren Meiss 56997bf53d Edits to text in Completion Suggester doc (#38980) 2019-02-15 15:47:54 -05:00
Darren Meiss 34b6383c47 Edits to text of Profile API documentation (#38742)
Minor edits of text.
2019-02-13 10:08:50 +01:00
Mayya Sharipova 6eec065353
Describe what _source.includes/excludes do (#38319) (#38794) 2019-02-12 11:09:15 -05:00
Darren Meiss f8426d9b76 Fix typos in Field-Caps documentation (#38580)
Fix typo in Field-Caps documentation

Reworded because asciidoc was formatting the ellipse/space as a numbered list.
2019-02-11 20:58:31 +01:00
Nik Everett 5d949dddfb
Docs: Drop inline callout from scroll example (#38340)
Coalesces two calls into one in a scroll example so all callouts are at
the end of the line. This is the only sort of callouts that are
supported by asciidoctor and we'd like to start building our docs with
asciidoctor.

At present we don't have any mechanism to stop folks adding more inline
callouts but we ought to be able to have one in a few weeks. For now,
though, removing these inline callouts is a step in the right direction.

Relates to #38335
2019-02-04 14:57:38 -05:00
Luca Cavanna 622fb7883b
Introduce ability to minimize round-trips in CCS (#37828)
With #37566 we have introduced the ability to merge multiple search responses into one. That makes it possible to expose a new way of executing cross-cluster search requests, that makes CCS much faster whenever there is network latency between the CCS coordinating node and the remote clusters. The coordinating node can now send a single search request to each remote cluster, which gets reduced by each one of them. from + size results are requested to each cluster, and the reduce phase in each cluster is non final (meaning that buckets are not pruned and pipeline aggs are not executed). The CCS coordinating node performs an additional, final reduction, which produces one search response out of the multiple responses received from the different clusters.

This new execution path will be activated by default for any CCS request unless a scroll is provided or inner hits are requested as part of field collapsing. The search API accepts now a new parameter called ccs_minimize_roundtrips that allows to opt-out of the default behaviour.

Relates to #32125
2019-01-31 15:12:14 +01:00
Adrien Grand c8af0f4bfa
Use mappings to format doc-value fields by default. (#30831)
Doc-value fields now return a value that is based on the mappings rather than
the script implementation by default.

This deprecates the special `use_field_mapping` docvalue format which was added
in #29639 only to ease the transition to 7.x and it is not necessary anymore in
7.0.
2019-01-30 10:31:51 +01:00
Jim Ferenczi 787acb14b9
Track total hits up to 10,000 by default (#37466)
This commit changes the default for the `track_total_hits` option of the search request
to `10,000`. This means that by default search requests will accurately track the total hit count
up to `10,000` documents, requests that match more than this value will set the `"total.relation"`
to `"gte"` (e.g. greater than or equals) and the `"total.value"` to `10,000` in the search response.
Scroll queries are not impacted, they will continue to count the total hits accurately.
The default is set back to `true` (accurate hit count) if `rest_total_hits_as_int` is set in the search request.
I choose `10,000` as the default because that's also the number we use to limit pagination. This means that users will be able to know how far they can jump (up to 10,000) even if the total number of hits is not accurate.

Closes #33028
2019-01-25 13:45:39 +01:00
Christoph Büscher 95a6951f78
Use new bulk API endpoint in the docs (#37698)
This change switches to using the typeless bulk API endpoint in the
documentation snippets where possible
2019-01-23 09:46:28 +01:00
Boaz Leskes 52ba407931
Expose sequence number and primary terms in search responses (#37639)
Users may require the sequence number and primary terms to perform optimistic concurrency control operations. Currently, you can get the sequence number via the `docvalues_fields` API but the primary term is not accessible because it is maintained by the `SeqNoFieldMapper` and the infrastructure can't find it. 

This commit adds a dedicated sub fetch phase to return both numbers that is connected to a new `seq_no_primary_term` parameter.
2019-01-23 09:01:58 +01:00
Christoph Büscher 34f2d2ec91
Remove remaining occurances of "include_type_name=true" in docs (#37646) 2019-01-22 15:13:52 +01:00
Christoph Büscher 3a96608b3f
Remove more include_type_name and types from docs (#37601) 2019-01-18 14:11:18 +01:00
Christoph Büscher 25aac4f77f
Remove `include_type_name` in asciidoc where possible (#37568)
The "include_type_name" parameter was temporarily introduced in #37285 to facilitate
moving the default parameter setting to "false" in many places in the documentation
code snippets. Most of the places can simply be reverted without causing errors.
In this change I looked for asciidoc files that contained the
"include_type_name=true" addition when creating new indices but didn't look
likey they made use of the "_doc" type for mappings. This is mostly the case
e.g. in the analysis docs where index creating often only contains settings. I
manually corrected the use of types in some places where the docs still used an
explicit type name and not the dummy "_doc" type.
2019-01-18 09:34:11 +01:00
Julie Tibshirani 36a3b84fc9
Update the default for include_type_name to false. (#37285)
* Default include_type_name to false for get and put mappings.

* Default include_type_name to false for get field mappings.

* Add a constant for the default include_type_name value.

* Default include_type_name to false for get and put index templates.

* Default include_type_name to false for create index.

* Update create index calls in REST documentation to use include_type_name=true.

* Some minor clean-ups around the get index API.

* In REST tests, use include_type_name=true by default for index creation.

* Make sure to use 'expression == false'.

* Clarify the different IndexTemplateMetaData toXContent methods.

* Fix FullClusterRestartIT#testSnapshotRestore.

* Fix the ml_anomalies_default_mappings test.

* Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests.

We make sure to specify include_type_name=true during xContent parsing,
so we continue to test the legacy typed responses. XContent generation
for the typeless responses is currently only covered by REST tests,
but we will be adding unit test coverage for these as we implement
each typeless API in the Java HLRC.

This commit also refactors GetMappingsResponse to follow the same appraoch
as the other mappings-related responses, where we read include_type_name
out of the xContent params, instead of creating a second toXContent method.
This gives better consistency in the response parsing code.

* Fix more REST tests.

* Improve some wording in the create index documentation.

* Add a note about types removal in the create index docs.

* Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL.

* Make sure to mention include_type_name in the REST docs for affected APIs.

* Make sure to use 'expression == false' in FullClusterRestartIT.

* Mention include_type_name in the REST templates docs.
2019-01-14 13:08:01 -08:00
Josh Soref edb48321ba [DOCS] Various spelling corrections (#37046) 2019-01-07 14:44:12 +01:00
Jim Ferenczi e38cf1d0dc
Add the ability to set the number of hits to track accurately (#36357)
In Lucene 8 searches can skip non-competitive hits if the total hit count is not requested.
It is also possible to track the number of hits up to a certain threshold. This is a trade off to speed up searches while still being able to know a lower bound of the total hit count. This change adds the ability to set this threshold directly in the track_total_hits search option. A boolean value (true, false) indicates whether the total hit count should be tracked in the response. When set as an integer this option allows to compute a lower bound of the total hits while preserving the ability to skip non-competitive hits when enough matches have been collected.

Relates #33028
2019-01-04 20:36:49 +01:00
Abdullah DURSUN 8a02bacf76 Fix typo in multi-search.asciidoc (#37060) 2019-01-02 10:32:42 +01:00
lcawl 504cfb2fb1 [DOCS] Adds missing anchors for profile API 2018-12-18 15:20:19 -08:00
Julie Tibshirani 87831051dc
Deprecate types in explain requests. (#35611)
The following updates were made:
- Add a new untyped endpoint `{index}/_explain/{id}`.
- Add deprecation warnings to Rest*Action, plus tests in Rest*ActionTests.
- For each REST yml test, make sure there is one version without types, and another legacy version that retains types (called *_with_types.yml).
- Deprecate relevant methods on the Java HLRC requests/ responses.
- Update documentation (for both the REST API and Java HLRC).
2018-12-10 19:45:13 -08:00
Christoph Büscher 54f39d9852
[Docs] Add Profile API limitations (#36252)
Adding some of the limitations mentioned in #29275.

Closes #29275
2018-12-06 00:09:26 +01:00
Jim Ferenczi 18866c4c0b
Make hits.total an object in the search response (#35849)
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).

Relates #33028
2018-12-05 19:49:06 +01:00
Alan Woodward 73ceaad03a
Update to lucene-8.0.0-snapshot-c78429a554 (#36212)
Includes:

* A fix for a bug in Intervals.or() (https://issues.apache.org/jira/browse/LUCENE-8586)
* The ability to disable offset mangling in WordDelimiterGraphFilter
        (https://issues.apache.org/jira/browse/LUCENE-8509)
* BM25Similarity no longer multiplies scores by k1 + 1
2018-12-05 12:43:56 +00:00
João Barbosa d27aa72b17 Added soft limit to open scroll contexts #25244 (#36009)
This change adds a soft limit to open scroll contexts that can be controlled with the dynamic cluster setting `search.max_open_scroll_context` (defaults to 500).
2018-12-03 19:57:10 +01:00
patrykk21 bb2cf7e6be [Docs] Clarify search_after behavior
Closes #34232
2018-11-30 14:30:23 +01:00
Christoph Büscher 33865211db
[Docs] Emphazise suggest behaviour with missing query part (#35393)
Add a short extra sentence that explains that a missing query part in a search
request containing a "suggest" section will mean only suggestions are returned.

Closes #31640
2018-11-28 12:01:27 +01:00
Julie Tibshirani c6a0904e0e
Deprecate types in count and msearch. (#35421)
* Deprecate types in count requests.
* Move RestCountAction to the 'search' package.
* Deprecate types in multi search requests.
* Add tests for types deprecation in the _search endpoint.
2018-11-16 13:04:43 -08:00
Julie Tibshirani 40ba4de5e6
Deprecate types in validate query requests. (#35575) 2018-11-16 08:59:04 -08:00
Jim Ferenczi 72504c2512
Do not recommend to use the _id field in search_after docs (#35370)
The documentation of `search_after` recommends to use the `_id`
field as a tiebreaker for the sort without warning against
the additional memory required. This change changes the recommandation
to use a copy of the `_id` field with doc_values enabled.
2018-11-14 10:50:31 +01:00
Jeff Hajewski d00b23c8b1 Fixes fast vector highlighter docs per issue 24318. (#34190)
The `fvh` highlighter does not support span queries. This fix updates
the docs to add a warning stating the lack of span query support for
`fvh`.
2018-11-08 11:09:03 +01:00
lipsill 6df1c9e818 Deprecate `_source_include` and `_source_exclude` url parameters (#33475)
Deprecates `_source_include` and `_source_exclude` url parameters
in favor of `_source_inclues` and `_source_excludes` because those
are consistent with the rest of Elasticsearch's APIs.

Relates to #22792
2018-10-29 12:06:38 -04:00
Stéphane Campinas 27c4d63340 document the search context is freed if the scroll is not extended (#34739)
The `fetchPhaseShouldFreeContext` returns true when there is a scroll context but the scroll parameter is null, thus freeing the search context.

183c32d4c3/server/src/main/java/org/elasticsearch/search/SearchService.java (L491)
2018-10-25 16:49:08 -04:00
Julie Tibshirani f854330e06
Make sure to use the type _doc in the REST documentation. (#34662)
* Replace custom type names with _doc in REST examples.
* Avoid using two mapping types in the percolator docs.
* Rename doc -> _doc in the main repository README.
* Also replace some custom type names in the HLRC docs.
2018-10-22 11:54:04 -07:00
Julie Tibshirani 67652b5355
Remove references to multiple types in the search documentation. (#34625) 2018-10-19 09:47:34 -07:00
eray daf88335d7 Add max_children limit to nested sort (#33587)
Add an option to `nested` sort to limit the number of children to visit when picking the sort value
of the root document. 

Closes #33592
2018-10-05 12:02:47 +02:00
Tim Heckel 3928921a1d [DOCS] Update scroll.asciidoc (#32530) 2018-09-18 17:00:22 +02:00
Dan Tennery-Spalding 3596512e6a [DOCS] Corrected several grammar errors (#33781) 2018-09-18 16:46:22 +02:00
Jim Ferenczi 4561c5ee83
Clarify context suggestions filtering and boosting (#33601)
This change clarifies the documentation of the context completion suggester
regarding filtering and boosting with contexts.
Unlike the suggester v1, filtering on multiple contexts
works as a disjunction, a suggestion matches if it contains at least one of the provided
context values and boosting selects the maximum score among the matching contexts.
This commit also adapts an old test that was written for the v1 suggester and commented out
for version 2 because the behavior changed.
2018-09-12 08:47:32 +02:00
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
Christoph Büscher 79db16f9bb
[Docs] Add search timeout caveats (#33354)
Global search timeouts and timeouts specified in the search request body use the
same internal mechanism as search cancellation. Therefore the same caveats
apply, mostly around the responsiveness of the timeout which gets only checked
by a running search on segment boundaries by default.

Closes #31263
2018-09-03 20:56:05 +02:00
Jim Ferenczi 713c07e14d
Add early termination support to BucketCollector (#33279)
This commit adds the support to early terminate the collection of a leaf
in the aggregation framework. This change introduces a MultiBucketCollector which
handles CollectionTerminatedException exactly like the Lucene MultiCollector.
Any aggregator can now throw a CollectionTerminatedException without stopping
the collection of a sibling aggregator. This is useful for aggregators that
can infer their result without visiting all documents (e.g.: a min/max aggregation on a match_all query).
2018-09-03 09:34:35 +02:00
lipsill b7c0d2830a [Docs] Remove repeating words (#33087) 2018-08-28 13:16:43 +02:00
Ignacio Vera d7219c05a2
Search: Support of wildcard on docvalue_fields (#32980)
* Search: Support of wildcard on docvalue_fields

For consistency with stored_fields, docvalue_fields should support the use of wildcards. 
Documentation of doc values fields is updated accordingly.

See also: #26390

Closes #26299
2018-08-23 10:04:00 +02:00
Luca Cavanna 393eec1482
Set maxScore for empty TopDocs to Nan rather than 0 (#32938)
We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).
2018-08-22 17:23:54 +02:00
Simon Willnauer ffb1a5d5b7
Expose `max_concurrent_shard_requests` in `_msearch` (#33016)
Today `_msearch` doesn't allow modifying the `max_concurrent_shard_requests`
per sub search request. This change adds support for setting this parameter on
all sub-search requests in an `_msearch`.

Relates to #31877
2018-08-22 08:45:08 +02:00
markharwood 70d80a3d09
Docs enhancement: added reference to cluster-level setting `search.default_allow_partial_results` (#32810)
Closes #32809
2018-08-16 10:21:37 +01:00
Christoph Büscher c1cc0cef61
Add ERR to ranking evaluation documentation (#32314)
This change adds a section about the Expected Reciprocal Rank metric (ERR) to
the Ranking Evaluation documentation.
2018-07-24 19:58:34 +02:00
Christoph Büscher fe6bb75eb4
Rename ranking evaluation `quality_level` to `metric_score` (#32168)
The notion of "quality" is an overloaded term in the search ranking evaluation 
context. Its usually used to decribe certain levels of "good" vs. "bad" of a 
seach result with respect to the users information need. We currently report the 
result of the ranking evaluation as `quality_level` which is a bit missleading.
This changes the response parameter name to `metric_score` which fits better.
2018-07-23 22:25:02 +02:00
Christoph Büscher 5cbd9ad177
Rename ranking evaluation response section (#32166)
Currently the ranking evaluation response contains a 'unknown_docs' section 
for each search use case in the evaluation set. It contains document ids for 
results in the search hits that currently don't have a quality rating.
This change renames it to `unrated_docs`, which better reflects its purpose.
2018-07-20 11:43:46 +02:00
David Turner 380b45b965
Improve docs for search preferences (#32159)
Today it is unclear what guarantees are offered by the search preference
feature, and we claim a guarantee that is stronger than what we really offer:

> A custom value will be used to guarantee that the same shards will be used
> for the same custom value.

This commit clarifies this documentation.

Forward-port of #32098 to `master`.
2018-07-18 12:58:17 +01:00
Mayya Sharipova 80492cacfc
Add second level of field collapsing (#31808)
* Put second level collapse under inner_hits

Closes #24855
2018-07-13 11:40:03 -04:00
Christoph Büscher 450a450b2c
[Docs] Clarify accepted sort case (#31605)
Rescore only works with an explicite "sort" element if it is on descending
"_score". Even using "order" : "asc" will throw an error.
2018-07-06 10:11:36 +02:00
Christoph Büscher 5f87a84bef
[Docs] Correct default window_size (#31582) 2018-07-04 14:07:20 +02:00
Julie Tibshirani 26a927a120
Fix a formatting issue in the docvalue_fields documentation. (#31563) 2018-06-26 10:15:56 -07:00
Igor Motov 7a9d9b0abf
Add support for ignore_unmapped to geo sort (#31153)
Adds support for `ignore_unmapped` parameter in geo distance sorting,
which is functionally equivalent to specifying an `unmapped_type` in
the field sort.

Closes #28152
2018-06-07 11:11:13 -04:00
Jim Ferenczi 0f5e570184
Deprecates indexing and querying a context completion field without context (#30712)
This change deprecates completion queries and documents without context that target a
context enabled completion field. Querying without context degrades the search
performance considerably (even when the number of indexed contexts is low).
This commit targets master but the deprecation will take place in 6.x and the functionality
will be removed in 7 in a follow up.

Closes #29222
2018-05-31 16:09:48 +02:00
Adrien Grand a19df4ab3b
Add a `format` option to `docvalue_fields`. (#29639)
This commit adds the ability to configure how a docvalue field should be
formatted, so that it would be possible eg. to return a date field
formatted as the number of milliseconds since Epoch.

Closes #27740
2018-05-23 14:39:04 +02:00
Fernando Medina Corey 739bb4f0ec Fix a grammatical error in the 'search types' documentation.
Simple grammatical fix.
2018-05-22 22:09:04 -07:00
Christoph Büscher f7b5986682
[Docs] Fix script-fields snippet execution (#30693)
Currently the first snippet in the documentation test in script-fields.asciidoc
isn't executed, although it has the CONSOLE annotation. Adding a test setup
annotation to it seems to fix the problem.
2018-05-22 20:22:42 +02:00
Jason Tedor 4a4e3d70d5
Default to one shard (#30539)
This commit changes the default out-of-the-box configuration for the
number of shards from five to one. We think this will help address a
common problem of oversharding. For users with time-based indices that
need a different default, this can be managed with index templates. For
users with non-time-based indices that find they need to re-shard with
the split API in place they no longer need to resort only to
reindexing.

Since this has the impact of changing the default number of shards used
in REST tests, we want to ensure that we still have coverage for issues
that could arise from multiple shards. As such, we randomize (rarely)
the default number of shards in REST tests to two. This is managed via a
global index template. However, some tests check the templates that are
in the cluster state during the test. Since this template is randomly
there, we need a way for tests to skip adding the template used to set
the number of shards to two. For this we add the default_shards feature
skip. To avoid having to write our docs in a complicated way because
sometimes they might be behind one shard, and sometimes they might be
behind two shards we apply the default_shards feature skip to all docs
tests. That is, these tests will always run with the default number of
shards (one).
2018-05-14 12:22:35 -04:00
Ke Li d373e1b49c Fix the search request default operation behavior doc (#29302) (#29405) 2018-05-07 14:43:45 +02:00
Julie Tibshirani 5c9f08402e
Correct an example in the top-level suggester documentation. (#30224) 2018-05-01 15:16:28 -07:00
Julie Tibshirani f5978d6d33
In the field capabilities API, remove support for providing fields in the request body. (#30185) 2018-04-27 16:14:11 -07:00
Saren Currie 0b4d2f5225 Clarify documentation of scroll_id (#29424)
* Clarify documentation of scroll_id

The Scroll API may return the same scroll ID for multiple requests due to server side state. This is not clear from the current documentation.

* Further clarify scroll ID return behaviour
2018-04-26 09:45:48 +01:00
Julie Tibshirani 32dfb65144 In the field capabilities API, deprecate support for providing fields in the request body. (#30157)
(cherry picked from commit d8d884b29d4aa7d01070484fee5de8d3db60cb25)
2018-04-25 23:01:53 -07:00
debadair 0c9baebe15
[DOCS] Added include for internal highlighters section. (#29597) 2018-04-18 16:56:09 -07:00
Mayya Sharipova bf6cfff080
[DOCS] Update highlighting docs (#28802)
- add more explanation to some highlighting parameters
- add a document describing how highlighters work internally
2018-04-18 17:41:19 -04:00