OpenSearch/docs
Simon Willnauer e81804cfa4 Add a shard filter search phase to pre-filter shards based on query rewriting (#25658)
Today if we search across a large amount of shards we hit every shard. Yet, it's quite
common to search across an index pattern for time based indices but filtering will exclude
all results outside a certain time range ie. `now-3d`. While the search can potentially hit
hundreds of shards the majority of the shards might yield 0 results since there is not document
that is within this date range. Kibana for instance does this regularly but used `_field_stats`
to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results.

This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards
and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.
2017-07-12 22:19:20 +02:00
..
community-clients Add link to community Rust Client (#22897) 2017-06-09 14:50:51 -07:00
groovy-api Use Versions.asciidoc for groovy docs too 2017-02-04 11:42:45 +01:00
java-api Mark Log4j API dependency as non-optional 2017-06-08 16:09:34 -04:00
java-rest Adding basic search request documentation for high level client (#25651) 2017-07-12 17:06:46 +02:00
painless Include shared/attributes.asciidoc from docs master 2017-07-03 18:17:34 +02:00
perl Updated copyright years to include 2016 (#17808) 2016-04-18 12:39:23 +02:00
plugins Fixed page breaks for ICU Collation Keyword Fields 2017-07-03 17:49:28 +02:00
python Remove most of the need for `// NOTCONSOLE` 2016-09-06 10:32:54 -04:00
reference Add a shard filter search phase to pre-filter shards based on query rewriting (#25658) 2017-07-12 22:19:20 +02:00
resiliency Add note regarding out-of-sync replicas 2017-03-07 14:25:23 -05:00
ruby Updated copyright years to include 2016 (#17808) 2016-04-18 12:39:23 +02:00
src/test Tests: Clean up rest test file handling (#21392) 2017-04-18 15:07:08 -07:00
README.asciidoc Add magic $_path stash key to docs tests (#24724) 2017-05-23 15:33:48 -04:00
Versions.asciidoc Include shared/attributes.asciidoc from docs master 2017-07-03 18:17:34 +02:00
build.gradle Remove reference to field-stats docs. 2017-07-11 18:38:25 +02:00

README.asciidoc

The Elasticsearch docs are in AsciiDoc format and can be built using the
Elasticsearch documentation build process.

See: https://github.com/elastic/docs

Snippets marked with `// CONSOLE` are automatically annotated with "VIEW IN
CONSOLE" and "COPY AS CURL" in the documentation and are automatically tested
by the command `gradle :docs:check`. To test just the docs from a single page,
use e.g. `gradle :docs:check -Dtests.method=*rollover*`.

By default each `// CONSOLE` snippet runs as its own isolated test. You can
manipulate the test execution in the following ways:

* `// TEST`: Explicitly marks a snippet as a test. Snippets marked this way
are tests even if they don't have `// CONSOLE` but usually `// TEST` is used
for its modifiers:
  * `// TEST[s/foo/bar/]`: Replace `foo` with `bar` in the generated test. This
  should be used sparingly because it makes the snippet "lie". Sometimes,
  though, you can use it to make the snippet more clear more clear. Keep in
  mind the that if there are multiple substitutions then they are applied in
  the order that they are defined.
  * `// TEST[catch:foo]`: Used to expect errors in the requests. Replace `foo`
  with `request` to expect a 400 error, for example. If the snippet contains
  multiple requests then only the last request will expect the error.
  * `// TEST[continued]`: Continue the test started in the last snippet. Between
  tests the nodes are cleaned: indexes are removed, etc. This prevents that
  from happening between snippets because the two snippets are a single test.
  This is most useful when you have text and snippets that work together to
  tell the story of some use case because it merges the snippets (and thus the
  use case) into one big test.
  * `// TEST[skip:reason]`: Skip this test. Replace `reason` with the actual
  reason to skip the test. Snippets without `// TEST` or `// CONSOLE` aren't
  considered tests anyway but this is useful for explicitly documenting the
  reason why the test shouldn't be run.
  * `// TEST[setup:name]`: Run some setup code before running the snippet. This
  is useful for creating and populating indexes used in the snippet. The setup
  code is defined in `docs/build.gradle`.
  * `// TEST[warning:some warning]`: Expect the response to include a `Warning`
  header. If the response doesn't include a `Warning` header with the exact
  text then the test fails. If the response includes `Warning` headers that
  aren't expected then the test fails.
* `// TESTRESPONSE`: Matches this snippet against the body of the response of
  the last test. If the response is JSON then order is ignored. If you add
  `// TEST[continued]` to the snippet after `// TESTRESPONSE` it will continue
  in the same test, allowing you to interleave requests with responses to check.
  * `// TESTRESPONSE[s/foo/bar/]`: Substitutions. See `// TEST[s/foo/bar]` for
  how it works. These are much more common than `// TEST[s/foo/bar]` because
  they are useful for eliding portions of the response that are not pertinent
  to the documentation.
    * One interesting difference here is that you often want to match against
    the response from Elasticsearch. To do that you can reference the "body" of
    the response like this: `// TESTRESPONSE[s/"took": 25/"took": $body.took/]`.
    Note the `$body` string. This says "I don't expect that 25 number in the
    response, just match against what is in the response." Instead of writing
    the path into the response after `$body` you can write `$_path` which
    "figures out" the path. This is especially useful for making sweeping
    assertions like "I made up all the numbers in this example, don't compare
    them" which looks like `// TESTRESPONSE[s/\d+/$body.$_path/]`.
  * `// TESTRESPONSE[_cat]`: Add substitutions for testing `_cat` responses. Use
  this after all other substitutions so it doesn't make other substitutions
  difficult.
* `// TESTSETUP`: Marks this snippet as the "setup" for all other snippets in
  this file. This is a somewhat natural way of structuring documentation. You
  say "this is the data we use to explain this feature" then you add the
  snippet that you mark `// TESTSETUP` and then every snippet will turn into
  a test that runs the setup snippet first. See the "painless" docs for a file
  that puts this to good use. This is fairly similar to `// TEST[setup:name]`
  but rather than the setup defined in `docs/build.gradle` the setup is defined
  right in the documentation file.

Any place you can use json you can use elements like `$body.path.to.thing`
which is replaced on the fly with the contents of the thing at `path.to.thing`
in the last response.