OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-10 23:15:04 +00:00

History

Optimize the composite aggregation for match_all and range queries (#28745 )

This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values
present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate
the collection when the leading source value is greater than the lowest value in the queue.
Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents
in the order of the values present in the leading source.
For instance the following aggregation:

```
"composite" : {
"sources" : [
{ "value1": { "terms" : { "field": "timestamp", "order": "asc" } } }
],
"size": 10
}
```
... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents.
For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough
composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it
is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited.

This mode can execute iff:
* The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`.
* The query is a match_all query or a range query over the field that is used as the leading source in the composite definition.
* The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only).

If these conditions are not met this aggregation visits each document like any other agg.

2018-03-26 09:51:37 +02:00

community-clients

Docs: Link C++ client lib elasticlient (#28949 )

2018-03-23 11:30:01 -04:00

groovy-api

[Docs] Unify spelling of Elasticsearch (#27567 )

2017-11-29 09:44:25 +01:00

java-api

Added minimal docs for reindex api in java-api docs

2018-03-16 07:42:48 +01:00

java-rest

Docs: HighLevelRestClient#multiSearch (#29144 )

2018-03-23 10:11:50 -04:00

painless

Docs: Support triple quotes (#28915 )

2018-03-16 12:46:39 -04:00

perl

Updated copyright years to include 2016 (#17808 )

2016-04-18 12:39:23 +02:00

plugins

Add ingest-attachment support for per document indexed_chars limit (#28977 )

2018-03-14 19:07:20 +01:00

python

Update version information (#25226 )

2017-08-15 15:00:11 -06:00

reference

Optimize the composite aggregation for match_all and range queries (#28745 )

2018-03-26 09:51:37 +02:00

resiliency

Resilience page - Remove 6.0.0 as a target for the discovery refactoring. (#26311 )

2017-08-21 18:15:24 +02:00

ruby

2016-04-18 12:39:23 +02:00

src/test

Revert shading for the low level rest client (#26367 )

2017-08-25 14:13:12 -05:00

build.gradle

Rename core module to server (#28180 )

2018-01-11 11:30:43 -07:00

README.asciidoc

[DOCS] Clarified readme for testing a single page

2017-08-15 15:11:12 -07:00

Versions.asciidoc

[Docs] Update links to java9 docs (#28750 )

2018-02-21 10:40:18 +01:00

README.asciidoc

The Elasticsearch docs are in AsciiDoc format and can be built using the
Elasticsearch documentation build process.

See: https://github.com/elastic/docs

Snippets marked with `// CONSOLE` are automatically annotated with "VIEW IN
CONSOLE" and "COPY AS CURL" in the documentation and are automatically tested
by the command `gradle :docs:check`. To test just the docs from a single page,
use e.g. `gradle :docs:check -Dtests.method="\*rollover*"`.

By default each `// CONSOLE` snippet runs as its own isolated test. You can
manipulate the test execution in the following ways:

* `// TEST`: Explicitly marks a snippet as a test. Snippets marked this way
are tests even if they don't have `// CONSOLE` but usually `// TEST` is used
for its modifiers:
  * `// TEST[s/foo/bar/]`: Replace `foo` with `bar` in the generated test. This
  should be used sparingly because it makes the snippet "lie". Sometimes,
  though, you can use it to make the snippet more clear more clear. Keep in
  mind the that if there are multiple substitutions then they are applied in
  the order that they are defined.
  * `// TEST[catch:foo]`: Used to expect errors in the requests. Replace `foo`
  with `request` to expect a 400 error, for example. If the snippet contains
  multiple requests then only the last request will expect the error.
  * `// TEST[continued]`: Continue the test started in the last snippet. Between
  tests the nodes are cleaned: indexes are removed, etc. This prevents that
  from happening between snippets because the two snippets are a single test.
  This is most useful when you have text and snippets that work together to
  tell the story of some use case because it merges the snippets (and thus the
  use case) into one big test.
  * `// TEST[skip:reason]`: Skip this test. Replace `reason` with the actual
  reason to skip the test. Snippets without `// TEST` or `// CONSOLE` aren't
  considered tests anyway but this is useful for explicitly documenting the
  reason why the test shouldn't be run.
  * `// TEST[setup:name]`: Run some setup code before running the snippet. This
  is useful for creating and populating indexes used in the snippet. The setup
  code is defined in `docs/build.gradle`.
  * `// TEST[warning:some warning]`: Expect the response to include a `Warning`
  header. If the response doesn't include a `Warning` header with the exact
  text then the test fails. If the response includes `Warning` headers that
  aren't expected then the test fails.
* `// TESTRESPONSE`: Matches this snippet against the body of the response of
  the last test. If the response is JSON then order is ignored. If you add
  `// TEST[continued]` to the snippet after `// TESTRESPONSE` it will continue
  in the same test, allowing you to interleave requests with responses to check.
  * `// TESTRESPONSE[s/foo/bar/]`: Substitutions. See `// TEST[s/foo/bar]` for
  how it works. These are much more common than `// TEST[s/foo/bar]` because
  they are useful for eliding portions of the response that are not pertinent
  to the documentation.
    * One interesting difference here is that you often want to match against
    the response from Elasticsearch. To do that you can reference the "body" of
    the response like this: `// TESTRESPONSE[s/"took": 25/"took": $body.took/]`.
    Note the `$body` string. This says "I don't expect that 25 number in the
    response, just match against what is in the response." Instead of writing
    the path into the response after `$body` you can write `$_path` which
    "figures out" the path. This is especially useful for making sweeping
    assertions like "I made up all the numbers in this example, don't compare
    them" which looks like `// TESTRESPONSE[s/\d+/$body.$_path/]`.
  * `// TESTRESPONSE[_cat]`: Add substitutions for testing `_cat` responses. Use
  this after all other substitutions so it doesn't make other substitutions
  difficult.
* `// TESTSETUP`: Marks this snippet as the "setup" for all other snippets in
  this file. This is a somewhat natural way of structuring documentation. You
  say "this is the data we use to explain this feature" then you add the
  snippet that you mark `// TESTSETUP` and then every snippet will turn into
  a test that runs the setup snippet first. See the "painless" docs for a file
  that puts this to good use. This is fairly similar to `// TEST[setup:name]`
  but rather than the setup defined in `docs/build.gradle` the setup is defined
  right in the documentation file.

Any place you can use json you can use elements like `$body.path.to.thing`
which is replaced on the fly with the contents of the thing at `path.to.thing`
in the last response.