OpenSearch/rest-api-spec
Jim Ferenczi 5288235ca3
Optimize the composite aggregation for match_all and range queries (#28745)
This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values
present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate
the collection when the leading source value is greater than the lowest value in the queue.
Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents
in the order of the values present in the leading source.
For instance the following aggregation:

```
"composite" : {
  "sources" : [
    { "value1": { "terms" : { "field": "timestamp", "order": "asc" } } }
  ],
  "size": 10
}
```
... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents.
For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough
composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it
is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited.

This mode can execute iff:
 * The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`.
 * The query is a match_all query or a range query over the field that is used as the leading source in the composite definition.
 * The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only).

If these conditions are not met this aggregation visits each document like any other agg.
2018-03-26 09:51:37 +02:00
..
src/main/resources/rest-api-spec Optimize the composite aggregation for match_all and range queries (#28745) 2018-03-26 09:51:37 +02:00
.gitignore Initial commit (blank repository) 2013-05-23 17:56:22 +02:00
README.markdown Remove reference to utils for generating REST docs 2016-03-30 13:36:51 +02:00
build.gradle Build: Add pom building and associated files to rest api spec jar (#20460) 2016-09-13 14:05:52 -07:00

README.markdown

Elasticsearch REST API JSON specification

This repository contains a collection of JSON files which describe the Elasticsearch HTTP API.

Their purpose is to formalize and standardize the API, to facilitate development of libraries and integrations.

Example for the "Create Index" API:

{
  "indices.create": {
    "documentation": "http://www.elastic.co/guide/en/elasticsearch/reference/master/indices-create-index.html",
    "methods": ["PUT", "POST"],
    "url": {
      "path": "/{index}",
      "paths": ["/{index}"],
      "parts": {
        "index": {
          "type" : "string",
          "required" : true,
          "description" : "The name of the index"
        }
      },
      "params": {
        "timeout": {
          "type" : "time",
          "description" : "Explicit operation timeout"
        }
      }
    },
    "body": {
      "description" : "The configuration for the index (`settings` and `mappings`)"
    }
  }
}

The specification contains:

  • The name of the API (indices.create), which usually corresponds to the client calls
  • Link to the documentation at http://elastic.co
  • List of HTTP methods for the endpoint
  • URL specification: path, parts, parameters
  • Whether body is allowed for the endpoint or not and its description

The methods and url.paths elements list all possible HTTP methods and URLs for the endpoint; it is the responsibility of the developer to use this information for a sensible API on the target platform.

License

This software is licensed under the Apache License, version 2 ("ALv2").