OpenSearch/rest-api-spec
Nik Everett ab2c6d9696
Save memory when auto_date_histogram is not on top (backport of #57304) (#58190)
This builds an `auto_date_histogram` aggregator that natively aggregates
from many buckets and uses it when the `auto_date_histogram` used to use
`asMultiBucketAggregator` which should save a significant amount of
memory in those cases. In particular, this happens when
`auto_date_histogram` is a sub-aggregator of a multi-bucketing aggregator
like `terms` or `histogram` or `filters`. For the most part we preserve
the original implementation when `auto_date_histogram` only collects from
a single bucket.

It isn't possible to "just port the aggregator" without taking a pretty
significant performance hit because we used to rewrite all of the
buckets every time we switched to a coarser and coarser rounding
configuration. Without some major surgery to how to delay sub-aggs
we'd end up rewriting the delay list zillions of time if there are many
buckets.

The multi-bucket version of the aggregator has a "budget" of "wasted"
buckets and only rewrites all of the buckets when we exceed that budget.
Now that we don't rebucket every time we increase the rounding we can no
longer get an accurate count of the number of buckets! So instead the
aggregator uses an estimate of the number of buckets to trigger switching
to a coarser rounding. This estimate is likely to be *terrible* when
buckets are far apart compared to the rounding. So it also uses the
difference between the first and last bucket to trigger switching to a
coarser rounding. Which covers for the shortcomings of the bucket
estimation technique pretty well. It also causes the aggregator to emit
fewer buckets in cases where they'd be reduced together on the
coordinating node. This is wonderful! But probably fairly rare.

All of that does buy us some speed improvements when the aggregator is
a child of multi-bucket aggregator:
Without metrics or time zone: 25% faster
With metrics: 15% faster
With time zone: 22% faster

Relates to #56487
2020-06-17 08:48:41 -04:00
..
src/main/resources Save memory when auto_date_histogram is not on top (backport of #57304) (#58190) 2020-06-17 08:48:41 -04:00
.gitignore
README.markdown [DOCS] Note doc links should be live in REST API JSON specs (#53871) 2020-03-23 07:42:03 -04:00
build.gradle 7.x only REST specification fixes (#56736) 2020-05-26 12:33:57 -05:00

README.markdown

Elasticsearch REST API JSON specification

This repository contains a collection of JSON files which describe the Elasticsearch HTTP API.

Their purpose is to formalize and standardize the API, to facilitate development of libraries and integrations.

Example for the "Create Index" API:

{
  "indices.create": {
    "documentation":{
      "url":"https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-create-index.html"
    },
    "stability": "stable",
    "url":{
      "paths":[
        {
          "path":"/{index}",
          "method":"PUT",
          "parts":{
            "index":{
              "type":"string",
              "description":"The name of the index"
            }
          }
        }
      ]
    },
    "params": {
      "timeout": {
        "type" : "time",
        "description" : "Explicit operation timeout"
      }
    },
    "body": {
      "description" : "The configuration for the index (`settings` and `mappings`)"
    }
  }
}

The specification contains:

  • The name of the API (indices.create), which usually corresponds to the client calls

  • Link to the documentation at the http://elastic.co website.

    IMPORANT: This should be a live link. Several downstream ES clients use this link to generate their documentation. Using a broken link or linking to yet-to-be-created doc pages can break the Elastic docs build.

  • stability indicating the state of the API, has to be declared explicitly or YAML tests will fail

    • experimental highly likely to break in the near future (minor/path), no bwc guarantees. Possibly removed in the future.
    • beta less likely to break or be removed but still reserve the right to do so
    • stable No backwards breaking changes in a minor
  • Request URL: HTTP method, path and parts

  • Request parameters

  • Request body specification

NOTE If an API is stable but it response should be treated as an arbitrary map of key values please notate this as followed

{
  "api.name": {
    "stability" : "stable",
    "response": {
      "treat_json_as_key_value" : true
    }
  }
}

Backwards compatibility

The specification follows the same backward compatibility guarantees as Elasticsearch.

  • Within a Major, additions only.
  • If an item has been documented wrong it should be deprecated instead as removing these might break downstream clients.
  • Major version change, may deprecate pieces or simply remove them given enough deprecation time.

Deprecations

The specification schema allows to codify API deprecations, either for an entire API, or for specific parts of the API, such as paths or parameters.

Entire API:

{
  "api" : {
    "deprecated" : {
      "version" : "7.0.0",
      "description" : "Reason API is being deprecated"
    },
  }
}

Specific paths and their parts:

{
  "api": {
    "url": {
      "paths": [
        {
          "path":"/{index}/{type}/{id}/_create",
          "method":"PUT",
          "parts":{
            "id":{
              "type":"string",
              "description":"Document ID"
            },
            "index":{
              "type":"string",
              "description":"The name of the index"
            },
            "type":{
              "type":"string",
              "description":"The type of the document",
              "deprecated":true
            }
          },
          "deprecated":{
            "version":"7.0.0",
            "description":"Specifying types in urls has been deprecated"
          }
        }
      ]
    }
  }
}

Parameters

{
  "api": {
    "url": {
      "params": {
        "stored_fields": {
          "type": "list",
          "description" : "",
          "deprecated" : {
            "version" : "7.0.0",
            "description" : "Reason parameter is being deprecated"
          }
        }
      }
    }
  }
}

License

This software is licensed under the Apache License, version 2 ("ALv2").