OpenSearch/docs/reference/aggregations/misc.asciidoc


[[caching-heavy-aggregations]]
== Caching heavy aggregations

Frequently used aggregations (e.g. for display on the home page of a website)
can be cached for faster responses. These cached results are the same results
that would be returned by an uncached aggregation -- you will never get stale
results.

See <<shard-request-cache>> for more details.

[[returning-only-agg-results]]
== Returning only aggregation results

There are many occasions when aggregations are required but search hits are not.  For these cases the hits can be ignored by
setting `size=0`. For example:

[source,console]
--------------------------------------------------
GET /twitter/_search
{
  "size": 0,
  "aggregations": {
    "my_agg": {
      "terms": {
        "field": "text"
      }
    }
  }
}
--------------------------------------------------
// TEST[setup:twitter]

Setting `size` to `0` avoids executing the fetch phase of the search making the request more efficient.

[[agg-metadata]]
== Aggregation Metadata

You can associate a piece of metadata with individual aggregations at request time that will be returned in place
at response time.

Consider this example where we want to associate the color blue with our `terms` aggregation.

[source,console]
--------------------------------------------------
GET /twitter/_search
{
  "size": 0,
  "aggs": {
    "titles": {
      "terms": {
        "field": "title"
      },
      "meta": {
        "color": "blue"
      }
    }
  }
}
--------------------------------------------------
// TEST[setup:twitter]

Then that piece of metadata will be returned in place for our `titles` terms aggregation

[source,console-result]
--------------------------------------------------
{
    "aggregations": {
        "titles": {
            "meta": {
                "color" : "blue"
            },
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets": [
            ]
        }
    },
    ...
}
--------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits"/]


[[returning-aggregation-type]]
== Returning the type of the aggregation

Sometimes you need to know the exact type of an aggregation in order to parse its results. The `typed_keys` parameter
 can be used to change the aggregation's name in the response so that it will be prefixed by its internal type.

Considering the following <<search-aggregations-bucket-datehistogram-aggregation,`date_histogram` aggregation>> named
`tweets_over_time` which has a sub <<search-aggregations-metrics-top-hits-aggregation, 'top_hits` aggregation>> named
 `top_users`:

[source,console]
--------------------------------------------------
GET /twitter/_search?typed_keys
{
  "aggregations": {
    "tweets_over_time": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "year"
      },
      "aggregations": {
        "top_users": {
            "top_hits": {
                "size": 1
            }
        }
      }
    }
  }
}
--------------------------------------------------
// TEST[setup:twitter]

In the response, the aggregations names will be changed to respectively `date_histogram#tweets_over_time` and
`top_hits#top_users`, reflecting the internal types of each aggregation:

[source,console-result]
--------------------------------------------------
{
    "aggregations": {
        "date_histogram#tweets_over_time": { <1>
            "buckets" : [
                {
                    "key_as_string" : "2009-01-01T00:00:00.000Z",
                    "key" : 1230768000000,
                    "doc_count" : 5,
                    "top_hits#top_users" : {  <2>
                        "hits" : {
                            "total" : {
                                "value": 5,
                                "relation": "eq"
                            },
                            "max_score" : 1.0,
                            "hits" : [
                                {
                                  "_index": "twitter",
                                  "_type": "_doc",
                                  "_id": "0",
                                  "_score": 1.0,
                                  "_source": {
                                    "date": "2009-11-15T14:12:12",
                                    "message": "trying out Elasticsearch",
                                    "user": "kimchy",
                                    "likes": 0
                                  }
                                }
                            ]
                        }
                    }
                }
            ]
        }
    },
    ...
}
--------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits"/]

<1> The name `tweets_over_time` now contains the `date_histogram` prefix.
<2> The name `top_users` now contains the `top_hits` prefix.

NOTE: For some aggregations, it is possible that the returned type is not the same as the one provided with the
request. This is the case for Terms, Significant Terms and Percentiles aggregations, where the returned type
also contains information about the type of the targeted field: `lterms` (for a terms aggregation on a Long field),
 `sigsterms` (for a significant terms aggregation on a String field), `tdigest_percentiles` (for a percentile
 aggregation based on the TDigest algorithm).
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00
			`[[caching-heavy-aggregations]]`
			`== Caching heavy aggregations`

			`Frequently used aggregations (e.g. for display on the home page of a website)`
			`can be cached for faster responses. These cached results are the same results`
			`that would be returned by an uncached aggregation -- you will never get stale`
			`results.`

Rename caches. In order to be more consistent with what they do, the query cache has been renamed to request cache and the filter cache has been renamed to query cache. A known issue is that package/logger names do no longer match settings names, please speak up if you think this is an issue. Here are the settings for which I kept backward compatibility. Note that they are a bit different from what was discussed on #11569 but putting `cache` before the name of what is cached has the benefit of making these settings consistent with the fielddata cache whose size is configured by `indices.fielddata.cache.size`: * index.cache.query.enable -> index.requests.cache.enable * indices.cache.query.size -> indices.requests.cache.size * indices.cache.filter.size -> indices.queries.cache.size Close #11569 2015-06-26 10:31:38 -04:00			`See <<shard-request-cache>> for more details.`
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00
			`[[returning-only-agg-results]]`
			`== Returning only aggregation results`

			`There are many occasions when aggregations are required but search hits are not. For these cases the hits can be ignored by`
			setting `size=0`. For example:

[DOCS] Replace "// CONSOLE" comments with [source,console] (#46159) (#46332) 2019-09-05 10:11:25 -04:00			`[source,console]`
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00			`--------------------------------------------------`
Allow `_doc` as a type. (#27816) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751 2017-12-14 11:47:53 -05:00			`GET /twitter/_search`
Docs: Convert aggs/misc to CONSOLE They should be more readable and tested during the build. 2016-06-21 11:24:06 -04:00			`{`
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00			`"size": 0,`
			`"aggregations": {`
			`"my_agg": {`
			`"terms": {`
			`"field": "text"`
			`}`
			`}`
			`}`
			`}`
			`--------------------------------------------------`
Docs: Convert aggs/misc to CONSOLE They should be more readable and tested during the build. 2016-06-21 11:24:06 -04:00			`// TEST[setup:twitter]`
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00
			Setting `size` to `0` avoids executing the fetch phase of the search making the request more efficient.

			`[[agg-metadata]]`
			`== Aggregation Metadata`

			`You can associate a piece of metadata with individual aggregations at request time that will be returned in place`
			`at response time.`

			Consider this example where we want to associate the color blue with our `terms` aggregation.

[DOCS] Replace "// CONSOLE" comments with [source,console] (#46159) (#46332) 2019-09-05 10:11:25 -04:00			`[source,console]`
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00			`--------------------------------------------------`
Allow `_doc` as a type. (#27816) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751 2017-12-14 11:47:53 -05:00			`GET /twitter/_search`
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00			`{`
Docs: Convert aggs/misc to CONSOLE They should be more readable and tested during the build. 2016-06-21 11:24:06 -04:00			`"size": 0,`
			`"aggs": {`
			`"titles": {`
			`"terms": {`
			`"field": "title"`
			`},`
			`"meta": {`
			`"color": "blue"`
			`}`
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00			`}`
Docs: Convert aggs/misc to CONSOLE They should be more readable and tested during the build. 2016-06-21 11:24:06 -04:00			`}`
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00			`}`
			`--------------------------------------------------`
Docs: Convert aggs/misc to CONSOLE They should be more readable and tested during the build. 2016-06-21 11:24:06 -04:00			`// TEST[setup:twitter]`
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00
			Then that piece of metadata will be returned in place for our `titles` terms aggregation

[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449) (#46459) 2019-09-06 16:09:09 -04:00			`[source,console-result]`
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00			`--------------------------------------------------`
			`{`
			`"aggregations": {`
			`"titles": {`
			`"meta": {`
			`"color" : "blue"`
			`},`
Docs: Convert aggs/misc to CONSOLE They should be more readable and tested during the build. 2016-06-21 11:24:06 -04:00			`"doc_count_error_upper_bound" : 0,`
			`"sum_other_doc_count" : 0,`
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00			`"buckets": [`
			`]`
			`}`
Docs: Convert aggs/misc to CONSOLE They should be more readable and tested during the build. 2016-06-21 11:24:06 -04:00			`},`
			`...`
[DOCS] Restructure Aggs documentation 2015-05-01 16:04:55 -04:00			`}`
Rename caches. In order to be more consistent with what they do, the query cache has been renamed to request cache and the filter cache has been renamed to query cache. A known issue is that package/logger names do no longer match settings names, please speak up if you think this is an issue. Here are the settings for which I kept backward compatibility. Note that they are a bit different from what was discussed on #11569 but putting `cache` before the name of what is cached has the benefit of making these settings consistent with the fielddata cache whose size is configured by `indices.fielddata.cache.size`: * index.cache.query.enable -> index.requests.cache.enable * indices.cache.query.size -> indices.requests.cache.size * indices.cache.filter.size -> indices.queries.cache.size Close #11569 2015-06-26 10:31:38 -04:00			`--------------------------------------------------`
Docs: Convert aggs/misc to CONSOLE They should be more readable and tested during the build. 2016-06-21 11:24:06 -04:00			`// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits"/]`
Add parameter to prefix aggs name with type in search responses (#22965) This pull request adds a new parameter to the REST Search API named `typed_keys`. When set to true, the aggregation names in the search response will be prefixed with a prefix that reflects the internal type of the aggregation. Here is a simple example: ``` GET /_search?typed_keys { "aggs": { "tweets_per_user": { "terms": { "field": "user" } } }, "size": 0 } ``` And the response: ``` { "aggs": { "sterms:tweets_per_user": { ... } } } ``` This parameter is intended to make life easier for REST clients that could parse back the prefix and could detect the type of the aggregation to parse. It could also be implemented for suggesters. 2017-02-09 05:19:04 -05:00

			`[[returning-aggregation-type]]`
			`== Returning the type of the aggregation`

			Sometimes you need to know the exact type of an aggregation in order to parse its results. The `typed_keys` parameter
			`can be used to change the aggregation's name in the response so that it will be prefixed by its internal type.`

			Considering the following <<search-aggregations-bucket-datehistogram-aggregation,`date_histogram` aggregation>> named
			`tweets_over_time` which has a sub <<search-aggregations-metrics-top-hits-aggregation, 'top_hits` aggregation>> named
			`top_users`:

[DOCS] Replace "// CONSOLE" comments with [source,console] (#46159) (#46332) 2019-09-05 10:11:25 -04:00			`[source,console]`
Add parameter to prefix aggs name with type in search responses (#22965) This pull request adds a new parameter to the REST Search API named `typed_keys`. When set to true, the aggregation names in the search response will be prefixed with a prefix that reflects the internal type of the aggregation. Here is a simple example: ``` GET /_search?typed_keys { "aggs": { "tweets_per_user": { "terms": { "field": "user" } } }, "size": 0 } ``` And the response: ``` { "aggs": { "sterms:tweets_per_user": { ... } } } ``` This parameter is intended to make life easier for REST clients that could parse back the prefix and could detect the type of the aggregation to parse. It could also be implemented for suggesters. 2017-02-09 05:19:04 -05:00			`--------------------------------------------------`
Allow `_doc` as a type. (#27816) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751 2017-12-14 11:47:53 -05:00			`GET /twitter/_search?typed_keys`
Add parameter to prefix aggs name with type in search responses (#22965) This pull request adds a new parameter to the REST Search API named `typed_keys`. When set to true, the aggregation names in the search response will be prefixed with a prefix that reflects the internal type of the aggregation. Here is a simple example: ``` GET /_search?typed_keys { "aggs": { "tweets_per_user": { "terms": { "field": "user" } } }, "size": 0 } ``` And the response: ``` { "aggs": { "sterms:tweets_per_user": { ... } } } ``` This parameter is intended to make life easier for REST clients that could parse back the prefix and could detect the type of the aggregation to parse. It could also be implemented for suggesters. 2017-02-09 05:19:04 -05:00			`{`
			`"aggregations": {`
			`"tweets_over_time": {`
			`"date_histogram": {`
			`"field": "date",`
[7.x Backport] Force selection of calendar or fixed intervals (#41906) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients. 2019-05-20 12:07:29 -04:00			`"calendar_interval": "year"`
Add parameter to prefix aggs name with type in search responses (#22965) This pull request adds a new parameter to the REST Search API named `typed_keys`. When set to true, the aggregation names in the search response will be prefixed with a prefix that reflects the internal type of the aggregation. Here is a simple example: ``` GET /_search?typed_keys { "aggs": { "tweets_per_user": { "terms": { "field": "user" } } }, "size": 0 } ``` And the response: ``` { "aggs": { "sterms:tweets_per_user": { ... } } } ``` This parameter is intended to make life easier for REST clients that could parse back the prefix and could detect the type of the aggregation to parse. It could also be implemented for suggesters. 2017-02-09 05:19:04 -05:00			`},`
			`"aggregations": {`
			`"top_users": {`
			`"top_hits": {`
			`"size": 1`
			`}`
			`}`
			`}`
			`}`
			`}`
			`}`
			`--------------------------------------------------`
			`// TEST[setup:twitter]`

Use `typed_keys` parameter to prefix suggester names by type in search responses (#23080) This pull request reuses the typed_keys parameter added in #22965, but this time it applies it to suggesters. When set to true, the suggester names in the search response will be prefixed with a prefix that reflects their type. 2017-02-10 04:53:38 -05:00			In the response, the aggregations names will be changed to respectively `date_histogram#tweets_over_time` and
			`top_hits#top_users`, reflecting the internal types of each aggregation:
Add parameter to prefix aggs name with type in search responses (#22965) This pull request adds a new parameter to the REST Search API named `typed_keys`. When set to true, the aggregation names in the search response will be prefixed with a prefix that reflects the internal type of the aggregation. Here is a simple example: ``` GET /_search?typed_keys { "aggs": { "tweets_per_user": { "terms": { "field": "user" } } }, "size": 0 } ``` And the response: ``` { "aggs": { "sterms:tweets_per_user": { ... } } } ``` This parameter is intended to make life easier for REST clients that could parse back the prefix and could detect the type of the aggregation to parse. It could also be implemented for suggesters. 2017-02-09 05:19:04 -05:00
[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449) (#46459) 2019-09-06 16:09:09 -04:00			`[source,console-result]`
Add parameter to prefix aggs name with type in search responses (#22965) This pull request adds a new parameter to the REST Search API named `typed_keys`. When set to true, the aggregation names in the search response will be prefixed with a prefix that reflects the internal type of the aggregation. Here is a simple example: ``` GET /_search?typed_keys { "aggs": { "tweets_per_user": { "terms": { "field": "user" } } }, "size": 0 } ``` And the response: ``` { "aggs": { "sterms:tweets_per_user": { ... } } } ``` This parameter is intended to make life easier for REST clients that could parse back the prefix and could detect the type of the aggregation to parse. It could also be implemented for suggesters. 2017-02-09 05:19:04 -05:00			`--------------------------------------------------`
			`{`
			`"aggregations": {`
			`"date_histogram#tweets_over_time": { <1>`
			`"buckets" : [`
			`{`
			`"key_as_string" : "2009-01-01T00:00:00.000Z",`
			`"key" : 1230768000000,`
			`"doc_count" : 5,`
			`"top_hits#top_users" : { <2>`
			`"hits" : {`
Make hits.total an object in the search response (#35849) This commit changes the format of the `hits.total` in the search response to be an object with a `value` and a `relation`. The `value` indicates the number of hits that match the query and the `relation` indicates whether the number is accurate (in which case the relation is equals to `eq`) or a lower bound of the total (in which case it is equals to `gte`). This change also adds a parameter called `rest_total_hits_as_int` that can be used in the search APIs to opt out from this change (retrieve the total hits as a number in the rest response). Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain `hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a follow up (to allow numbers to be passed to `track_total_hits`). Relates #33028 2018-12-05 13:49:06 -05:00			`"total" : {`
			`"value": 5,`
			`"relation": "eq"`
			`},`
Add parameter to prefix aggs name with type in search responses (#22965) This pull request adds a new parameter to the REST Search API named `typed_keys`. When set to true, the aggregation names in the search response will be prefixed with a prefix that reflects the internal type of the aggregation. Here is a simple example: ``` GET /_search?typed_keys { "aggs": { "tweets_per_user": { "terms": { "field": "user" } } }, "size": 0 } ``` And the response: ``` { "aggs": { "sterms:tweets_per_user": { ... } } } ``` This parameter is intended to make life easier for REST clients that could parse back the prefix and could detect the type of the aggregation to parse. It could also be implemented for suggesters. 2017-02-09 05:19:04 -05:00			`"max_score" : 1.0,`
			`"hits" : [`
			`{`
			`"_index": "twitter",`
Allow `_doc` as a type. (#27816) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751 2017-12-14 11:47:53 -05:00			`"_type": "_doc",`
Add parameter to prefix aggs name with type in search responses (#22965) This pull request adds a new parameter to the REST Search API named `typed_keys`. When set to true, the aggregation names in the search response will be prefixed with a prefix that reflects the internal type of the aggregation. Here is a simple example: ``` GET /_search?typed_keys { "aggs": { "tweets_per_user": { "terms": { "field": "user" } } }, "size": 0 } ``` And the response: ``` { "aggs": { "sterms:tweets_per_user": { ... } } } ``` This parameter is intended to make life easier for REST clients that could parse back the prefix and could detect the type of the aggregation to parse. It could also be implemented for suggesters. 2017-02-09 05:19:04 -05:00			`"_id": "0",`
			`"_score": 1.0,`
			`"_source": {`
			`"date": "2009-11-15T14:12:12",`
			`"message": "trying out Elasticsearch",`
			`"user": "kimchy",`
			`"likes": 0`
			`}`
			`}`
			`]`
			`}`
			`}`
			`}`
			`]`
			`}`
			`},`
			`...`
			`}`
			`--------------------------------------------------`
			`// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits"/]`

			<1> The name `tweets_over_time` now contains the `date_histogram` prefix.
			<2> The name `top_users` now contains the `top_hits` prefix.

			`NOTE: For some aggregations, it is possible that the returned type is not the same as the one provided with the`
			`request. This is the case for Terms, Significant Terms and Percentiles aggregations, where the returned type`
			also contains information about the type of the targeted field: `lterms` (for a terms aggregation on a Long field),
			`sigsterms` (for a significant terms aggregation on a String field), `tdigest_percentiles` (for a percentile
			`aggregation based on the TDigest algorithm).`