2015-05-01 16:04:55 -04:00
|
|
|
|
|
|
|
[[caching-heavy-aggregations]]
|
|
|
|
== Caching heavy aggregations
|
|
|
|
|
|
|
|
Frequently used aggregations (e.g. for display on the home page of a website)
|
|
|
|
can be cached for faster responses. These cached results are the same results
|
|
|
|
that would be returned by an uncached aggregation -- you will never get stale
|
|
|
|
results.
|
|
|
|
|
2015-06-26 10:31:38 -04:00
|
|
|
See <<shard-request-cache>> for more details.
|
2015-05-01 16:04:55 -04:00
|
|
|
|
|
|
|
[[returning-only-agg-results]]
|
|
|
|
== Returning only aggregation results
|
|
|
|
|
|
|
|
There are many occasions when aggregations are required but search hits are not. For these cases the hits can be ignored by
|
|
|
|
setting `size=0`. For example:
|
|
|
|
|
2019-09-05 10:11:25 -04:00
|
|
|
[source,console]
|
2015-05-01 16:04:55 -04:00
|
|
|
--------------------------------------------------
|
2020-08-04 14:16:38 -04:00
|
|
|
GET /my-index-000001/_search
|
2016-06-21 11:24:06 -04:00
|
|
|
{
|
2015-05-01 16:04:55 -04:00
|
|
|
"size": 0,
|
|
|
|
"aggregations": {
|
|
|
|
"my_agg": {
|
|
|
|
"terms": {
|
2020-08-04 14:16:38 -04:00
|
|
|
"field": "user.id"
|
2015-05-01 16:04:55 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2020-08-04 14:16:38 -04:00
|
|
|
// TEST[setup:my_index]
|
2015-05-01 16:04:55 -04:00
|
|
|
|
|
|
|
Setting `size` to `0` avoids executing the fetch phase of the search making the request more efficient.
|
|
|
|
|
|
|
|
[[agg-metadata]]
|
|
|
|
== Aggregation Metadata
|
|
|
|
|
|
|
|
You can associate a piece of metadata with individual aggregations at request time that will be returned in place
|
|
|
|
at response time.
|
|
|
|
|
|
|
|
Consider this example where we want to associate the color blue with our `terms` aggregation.
|
|
|
|
|
2019-09-05 10:11:25 -04:00
|
|
|
[source,console]
|
2015-05-01 16:04:55 -04:00
|
|
|
--------------------------------------------------
|
2020-08-04 14:16:38 -04:00
|
|
|
GET /my-index-000001/_search
|
2015-05-01 16:04:55 -04:00
|
|
|
{
|
2016-06-21 11:24:06 -04:00
|
|
|
"size": 0,
|
|
|
|
"aggs": {
|
|
|
|
"titles": {
|
|
|
|
"terms": {
|
|
|
|
"field": "title"
|
|
|
|
},
|
|
|
|
"meta": {
|
|
|
|
"color": "blue"
|
|
|
|
}
|
2015-05-01 16:04:55 -04:00
|
|
|
}
|
2016-06-21 11:24:06 -04:00
|
|
|
}
|
2015-05-01 16:04:55 -04:00
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2020-08-04 14:16:38 -04:00
|
|
|
// TEST[setup:my_index]
|
2015-05-01 16:04:55 -04:00
|
|
|
|
|
|
|
Then that piece of metadata will be returned in place for our `titles` terms aggregation
|
|
|
|
|
2019-09-06 16:09:09 -04:00
|
|
|
[source,console-result]
|
2015-05-01 16:04:55 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
{
|
2020-07-20 15:59:00 -04:00
|
|
|
"aggregations": {
|
|
|
|
"titles": {
|
|
|
|
"meta": {
|
|
|
|
"color": "blue"
|
|
|
|
},
|
|
|
|
"doc_count_error_upper_bound": 0,
|
|
|
|
"sum_other_doc_count": 0,
|
|
|
|
"buckets": [
|
|
|
|
]
|
|
|
|
}
|
|
|
|
},
|
|
|
|
...
|
2015-05-01 16:04:55 -04:00
|
|
|
}
|
2015-06-26 10:31:38 -04:00
|
|
|
--------------------------------------------------
|
2016-06-21 11:24:06 -04:00
|
|
|
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits"/]
|
2017-02-09 05:19:04 -05:00
|
|
|
|
|
|
|
|
|
|
|
[[returning-aggregation-type]]
|
|
|
|
== Returning the type of the aggregation
|
|
|
|
|
|
|
|
Sometimes you need to know the exact type of an aggregation in order to parse its results. The `typed_keys` parameter
|
|
|
|
can be used to change the aggregation's name in the response so that it will be prefixed by its internal type.
|
|
|
|
|
|
|
|
Considering the following <<search-aggregations-bucket-datehistogram-aggregation,`date_histogram` aggregation>> named
|
2020-08-04 14:16:38 -04:00
|
|
|
`requests_over_time` which has a sub <<search-aggregations-metrics-top-hits-aggregation, `top_hits` aggregation>> named
|
2017-02-09 05:19:04 -05:00
|
|
|
`top_users`:
|
|
|
|
|
2019-09-05 10:11:25 -04:00
|
|
|
[source,console]
|
2017-02-09 05:19:04 -05:00
|
|
|
--------------------------------------------------
|
2020-08-04 14:16:38 -04:00
|
|
|
GET /my-index-000001/_search?typed_keys
|
2017-02-09 05:19:04 -05:00
|
|
|
{
|
|
|
|
"aggregations": {
|
2020-08-04 14:16:38 -04:00
|
|
|
"requests_over_time": {
|
2017-02-09 05:19:04 -05:00
|
|
|
"date_histogram": {
|
2020-08-04 14:16:38 -04:00
|
|
|
"field": "@timestamp",
|
[7.x Backport] Force selection of calendar or fixed intervals (#41906)
The date_histogram accepts an interval which can be either a calendar
interval (DST-aware, leap seconds, arbitrary length of months, etc) or
fixed interval (strict multiples of SI units). Unfortunately this is inferred
by first trying to parse as a calendar interval, then falling back to fixed
if that fails.
This leads to confusing arrangement where `1d` == calendar, but
`2d` == fixed. And if you want a day of fixed time, you have to
specify `24h` (e.g. the next smallest unit). This arrangement is very
error-prone for users.
This PR adds `calendar_interval` and `fixed_interval` parameters to any
code that uses intervals (date_histogram, rollup, composite, datafeed, etc).
Calendar only accepts calendar intervals, fixed accepts any combination of
units (meaning `1d` can be used to specify `24h` in fixed time), and both
are mutually exclusive.
The old interval behavior is deprecated and will throw a deprecation warning.
It is also mutually exclusive with the two new parameters. In the future the
old dual-purpose interval will be removed.
The change applies to both REST and java clients.
2019-05-20 12:07:29 -04:00
|
|
|
"calendar_interval": "year"
|
2017-02-09 05:19:04 -05:00
|
|
|
},
|
|
|
|
"aggregations": {
|
|
|
|
"top_users": {
|
|
|
|
"top_hits": {
|
2020-07-28 13:58:20 -04:00
|
|
|
"size": 1,
|
2020-08-04 14:16:38 -04:00
|
|
|
"_source": ["user.id", "http.response.bytes", "message"]
|
2017-02-09 05:19:04 -05:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2020-08-04 14:16:38 -04:00
|
|
|
// TEST[setup:my_index]
|
2017-02-09 05:19:04 -05:00
|
|
|
|
2020-08-04 14:16:38 -04:00
|
|
|
In the response, the aggregations names will be changed to respectively `date_histogram#requests_over_time` and
|
2017-02-10 04:53:38 -05:00
|
|
|
`top_hits#top_users`, reflecting the internal types of each aggregation:
|
2017-02-09 05:19:04 -05:00
|
|
|
|
2019-09-06 16:09:09 -04:00
|
|
|
[source,console-result]
|
2017-02-09 05:19:04 -05:00
|
|
|
--------------------------------------------------
|
|
|
|
{
|
2020-07-20 15:59:00 -04:00
|
|
|
"aggregations": {
|
2020-08-04 14:16:38 -04:00
|
|
|
"date_histogram#requests_over_time": { <1>
|
|
|
|
"buckets": [
|
2020-07-20 15:59:00 -04:00
|
|
|
{
|
2020-08-04 14:16:38 -04:00
|
|
|
"key_as_string": "2099-01-01T00:00:00.000Z",
|
|
|
|
"key": 4070908800000,
|
2020-07-20 15:59:00 -04:00
|
|
|
"doc_count": 5,
|
|
|
|
"top_hits#top_users": { <2>
|
|
|
|
"hits": {
|
|
|
|
"total": {
|
|
|
|
"value": 5,
|
|
|
|
"relation": "eq"
|
|
|
|
},
|
|
|
|
"max_score": 1.0,
|
|
|
|
"hits": [
|
2017-02-09 05:19:04 -05:00
|
|
|
{
|
2020-08-04 14:16:38 -04:00
|
|
|
"_index": "my-index-000001",
|
2020-07-20 15:59:00 -04:00
|
|
|
"_type": "_doc",
|
|
|
|
"_id": "0",
|
|
|
|
"_score": 1.0,
|
|
|
|
"_source": {
|
2020-08-04 14:16:38 -04:00
|
|
|
"user": { "id": "kimchy"},
|
|
|
|
"message": "GET /search HTTP/1.1 200 1070000",
|
|
|
|
"http": { "response": { "bytes": 1070000 } }
|
2020-07-20 15:59:00 -04:00
|
|
|
}
|
2017-02-09 05:19:04 -05:00
|
|
|
}
|
2020-07-20 15:59:00 -04:00
|
|
|
]
|
|
|
|
}
|
|
|
|
}
|
2017-02-09 05:19:04 -05:00
|
|
|
}
|
2020-07-20 15:59:00 -04:00
|
|
|
]
|
|
|
|
}
|
|
|
|
},
|
|
|
|
...
|
2017-02-09 05:19:04 -05:00
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits"/]
|
|
|
|
|
2020-08-04 14:16:38 -04:00
|
|
|
<1> The name `requests_over_time` now contains the `date_histogram` prefix.
|
2017-02-09 05:19:04 -05:00
|
|
|
<2> The name `top_users` now contains the `top_hits` prefix.
|
|
|
|
|
|
|
|
NOTE: For some aggregations, it is possible that the returned type is not the same as the one provided with the
|
|
|
|
request. This is the case for Terms, Significant Terms and Percentiles aggregations, where the returned type
|
|
|
|
also contains information about the type of the targeted field: `lterms` (for a terms aggregation on a Long field),
|
|
|
|
`sigsterms` (for a significant terms aggregation on a String field), `tdigest_percentiles` (for a percentile
|
|
|
|
aggregation based on the TDigest algorithm).
|
2020-07-28 13:58:20 -04:00
|
|
|
|
2020-02-21 02:22:04 -05:00
|
|
|
|
|
|
|
[[indexing-aggregation-results]]
|
|
|
|
== Indexing aggregation results with {transforms}
|
2020-07-28 13:58:20 -04:00
|
|
|
|
|
|
|
<<transforms,{transforms-cap}>> enable you to convert existing {es} indices
|
|
|
|
into summarized indices, which provide opportunities for new insights and
|
|
|
|
analytics. You can use {transforms} to persistently index your aggregation
|
2020-02-21 02:22:04 -05:00
|
|
|
results into entity-centric indices.
|