2015-05-01 16:04:55 -04:00
|
|
|
|
|
|
|
[[caching-heavy-aggregations]]
|
|
|
|
== Caching heavy aggregations
|
|
|
|
|
|
|
|
Frequently used aggregations (e.g. for display on the home page of a website)
|
|
|
|
can be cached for faster responses. These cached results are the same results
|
|
|
|
that would be returned by an uncached aggregation -- you will never get stale
|
|
|
|
results.
|
|
|
|
|
2015-06-26 10:31:38 -04:00
|
|
|
See <<shard-request-cache>> for more details.
|
2015-05-01 16:04:55 -04:00
|
|
|
|
|
|
|
[[returning-only-agg-results]]
|
|
|
|
== Returning only aggregation results
|
|
|
|
|
|
|
|
There are many occasions when aggregations are required but search hits are not. For these cases the hits can be ignored by
|
|
|
|
setting `size=0`. For example:
|
|
|
|
|
2019-09-05 10:11:25 -04:00
|
|
|
[source,console]
|
2015-05-01 16:04:55 -04:00
|
|
|
--------------------------------------------------
|
2017-12-14 11:47:53 -05:00
|
|
|
GET /twitter/_search
|
2016-06-21 11:24:06 -04:00
|
|
|
{
|
2015-05-01 16:04:55 -04:00
|
|
|
"size": 0,
|
|
|
|
"aggregations": {
|
|
|
|
"my_agg": {
|
|
|
|
"terms": {
|
|
|
|
"field": "text"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2016-06-21 11:24:06 -04:00
|
|
|
// TEST[setup:twitter]
|
2015-05-01 16:04:55 -04:00
|
|
|
|
|
|
|
Setting `size` to `0` avoids executing the fetch phase of the search making the request more efficient.
|
|
|
|
|
|
|
|
[[agg-metadata]]
|
|
|
|
== Aggregation Metadata
|
|
|
|
|
|
|
|
You can associate a piece of metadata with individual aggregations at request time that will be returned in place
|
|
|
|
at response time.
|
|
|
|
|
|
|
|
Consider this example where we want to associate the color blue with our `terms` aggregation.
|
|
|
|
|
2019-09-05 10:11:25 -04:00
|
|
|
[source,console]
|
2015-05-01 16:04:55 -04:00
|
|
|
--------------------------------------------------
|
2017-12-14 11:47:53 -05:00
|
|
|
GET /twitter/_search
|
2015-05-01 16:04:55 -04:00
|
|
|
{
|
2016-06-21 11:24:06 -04:00
|
|
|
"size": 0,
|
|
|
|
"aggs": {
|
|
|
|
"titles": {
|
|
|
|
"terms": {
|
|
|
|
"field": "title"
|
|
|
|
},
|
|
|
|
"meta": {
|
|
|
|
"color": "blue"
|
|
|
|
}
|
2015-05-01 16:04:55 -04:00
|
|
|
}
|
2016-06-21 11:24:06 -04:00
|
|
|
}
|
2015-05-01 16:04:55 -04:00
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2016-06-21 11:24:06 -04:00
|
|
|
// TEST[setup:twitter]
|
2015-05-01 16:04:55 -04:00
|
|
|
|
|
|
|
Then that piece of metadata will be returned in place for our `titles` terms aggregation
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
"aggregations": {
|
|
|
|
"titles": {
|
|
|
|
"meta": {
|
|
|
|
"color" : "blue"
|
|
|
|
},
|
2016-06-21 11:24:06 -04:00
|
|
|
"doc_count_error_upper_bound" : 0,
|
|
|
|
"sum_other_doc_count" : 0,
|
2015-05-01 16:04:55 -04:00
|
|
|
"buckets": [
|
|
|
|
]
|
|
|
|
}
|
2016-06-21 11:24:06 -04:00
|
|
|
},
|
|
|
|
...
|
2015-05-01 16:04:55 -04:00
|
|
|
}
|
2015-06-26 10:31:38 -04:00
|
|
|
--------------------------------------------------
|
2016-06-21 11:24:06 -04:00
|
|
|
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits"/]
|
2017-02-09 05:19:04 -05:00
|
|
|
|
|
|
|
|
|
|
|
[[returning-aggregation-type]]
|
|
|
|
== Returning the type of the aggregation
|
|
|
|
|
|
|
|
Sometimes you need to know the exact type of an aggregation in order to parse its results. The `typed_keys` parameter
|
|
|
|
can be used to change the aggregation's name in the response so that it will be prefixed by its internal type.
|
|
|
|
|
|
|
|
Considering the following <<search-aggregations-bucket-datehistogram-aggregation,`date_histogram` aggregation>> named
|
|
|
|
`tweets_over_time` which has a sub <<search-aggregations-metrics-top-hits-aggregation, 'top_hits` aggregation>> named
|
|
|
|
`top_users`:
|
|
|
|
|
2019-09-05 10:11:25 -04:00
|
|
|
[source,console]
|
2017-02-09 05:19:04 -05:00
|
|
|
--------------------------------------------------
|
2017-12-14 11:47:53 -05:00
|
|
|
GET /twitter/_search?typed_keys
|
2017-02-09 05:19:04 -05:00
|
|
|
{
|
|
|
|
"aggregations": {
|
|
|
|
"tweets_over_time": {
|
|
|
|
"date_histogram": {
|
|
|
|
"field": "date",
|
[7.x Backport] Force selection of calendar or fixed intervals (#41906)
The date_histogram accepts an interval which can be either a calendar
interval (DST-aware, leap seconds, arbitrary length of months, etc) or
fixed interval (strict multiples of SI units). Unfortunately this is inferred
by first trying to parse as a calendar interval, then falling back to fixed
if that fails.
This leads to confusing arrangement where `1d` == calendar, but
`2d` == fixed. And if you want a day of fixed time, you have to
specify `24h` (e.g. the next smallest unit). This arrangement is very
error-prone for users.
This PR adds `calendar_interval` and `fixed_interval` parameters to any
code that uses intervals (date_histogram, rollup, composite, datafeed, etc).
Calendar only accepts calendar intervals, fixed accepts any combination of
units (meaning `1d` can be used to specify `24h` in fixed time), and both
are mutually exclusive.
The old interval behavior is deprecated and will throw a deprecation warning.
It is also mutually exclusive with the two new parameters. In the future the
old dual-purpose interval will be removed.
The change applies to both REST and java clients.
2019-05-20 12:07:29 -04:00
|
|
|
"calendar_interval": "year"
|
2017-02-09 05:19:04 -05:00
|
|
|
},
|
|
|
|
"aggregations": {
|
|
|
|
"top_users": {
|
|
|
|
"top_hits": {
|
|
|
|
"size": 1
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TEST[setup:twitter]
|
|
|
|
|
2017-02-10 04:53:38 -05:00
|
|
|
In the response, the aggregations names will be changed to respectively `date_histogram#tweets_over_time` and
|
|
|
|
`top_hits#top_users`, reflecting the internal types of each aggregation:
|
2017-02-09 05:19:04 -05:00
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
"aggregations": {
|
|
|
|
"date_histogram#tweets_over_time": { <1>
|
|
|
|
"buckets" : [
|
|
|
|
{
|
|
|
|
"key_as_string" : "2009-01-01T00:00:00.000Z",
|
|
|
|
"key" : 1230768000000,
|
|
|
|
"doc_count" : 5,
|
|
|
|
"top_hits#top_users" : { <2>
|
|
|
|
"hits" : {
|
2018-12-05 13:49:06 -05:00
|
|
|
"total" : {
|
|
|
|
"value": 5,
|
|
|
|
"relation": "eq"
|
|
|
|
},
|
2017-02-09 05:19:04 -05:00
|
|
|
"max_score" : 1.0,
|
|
|
|
"hits" : [
|
|
|
|
{
|
|
|
|
"_index": "twitter",
|
2017-12-14 11:47:53 -05:00
|
|
|
"_type": "_doc",
|
2017-02-09 05:19:04 -05:00
|
|
|
"_id": "0",
|
|
|
|
"_score": 1.0,
|
|
|
|
"_source": {
|
|
|
|
"date": "2009-11-15T14:12:12",
|
|
|
|
"message": "trying out Elasticsearch",
|
|
|
|
"user": "kimchy",
|
|
|
|
"likes": 0
|
|
|
|
}
|
|
|
|
}
|
|
|
|
]
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
]
|
|
|
|
}
|
|
|
|
},
|
|
|
|
...
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits"/]
|
|
|
|
|
|
|
|
<1> The name `tweets_over_time` now contains the `date_histogram` prefix.
|
|
|
|
<2> The name `top_users` now contains the `top_hits` prefix.
|
|
|
|
|
|
|
|
NOTE: For some aggregations, it is possible that the returned type is not the same as the one provided with the
|
|
|
|
request. This is the case for Terms, Significant Terms and Percentiles aggregations, where the returned type
|
|
|
|
also contains information about the type of the targeted field: `lterms` (for a terms aggregation on a Long field),
|
|
|
|
`sigsterms` (for a significant terms aggregation on a String field), `tdigest_percentiles` (for a percentile
|
|
|
|
aggregation based on the TDigest algorithm).
|