405 lines
8.7 KiB
Plaintext
405 lines
8.7 KiB
Plaintext
[role="xpack"]
|
|
[testenv="basic"]
|
|
[[search-aggregations-metrics-top-metrics]]
|
|
=== Top Metrics Aggregation
|
|
|
|
experimental[We expect to change the response format of this aggregation as we add more features., https://github.com/elastic/elasticsearch/issues/51813]
|
|
|
|
The `top_metrics` aggregation selects metrics from the document with the largest or smallest "sort"
|
|
value. For example, This gets the value of the `v` field on the document with the largest value of `s`:
|
|
|
|
[source,console,id=search-aggregations-metrics-top-metrics-simple]
|
|
----
|
|
POST /test/_bulk?refresh
|
|
{"index": {}}
|
|
{"s": 1, "v": 3.1415}
|
|
{"index": {}}
|
|
{"s": 2, "v": 1.0}
|
|
{"index": {}}
|
|
{"s": 3, "v": 2.71828}
|
|
POST /test/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": {"field": "v"},
|
|
"sort": {"s": "desc"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
|
|
Which returns:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"tm": {
|
|
"top": [ {"sort": [3], "metrics": {"v": 2.718280076980591 } } ]
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|
|
|
|
`top_metrics` is fairly similar to <<search-aggregations-metrics-top-hits-aggregation, `top_hits`>>
|
|
in spirit but because it is more limited it is able to do its job using less memory and is often
|
|
faster.
|
|
|
|
==== `sort`
|
|
|
|
The `sort` field in the metric request functions exactly the same as the `sort` field in the
|
|
<<request-body-search-sort, search>> request except:
|
|
* It can't be used on <<binary,binary>>, <<flattened,flattened>, <<ip,ip>>,
|
|
<<keyword,keyword>>, or <<text,text>> fields.
|
|
* It only supports a single sort value.
|
|
|
|
The metrics that the aggregation returns is the first hit that would be returned by the search
|
|
request. So,
|
|
|
|
`"sort": {"s": "desc"}`:: gets metrics from the document with the highest `s`
|
|
`"sort": {"s": "asc"}`:: gets the metrics from the document with the lowest `s`
|
|
`"sort": {"_geo_distance": {"location": "35.7796, -78.6382"}}`::
|
|
gets metrics from the documents with `location` *closest* to `35.7796, -78.6382`
|
|
`"sort": "_score"`:: gets metrics from the document with the highest score
|
|
|
|
NOTE: This aggregation doesn't support any sort of "tie breaking". If two documents have
|
|
the same sort values then this aggregation could return either document's fields.
|
|
|
|
==== `metrics`
|
|
|
|
`metrics` selects the fields to of the "top" document to return.
|
|
|
|
You can return multiple metrics by providing a list:
|
|
|
|
[source,console,id=search-aggregations-metrics-top-metrics-list-of-metrics]
|
|
----
|
|
PUT /test
|
|
{
|
|
"mappings": {
|
|
"properties": {
|
|
"d": {"type": "date"}
|
|
}
|
|
}
|
|
}
|
|
POST /test/_bulk?refresh
|
|
{"index": {}}
|
|
{"s": 1, "v": 3.1415, "m": 1, "d": "2020-01-01T00:12:12Z"}
|
|
{"index": {}}
|
|
{"s": 2, "v": 1.0, "m": 6, "d": "2020-01-02T00:12:12Z"}
|
|
{"index": {}}
|
|
{"s": 3, "v": 2.71828, "m": -12, "d": "2019-12-31T00:12:12Z"}
|
|
POST /test/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": [
|
|
{"field": "v"},
|
|
{"field": "m"},
|
|
{"field": "d"}
|
|
],
|
|
"sort": {"s": "desc"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
|
|
Which returns:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"tm": {
|
|
"top": [ {
|
|
"sort": [3],
|
|
"metrics": {
|
|
"v": 2.718280076980591,
|
|
"m": -12,
|
|
"d": "2019-12-31T00:12:12.000Z"
|
|
}
|
|
} ]
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|
|
|
|
==== `size`
|
|
|
|
`top_metrics` can return the top few document's worth of metrics using the size parameter:
|
|
|
|
[source,console,id=search-aggregations-metrics-top-metrics-size]
|
|
----
|
|
POST /test/_bulk?refresh
|
|
{"index": {}}
|
|
{"s": 1, "v": 3.1415}
|
|
{"index": {}}
|
|
{"s": 2, "v": 1.0}
|
|
{"index": {}}
|
|
{"s": 3, "v": 2.71828}
|
|
POST /test/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": {"field": "v"},
|
|
"sort": {"s": "desc"},
|
|
"size": 2
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
|
|
Which returns:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"tm": {
|
|
"top": [
|
|
{"sort": [3], "metrics": {"v": 2.718280076980591 } },
|
|
{"sort": [2], "metrics": {"v": 1.0 } }
|
|
]
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|
|
|
|
The default `size` is 1. The maximum default size is `10` because the aggregation's
|
|
working storage is "dense", meaning we allocate `size` slots for every bucket. `10`
|
|
is a *very* conservative default maximum and you can raise it if you need to by
|
|
changing the `top_metrics_max_size` index setting. But know that large sizes can
|
|
take a fair bit of memory, especially if they are inside of an aggregation which
|
|
makes many buckes like a large
|
|
<<search-aggregations-metrics-top-metrics-example-terms, terms aggregation>>.
|
|
|
|
[source,console]
|
|
----
|
|
PUT /test/_settings
|
|
{
|
|
"top_metrics_max_size": 100
|
|
}
|
|
----
|
|
// TEST[continued]
|
|
|
|
NOTE: If `size` is more than `1` the `top_metrics` aggregation can't be the target of a sort.
|
|
|
|
|
|
==== Examples
|
|
|
|
[[search-aggregations-metrics-top-metrics-example-terms]]
|
|
===== Use with terms
|
|
|
|
This aggregation should be quite useful inside of <<search-aggregations-bucket-terms-aggregation, `terms`>>
|
|
aggregation, to, say, find the last value reported by each server.
|
|
|
|
[source,console,id=search-aggregations-metrics-top-metrics-terms]
|
|
----
|
|
PUT /node
|
|
{
|
|
"mappings": {
|
|
"properties": {
|
|
"ip": {"type": "ip"},
|
|
"date": {"type": "date"}
|
|
}
|
|
}
|
|
}
|
|
POST /node/_bulk?refresh
|
|
{"index": {}}
|
|
{"ip": "192.168.0.1", "date": "2020-01-01T01:01:01", "v": 1}
|
|
{"index": {}}
|
|
{"ip": "192.168.0.1", "date": "2020-01-01T02:01:01", "v": 2}
|
|
{"index": {}}
|
|
{"ip": "192.168.0.2", "date": "2020-01-01T02:01:01", "v": 3}
|
|
POST /node/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"ip": {
|
|
"terms": {
|
|
"field": "ip"
|
|
},
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": {"field": "v"},
|
|
"sort": {"date": "desc"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
|
|
Which returns:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"ip": {
|
|
"buckets": [
|
|
{
|
|
"key": "192.168.0.1",
|
|
"doc_count": 2,
|
|
"tm": {
|
|
"top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"v": 2 } } ]
|
|
}
|
|
},
|
|
{
|
|
"key": "192.168.0.2",
|
|
"doc_count": 1,
|
|
"tm": {
|
|
"top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"v": 3 } } ]
|
|
}
|
|
}
|
|
],
|
|
"doc_count_error_upper_bound": 0,
|
|
"sum_other_doc_count": 0
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|
|
|
|
Unlike `top_hits`, you can sort buckets by the results of this metric:
|
|
|
|
[source,console]
|
|
----
|
|
POST /node/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"ip": {
|
|
"terms": {
|
|
"field": "ip",
|
|
"order": {"tm.v": "desc"}
|
|
},
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": {"field": "v"},
|
|
"sort": {"date": "desc"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TEST[continued]
|
|
|
|
Which returns:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"ip": {
|
|
"buckets": [
|
|
{
|
|
"key": "192.168.0.2",
|
|
"doc_count": 1,
|
|
"tm": {
|
|
"top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"v": 3 } } ]
|
|
}
|
|
},
|
|
{
|
|
"key": "192.168.0.1",
|
|
"doc_count": 2,
|
|
"tm": {
|
|
"top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"v": 2 } } ]
|
|
}
|
|
}
|
|
],
|
|
"doc_count_error_upper_bound": 0,
|
|
"sum_other_doc_count": 0
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|
|
|
|
===== Mixed sort types
|
|
|
|
Sorting `top_metrics` by a field that has different types across different
|
|
indices producs somewhat suprising results: floating point fields are
|
|
always sorted independantly of whole numbered fields.
|
|
|
|
[source,console,id=search-aggregations-metrics-top-metrics-mixed-sort]
|
|
----
|
|
POST /test/_bulk?refresh
|
|
{"index": {"_index": "test1"}}
|
|
{"s": 1, "v": 3.1415}
|
|
{"index": {"_index": "test1"}}
|
|
{"s": 2, "v": 1}
|
|
{"index": {"_index": "test2"}}
|
|
{"s": 3.1, "v": 2.71828}
|
|
POST /test*/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": {"field": "v"},
|
|
"sort": {"s": "asc"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
|
|
Which returns:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"tm": {
|
|
"top": [ {"sort": [3.0999999046325684], "metrics": {"v": 2.718280076980591 } } ]
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|
|
|
|
While this is better than an error it *probably* isn't what you were going for.
|
|
While it does lose some precision, you can explictly cast the whole number
|
|
fields to floating points with something like:
|
|
|
|
[source,console]
|
|
----
|
|
POST /test*/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": {"field": "v"},
|
|
"sort": {"s": {"order": "asc", "numeric_type": "double"}}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TEST[continued]
|
|
|
|
Which returns the much more expected:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"tm": {
|
|
"top": [ {"sort": [1.0], "metrics": {"v": 3.1414999961853027 } } ]
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|