241 lines
7.3 KiB
Plaintext
241 lines
7.3 KiB
Plaintext
[role="xpack"]
|
|
[testenv="basic"]
|
|
[[rollup-search]]
|
|
=== Rollup search
|
|
++++
|
|
<titleabbrev>Rollup search</titleabbrev>
|
|
++++
|
|
|
|
experimental[]
|
|
|
|
The Rollup Search endpoint allows searching rolled-up data using the standard query DSL. The Rollup Search endpoint
|
|
is needed because, internally, rolled-up documents utilize a different document structure than the original data. The
|
|
Rollup Search endpoint rewrites standard query DSL into a format that matches the rollup documents, then takes the response
|
|
and rewrites it back to what a client would expect given the original query.
|
|
|
|
==== Request
|
|
|
|
`GET {index}/_rollup_search`
|
|
|
|
//===== Description
|
|
|
|
==== Path Parameters
|
|
|
|
`index`::
|
|
(string) Index, indices or index-pattern to execute a rollup search against. This can include both rollup and non-rollup
|
|
indices.
|
|
|
|
Rules for the `index` parameter:
|
|
|
|
- At least one index/index-pattern must be specified. This can be either a rollup or non-rollup index. Omitting the index parameter,
|
|
or using `_all`, is not permitted
|
|
- Multiple non-rollup indices may be specified
|
|
- Only one rollup index may be specified. If more than one are supplied an exception will be thrown
|
|
- Index patterns may be used, but if they match more than one rollup index an exception will be thrown.
|
|
|
|
==== Request Body
|
|
|
|
The request body supports a subset of features from the regular Search API. It supports:
|
|
|
|
- `query` param for specifying an DSL query, subject to some limitations (see <<rollup-search-limitations>> and <<rollup-agg-limitations>>
|
|
- `aggregations` param for specifying aggregations
|
|
|
|
Functionality that is not available:
|
|
|
|
- `size`: because rollups work on pre-aggregated data, no search hits can be returned and so size must be set to zero or
|
|
omitted entirely.
|
|
- `highlighter`, `suggestors`, `post_filter`, `profile`, `explain` are similarly disallowed
|
|
|
|
|
|
==== Historical-only search example
|
|
|
|
Imagine we have an index named `sensor-1` full of raw data, and we have created a rollup job with the following configuration:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
PUT _rollup/job/sensor
|
|
{
|
|
"index_pattern": "sensor-*",
|
|
"rollup_index": "sensor_rollup",
|
|
"cron": "*/30 * * * * ?",
|
|
"page_size" :1000,
|
|
"groups" : {
|
|
"date_histogram": {
|
|
"field": "timestamp",
|
|
"interval": "1h",
|
|
"delay": "7d"
|
|
},
|
|
"terms": {
|
|
"fields": ["node"]
|
|
}
|
|
},
|
|
"metrics": [
|
|
{
|
|
"field": "temperature",
|
|
"metrics": ["min", "max", "sum"]
|
|
},
|
|
{
|
|
"field": "voltage",
|
|
"metrics": ["avg"]
|
|
}
|
|
]
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[setup:sensor_index]
|
|
|
|
This rolls up the `sensor-*` pattern and stores the results in `sensor_rollup`. To search this rolled up data, we
|
|
need to use the `_rollup_search` endpoint. However, you'll notice that we can use regular query DSL to search the
|
|
rolled-up data:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /sensor_rollup/_rollup_search
|
|
{
|
|
"size": 0,
|
|
"aggregations": {
|
|
"max_temperature": {
|
|
"max": {
|
|
"field": "temperature"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[setup:sensor_prefab_data]
|
|
// TEST[s/_rollup_search/_rollup_search?filter_path=took,timed_out,terminated_early,_shards,hits,aggregations/]
|
|
|
|
The query is targeting the `sensor_rollup` data, since this contains the rollup data as configured in the job. A `max`
|
|
aggregation has been used on the `temperature` field, yielding the following response:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"took" : 102,
|
|
"timed_out" : false,
|
|
"terminated_early" : false,
|
|
"_shards" : ... ,
|
|
"hits" : {
|
|
"total" : {
|
|
"value": 0,
|
|
"relation": "eq"
|
|
},
|
|
"max_score" : 0.0,
|
|
"hits" : [ ]
|
|
},
|
|
"aggregations" : {
|
|
"max_temperature" : {
|
|
"value" : 202.0
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE[s/"took" : 102/"took" : $body.$_path/]
|
|
// TESTRESPONSE[s/"_shards" : \.\.\. /"_shards" : $body.$_path/]
|
|
|
|
The response is exactly as you'd expect from a regular query + aggregation; it provides some metadata about the request
|
|
(`took`, `_shards`, etc), the search hits (which is always empty for rollup searches), and the aggregation response.
|
|
|
|
Rollup searches are limited to functionality that was configured in the rollup job. For example, we are not able to calculate
|
|
the average temperature because `avg` was not one of the configured metrics for the `temperature` field. If we try
|
|
to execute that search:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET sensor_rollup/_rollup_search
|
|
{
|
|
"size": 0,
|
|
"aggregations": {
|
|
"avg_temperature": {
|
|
"avg": {
|
|
"field": "temperature"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[continued]
|
|
// TEST[catch:/illegal_argument_exception/]
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"error" : {
|
|
"root_cause" : [
|
|
{
|
|
"type" : "illegal_argument_exception",
|
|
"reason" : "There is not a rollup job that has a [avg] agg with name [avg_temperature] which also satisfies all requirements of query.",
|
|
"stack_trace": ...
|
|
}
|
|
],
|
|
"type" : "illegal_argument_exception",
|
|
"reason" : "There is not a rollup job that has a [avg] agg with name [avg_temperature] which also satisfies all requirements of query.",
|
|
"stack_trace": ...
|
|
},
|
|
"status": 400
|
|
}
|
|
----
|
|
// TESTRESPONSE[s/"stack_trace": \.\.\./"stack_trace": $body.$_path/]
|
|
|
|
==== Searching both historical rollup and non-rollup data
|
|
|
|
The Rollup Search API has the capability to search across both "live", non-rollup data as well as the aggregated rollup
|
|
data. This is done by simply adding the live indices to the URI:
|
|
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET sensor-1,sensor_rollup/_rollup_search <1>
|
|
{
|
|
"size": 0,
|
|
"aggregations": {
|
|
"max_temperature": {
|
|
"max": {
|
|
"field": "temperature"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[continued]
|
|
// TEST[s/_rollup_search/_rollup_search?filter_path=took,timed_out,terminated_early,_shards,hits,aggregations/]
|
|
<1> Note the URI now searches `sensor-1` and `sensor_rollup` at the same time
|
|
|
|
When the search is executed, the Rollup Search endpoint will do two things:
|
|
|
|
1. The original request will be sent to the non-rollup index unaltered
|
|
2. A rewritten version of the original request will be sent to the rollup index.
|
|
|
|
When the two responses are received, the endpoint will then rewrite the rollup response and merge the two together.
|
|
During the merging process, if there is any overlap in buckets between the two responses, the buckets from the non-rollup
|
|
index will be used.
|
|
|
|
The response to the above query will look as expected, despite spanning rollup and non-rollup indices:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"took" : 102,
|
|
"timed_out" : false,
|
|
"terminated_early" : false,
|
|
"_shards" : ... ,
|
|
"hits" : {
|
|
"total" : {
|
|
"value": 0,
|
|
"relation": "eq"
|
|
},
|
|
"max_score" : 0.0,
|
|
"hits" : [ ]
|
|
},
|
|
"aggregations" : {
|
|
"max_temperature" : {
|
|
"value" : 202.0
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE[s/"took" : 102/"took" : $body.$_path/]
|
|
// TESTRESPONSE[s/"_shards" : \.\.\. /"_shards" : $body.$_path/] |