OpenSearch/docs/reference/rollup/apis/rollup-caps.asciidoc
Zachary Tong 6ae6f57d39
[7.x Backport] Force selection of calendar or fixed intervals (#41906)
The date_histogram accepts an interval which can be either a calendar
interval (DST-aware, leap seconds, arbitrary length of months, etc) or
fixed interval (strict multiples of SI units). Unfortunately this is inferred
by first trying to parse as a calendar interval, then falling back to fixed
if that fails.

This leads to confusing arrangement where `1d` == calendar, but
`2d` == fixed.  And if you want a day of fixed time, you have to
specify `24h` (e.g. the next smallest unit).  This arrangement is very
error-prone for users.

This PR adds `calendar_interval` and `fixed_interval` parameters to any
code that uses intervals (date_histogram, rollup, composite, datafeed, etc).
Calendar only accepts calendar intervals, fixed accepts any combination of
units (meaning `1d` can be used to specify `24h` in fixed time), and both
are mutually exclusive.

The old interval behavior is deprecated and will throw a deprecation warning.
It is also mutually exclusive with the two new parameters. In the future the
old dual-purpose interval will be removed.

The change applies to both REST and java clients.
2019-05-20 12:07:29 -04:00

187 lines
5.2 KiB
Plaintext

[role="xpack"]
[testenv="basic"]
[[rollup-get-rollup-caps]]
=== Get rollup job capabilities API
++++
<titleabbrev>Get rollup caps</titleabbrev>
++++
experimental[]
This API returns the capabilities of any rollup jobs that have been configured
for a specific index or index pattern.
This API is useful because a rollup job is often configured to rollup only a
subset of fields from the source index. Furthermore, only certain aggregations
can be configured for various fields, leading to a limited subset of
functionality depending on that configuration.
This API enables you to inspect an index and determine:
1. Does this index have associated rollup data somewhere in the cluster?
2. If yes to the first question, what fields were rolled up, what aggregations
can be performed, and where does the data live?
==== Request
`GET _rollup/data/{index}`
//===== Description
==== Path Parameters
`index`::
(string) Index, indices or index-pattern to return rollup capabilities for. `_all` may be used to fetch
rollup capabilities from all jobs
==== Request Body
There is no request body for the Get Rollup Caps API.
==== Authorization
You must have `monitor`, `monitor_rollup`, `manage` or `manage_rollup` cluster privileges to use this API.
For more information, see
{xpack-ref}/security-privileges.html[Security Privileges].
==== Examples
Imagine we have an index named `sensor-1` full of raw data. We know that the data will grow over time, so there
will be a `sensor-2`, `sensor-3`, etc. Let's create a Rollup job that targets the index pattern `sensor-*` to accommodate
this future scaling:
[source,js]
--------------------------------------------------
PUT _rollup/job/sensor
{
"index_pattern": "sensor-*",
"rollup_index": "sensor_rollup",
"cron": "*/30 * * * * ?",
"page_size" :1000,
"groups" : {
"date_histogram": {
"field": "timestamp",
"fixed_interval": "1h",
"delay": "7d"
},
"terms": {
"fields": ["node"]
}
},
"metrics": [
{
"field": "temperature",
"metrics": ["min", "max", "sum"]
},
{
"field": "voltage",
"metrics": ["avg"]
}
]
}
--------------------------------------------------
// CONSOLE
// TEST[setup:sensor_index]
We can then retrieve the rollup capabilities of that index pattern (`sensor-*`) via the following command:
[source,js]
--------------------------------------------------
GET _rollup/data/sensor-*
--------------------------------------------------
// CONSOLE
// TEST[continued]
Which will yield the following response:
[source,js]
----
{
"sensor-*" : {
"rollup_jobs" : [
{
"job_id" : "sensor",
"rollup_index" : "sensor_rollup",
"index_pattern" : "sensor-*",
"fields" : {
"node" : [
{
"agg" : "terms"
}
],
"temperature" : [
{
"agg" : "min"
},
{
"agg" : "max"
},
{
"agg" : "sum"
}
],
"timestamp" : [
{
"agg" : "date_histogram",
"time_zone" : "UTC",
"fixed_interval" : "1h",
"delay": "7d"
}
],
"voltage" : [
{
"agg" : "avg"
}
]
}
}
]
}
}
----
// TESTRESPONSE
The response that is returned contains information that is similar to the original Rollup configuration, but formatted
differently. First, there are some house-keeping details: the Rollup job's ID, the index that holds the rolled data,
the index pattern that the job was targeting.
Next it shows a list of fields that contain data eligible for rollup searches. Here we see four fields: `node`, `temperature`,
`timestamp` and `voltage`. Each of these fields list the aggregations that are possible. For example, you can use a min, max
or sum aggregation on the `temperature` field, but only a `date_histogram` on `timestamp`.
Note that the `rollup_jobs` element is an array; there can be multiple, independent jobs configured for a single index
or index pattern. Each of these jobs may have different configurations, so the API returns a list of all the various
configurations available.
We could also retrieve the same information with a request to `_all`:
[source,js]
--------------------------------------------------
GET _rollup/data/_all
--------------------------------------------------
// CONSOLE
// TEST[continued]
But note that if we use the concrete index name (`sensor-1`), we'll retrieve no rollup capabilities:
[source,js]
--------------------------------------------------
GET _rollup/data/sensor-1
--------------------------------------------------
// CONSOLE
// TEST[continued]
[source,js]
----
{
}
----
// TESTRESPONSE
Why is this? The original rollup job was configured against a specific index pattern (`sensor-*`) not a concrete index
(`sensor-1`). So while the index belongs to the pattern, the rollup job is only valid across the entirety of the pattern
not just one of it's containing indices. So for that reason, the Rollup Capabilities API only returns information based
on the originally configured index name or pattern.