[Docs] Add quickstart and limitation documentation for Rollups

Original commit: elastic/x-pack-elasticsearch@cb4aaa0992
2018-03-30 20:43:33 +00:00 · 2018-03-30 20:43:33 +00:00 · 574ce84885
parent e8a6c9f5d1
commit 574ce84885
6 changed files with 408 additions and 10 deletions
--- a/docs/en/rest-api/rollup/rollup-job-config.asciidoc
+++ b/docs/en/rest-api/rollup/rollup-job-config.asciidoc
@ -143,7 +143,7 @@ The `date_histogram` group has several parameters:
 `time_zone`::
  Defines what time_zone the rollup documents are stored as.  Unlike raw data, which can shift timezones on the fly, rolled documents have
  to be stored with a specific timezone.  By default, rollup documents are stored in `UTC`, but this can be changed with the `time_zone`
-  parameter.  For more details about timezones, see <<rollup-timezones,Dealing with Timezones>>
+  parameter.
 ===== Terms
--- a/docs/en/rest-api/rollup/rollup-search.asciidoc
+++ b/docs/en/rest-api/rollup/rollup-search.asciidoc
@ -20,8 +20,13 @@ and rewrites it back to what a client would expect given the original query.
 `index`::
  (string) Index, indices or index-pattern to execute a rollup search against.  This can include both rollup and non-rollup
-  indices
+  indices.
 Rules for the `index` parameter:
 - At least one index/index-pattern must be specified.  This can be either a rollup or non-rollup index.  Omitting the index parameter,
 or using `_all`, is not permitted
 - Multiple non-rollup indices may be specified
 - Only one rollup index may be specified.  If more than one are supplied an exception will be thrown
 ==== Request Body
--- a/docs/en/rollup/index.asciidoc
+++ b/docs/en/rollup/index.asciidoc
@ -18,7 +18,6 @@ for analysis, but at a fraction of the storage cost of raw data.
 * <<rollup-getting-started,Getting Started>>
 * <<rollup-api-quickref, API Quick Reference>>
 * <<rollup-understanding-groups,Understanding Rollup Grouping>>
 * <<rollup-timezones,Dealing with Timezones>>
 * <<rollup-search-limitations,Limitations of Rollup Search>>
@ -28,5 +27,4 @@ include::overview.asciidoc[]
 include::api-quickref.asciidoc[]
 include::rollup-getting-started.asciidoc[]
 include::understanding-groups.asciidoc[]
 include::timezones.asciidoc[]
 include::rollup-search-limitations.asciidoc[]
--- a/docs/en/rollup/rollup-getting-started.asciidoc
+++ b/docs/en/rollup/rollup-getting-started.asciidoc
@ -1,4 +1,293 @@
 [[rollup-getting-started]]
 == Getting Started
-todo
+To use the Rollup feature, you need to create one or more "Rollup Jobs".  These jobs run continuously in the background
 and rollup the index or indices that you specify, placing the rolled documents in a secondary index (also of your choosing).
 Imagine you have a series of daily indices that hold sensor data (`sensor-2017-01-01`, `sensor-2017-01-02`, etc).  A sample document might
 look like this:
 [source,js]
 --------------------------------------------------
 {
  "timestamp": 1516729294000,
  "temperature": 200,
  "voltage": 5.2,
  "node": "a"
 }
 --------------------------------------------------
 // NOTCONSOLE
 [float]
 === Creating a Rollup Job
 We'd like to rollup these documents into hourly summaries, which will allow us to generate reports and dashboards with any time interval
 one hour or greater.  A rollup job might look like this:
 [source,js]
 --------------------------------------------------
 PUT _xpack/rollup/job/sensor
 {
    "index_pattern": "sensor-*",
    "rollup_index": "sensor_rollup",
    "cron": "*/30 * * * * ?",
    "size" :1000,
    "groups" : {
      "date_histogram": {
        "field": "timestamp",
        "interval": "1h",
        "delay": "7d"
      },
      "terms": {
        "fields": ["node"]
      }
    },
    "metrics": [
        {
            "field": "temperature",
            "metrics": ["min", "max", "sum"]
        },
        {
            "field": "voltage",
            "metrics": ["avg"]
        }
    ]
 }
 --------------------------------------------------
 // CONSOLE
 We give the job the ID of "sensor" (in the url: `PUT _xpack/rollup/job/sensor`), and tell it to rollup the index pattern `"sensor-*"`.
 This job will find and rollup any index that matches that pattern. Rollup summaries are then stored in the `"sensor_rollup"` index.
 The `cron` parameter controls when and how often the job activates.  When a rollup job's cron schedule triggers, it will begin rolling up
 from where it left off after the last activation.  So if you configure the cron to run every 30 seconds, the job will process the last 30
 seconds worth of data that was indexed into the `sensor-*` indices.
 If instead the cron was configured to run once a day at midnight, the job would process the last 24hours worth of data.  The choice is largely
 preference, based on how "realtime" you want the rollups, and if you wish to process continuously or move it to off-peak hours.
 Next, we define a set of `groups` and `metrics`.  The metrics are fairly straightforward: we want to save the min/max/sum of the `temperature`
 field, and the average of the `voltage` field.
 The groups are a little more interesting.  Essentially, we are defining the dimensions that we wish to pivot on at a later date when
 querying the data.  The grouping in this job allows us to use date_histograms aggregations on the `timestamp` field, rolled up at hourly intervals.
 It also allows us to run terms aggregations on the `node` field.
 .Date histogram interval vs cron schedule
 **********************************
 You'll note that the job's cron is configured to run every 30 seconds, but the date_histogram is configured to
 rollup at hourly intervals.  How do these relate?
 The date_histogram controls the granularity of the saved data.  Data will be rolled up into hourly intervals, and you will be unable
 to query with finer granularity.  The cron simply controls when the process looks for new data to rollup.  Every 30 seconds it will see
 if there is a new hour's worth of data and roll it up.  If not, the job goes back to sleep.
 Often, it doesn't make sense to define such a small cron (30s) on a large interval (1h), because the majority of the activations will
 simply go back to sleep.  But there's nothing wrong with it either, the job will do the right thing.
 **********************************
 For more details about the job syntax, see <<rollup-job-config>>.
 After you execute the above command and create the job, you'll receive the following response:
 [source,js]
 ----
 {
  "acknowledged": true
 }
 ----
 // TESTRESPONSE
 [float]
 === Starting the job
 After the job is created, it will be sitting in an inactive state.  Jobs need to be started before they begin processing data (this allows
 you to stop them later as a way to temporarily pause, without deleting the configuration).
 To start the job, execute this command:
 [source,js]
 --------------------------------------------------
 POST _xpack/rollup/job/sensor/_start
 --------------------------------------------------
 // CONSOLE
 // TEST[setup:sensor_rollup_job]
 [float]
 === Searching the Rolled results
 After the job has run and processed some data, we can use the <<rollup-search>> endpoint to do some searching.  The Rollup feature is designed
 so that you can use the same Query DSL syntax that you are accustomed to... it just happens to run on the rolled up data instead.
 For example, take this query:
 [source,js]
 --------------------------------------------------
 GET /sensor_rollup/_rollup_search
 {
    "size": 0,
    "aggregations": {
        "max_temperature": {
            "max": {
                "field": "temperature"
            }
        }
    }
 }
 --------------------------------------------------
 // CONSOLE
 // TEST[setup:sensor_prefab_data]
 It's a simple aggregation that calculates the maximum of the `temperature` field.  But you'll notice that is is being sent to the `sensor_rollup`
 index instead of the raw `sensor-*` indices.  And you'll also notice that it is using the `_rollup_search` endpoint.  Otherwise the syntax
 is exactly as you'd expect.
 If you were to execute that query, you'd receive a result that looks like a normal aggregation response:
 [source,js]
 ----
 {
  "took" : 102,
  "timed_out" : false,
  "terminated_early" : false,
  "_shards" : ... ,
  "hits" : {
    "total" : 0,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "max_temperature" : {
      "value" : 202.0
    }
  }
 }
 ----
 // TESTRESPONSE[s/"took" : 102/"took" : $body.$_path/]
 // TESTRESPONSE[s/"_shards" : \.\.\. /"_shards" : $body.$_path/]
 The only notable difference is that Rollup search results have zero `hits`, because we aren't really searching the original, live data any
 more.  Otherwise it's identical syntax.
 There are a few interesting takeaways here.  Firstly, even though the data was rolled up with hourly intervals and partitioned by
 node name, the query we ran is just calculating the max temperature across all documents.  The `groups` that were configured in the job
 are not mandatory elements of a query, they are just extra dimensions you can partition on.  Second, the request and response syntax
 is nearly identical to normal DSL, making it easy to integrate into dashboards and applications.
 Finally, we can use those grouping fields we defined to construct a more complicated query:
 [source,js]
 --------------------------------------------------
 GET /sensor_rollup/_rollup_search
 {
    "size": 0,
    "aggregations": {
        "timeline": {
            "date_histogram": {
                "field": "timestamp",
                "interval": "7d"
            },
            "aggs": {
                "nodes": {
                    "terms": {
                        "field": "node"
                    },
                    "aggs": {
                        "max_temperature": {
                            "max": {
                                "field": "temperature"
                            }
                        },
                        "avg_voltage": {
                            "avg": {
                                "field": "voltage"
                            }
                        }
                    }
                }
            }
        }
    }
 }
 --------------------------------------------------
 // CONSOLE
 // TEST[setup:sensor_prefab_data]
 Which returns a corresponding response:
 [source,js]
 ----
 {
  "took" : 93,
  "timed_out" : false,
  "terminated_early" : false,
  "_shards" : ... ,
  "hits" : {
    "total" : 0,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "timeline" : {
      "meta" : { },
      "buckets" : [
        {
          "key_as_string" : "2018-01-18T00:00:00.000Z",
          "key" : 1516233600000,
          "doc_count" : 6,
          "nodes" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "a",
                "doc_count" : 2,
                "max_temperature" : {
                  "value" : 202.0
                },
                "avg_voltage" : {
                  "value" : 5.1499998569488525
                }
              },
              {
                "key" : "b",
                "doc_count" : 2,
                "max_temperature" : {
                  "value" : 201.0
                },
                "avg_voltage" : {
                  "value" : 5.700000047683716
                }
              },
              {
                "key" : "c",
                "doc_count" : 2,
                "max_temperature" : {
                  "value" : 202.0
                },
                "avg_voltage" : {
                  "value" : 4.099999904632568
                }
              }
            ]
          }
        }
      ]
    }
  }
 }
 ----
 // TESTRESPONSE[s/"took" : 93/"took" : $body.$_path/]
 // TESTRESPONSE[s/"_shards" : \.\.\. /"_shards" : $body.$_path/]
 In addition to being more complicated (date histogram and a terms aggregation, plus an additional average metric), you'll notice
 the date_histogram uses a `7d` interval instead of `1h`.
 [float]
 === Conclusion
 This quickstart should have provided a concise overview of the core functionality that Rollup exposes.  There are more tips and things
 to consider when setting up Rollups, which you can find throughout the rest of this section.  You may also explore the <<rollup-api-quickref,REST API>>
 for an overview of what is available.
--- a/docs/en/rollup/rollup-search-limitations.asciidoc
+++ b/docs/en/rollup/rollup-search-limitations.asciidoc
@ -1,4 +1,114 @@
 [[rollup-search-limitations]]
 == Rollup Search Limitations
-todo
+While we feel the Rollup function is extremely flexible, the nature of summarizing data means there will be some limitations.  Once
 live data is thrown away, you will always lose some flexibility.
 This page highlights the major limitations so that you are aware of them.
 [float]
 === Only one Rollup index per search
 When using the <<rollup-search>> endpoint, the `index` parameter accepts one or more indices.  These can be a mix of regular, non-rollup
 indices and rollup indices.  However, only one rollup index can be specified.  The exact list of rules for the `index` parameter are as
 follows:
 - At least one index/index-pattern must be specified.  This can be either a rollup or non-rollup index.  Omitting the index parameter,
 or using `_all`, is not permitted
 - Multiple non-rollup indices may be specified
 - Only one rollup index may be specified.  If more than one are supplied an exception will be thrown
 This limitation is driven by the logic that decides which jobs are the "best" for any given query.  If you have ten jobs stored in a single
 index, which cover the source data with varying degrees of completeness and different intervals, the query needs to determine which set
 of jobs to actually search.  Incorrect decisions can lead to inaccurate aggregation results (e.g. over-counting doc counts, or bad metrics).
 Needless to say, this is a technically challenging piece of code.
 To help simplify the problem, we have limited search to just one rollup index at a time (which may contain multiple jobs).  In the future we
 may be able to open this up to multiple rollup jobs.
 [float]
 === Can only aggregate what's been stored
 A perhaps obvious limitation, but rollups can only aggregate on data that has been stored in the rollups.  If you don't configure the
 rollup job to store metrics about the `price` field, you won't be able to use the `price` field in any query or aggregation.
 For example, the `temperature` field in the following query has been stored in a rollup job... but not with an `avg` metric.  Which means
 the usage of `avg` here is not allowed:
 [source,js]
 --------------------------------------------------
 GET sensor_rollup/_rollup_search
 {
    "size": 0,
    "aggregations": {
        "avg_temperature": {
            "avg": {
                "field": "temperature"
            }
        }
    }
 }
 --------------------------------------------------
 // CONSOLE
 // TEST[continued]
 // TEST[catch:/illegal_argument_exception/]
 The response will tell you that the field and aggregation were not possible, because no rollup jobs were found which contained them:
 [source,js]
 ----
 {
    "error" : {
        "root_cause" : [
            {
                "type" : "illegal_argument_exception",
                "reason" : "There is not a rollup job that has a [avg] agg with name [avg_temperature] which also satisfies all requirements of query.",
                "stack_trace": ...
            }
        ],
        "type" : "illegal_argument_exception",
        "reason" : "There is not a rollup job that has a [avg] agg with name [avg_temperature] which also satisfies all requirements of query.",
        "stack_trace": ...
    },
    "status": 400
 }
 ----
 // TESTRESPONSE[s/"stack_trace": \.\.\./"stack_trace": $body.$_path/]
 [float]
 === Interval Granularity
 Rollups are stored at a certain granularity, as defined by the `date_histogram` group in the configuration.  If data is rolled up at hourly
 intervals, the <<rollup-search>> API can aggregate on any time interval hourly or greater.  Intervals that are less than an hour will throw
 an exception, since the data simply doesn't exist for finer granularities.
 Because the RollupSearch endpoint can "upsample" intervals, there is no need to configure jobs with multiple intervals (hourly, daily, etc).
 It's recommended to just configure a single job with the smallest granularity that is needed, and allow the search endpoint to upsample
 as needed.
 That said, if multiple jobs are present in a single rollup index with varying intervals, the search endpoint will identify and use the job(s)
 with the largest interval to satisfy the search reques.
 [float]
 === Limited querying components
 The Rollup functionality allows `query`'s in the search request, but with a limited subset of components.  The queries currently allowed are:
 - Term Query
 - Terms Query
 - Range Query
 - MatchAll Query
 - Any compound query (Boolean, Boosting, ConstantScore, etc)
 Furthermore, these queries can only use fields that were also saved in the rollup job.  If you wish to filter on a keyword `hostname` field,
 that field must have been configured in the rollup job under a `terms` grouping.
 If you attempt to use an unsupported query, or the query references a field that wasn't configured in the rollup job, an exception will be
 thrown.  We expect the list of support queries to grow over time as more are implemented.
 [float]
 === Timezones
 Rollup documents are stored in the timezone of the `date_histogram` group configuration in the job.  If no timezone is specified, the default
 is to rollup timestamps in `UTC`.
--- a/docs/en/rollup/timezones.asciidoc
+++ b/docs/en/rollup/timezones.asciidoc
@ -1,4 +0,0 @@
 [[rollup-timezones]]
 == Dealing with Timezones
 todo