[DOCS] Group rollup and transform content (#46882)
This commit is contained in:
parent
d4d1182677
commit
a815f8b930
|
@ -0,0 +1,16 @@
|
|||
[[data-rollup-transform]]
|
||||
= Roll up or transform your data
|
||||
|
||||
[partintro]
|
||||
--
|
||||
|
||||
{es} offers the following methods for manipulating your data:
|
||||
|
||||
* <<xpack-rollup,Rolling up your historical data>>
|
||||
+
|
||||
include::rollup/index.asciidoc[tag=rollup-intro]
|
||||
* {stack-ov}/ml-dataframes.html[Transforming your data]
|
||||
|
||||
--
|
||||
|
||||
include::rollup/index.asciidoc[]
|
|
@ -50,10 +50,10 @@ include::sql/index.asciidoc[]
|
|||
|
||||
include::monitoring/index.asciidoc[]
|
||||
|
||||
include::rollup/index.asciidoc[]
|
||||
|
||||
include::frozen-indices.asciidoc[]
|
||||
|
||||
include::data-rollup-transform.asciidoc[]
|
||||
|
||||
include::high-availability.asciidoc[]
|
||||
|
||||
include::security/index.asciidoc[]
|
||||
|
|
|
@ -1,7 +1,10 @@
|
|||
[role="xpack"]
|
||||
[testenv="basic"]
|
||||
[[rollup-api-quickref]]
|
||||
== API Quick Reference
|
||||
=== {rollup-cap} API quick reference
|
||||
++++
|
||||
<titleabbrev>API quick reference</titleabbrev>
|
||||
++++
|
||||
|
||||
experimental[]
|
||||
|
||||
|
@ -15,7 +18,7 @@ Most rollup endpoints have the following base:
|
|||
|
||||
[float]
|
||||
[[rollup-api-jobs]]
|
||||
=== /job/
|
||||
==== /job/
|
||||
|
||||
* {ref}/rollup-put-job.html[PUT /_rollup/job/<job_id+++>+++]: Create a {rollup-job}
|
||||
* {ref}/rollup-get-job.html[GET /_rollup/job]: List {rollup-jobs}
|
||||
|
@ -26,13 +29,13 @@ Most rollup endpoints have the following base:
|
|||
|
||||
[float]
|
||||
[[rollup-api-data]]
|
||||
=== /data/
|
||||
==== /data/
|
||||
|
||||
* {ref}/rollup-get-rollup-caps.html[GET /_rollup/data/<index_pattern+++>/_rollup_caps+++]: Get Rollup Capabilities
|
||||
* {ref}/rollup-get-rollup-index-caps.html[GET /<index_name+++>/_rollup/data/+++]: Get Rollup Index Capabilities
|
||||
|
||||
[float]
|
||||
[[rollup-api-index]]
|
||||
=== /<index_name>/
|
||||
==== /<index_name>/
|
||||
|
||||
* {ref}/rollup-search.html[GET /<index_name>/_rollup_search]: Search rollup data
|
||||
|
|
|
@ -1,10 +1,7 @@
|
|||
[role="xpack"]
|
||||
[testenv="basic"]
|
||||
[[xpack-rollup]]
|
||||
= Rolling up historical data
|
||||
|
||||
[partintro]
|
||||
--
|
||||
== Rolling up historical data
|
||||
|
||||
experimental[]
|
||||
|
||||
|
@ -12,20 +9,20 @@ Keeping historical data around for analysis is extremely useful but often avoide
|
|||
archiving massive amounts of data. Retention periods are thus driven by financial realities rather than by the
|
||||
usefulness of extensive historical data.
|
||||
|
||||
The Rollup feature in {xpack} provides a means to summarize and store historical data so that it can still be used
|
||||
for analysis, but at a fraction of the storage cost of raw data.
|
||||
// tag::rollup-intro[]
|
||||
The {stack} {rollup-features} provide a means to summarize and store historical
|
||||
data so that it can still be used for analysis, but at a fraction of the storage
|
||||
cost of raw data.
|
||||
// end::rollup-intro[]
|
||||
|
||||
|
||||
* <<rollup-overview, Overview>>
|
||||
* <<rollup-getting-started,Getting Started>>
|
||||
* <<rollup-api-quickref, API Quick Reference>>
|
||||
* <<rollup-understanding-groups,Understanding Rollup Grouping>>
|
||||
* <<rollup-overview,Overview>>
|
||||
* <<rollup-getting-started,Getting started>>
|
||||
* <<rollup-api-quickref, API quick reference>>
|
||||
* <<rollup-understanding-groups,Understanding rollup grouping>>
|
||||
* <<rollup-agg-limitations,Rollup aggregation limitations>>
|
||||
* <<rollup-search-limitations,Rollup Search limitations>>
|
||||
* <<rollup-search-limitations,Rollup search limitations>>
|
||||
|
||||
|
||||
--
|
||||
|
||||
include::overview.asciidoc[]
|
||||
include::api-quickref.asciidoc[]
|
||||
include::rollup-getting-started.asciidoc[]
|
||||
|
|
|
@ -1,7 +1,10 @@
|
|||
[role="xpack"]
|
||||
[testenv="basic"]
|
||||
[[rollup-overview]]
|
||||
== Overview
|
||||
=== {rollup-cap} overview
|
||||
++++
|
||||
<titleabbrev>Overview</titleabbrev>
|
||||
++++
|
||||
|
||||
experimental[]
|
||||
|
||||
|
@ -23,7 +26,7 @@ reading often diminishes with time. It's not useless -- it could easily contrib
|
|||
value often leads to deletion rather than paying the fixed storage cost.
|
||||
|
||||
[float]
|
||||
=== Rollup store historical data at reduced granularity
|
||||
==== Rollup stores historical data at reduced granularity
|
||||
|
||||
That's where Rollup comes into play. The Rollup functionality summarizes old, high-granularity data into a reduced
|
||||
granularity format for long-term storage. By "rolling" the data up into a single summary document, historical data
|
||||
|
@ -39,7 +42,7 @@ automates this process of summarizing historical data.
|
|||
Details about setting up and configuring Rollup are covered in <<rollup-put-job,Create Job API>>
|
||||
|
||||
[float]
|
||||
=== Rollup uses standard query DSL
|
||||
==== Rollup uses standard query DSL
|
||||
|
||||
The Rollup feature exposes a new search endpoint (`/_rollup_search` vs the standard `/_search`) which knows how to search
|
||||
over rolled-up data. Importantly, this endpoint accepts 100% normal {es} Query DSL. Your application does not need to learn
|
||||
|
@ -53,7 +56,7 @@ But if your queries, aggregations and dashboards only use the available function
|
|||
data is trivial.
|
||||
|
||||
[float]
|
||||
=== Rollup merges "live" and "rolled" data
|
||||
==== Rollup merges "live" and "rolled" data
|
||||
|
||||
A useful feature of Rollup is the ability to query both "live", realtime data in addition to historical "rolled" data
|
||||
in a single query.
|
||||
|
@ -67,7 +70,7 @@ It will take the results from both data sources and merge them together. If the
|
|||
"rolled" data, live data is preferred to increase accuracy.
|
||||
|
||||
[float]
|
||||
=== Rollup is multi-interval aware
|
||||
==== Rollup is multi-interval aware
|
||||
|
||||
Finally, Rollup is capable of intelligently utilizing the best interval available. If you've worked with summarizing
|
||||
features of other products, you'll find that they can be limiting. If you configure rollups at daily intervals... your
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
[role="xpack"]
|
||||
[testenv="basic"]
|
||||
[[rollup-agg-limitations]]
|
||||
== Rollup Aggregation Limitations
|
||||
=== {rollup-cap} aggregation limitations
|
||||
|
||||
experimental[]
|
||||
|
||||
|
@ -9,7 +9,7 @@ There are some limitations to how fields can be rolled up / aggregated. This pa
|
|||
you are aware of them.
|
||||
|
||||
[float]
|
||||
=== Limited aggregation components
|
||||
==== Limited aggregation components
|
||||
|
||||
The Rollup functionality allows fields to be grouped with the following aggregations:
|
||||
|
||||
|
|
|
@ -1,7 +1,10 @@
|
|||
[role="xpack"]
|
||||
[testenv="basic"]
|
||||
[[rollup-getting-started]]
|
||||
== Getting Started
|
||||
=== Getting started with {rollups}
|
||||
++++
|
||||
<titleabbrev>Getting started</titleabbrev>
|
||||
++++
|
||||
|
||||
experimental[]
|
||||
|
||||
|
@ -23,7 +26,7 @@ look like this:
|
|||
// NOTCONSOLE
|
||||
|
||||
[float]
|
||||
=== Creating a Rollup Job
|
||||
==== Creating a rollup job
|
||||
|
||||
We'd like to rollup these documents into hourly summaries, which will allow us to generate reports and dashboards with any time interval
|
||||
one hour or greater. A rollup job might look like this:
|
||||
|
@ -103,7 +106,7 @@ After you execute the above command and create the job, you'll receive the follo
|
|||
----
|
||||
|
||||
[float]
|
||||
=== Starting the job
|
||||
==== Starting the job
|
||||
|
||||
After the job is created, it will be sitting in an inactive state. Jobs need to be started before they begin processing data (this allows
|
||||
you to stop them later as a way to temporarily pause, without deleting the configuration).
|
||||
|
@ -117,7 +120,7 @@ POST _rollup/job/sensor/_start
|
|||
// TEST[setup:sensor_rollup_job]
|
||||
|
||||
[float]
|
||||
=== Searching the Rolled results
|
||||
==== Searching the rolled results
|
||||
|
||||
After the job has run and processed some data, we can use the <<rollup-search>> endpoint to do some searching. The Rollup feature is designed
|
||||
so that you can use the same Query DSL syntax that you are accustomed to... it just happens to run on the rolled up data instead.
|
||||
|
@ -292,7 +295,7 @@ In addition to being more complicated (date histogram and a terms aggregation, p
|
|||
the date_histogram uses a `7d` interval instead of `60m`.
|
||||
|
||||
[float]
|
||||
=== Conclusion
|
||||
==== Conclusion
|
||||
|
||||
This quickstart should have provided a concise overview of the core functionality that Rollup exposes. There are more tips and things
|
||||
to consider when setting up Rollups, which you can find throughout the rest of this section. You may also explore the <<rollup-api-quickref,REST API>>
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
[role="xpack"]
|
||||
[testenv="basic"]
|
||||
[[rollup-search-limitations]]
|
||||
== Rollup Search Limitations
|
||||
=== {rollup-cap} search limitations
|
||||
|
||||
experimental[]
|
||||
|
||||
|
@ -11,7 +11,7 @@ live data is thrown away, you will always lose some flexibility.
|
|||
This page highlights the major limitations so that you are aware of them.
|
||||
|
||||
[float]
|
||||
=== Only one Rollup index per search
|
||||
==== Only one {rollup} index per search
|
||||
|
||||
When using the <<rollup-search>> endpoint, the `index` parameter accepts one or more indices. These can be a mix of regular, non-rollup
|
||||
indices and rollup indices. However, only one rollup index can be specified. The exact list of rules for the `index` parameter are as
|
||||
|
@ -33,7 +33,7 @@ may be able to open this up to multiple rollup jobs.
|
|||
|
||||
[float]
|
||||
[[aggregate-stored-only]]
|
||||
=== Can only aggregate what's been stored
|
||||
==== Can only aggregate what's been stored
|
||||
|
||||
A perhaps obvious limitation, but rollups can only aggregate on data that has been stored in the rollups. If you don't configure the
|
||||
rollup job to store metrics about the `price` field, you won't be able to use the `price` field in any query or aggregation.
|
||||
|
@ -81,7 +81,7 @@ The response will tell you that the field and aggregation were not possible, bec
|
|||
// TESTRESPONSE[s/"stack_trace": \.\.\./"stack_trace": $body.$_path/]
|
||||
|
||||
[float]
|
||||
=== Interval Granularity
|
||||
==== Interval granularity
|
||||
|
||||
Rollups are stored at a certain granularity, as defined by the `date_histogram` group in the configuration. This means you
|
||||
can only search/aggregate the rollup data with an interval that is greater-than or equal to the configured rollup interval.
|
||||
|
@ -111,7 +111,7 @@ That said, if multiple jobs are present in a single rollup index with varying in
|
|||
with the largest interval to satisfy the search request.
|
||||
|
||||
[float]
|
||||
=== Limited querying components
|
||||
==== Limited querying components
|
||||
|
||||
The Rollup functionality allows `query`'s in the search request, but with a limited subset of components. The queries currently allowed are:
|
||||
|
||||
|
@ -128,7 +128,7 @@ If you attempt to use an unsupported query, or the query references a field that
|
|||
thrown. We expect the list of support queries to grow over time as more are implemented.
|
||||
|
||||
[float]
|
||||
=== Timezones
|
||||
==== Timezones
|
||||
|
||||
Rollup documents are stored in the timezone of the `date_histogram` group configuration in the job. If no timezone is specified, the default
|
||||
is to rollup timestamps in `UTC`.
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
[role="xpack"]
|
||||
[testenv="basic"]
|
||||
[[rollup-understanding-groups]]
|
||||
== Understanding Groups
|
||||
=== Understanding groups
|
||||
|
||||
experimental[]
|
||||
|
||||
|
@ -121,7 +121,7 @@ Ultimately, when configuring `groups` for a job, think in terms of how you might
|
|||
then include those in the config. Because Rollup Search allows any order or combination of the grouped fields, you just need to decide
|
||||
if a field is useful for aggregating later, and how you might wish to use it (terms, histogram, etc)
|
||||
|
||||
=== Grouping Limitations with heterogeneous indices
|
||||
==== Grouping limitations with heterogeneous indices
|
||||
|
||||
There was previously a limitation in how Rollup could handle indices that had heterogeneous mappings (multiple, unrelated/non-overlapping
|
||||
mappings). The recommendation at the time was to configure a separate job per data "type". For example, you might configure a separate
|
||||
|
@ -192,7 +192,7 @@ PUT _rollup/job/combined
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
=== Doc counts and overlapping jobs
|
||||
==== Doc counts and overlapping jobs
|
||||
|
||||
There was previously an issue with document counts on "overlapping" job configurations, driven by the same internal implementation detail.
|
||||
If there were two Rollup jobs saving to the same index, where one job is a "subset" of another job, it was possible that document counts
|
||||
|
|
Loading…
Reference in New Issue