From 3635bd741ced2a334d5aae4ce836fb62c259926a Mon Sep 17 00:00:00 2001 From: Andrei Dan Date: Mon, 15 Jun 2020 15:16:14 +0100 Subject: [PATCH] [DOCS] Make ILM documentation data stream aware (#58035) (#58110) Co-authored-by: James Rodewig Co-authored-by: Lee Hinman (cherry picked from commit 25cbbe56dd29fbee2efe8040e9c8b92d168cb670) Signed-off-by: Andrei Dan --- docs/reference/glossary.asciidoc | 11 +- .../ilm/actions/ilm-rollover.asciidoc | 15 +- .../actions/ilm-searchable-snapshot.asciidoc | 12 + .../reference/ilm/actions/ilm-shrink.asciidoc | 15 +- docs/reference/ilm/ilm-tutorial.asciidoc | 306 ++++++++++++++---- docs/reference/ilm/index-rollover.asciidoc | 20 +- 6 files changed, 301 insertions(+), 78 deletions(-) diff --git a/docs/reference/glossary.asciidoc b/docs/reference/glossary.asciidoc index ceff43a0a6e..c9ba09ed2e2 100644 --- a/docs/reference/glossary.asciidoc +++ b/docs/reference/glossary.asciidoc @@ -268,10 +268,15 @@ shard will never be started on the same node as its primary shard. -- // tag::rollover-def[] // tag::rollover-def-short[] -Redirect an alias to begin writing to a new index when the existing index reaches a certain age, number of docs, or size. + +Creates a new index for a rollover target when the existing index reaches a certain size, number of docs, or age. +A rollover target can be either an <> or a <>. // end::rollover-def-short[] -The new index is automatically configured according to any matching <>. -For example, if you're indexing log data, you might use rollover to create daily or weekly indices. + +The new index is automatically configured according to any matching <> or +respectively, a <> if the rollover target is a +<>. +For example, if you're indexing log data, you might use rollover to create daily or weekly indices. See the {ref}/indices-rollover-index.html[rollover index API]. // end::rollover-def[] -- diff --git a/docs/reference/ilm/actions/ilm-rollover.asciidoc b/docs/reference/ilm/actions/ilm-rollover.asciidoc index 57c8debca1f..fe39d3610a0 100644 --- a/docs/reference/ilm/actions/ilm-rollover.asciidoc +++ b/docs/reference/ilm/actions/ilm-rollover.asciidoc @@ -4,15 +4,20 @@ Phases allowed: hot. -Rolls an alias over to a new index when the existing index meets one of the rollover conditions. +Rolls over a target to a new index when the existing index meets one of the rollover conditions. -IMPORTANT: If the rollover action is used on a <>, +IMPORTANT: If the rollover action is used on a <>, policy execution waits until the leader index rolls over (or is -<>), -then converts the follower index into a regular index with the +<>), +then converts the follower index into a regular index with the <>. -For a managed index to be rolled over: +A rollover target can be a <> or an <>. +When targeting a data stream, the new index becomes the data stream's +<> and its generation is incremented. + +To roll over an <>, the alias and its write index +must meet the following conditions: * The index name must match the pattern '^.*-\\d+$', for example (`my_index-000001`). * The `index.lifecycle.rollover_alias` must be configured as the alias to roll over. diff --git a/docs/reference/ilm/actions/ilm-searchable-snapshot.asciidoc b/docs/reference/ilm/actions/ilm-searchable-snapshot.asciidoc index 7f2fc7b477a..20aa7ae2f1d 100644 --- a/docs/reference/ilm/actions/ilm-searchable-snapshot.asciidoc +++ b/docs/reference/ilm/actions/ilm-searchable-snapshot.asciidoc @@ -6,6 +6,18 @@ Phases allowed: cold. Takes a snapshot of the managed index in the configured repository and mounts it as a searchable snapshot. +If the managed index is part of a <>, +the mounted index replaces the original index in the data stream. + +[NOTE] +This action cannot be performed on a data stream's write index. Attempts to do +so will fail. To convert the index to a searchable snapshot, first +<> the data stream. This +creates a new write index. Because the index is no longer the stream's write +index, the action can then convert it to a searchable snapshot. +Using a policy that makes use of the <> action +in the hot phase will avoid this situation and the need for a manual rollover for future +managed indices. By default, this snapshot is deleted by the <> in the delete phase. To keep the snapshot, set `delete_searchable_snapshot` to `false` in the delete action. diff --git a/docs/reference/ilm/actions/ilm-shrink.asciidoc b/docs/reference/ilm/actions/ilm-shrink.asciidoc index d13e8bf41cf..8800136abf5 100644 --- a/docs/reference/ilm/actions/ilm-shrink.asciidoc +++ b/docs/reference/ilm/actions/ilm-shrink.asciidoc @@ -21,6 +21,19 @@ policy execution waits until the leader index rolls over (or is then converts the follower index into a regular index with the <> before performing the shrink operation. +If the managed index is part of a <>, +the shrunken index replaces the original index in the data stream. + +[NOTE] +This action cannot be performed on a data stream's write index. Attempts to do +so will fail. To shrink the index, first +<> the data stream. This +creates a new write index. Because the index is no longer the stream's write +index, the action can resume shrinking it. +Using a policy that makes use of the <> action +in the hot phase will avoid this situation and the need for a manual rollover for future +managed indices. + [[ilm-shrink-options]] ==== Shrink options `number_of_shards`:: @@ -48,4 +61,4 @@ PUT _ilm/policy/my_policy } } } --------------------------------------------------- \ No newline at end of file +-------------------------------------------------- diff --git a/docs/reference/ilm/ilm-tutorial.asciidoc b/docs/reference/ilm/ilm-tutorial.asciidoc index fc488204f1b..578a1d9ea00 100644 --- a/docs/reference/ilm/ilm-tutorial.asciidoc +++ b/docs/reference/ilm/ilm-tutorial.asciidoc @@ -11,26 +11,35 @@ This tutorial demonstrates how to use {ilm} ({ilm-init}) to manage indices that contain time-series data. When you continuously index timestamped documents into {es}, -you typically use an index alias so you can periodically roll over to a new index. +you typically use a <> so you can periodically roll over to a +new index. This enables you to implement a hot-warm-cold architecture to meet your performance requirements for your newest data, control costs over time, enforce retention policies, and still get the most out of your data. -To automate rollover and management of time-series indices with {ilm-init}, you: +TIP: Data streams are best suited for +<> use cases. If you need to frequently +update or delete existing documents across multiple indices, we recommend +using an index alias and index template instead. You can still use ILM to +manage and rollover the alias's indices. Skip to +<>. + +To automate rollover and management of a data stream with {ilm-init}, you: . <> that defines the appropriate -phases and actions. -. <> to apply the policy to each new index. -. <> as the initial write index. -. <> +phases and actions. +. <> to create the data stream and +apply the ILM policy and the indices settings and mappings configurations for the backing +indices. +. <> as expected. For an introduction to rolling indices, see <>. -IMPORTANT: When you enable {ilm} for {beats} or the {ls} {es} output plugin, -lifecycle policies are set up automatically. -You do not need bootstrap the initial index or take any other actions. -You can modify the default policies through +IMPORTANT: When you enable {ilm} for {beats} or the {ls} {es} output plugin, +lifecycle policies are set up automatically. +You do not need to take any other actions. +You can modify the default policies through {kibana-ref}/example-using-index-lifecycle-policy.html[{kib} Management] or the {ilm-init} APIs. @@ -89,13 +98,217 @@ For the complete list of actions that {ilm} can perform, see <>. [discrete] [[ilm-gs-apply-policy]] -=== Create an index template to apply the lifecycle policy +=== Create a composable template to create the data stream and apply the lifecycle policy + +To set up a data stream, first create a composable template to specify the lifecycle policy. Because +the template is for a data stream, it must also include a `data_stream` definition. + +For example, you might create a `timeseries_template` to use for a future data stream +named `timeseries`. + +To enable the {ilm-init} to manage the data stream, the template configures one {ilm-init} setting: + +* `index.lifecycle.name` specifies the name of the lifecycle policy to apply to the data stream. + +You can use the {kib} Create template wizard to add the template. +This wizard invokes the put _index_template API to create the <> +with the options you specify. + +The underlying request looks like this: + +[source,console] +----------------------- +PUT _index_template/timeseries_template +{ + "index_patterns": ["timeseries"], <1> + "data_stream": { + "timestamp_field": "@timestamp" <2> + }, + "template": { + "settings": { + "number_of_shards": 1, + "number_of_replicas": 1, + "index.lifecycle.name": "timeseries_policy" <3> + }, + "mappings": { + "properties": { + "@timestamp": { + "type": "date" <4> + } + } + } + } +} +----------------------- +// TEST[continued] + +<1> Apply the template when a document is indexed into the `timeseries` target. +<2> Identifies the timestamp field for the data source. This field must be present +in all documents indexed into the `timeseries` data stream. +<3> The name of the {ilm-init} policy used to manage the data stream. +<4> A <> or <> field mapping for the +timestamp field specified in the `timestamp_field` property + +You can also invoke this API directly to add templates. + +[discrete] +[[ilm-gs-create-the-data-stream]] +=== Create the data stream + +To get things started, index a document into the name or wildcard pattern defined +in the `index_patterns` of the <>. As long +as an existing data stream, index, or index alias does not already use the name, the index +request automatically creates a corresponding data stream with a single backing index. +{es} automatically indexes the request's documents into this backing index, which also +acts as the stream's <>. + +For example, the following request creates the `timeseries` data stream and the first generation +backing index called `.ds-timeseries-000001`. + +[source,console] +----------------------- +POST timeseries/_doc +{ + "message": "logged the request", + "@timestamp": "1591890611" +} + +----------------------- +// TEST[continued] + +When a rollover condition in the lifecycle policy is met, the `rollover` action: + +* Creates the second generation backing index, named `.ds-timeseries-000002`. +Because it is a backing index of the `timeseries` data stream, the configuration from the `timeseries_template` composable template is applied to the new index. +* As it is the latest generation index of the `timeseries` data stream, the newly created +backing index `.ds-timeseries-000002` becomes the data stream's write index. + +This process repeats each time a rollover condition is met. +You can search across all of the data stream's backing indices, managed by the `timeseries_policy`, +with the `timeseries` data stream name. +Write operations are routed to the current write index. Read operations will be handled by all +backing indices. + +[discrete] +[[ilm-gs-check-progress]] +=== Check lifecycle progress + +To get status information for managed indices, you use the {ilm-init} explain API. +This lets you find out things like: + +* What phase an index is in and when it entered that phase. +* The current action and what step is being performed. +* If any errors have occurred or progress is blocked. + +For example, the following request gets information about the `timeseries` data stream's +backing indices: + +[source,console] +-------------------------------------------------- +GET .ds-timeseries-*/_ilm/explain +-------------------------------------------------- +// TEST[continued] + +The following response shows the data stream's first generation backing index is waiting for the `hot` +phase's `rollover` action. +It remains in this state and {ilm-init} continues to call `check-rollover-ready` until a rollover condition +is met. + +// [[36818c6d9f434d387819c30bd9addb14]] +[source,console-result] +-------------------------------------------------- +{ + "indices": { + ".ds-timeseries-000001": { + "index": ".ds-timeseries-000001", + "managed": true, + "policy": "timeseries_policy", <1> + "lifecycle_date_millis": 1538475653281, + "age": "30s", <2> + "phase": "hot", + "phase_time_millis": 1538475653317, + "action": "rollover", + "action_time_millis": 1538475653317, + "step": "check-rollover-ready", <3> + "step_time_millis": 1538475653317, + "phase_execution": { + "policy": "timeseries_policy", + "phase_definition": { <4> + "min_age": "0ms", + "actions": { + "rollover": { + "max_size": "50gb", + "max_age": "30d" + } + } + }, + "version": 1, + "modified_date_in_millis": 1539609701576 + } + } + } +} +-------------------------------------------------- +// TESTRESPONSE[skip:no way to know if we will get this response immediately] + +<1> The policy used to manage the index +<2> The age of the index +<3> The step {ilm-init} is performing on the index +<4> The definition of the current phase (the `hot` phase) + +////////////////////////// + +[source,console] +-------------------------------------------------- +DELETE /_data_stream/timeseries +-------------------------------------------------- +// TEST[continued] + +////////////////////////// + + +////////////////////////// + +[source,console] +-------------------------------------------------- +DELETE /_index_template/timeseries_template +-------------------------------------------------- +// TEST[continued] + +////////////////////////// + +[discrete] +[[manage-time-series-data-without-data-streams]] +=== Manage time-series data without data streams + +Even though <> are a convenient way to scale +and manage time-series data, they are designed to be append-only. We recognise there +might be use-cases where data needs to be updated or deleted in place and the +data streams don't support delete and update requests directly, +so the index APIs would need to be used directly on the data stream's backing indices. + +In these cases, you can use an index alias to manage indices containing the time-series data +and periodically roll over to a new index. + +To automate rollover and management of time-series indices with {ilm-init} using an index +alias, you: + +. Create a lifecycle policy that defines the appropriate phases and actions. +See <> above. +. <> to apply the policy to each new index. +. <> as the initial write index. +. <> +as expected. + +[discrete] +[[ilm-gs-alias-apply-policy]] +=== Create a legacy index template to apply the lifecycle policy To automatically apply a lifecycle policy to the new write index on rollover, specify the policy in the index template used to create new indices. For example, you might create a `timeseries_template` that is applied to new indices -whose names match the `timeseries-*` index pattern. +whose names match the `timeseries-*` index pattern. To enable automatic rollover, the template configures two {ilm-init} settings: @@ -104,8 +317,8 @@ that match the index pattern. * `index.lifecycle.rollover_alias` specifies the index alias to be rolled over when the rollover action is triggered for an index. -You can use the {kib} Create template wizard to add the template. -This wizard invokes the put template API to create the template with the options you specify. +You can use the {kib} Create template wizard to add the template. +This wizard invokes the put template API to create the template with the options you specify. The underlying request looks like this: @@ -143,8 +356,8 @@ DELETE /_template/timeseries_template ////////////////////////// [discrete] -[[ilm-gs-bootstrap]] -=== Bootstrap the initial time-series index +[[ilm-gs-alias-bootstrap]] +=== Bootstrap the initial time-series index with a write index alias To get things started, you need to bootstrap an initial index and designate it as the write index for the rollover alias specified in your index template. @@ -178,17 +391,13 @@ You can search across all of the indices managed by the `timeseries_policy` with Write operations are routed to the current write index. [discrete] -[[ilm-gs-check-progress]] +[[ilm-gs-alias-check-progress]] === Check lifecycle progress -To get status information for managed indices, you use the {ilm-init} explain API. -This lets you find out things like: - -* What phase an index is in and when it entered that phase. -* The current action and what step is being performed. -* If any errors have occurred or progress is blocked. - -For example, the following request gets information about the `timeseries` indices: +Retrieving the status information for managed indices is very similar to the data stream case. +See the data stream <> for more information. +The only difference is the indices namespace, so retrieving the progress will entail the following +api call: [source,console] -------------------------------------------------- @@ -196,48 +405,11 @@ GET timeseries-*/_ilm/explain -------------------------------------------------- // TEST[continued] -The response below shows that the bootstrap index is waiting in the `hot` phase's `rollover` action. -It remains in this state and {ilm-init} continues to call `attempt-rollover` -until the rollover conditions are met. +////////////////////////// -// [[36818c6d9f434d387819c30bd9addb14]] -[source,console-result] +[source,console] -------------------------------------------------- -{ - "indices": { - "timeseries-000001": { - "index": "timeseries-000001", - "managed": true, - "policy": "timeseries_policy", <1> - "lifecycle_date_millis": 1538475653281, - "age": "30s", <2> - "phase": "hot", - "phase_time_millis": 1538475653317, - "action": "rollover", - "action_time_millis": 1538475653317, - "step": "attempt-rollover", <3> - "step_time_millis": 1538475653317, - "phase_execution": { - "policy": "timeseries_policy", - "phase_definition": { <4> - "min_age": "0ms", - "actions": { - "rollover": { - "max_size": "50gb", - "max_age": "30d" - } - } - }, - "version": 1, - "modified_date_in_millis": 1539609701576 - } - } - } -} +DELETE /timeseries-000001 -------------------------------------------------- -// TESTRESPONSE[skip:no way to know if we will get this response immediately] - -<1> The policy used to manage the index -<2> The age of the index -<3> The step {ilm-init} is performing on the index -<4> The definition of the current phase (the `hot` phase) +// TEST[continued] +////////////////////////// diff --git a/docs/reference/ilm/index-rollover.asciidoc b/docs/reference/ilm/index-rollover.asciidoc index f8680251585..633127d3835 100644 --- a/docs/reference/ilm/index-rollover.asciidoc +++ b/docs/reference/ilm/index-rollover.asciidoc @@ -12,7 +12,23 @@ Using rolling indices enables you to: * Shift older, less frequently accessed data to less expensive _cold_ nodes, * Delete data according to your retention policies by removing entire indices. -Rollover relies on three things: +We recommend using <> to manage time-series +data. Data streams automatically track the write index while keeping configuration to a minimum. + +Each data stream requires a <> that contains: + +* A name or wildcard (`*`) pattern for the data stream. + +* The data stream's timestamp field. This field must be mapped as a + <> or <> field datatype and must be + included in every document indexed to the data stream. + + * The mappings and settings applied to each backing index when it's created. + +Data streams are designed for append-only data, where the data stream name +can be used as the operations (read, write, rollover, shrink etc.) target. +If your use case requires data to be updated in place, you can instead manage your time-series data using <>. However, there are a few more configuration steps and +concepts: * An _index template_ that specifies the settings for each new index in the series. You optimize this configuration for ingestion, typically using as many shards as you have hot nodes. @@ -35,4 +51,4 @@ subsequent updates are written to the new index. TIP: Rolling over to a new index based on size, document count, or age is preferable to time-based rollovers. Rolling over at an arbitrary time often results in many small indices, which can have a negative impact on performance and -resource usage. \ No newline at end of file +resource usage.