[DOCS] Fix data stream docs (#59818) (#60010)

This commit is contained in:
James Rodewig 2020-07-21 17:04:13 -04:00 committed by GitHub
parent 04c68ba740
commit 401e12dc2b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
9 changed files with 296 additions and 432 deletions

View File

@ -89,10 +89,9 @@ To add a mapping for a new field to a data stream, following these steps:
. Update the index template used by the data stream. This ensures the new
field mapping is added to future backing indices created for the stream.
+
.*Example*
[%collapsible]
====
`logs_data_stream` is an existing index template used by the `logs` data stream.
--
For example, `logs_data_stream` is an existing index template used by the `logs`
data stream.
The following <<indices-templates,put index template>> request adds a mapping
for a new field, `message`, to the template.
@ -115,15 +114,13 @@ PUT /_index_template/logs_data_stream
}
----
<1> Adds a mapping for the new `message` field.
====
--
. Use the <<indices-put-mapping,put mapping API>> to add the new field mapping
to the data stream. By default, this adds the mapping to the stream's existing
backing indices, including the write index.
+
.*Example*
[%collapsible]
====
--
The following put mapping API request adds the new `message` field mapping to
the `logs` data stream.
@ -138,14 +135,12 @@ PUT /logs/_mapping
}
}
----
====
--
+
To add the mapping only to the stream's write index, set the put mapping API's
`write_index_only` query parameter to `true`.
+
.*Example*
[%collapsible]
====
--
The following put mapping request adds the new `message` field mapping only to
the `logs` stream's write index. The new field mapping is not added to the
stream's other backing indices.
@ -161,7 +156,7 @@ PUT /logs/_mapping?write_index_only=true
}
}
----
====
--
[discrete]
[[change-existing-field-mapping-in-a-data-stream]]
@ -175,10 +170,9 @@ existing field, follow these steps:
. Update the index template used by the data stream. This ensures the updated
field mapping is added to future backing indices created for the stream.
+
.*Example*
[%collapsible]
====
`logs_data_stream` is an existing index template used by the `logs` data stream.
--
For example, `logs_data_stream` is an existing index template used by the `logs`
data stream.
The following <<indices-templates,put index template>> request changes the
argument for the `host.ip` field's <<ignore-malformed,`ignore_malformed`>>
@ -207,15 +201,13 @@ PUT /_index_template/logs_data_stream
}
----
<1> Changes the `host.ip` field's `ignore_malformed` value to `true`.
====
--
. Use the <<indices-put-mapping,put mapping API>> to apply the mapping changes
to the data stream. By default, this applies the changes to the stream's
existing backing indices, including the write index.
+
.*Example*
[%collapsible]
====
--
The following <<indices-put-mapping,put mapping API>> request targets the `logs`
data stream. The request changes the argument for the `host.ip` field's
`ignore_malformed` mapping parameter to `true`.
@ -236,14 +228,12 @@ PUT /logs/_mapping
}
}
----
====
--
+
To apply the mapping changes only to the stream's write index, set the put mapping API's
`write_index_only` query parameter to `true`.
+
.*Example*
[%collapsible]
====
--
The following put mapping request changes the `host.ip` field's mapping only for
the `logs` stream's write index. The change is not applied to the stream's other
backing indices.
@ -264,7 +254,7 @@ PUT /logs/_mapping?write_index_only=true
}
}
----
====
--
Except for supported mapping parameters, we don't recommend you change the
mapping or field data type of existing fields, even in a data stream's matching
@ -285,10 +275,9 @@ follow these steps:
. Update the index template used by the data stream. This ensures the setting is
applied to future backing indices created for the stream.
+
.*Example*
[%collapsible]
====
`logs_data_stream` is an existing index template used by the `logs` data stream.
--
For example, `logs_data_stream` is an existing index template used by the `logs`
data stream.
The following <<indices-templates,put index template>> request changes the
template's `index.refresh_interval` index setting to `30s` (30 seconds).
@ -307,15 +296,13 @@ PUT /_index_template/logs_data_stream
}
----
<1> Changes the `index.refresh_interval` setting to `30s` (30 seconds).
====
--
. Use the <<indices-update-settings,update index settings API>> to update the
index setting for the data stream. By default, this applies the setting to
the stream's existing backing indices, including the write index.
+
.*Example*
[%collapsible]
====
--
The following update index settings API request updates the
`index.refresh_interval` setting for the `logs` data stream.
@ -328,7 +315,7 @@ PUT /logs/_settings
}
}
----
====
--
[discrete]
[[change-static-index-setting-for-a-data-stream]]
@ -342,10 +329,8 @@ To apply a new static setting to future backing indices, update the index
template used by the data stream. The setting is automatically applied to any
backing index created after the update.
.*Example*
[%collapsible]
====
`logs_data_stream` is an existing index template used by the `logs` data stream.
For example, `logs_data_stream` is an existing index template used by the `logs`
data stream.
The following <<indices-templates,put index template API>> requests adds new
`sort.field` and `sort.order index` settings to the template.
@ -366,7 +351,6 @@ PUT /_index_template/logs_data_stream
----
<1> Adds the `sort.field` index setting.
<2> Adds the `sort.order` index setting.
====
If wanted, you can <<manually-roll-over-a-data-stream,roll over the data
stream>> to immediately apply the setting to the data streams write index. This
@ -400,10 +384,7 @@ stream will contain data from your existing stream.
You can use the resolve index API to check if the name or pattern matches any
existing indices, index aliases, or data streams. If so, you should consider
using another name or pattern.
+
.*Example*
[%collapsible]
====
--
The following resolve index API request checks for any existing indices, index
aliases, or data streams that start with `new_logs`. If not, the `new_logs*`
wildcard pattern can be used to create a new data stream.
@ -425,7 +406,7 @@ this pattern.
}
----
// TESTRESPONSE[s/"data_streams": \[ \]/"data_streams": $body.data_streams/]
====
--
. Create or update an index template. This template should contain the
mappings and settings you'd like to apply to the new data stream's backing
@ -439,10 +420,8 @@ should also contain your previously chosen name or wildcard pattern in the
TIP: If you are only adding or changing a few things, we recommend you create a
new template by copying an existing one and modifying it as needed.
+
.*Example*
[%collapsible]
====
`logs_data_stream` is an existing index template used by the
--
For example, `logs_data_stream` is an existing index template used by the
`logs` data stream.
The following <<indices-templates,put index template API>> request creates
@ -480,7 +459,7 @@ PUT /_index_template/new_logs_data_stream
<1> Changes the `@timestamp` field mapping to the `date_nanos` field data type.
<2> Adds the `sort.field` index setting.
<3> Adds the `sort.order` index setting.
====
--
. Use the <<indices-create-data-stream,create data stream API>> to manually
create the new data stream. The name of the data stream must match the name or
@ -501,9 +480,7 @@ contains both new and old data. To prevent premature data loss, you would need
to retain such a backing index until you are ready to delete its newest data.
====
+
.*Example*
[%collapsible]
====
--
The following create data stream API request targets `new_logs`, which matches
the wildcard pattern for the `new_logs_data_stream` template. Because no
existing index or data stream uses this name, this request creates the
@ -514,7 +491,7 @@ existing index or data stream uses this name, this request creates the
PUT /_data_stream/new_logs
----
// TEST[s/new_logs/new_logs_two/]
====
--
. If you do not want to mix new and old data in your new data stream, pause the
indexing of new documents. While mixing old and new data is safe, it could
@ -526,9 +503,7 @@ rollover>>, reduce the {ilm-init} poll interval. This ensures the current write
index doesnt grow too large while waiting for the rollover check. By default,
{ilm-init} checks rollover conditions every 10 minutes.
+
.*Example*
[%collapsible]
====
--
The following <<cluster-update-settings,update cluster settings API>> request
lowers the `indices.lifecycle.poll_interval` setting to `1m` (one minute).
@ -541,7 +516,7 @@ PUT /_cluster/settings
}
}
----
====
--
. Reindex your data to the new data stream using an `op_type` of `create`.
+
@ -551,9 +526,7 @@ individual backing indices as the source. You can use the
<<indices-get-data-stream,get data stream API>> to retrieve a list of backing
indices.
+
.*Example*
[%collapsible]
====
--
You plan to reindex data from the `logs` data stream into the newly created
`new_logs` data stream. However, you want to submit a separate reindex request
for each backing index in the `logs` data stream, starting with the oldest
@ -622,14 +595,12 @@ POST /_reindex
}
}
----
====
--
+
You can also use a query to reindex only a subset of documents with each
request.
+
.*Example*
[%collapsible]
====
--
The following <<docs-reindex,reindex API>> request copies documents from the
`logs` data stream to the `new_logs` data stream. The request uses a
<<query-dsl-range-query,`range` query>> to only reindex documents with a
@ -656,15 +627,13 @@ POST /_reindex
}
}
----
====
--
. If you previously changed your {ilm-init} poll interval, change it back to its
original value when reindexing is complete. This prevents unnecessary load on
the master node.
+
.*Example*
[%collapsible]
====
--
The following update cluster settings API request resets the
`indices.lifecycle.poll_interval` setting to its default value, 10 minutes.
@ -677,7 +646,7 @@ PUT /_cluster/settings
}
}
----
====
--
. Resume indexing using the new data stream. Searches on this stream will now
query your new data and the reindexed data.
@ -685,9 +654,7 @@ query your new data and the reindexed data.
. Once you have verified that all reindexed data is available in the new
data stream, you can safely remove the old stream.
+
.*Example*
[%collapsible]
====
--
The following <<indices-delete-data-stream,delete data stream API>> request
deletes the `logs` data stream. This request also deletes the stream's backing
indices and any data they contain.
@ -696,4 +663,4 @@ indices and any data they contain.
----
DELETE /_data_stream/logs
----
====
--

View File

@ -1,152 +0,0 @@
[role="xpack"]
[[data-streams-overview]]
== Data streams overview
++++
<titleabbrev>Overview</titleabbrev>
++++
A data stream consists of one or more _backing indices_. Backing indices are
<<index-hidden,hidden>>, auto-generated indices used to store a stream's
documents.
image::images/data-streams/data-streams-diagram.svg[align="center"]
The creation of a data stream requires a matching
<<indices-templates,index template>>. This template acts as a blueprint for
the stream's backing indices. It contains:
* A name or wildcard (`*`) pattern for the data stream.
* An optional mapping for the data stream's `@timestamp` field.
+
A `@timestamp` field must be included in every document indexed to the data
stream. This field must be mapped as a <<date,`date`>> or
<<date_nanos,`date_nanos`>> field data type. If no mapping is specified in the
index template, the `date` field data type with default options is used.
* The mappings and settings applied to each backing index when it's created.
The same index template can be used to create multiple data streams.
See <<set-up-a-data-stream>>.
[discrete]
[[data-streams-generation]]
=== Generation
Each data stream tracks its _generation_: a six-digit, zero-padded integer
that acts as a cumulative count of the data stream's backing indices. This count
includes any deleted indices for the stream. The generation is incremented
whenever a new backing index is added to the stream.
When a backing index is created, the index is named using the following
convention:
[source,text]
----
.ds-<data-stream>-<generation>
----
.*Example*
[%collapsible]
====
The `web_server_logs` data stream has a generation of `34`. The most recently
created backing index for this data stream is named
`.ds-web_server_logs-000034`.
====
Because the generation increments with each new backing index, backing indices
with a higher generation contain more recent data. Backing indices with a lower
generation contain older data.
A backing index's name can change after its creation due to a
<<indices-shrink-index,shrink>>, <<snapshots-restore-snapshot,restore>>, or
other operations.
[discrete]
[[data-stream-write-index]]
=== Write index
When a read request is sent to a data stream, it routes the request to all its
backing indices. For example, a search request sent to a data stream would query
all its backing indices.
image::images/data-streams/data-streams-search-request.svg[align="center"]
However, the most recently created backing index is the data streams only
_write index_. The data stream routes all indexing requests for new documents to
this index.
image::images/data-streams/data-streams-index-request.svg[align="center"]
You cannot add new documents to a stream's other backing indices, even by
sending requests directly to the index. This means you cannot submit the
following requests directly to any backing index except the write index:
* An <<docs-index_,index API>> request with an
<<docs-index-api-op_type,`op_type`>> of `create`. The `op_type` parameter
defaults to `create` when adding new documents.
* A <<docs-bulk,bulk API>> request using a `create` action
Because it's the only index capable of ingesting new documents, you cannot
perform operations on a write index that might hinder indexing. These
prohibited operations include:
* <<indices-clone-index,Clone>>
* <<indices-close,Close>>
* <<indices-delete-index,Delete>>
* <<freeze-index-api,Freeze>>
* <<indices-shrink-index,Shrink>>
* <<indices-split-index,Split>>
[discrete]
[[data-streams-rollover]]
=== Rollover
When a data stream is created, one backing index is automatically created.
Because this single index is also the most recently created backing index, it
acts as the stream's write index.
A <<indices-rollover-index,rollover>> creates a new backing index for a data
stream. This new backing index becomes the stream's write index, replacing
the current one, and increments the stream's generation.
In most cases, we recommend using <<index-lifecycle-management,{ilm}
({ilm-init})>> to automate rollovers for data streams. This lets you
automatically roll over the current write index when it meets specified
criteria, such as a maximum age or size.
However, you can also use the <<indices-rollover-index,rollover API>> to
manually perform a rollover. See <<manually-roll-over-a-data-stream>>.
[discrete]
[[data-streams-append-only]]
=== Append-only
For most time-series use cases, existing data is rarely, if ever, updated.
Because of this, data streams are designed to be append-only.
You can send <<add-documents-to-a-data-stream,indexing requests for new
documents>> directly to a data stream. However, you cannot send the following
requests for existing documents directly to a data stream:
* An <<docs-index_,index API>> request with an
<<docs-index-api-op_type,`op_type`>> of `index`. The `op_type` parameter
defaults to `index` for existing documents.
* A <<docs-bulk,bulk API>> request using the `delete`, `index`, or `update`
action.
* A <<docs-delete,delete API>> request
Instead, you can use the <<docs-update-by-query,update by query>> and
<<docs-delete-by-query,delete by query>> APIs to update or delete existing
documents in a data stream. See <<update-delete-docs-in-a-data-stream>>.
Alternatively, you can update or delete a document by submitting requests to the
backing index containing the document. See
<<update-delete-docs-in-a-backing-index>>.
TIP: If you frequently update or delete existing documents,
we recommend using an <<indices-add-alias,index alias>> and
<<indices-templates,index template>> instead of a data stream. You can still
use <<index-lifecycle-management,{ilm-init}>> to manage indices for the alias.

View File

@ -21,16 +21,15 @@ A data stream is designed to give you the best of both worlds:
* The storage, scalability, and cost-saving benefits of multiple indices
You can submit indexing and search requests directly to a data stream. The
stream automatically routes the requests to a collection of hidden,
auto-generated indices that store the stream's data.
stream automatically routes the requests to a collection of hidden
_backing indices_ that store the stream's data.
You can use an <<indices-templates,index template>> and
<<index-lifecycle-management,{ilm} ({ilm-init})>> to automate the management of
these hidden indices. You can use {ilm-init} to spin up new indices, allocate
indices to different hardware, delete old indices, and take other automatic
actions based on age or size criteria you set. This lets you seamlessly scale
your data storage based on your budget, performance, resiliency, and retention
needs.
You can use <<index-lifecycle-management,{ilm} ({ilm-init})>> to automate the
management of these backing indices. {ilm-init} lets you automatically spin up
new backing indices, allocate indices to different hardware, delete old indices,
and take other automatic actions based on age or size criteria you set. Use data
streams and {ilm-init} to seamlessly scale your data storage based on your
budget, performance, resiliency, and retention needs.
[discrete]
@ -47,16 +46,142 @@ We recommend using data streams if you:
[discrete]
[[data-streams-toc]]
== In this section
[[backing-indices]]
== Backing indices
A data stream consists of one or more _backing indices_. Backing indices are
<<index-hidden,hidden>>, auto-generated indices used to store a stream's
documents.
* <<data-streams-overview>>
* <<set-up-a-data-stream>>
* <<use-a-data-stream>>
* <<data-streams-change-mappings-and-settings>>
image::images/data-streams/data-streams-diagram.svg[align="center"]
To create backing indices, each data stream requires a matching
<<indices-templates,index template>>. This template acts as a blueprint for the
stream's backing indices. It contains:
* The mappings and settings applied to each backing index when it's created.
* A name or wildcard (`*`) pattern that matches the data stream's name.
* A `data_stream` object with an empty body (`{ }`). This object indicates the
template is used for data streams.
A `@timestamp` field must be included in every document indexed to the data
stream. This field can be mapped as a <<date,`date`>> or
<<date_nanos,`date_nanos`>> field data type in the stream's matching index
template. If no mapping is specified in the template, the `date` field data type
with default options is used.
The same index template can be used to create multiple data streams.
include::data-streams-overview.asciidoc[]
[discrete]
[[data-streams-generation]]
== Generation
Each data stream tracks its _generation_: a six-digit, zero-padded integer
that acts as a cumulative count of the data stream's backing indices. This count
includes any deleted indices for the stream. The generation is incremented
whenever a new backing index is added to the stream.
When a backing index is created, the index is named using the following
convention:
[source,text]
----
.ds-<data-stream>-<generation>
----
For example, the `web_server_logs` data stream has a generation of `34`. The
most recently created backing index for this data stream is named
`.ds-web_server_logs-000034`.
Because the generation increments with each new backing index, backing indices
with a higher generation contain more recent data. Backing indices with a lower
generation contain older data.
A backing index's name can change after its creation due to a
<<indices-shrink-index,shrink>>, <<snapshots-restore-snapshot,restore>>, or
other operations. However, renaming a backing index does not detach it from a
data stream.
[discrete]
[[data-stream-read-requests]]
== Read requests
When a read request is sent to a data stream, it routes the request to all its
backing indices. For example, a search request sent to a data stream would query
all its backing indices.
image::images/data-streams/data-streams-search-request.svg[align="center"]
[discrete]
[[data-stream-write-index]]
== Write index
The most recently created backing index is the data streams only
_write index_. The data stream routes all indexing requests for new documents to
this index.
image::images/data-streams/data-streams-index-request.svg[align="center"]
You cannot add new documents to a stream's other backing indices, even by
sending requests directly to the index.
Because it's the only index capable of ingesting new documents, you cannot
perform operations on a write index that might hinder indexing. These
prohibited operations include:
* <<indices-clone-index,Clone>>
* <<indices-close,Close>>
* <<indices-delete-index,Delete>>
* <<freeze-index-api,Freeze>>
* <<indices-shrink-index,Shrink>>
* <<indices-split-index,Split>>
[discrete]
[[data-streams-rollover]]
== Rollover
When a data stream is created, one backing index is automatically created.
Because this single index is also the most recently created backing index, it
acts as the stream's write index.
A <<indices-rollover-index,rollover>> creates a new backing index for a data
stream. This new backing index becomes the stream's write index, replacing
the current one, and increments the stream's generation.
In most cases, we recommend using <<index-lifecycle-management,{ilm}
({ilm-init})>> to automate rollovers for data streams. This lets you
automatically roll over the current write index when it meets specified
criteria, such as a maximum age or size.
However, you can also use the <<indices-rollover-index,rollover API>> to
manually perform a rollover. See <<manually-roll-over-a-data-stream>>.
[discrete]
[[data-streams-append-only]]
== Append-only
For most time-series use cases, existing data is rarely, if ever, updated.
Because of this, data streams are designed to be append-only.
You can send <<add-documents-to-a-data-stream,indexing requests for new
documents>> directly to a data stream. However, you cannot send the update or
deletion requests for existing documents directly to a data stream.
Instead, you can use the <<docs-update-by-query,update by query>> and
<<docs-delete-by-query,delete by query>> APIs to update or delete existing
documents in a data stream. See <<update-docs-in-a-data-stream-by-query>> and <<delete-docs-in-a-data-stream-by-query>>.
If needed, you can update or delete a document by submitting requests to the
backing index containing the document. See
<<update-delete-docs-in-a-backing-index>>.
TIP: If you frequently update or delete existing documents,
we recommend using an <<indices-add-alias,index alias>> and
<<indices-templates,index template>> instead of a data stream. You can still
use <<index-lifecycle-management,{ilm-init}>> to manage indices for the alias.
include::set-up-a-data-stream.asciidoc[]
include::use-a-data-stream.asciidoc[]
include::change-mappings-and-settings.asciidoc[]

View File

@ -52,12 +52,9 @@ To use {ilm-init} with a data stream, you must
should contain the automated actions to take on backing indices and the
triggers for such actions.
TIP: While optional, we recommend using {ilm-init} to scale data streams in
production.
TIP: While optional, we recommend using {ilm-init} to manage the backing indices
associated with a data stream.
.*Example*
[%collapsible]
====
The following <<ilm-put-lifecycle,create lifecycle policy API>> request
configures the `logs_policy` lifecycle policy.
@ -89,83 +86,37 @@ PUT /_ilm/policy/logs_policy
}
}
----
====
[discrete]
[[create-a-data-stream-template]]
=== Create an index template for a data stream
Each data stream requires an <<indices-templates,index template>>. The data
stream uses this template to create its backing indices.
A data stream uses an index template to configure its backing indices. A
template for a data stream must specify:
An index template for a data stream must contain:
* An index pattern that matches the name of the stream.
* A name or wildcard (`*`) pattern for the data stream in the `index_patterns`
property.
+
You can use the resolve index API to check if the name or pattern
matches any existing indices, index aliases, or data streams. If so, you should
consider using another name or pattern.
+
.*Example*
[%collapsible]
====
The following resolve index API request checks for any existing indices, index
aliases, or data streams that start with `logs`. If not, the `logs*`
wildcard pattern can be used to create a new data stream.
* An empty `data_stream` object that indicates the template is used for data
streams.
[source,console]
----
GET /_resolve/index/logs*
----
// TEST[continued]
* The mappings and settings for the stream's backing indices.
The API returns the following response, indicating no existing targets match
this pattern.
Every document indexed to a data stream must have a `@timestamp` field. This
field can be mapped as a <<date,`date`>> or <<date_nanos,`date_nanos`>> field
data type by the stream's index template. This mapping can include other
<<mapping-params,mapping parameters>>, such as <<mapping-date-format,`format`>>.
If the template does not specify a mapping is specified in the template, the
`@timestamp` field is mapped as a `date` field with default options.
[source,console-result]
----
{
"indices" : [ ],
"aliases" : [ ],
"data_streams" : [ ]
}
----
====
* A `data_stream` object with an empty body (`{ }`).
The template can also contain:
* An optional field mapping for the `@timestamp` field. Both the <<date,`date`>> and
<<date_nanos,`date_nanos`>> field data types are supported. If no mapping is specified,
a <<date,`date`>> field data type with default options is used.
+
This mapping can include other <<mapping-params,mapping parameters>>, such as
<<mapping-date-format,`format`>>.
+
IMPORTANT: Carefully consider the `@timestamp` field's mapping, including
its <<mapping-params,mapping parameters>>.
Once the stream is created, you can only update the `@timestamp` field's mapping
by reindexing the data stream. See
<<data-streams-use-reindex-to-change-mappings-settings>>.
* If you intend to use {ilm-init}, the
<<configure-a-data-stream-ilm-policy,lifecycle policy>> in the
`index.lifecycle.name` setting.
You can also specify other mappings and settings you'd like to apply to the
stream's backing indices.
We recommend using {ilm-init} to manage a data stream's backing indices. Specify
the name of the lifecycle policy with the `index.lifecycle.name` setting.
TIP: We recommend you carefully consider which mappings and settings to include
in this template before creating a data stream. Later changes to the mappings or
settings of a stream's backing indices may require reindexing. See
<<data-streams-change-mappings-and-settings>>.
.*Example*
[%collapsible]
====
The following <<indices-templates,put index template API>> request
configures the `logs_data_stream` template.
@ -187,7 +138,7 @@ PUT /_index_template/logs_data_stream
----
// TEST[continued]
The following template maps `@timestamp` as a `date_nanos` field.
Alternatively, the following template maps `@timestamp` as a `date_nanos` field.
[source,console]
----
@ -211,7 +162,6 @@ PUT /_index_template/logs_data_stream
<1> Maps `@timestamp` as a `date_nanos` field. You can include other supported
mapping parameters in this field mapping.
====
NOTE: You cannot delete an index template that's in use by a data stream.
This would prevent the data stream from creating new backing indices.
@ -220,24 +170,26 @@ This would prevent the data stream from creating new backing indices.
[[create-a-data-stream]]
=== Create a data stream
With an index template, you can create a data stream using one of two
methods:
You can create a data stream using one of two methods:
* Submit an <<add-documents-to-a-data-stream,indexing request>> to a target
* <<index-documents-to-create-a-data-stream>>
* <<manually-create-a-data-stream>>
[discrete]
[[index-documents-to-create-a-data-stream]]
==== Index documents to create a data stream
You can automatically generate a data stream using an indexing request. Submit
an <<add-documents-to-a-data-stream,indexing request>> to a target
matching the name or wildcard pattern defined in the template's `index_patterns`
property.
+
--
If the indexing request's target doesn't exist, {es} creates the data stream and
uses the target name as the name for the stream.
NOTE: Data streams support only specific types of indexing requests. See
<<add-documents-to-a-data-stream>>.
[[index-documents-to-create-a-data-stream]]
.*Example: Index documents to create a data stream*
[%collapsible]
====
The following <<docs-index_,index API>> request targets `logs`, which matches
the wildcard pattern for the `logs_data_stream` template. Because no existing
index or data stream uses this name, this request creates the `logs` data stream
@ -278,18 +230,16 @@ new `logs` data stream.
}
----
// TESTRESPONSE[s/"_id": "qecQmXIBT4jB8tq1nG0j"/"_id": $body._id/]
====
--
* Use the <<indices-create-data-stream,create data stream API>> to manually
create a data stream. The name of the data stream must match the
name or wildcard pattern defined in the template's `index_patterns` property.
+
--
.*Example: Manually create a data stream*
[%collapsible]
====
The following <<indices-create-data-stream,create data stream API>> request
[discrete]
[[manually-create-a-data-stream]]
==== Manually create a data stream
You can use the <<indices-create-data-stream,create data stream API>> to
manually create a data stream. The name of the data stream must match the name
or wildcard pattern defined in the template's `index_patterns` property.
The following create data stream request
targets `logs_alt`, which matches the wildcard pattern for the
`logs_data_stream` template. Because no existing index or data stream uses this
name, this request creates the `logs_alt` data stream.
@ -299,8 +249,6 @@ name, this request creates the `logs_alt` data stream.
PUT /_data_stream/logs_alt
----
// TEST[continued]
====
--
[discrete]
[[get-info-about-a-data-stream]]
@ -320,9 +268,6 @@ template
This is also handy way to verify that a recently created data stream exists.
.*Example*
[%collapsible]
====
The following get data stream API request retrieves information about the
`logs` data stream.
@ -377,7 +322,6 @@ contains information about the stream's write index, `.ds-logs-000002`.
<1> Last item in the `indices` array for the `logs` data stream. This item
contains information about the stream's current write index, `.ds-logs-000002`.
====
[discrete]
[[secure-a-data-stream]]
@ -393,9 +337,6 @@ data. See <<data-stream-privileges>>.
You can use the <<indices-delete-data-stream,delete data stream API>> to delete
a data stream and its backing indices.
.*Example*
[%collapsible]
====
The following delete data stream API request deletes the `logs` data stream. This
request also deletes the stream's backing indices and any data they contain.
@ -404,7 +345,6 @@ request also deletes the stream's backing indices and any data they contain.
DELETE /_data_stream/logs
----
// TEST[continued]
====
////
[source,console]

View File

@ -11,7 +11,8 @@ the following:
* <<manually-roll-over-a-data-stream>>
* <<open-closed-backing-indices>>
* <<reindex-with-a-data-stream>>
* <<update-delete-docs-in-a-data-stream>>
* <<update-docs-in-a-data-stream-by-query>>
* <<delete-docs-in-a-data-stream-by-query>>
* <<update-delete-docs-in-a-backing-index>>
////
@ -55,18 +56,34 @@ DELETE /_index_template/*
[[add-documents-to-a-data-stream]]
=== Add documents to a data stream
You can add documents to a data stream using the following requests:
You can add documents to a data stream using two types of indexing requests:
* <<data-streams-individual-indexing-requests>>
* <<data-streams-bulk-indexing-requests>>
Adding a document to a data stream adds the document to stream's current
<<data-stream-write-index,write index>>.
You cannot add new documents to a stream's other backing indices, even by
sending requests directly to the index. This means you cannot submit the
following requests directly to any backing index except the write index:
* An <<docs-index_,index API>> request with an
<<docs-index-api-op_type,`op_type`>> set to `create`. Specify the data
stream's name in place of an index name.
+
--
<<docs-index-api-op_type,`op_type`>> of `create`. The `op_type` parameter
defaults to `create` when adding new documents.
* A <<docs-bulk,bulk API>> request using a `create` action
[discrete]
[[data-streams-individual-indexing-requests]]
==== Individual indexing requests
You can use an <<docs-index_,index API>> request with an
<<docs-index-api-op_type,`op_type`>> of `create` to add individual documents
to a data stream.
NOTE: The `op_type` parameter defaults to `create` when adding new documents.
.*Example: Index API request*
[%collapsible]
====
The following index API request adds a new document to the `logs` data
stream.
@ -81,22 +98,22 @@ POST /logs/_doc/
"message": "Login successful"
}
----
====
IMPORTANT: You cannot add new documents to a data stream using the index API's
`PUT /<target>/_doc/<_id>` request format. To specify a document ID, use the
`PUT /<target>/_create/<_id>` format instead.
--
* A <<docs-bulk,bulk API>> request using the `create` action. Specify the data
stream's name in place of an index name.
+
--
[discrete]
[[data-streams-bulk-indexing-requests]]
==== Bulk indexing requests
You can use the <<docs-bulk,bulk API>> to add multiple documents to a data
stream in a single request. Each action in the bulk request must use the
`create` action.
NOTE: Data streams do not support other bulk actions, such as `index`.
.*Example: Bulk API request*
[%collapsible]
====
The following bulk API request adds several new documents to
the `logs` data stream. Note that only the `create` action is used.
@ -110,15 +127,14 @@ PUT /logs/_bulk?refresh
{"create":{ }}
{ "@timestamp": "2020-12-09T11:07:08.000Z", "user": { "id": "l7gk7f82" }, "message": "Logout successful" }
----
====
--
You can use an <<ingest,ingest pipeline>> with these requests to pre-process
data before it's indexed.
[discrete]
[[data-streams-index-with-an-ingest-pipeline]]
==== Index with an ingest pipeline
You can use an <<ingest,ingest pipeline>> with an indexing request to
pre-process data before it's indexed to a data stream.
.*Example: Ingest pipeline*
[%collapsible]
====
The following <<put-pipeline-api,put pipeline API>> request creates the
`lowercase_message_field` ingest pipeline. The pipeline uses the
<<lowercase-processor,`lowercase` ingest processor>> to change the `message`
@ -169,7 +185,7 @@ DELETE /_ingest/pipeline/lowercase_message_field
----
// TEST[continued]
////
====
[discrete]
[[search-a-data-stream]]
@ -185,9 +201,6 @@ The following search APIs support data streams:
* <<eql-search-api, EQL search>>
////
.*Example*
[%collapsible]
====
The following <<search-search,search API>> request searches the `logs` data
stream for documents with a timestamp between today and yesterday that also have
`message` value of `login successful`.
@ -215,14 +228,10 @@ GET /logs/_search
}
}
----
====
You can use a comma-separated list or wildcard (`*`) expression to search
multiple data streams, indices, and index aliases in the same request.
.*Example*
[%collapsible]
====
The following request searches the `logs` and `logs_alt` data streams, which are
specified as a comma-separated list in the request path.
@ -266,7 +275,6 @@ GET /_search
}
}
----
====
[discrete]
[[get-stats-for-a-data-stream]]
@ -339,9 +347,6 @@ manually perform a rollover. This can be useful if you want to
<<data-streams-change-mappings-and-settings,apply mapping or setting changes>>
to the stream's write index after updating a data stream's template.
.*Example*
[%collapsible]
====
The following <<indices-rollover-index,rollover API>> request submits a manual
rollover request for the `logs` data stream.
@ -349,7 +354,6 @@ rollover request for the `logs` data stream.
----
POST /logs/_rollover/
----
====
[discrete]
[[open-closed-backing-indices]]
@ -358,8 +362,8 @@ POST /logs/_rollover/
You may <<indices-close,close>> one or more of a data stream's backing indices
as part of its {ilm-init} lifecycle or another workflow. A closed backing index
cannot be searched, even for searches targeting its data stream. You also can't
<<update-delete-docs-in-a-data-stream,update or delete documents>> in a closed
index.
<<update-docs-in-a-data-stream-by-query,update>> or
<<delete-docs-in-a-data-stream-by-query,delete>> documents in a closed index.
You can re-open individual backing indices by sending an
<<indices-open-close,open request>> directly to the index.
@ -367,9 +371,6 @@ You can re-open individual backing indices by sending an
You also can conveniently re-open all closed backing indices for a data stream
by sending an open request directly to the stream.
.*Example*
[%collapsible]
====
The following <<cat-indices,cat indices>> API request retrieves the status for
the `logs` data stream's backing indices.
@ -428,7 +429,6 @@ index status
.ds-logs-000003 open
----
// TESTRESPONSE[non_json]
====
[discrete]
[[reindex-with-a-data-stream]]
@ -462,9 +462,6 @@ TIP: If you only want to update the mappings or settings of a data stream's
write index, we recommend you update the <<create-a-data-stream-template,data
stream's template>> and perform a <<manually-roll-over-a-data-stream,rollover>>.
.*Example*
[%collapsible]
====
The following reindex request copies documents from the `archive` index alias to
the existing `logs` data stream. Because the destination is a data stream, the
request's `op_type` is `create`.
@ -506,14 +503,10 @@ POST /_reindex
}
----
// TEST[continued]
====
You can also reindex documents from a data stream to an index, index
alias, or data stream.
.*Example*
[%collapsible]
====
The following reindex request copies documents from the `logs` data stream
to the existing `archive` index alias. Because the destination is not a data
stream, the `op_type` does not need to be specified.
@ -531,21 +524,25 @@ POST /_reindex
}
----
// TEST[continued]
====
[discrete]
[[update-delete-docs-in-a-data-stream]]
=== Update or delete documents in a data stream
[[update-docs-in-a-data-stream-by-query]]
=== Update documents in a data stream by query
You can update or delete documents in a data stream using the following
requests:
You cannot send indexing or update requests for existing documents directly to a
data stream. These prohibited requests include:
* An <<docs-update-by-query,update by query API>> request
+
.*Example*
[%collapsible]
====
The following update by query API request updates documents in the `logs` data
* An <<docs-index_,index API>> request with an
<<docs-index-api-op_type,`op_type`>> of `index`. The `op_type` parameter
defaults to `index` for existing documents.
* A <<docs-bulk,bulk API>> request using the `index` or `update`
action.
Instead, you can use the <<docs-update-by-query,update by query API>> to update
documents in a data stream that matches a provided query.
The following update by query request updates documents in the `logs` data
stream with a `user.id` of `l7gk7f82`. The request uses a
<<modules-scripting-using,script>> to assign matching documents a new `user.id`
value of `XgdX0NoX`.
@ -567,14 +564,22 @@ POST /logs/_update_by_query
}
}
----
====
* A <<docs-delete-by-query,delete by query API>> request
+
.*Example*
[%collapsible]
====
The following delete by query API request deletes documents in the `logs` data
[discrete]
[[delete-docs-in-a-data-stream-by-query]]
=== Delete documents in a data stream by query
You cannot send document deletion requests directly to a data stream. These
prohibited requests include:
* A <<docs-delete,delete API>> request
* A <<docs-bulk,bulk API>> request using the `delete` action.
Instead, you can use the <<docs-delete-by-query,delete by query API>> to delete
documents in a data stream that matches a provided query.
The following delete by query request deletes documents in the `logs` data
stream with a `user.id` of `vlb44hny`.
[source,console]
@ -588,7 +593,6 @@ POST /logs/_delete_by_query
}
}
----
====
[discrete]
[[update-delete-docs-in-a-backing-index]]
@ -607,9 +611,6 @@ If you want to update a document, you must also get its current
You can use a <<search-a-data-stream,search request>> to retrieve this
information.
.*Example*
[%collapsible]
====
The following search request retrieves documents in the `logs` data stream with
a `user.id` of `yWIumJd7`. By default, this search returns the document ID and
backing index for any matching documents.
@ -679,15 +680,11 @@ information for any documents matching the search.
<2> Document ID for the document
<3> Current sequence number for the document
<4> Primary term for the document
====
You can use an <<docs-index_,index API>> request to update an individual
document. To prevent an accidental overwrite, this request must include valid
`if_seq_no` and `if_primary_term` arguments.
.*Example*
[%collapsible]
====
The following index API request updates an existing document in the `logs` data
stream. The request targets document ID `bfspvnIBr7VVZlfp2lqX` in the
`.ds-logs-000003` backing index.
@ -707,14 +704,10 @@ PUT /.ds-logs-000003/_doc/bfspvnIBr7VVZlfp2lqX?if_seq_no=0&if_primary_term=1
"message": "Login successful"
}
----
====
You use the <<docs-delete,delete API>> to delete individual documents. Deletion
requests do not require a sequence number or primary term.
.*Example*
[%collapsible]
====
The following index API request deletes an existing document in the `logs` data
stream. The request targets document ID `bfspvnIBr7VVZlfp2lqX` in the
`.ds-logs-000003` backing index.
@ -723,7 +716,6 @@ stream. The request targets document ID `bfspvnIBr7VVZlfp2lqX` in the
----
DELETE /.ds-logs-000003/_doc/bfspvnIBr7VVZlfp2lqX
----
====
You can use the <<docs-bulk,bulk API>> to delete or update multiple documents in
one request using `delete`, `index`, or `update` actions.
@ -732,9 +724,6 @@ If the action type is `index`, the action must include valid
<<bulk-optimistic-concurrency-control,`if_seq_no` and `if_primary_term`>>
arguments.
.*Example*
[%collapsible]
====
The following bulk API request uses an `index` action to update an existing
document in the `logs` data stream.
@ -749,5 +738,4 @@ PUT /_bulk?refresh
{ "index": { "_index": ".ds-logs-000003", "_id": "bfspvnIBr7VVZlfp2lqX", "if_seq_no": 0, "if_primary_term": 1 } }
{ "@timestamp": "2020-12-07T11:06:07.000Z", "user": { "id": "8a4f500d" }, "message": "Login successful" }
----
====

View File

@ -52,7 +52,7 @@ and have the same semantics as the `op_type` parameter in the standard index API
NOTE: <<data-streams,Data streams>> support only the `create` action. To update
or delete a document in a data stream, you must target the backing index
containing the document. See <<update-delete-docs-in-a-data-stream>>.
containing the document. See <<update-delete-docs-in-a-backing-index>>.
`update` expects that the partial doc, upsert,
and script and its options are specified on the next line.

View File

@ -19,7 +19,7 @@ index name and document ID.
NOTE: You cannot send deletion requests directly to a data stream. To delete a
document in a data stream, you must target the backing index containing the
document. See <<update-delete-docs-in-a-data-stream>>.
document. See <<update-delete-docs-in-a-backing-index>>.
[float]
[[optimistic-concurrency-control-delete]]

View File

@ -11,7 +11,7 @@ it searchable. If the target is an index and the document already exists,
the request updates the document and increments its version.
NOTE: You cannot use the index API to send update requests for existing
documents to a data stream. See <<update-delete-docs-in-a-data-stream>>
documents to a data stream. See <<update-docs-in-a-data-stream-by-query>>
and <<update-delete-docs-in-a-backing-index>>.
[[docs-index-api-request]]

View File

@ -17,9 +17,6 @@ to control access to a data stream. Any role or user granted privileges to a
data stream are automatically granted the same privileges to its backing
indices.
.*Example*
[%collapsible]
====
`logs` is a data stream that consists of two backing indices: `.ds-logs-000001`
and `.ds-logs-000002`.
@ -96,7 +93,6 @@ DELETE /_index_template/*
----
// TEST[continued]
////
====
[[index-alias-privileges]]
==== Index alias privileges