2020-06-10 14:03:46 -04:00
|
|
|
[[set-up-a-data-stream]]
|
|
|
|
== Set up a data stream
|
|
|
|
|
|
|
|
To set up a data stream, follow these steps:
|
|
|
|
|
|
|
|
. Check the <<data-stream-prereqs, prerequisites>>.
|
|
|
|
. <<configure-a-data-stream-ilm-policy>>.
|
|
|
|
. <<create-a-data-stream-template>>.
|
|
|
|
. <<create-a-data-stream>>.
|
2020-06-15 09:48:58 -04:00
|
|
|
. <<get-info-about-a-data-stream>> to verify it exists.
|
2020-06-10 14:03:46 -04:00
|
|
|
|
|
|
|
After you set up a data stream, you can <<use-a-data-stream, use the data
|
|
|
|
stream>> for indexing, searches, and other supported operations.
|
|
|
|
|
2020-06-15 08:52:43 -04:00
|
|
|
If you no longer need it, you can <<delete-a-data-stream,delete a data stream>>
|
|
|
|
and its backing indices.
|
|
|
|
|
2020-06-10 14:03:46 -04:00
|
|
|
[discrete]
|
|
|
|
[[data-stream-prereqs]]
|
|
|
|
=== Prerequisites
|
|
|
|
|
|
|
|
* {es} data streams are intended for time-series data only. Each document
|
|
|
|
indexed to a data stream must contain a shared timestamp field.
|
|
|
|
+
|
|
|
|
TIP: Data streams work well with most common log formats. While no schema is
|
|
|
|
required to use data streams, we recommend the {ecs-ref}[Elastic Common Schema
|
|
|
|
(ECS)].
|
|
|
|
|
2020-06-11 11:32:09 -04:00
|
|
|
* Data streams are designed to be <<data-streams-append-only,append-only>>.
|
|
|
|
While you can index new documents directly to a data stream, you cannot use a
|
|
|
|
data stream to directly update or delete individual documents. To update or
|
|
|
|
delete specific documents in a data stream, submit a <<docs-delete,delete>> or
|
|
|
|
<<docs-update,update>> API request to the backing index containing the document.
|
2020-06-10 14:03:46 -04:00
|
|
|
|
|
|
|
|
|
|
|
[discrete]
|
|
|
|
[[configure-a-data-stream-ilm-policy]]
|
|
|
|
=== Optional: Configure an {ilm-init} lifecycle policy for a data stream
|
|
|
|
|
|
|
|
You can use <<index-lifecycle-management,{ilm} ({ilm-init})>> to automatically
|
|
|
|
manage a data stream's backing indices. For example, you could use {ilm-init}
|
|
|
|
to:
|
|
|
|
|
|
|
|
* Spin up a new write index for the data stream when the current one reaches a
|
|
|
|
certain size or age.
|
|
|
|
* Move older backing indices to slower, less expensive hardware.
|
|
|
|
* Delete stale backing indices to enforce data retention standards.
|
|
|
|
|
|
|
|
To use {ilm-init} with a data stream, you must
|
|
|
|
<<set-up-lifecycle-policy,configure a lifecycle policy>>. This lifecycle policy
|
|
|
|
should contain the automated actions to take on backing indices and the
|
|
|
|
triggers for such actions.
|
|
|
|
|
|
|
|
TIP: While optional, we recommend using {ilm-init} to scale data streams in
|
|
|
|
production.
|
|
|
|
|
|
|
|
.*Example*
|
|
|
|
[%collapsible]
|
|
|
|
====
|
|
|
|
The following <<ilm-put-lifecycle,create lifecycle policy API>> request
|
|
|
|
configures the `logs_policy` lifecycle policy.
|
|
|
|
|
|
|
|
The `logs_policy` policy uses the <<ilm-rollover,`rollover` action>> to create a
|
2020-06-11 11:32:09 -04:00
|
|
|
new <<data-stream-write-index,write index>> for the data stream when the current
|
|
|
|
one reaches 25GB in size. The policy also deletes backing indices 30 days after
|
|
|
|
their rollover.
|
2020-06-10 14:03:46 -04:00
|
|
|
|
|
|
|
[source,console]
|
|
|
|
----
|
|
|
|
PUT /_ilm/policy/logs_policy
|
|
|
|
{
|
|
|
|
"policy": {
|
|
|
|
"phases": {
|
|
|
|
"hot": {
|
|
|
|
"actions": {
|
|
|
|
"rollover": {
|
|
|
|
"max_size": "25GB"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"delete": {
|
|
|
|
"min_age": "30d",
|
|
|
|
"actions": {
|
|
|
|
"delete": {}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
----
|
|
|
|
====
|
|
|
|
|
|
|
|
|
|
|
|
[discrete]
|
|
|
|
[[create-a-data-stream-template]]
|
2020-06-26 11:52:58 -04:00
|
|
|
=== Create an index template for a data stream
|
2020-06-10 14:03:46 -04:00
|
|
|
|
2020-06-26 11:52:58 -04:00
|
|
|
Each data stream requires an <<indices-templates,index template>>. The data
|
2020-06-10 14:03:46 -04:00
|
|
|
stream uses this template to create its backing indices.
|
|
|
|
|
2020-06-26 11:52:58 -04:00
|
|
|
Index templates for data streams must contain:
|
2020-06-10 14:03:46 -04:00
|
|
|
|
|
|
|
* A name or wildcard (`*`) pattern for the data stream in the `index_patterns`
|
2020-06-16 16:28:42 -04:00
|
|
|
property.
|
|
|
|
+
|
|
|
|
You can use the resolve index API to check if the name or pattern
|
|
|
|
matches any existing indices, index aliases, or data streams. If so, you should
|
|
|
|
consider using another name or pattern.
|
|
|
|
+
|
|
|
|
.*Example*
|
|
|
|
[%collapsible]
|
|
|
|
====
|
|
|
|
The following resolve index API request checks for any existing indices, index
|
|
|
|
aliases, or data streams that start with `logs`. If not, the `logs*`
|
|
|
|
wildcard pattern can be used to create a new data stream.
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
----
|
|
|
|
GET /_resolve/index/logs*
|
|
|
|
----
|
|
|
|
// TEST[continued]
|
|
|
|
|
|
|
|
The API returns the following response, indicating no existing targets match
|
|
|
|
this pattern.
|
|
|
|
|
|
|
|
[source,console-result]
|
|
|
|
----
|
|
|
|
{
|
|
|
|
"indices" : [ ],
|
|
|
|
"aliases" : [ ],
|
|
|
|
"data_streams" : [ ]
|
|
|
|
}
|
|
|
|
----
|
|
|
|
====
|
2020-06-10 14:03:46 -04:00
|
|
|
|
|
|
|
* A `data_stream` definition containing the `timestamp_field` property.
|
|
|
|
This timestamp field must be included in every document indexed to the data
|
|
|
|
stream.
|
|
|
|
|
|
|
|
* A <<date,`date`>> or <<date_nanos,`date_nanos`>> field mapping for the
|
2020-06-24 17:21:26 -04:00
|
|
|
timestamp field specified in the `timestamp_field` property.
|
|
|
|
+
|
|
|
|
IMPORTANT: Carefully consider the timestamp field's mapping, including
|
|
|
|
<<mapping-params,mapping parameters>> such as <<mapping-date-format,`format`>>.
|
|
|
|
Once the stream is created, you can only update the timestamp field's mapping by
|
|
|
|
reindexing the data stream. See
|
|
|
|
<<data-streams-use-reindex-to-change-mappings-settings>>.
|
2020-06-10 14:03:46 -04:00
|
|
|
|
|
|
|
* If you intend to use {ilm-init}, you must specify the
|
2020-06-15 08:52:43 -04:00
|
|
|
<<configure-a-data-stream-ilm-policy,lifecycle policy>> in the
|
2020-06-10 14:03:46 -04:00
|
|
|
`index.lifecycle.name` setting.
|
|
|
|
|
|
|
|
You can also specify other mappings and settings you'd like to apply to the
|
|
|
|
stream's backing indices.
|
|
|
|
|
2020-06-16 16:20:32 -04:00
|
|
|
TIP: We recommend you carefully consider which mappings and settings to include
|
|
|
|
in this template before creating a data stream. Later changes to the mappings or
|
|
|
|
settings of a stream's backing indices may require reindexing. See
|
|
|
|
<<data-streams-change-mappings-and-settings>>.
|
|
|
|
|
2020-06-10 14:03:46 -04:00
|
|
|
.*Example*
|
|
|
|
[%collapsible]
|
|
|
|
====
|
2020-06-26 11:52:58 -04:00
|
|
|
The following <<indices-templates,put index template API>> request
|
2020-06-10 14:03:46 -04:00
|
|
|
configures the `logs_data_stream` template.
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
----
|
|
|
|
PUT /_index_template/logs_data_stream
|
|
|
|
{
|
|
|
|
"index_patterns": [ "logs*" ],
|
|
|
|
"data_stream": {
|
|
|
|
"timestamp_field": "@timestamp"
|
|
|
|
},
|
|
|
|
"template": {
|
|
|
|
"mappings": {
|
|
|
|
"properties": {
|
|
|
|
"@timestamp": {
|
|
|
|
"type": "date"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"settings": {
|
|
|
|
"index.lifecycle.name": "logs_policy"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
----
|
|
|
|
// TEST[continued]
|
|
|
|
====
|
|
|
|
|
2020-06-26 11:52:58 -04:00
|
|
|
NOTE: You cannot delete an index template that's in use by a data stream.
|
2020-06-23 09:01:17 -04:00
|
|
|
This would prevent the data stream from creating new backing indices.
|
|
|
|
|
2020-06-10 14:03:46 -04:00
|
|
|
[discrete]
|
|
|
|
[[create-a-data-stream]]
|
|
|
|
=== Create a data stream
|
|
|
|
|
2020-06-26 11:52:58 -04:00
|
|
|
With an index template, you can create a data stream using one of two
|
2020-06-10 14:03:46 -04:00
|
|
|
methods:
|
|
|
|
|
|
|
|
* Submit an <<add-documents-to-a-data-stream,indexing request>> to a target
|
|
|
|
matching the name or wildcard pattern defined in the template's `index_patterns`
|
|
|
|
property.
|
|
|
|
+
|
|
|
|
--
|
|
|
|
If the indexing request's target doesn't exist, {es} creates the data stream and
|
|
|
|
uses the target name as the name for the stream.
|
|
|
|
|
|
|
|
NOTE: Data streams support only specific types of indexing requests. See
|
|
|
|
<<add-documents-to-a-data-stream>>.
|
|
|
|
|
2020-06-16 16:20:32 -04:00
|
|
|
[[index-documents-to-create-a-data-stream]]
|
2020-06-10 14:03:46 -04:00
|
|
|
.*Example: Index documents to create a data stream*
|
|
|
|
[%collapsible]
|
|
|
|
====
|
|
|
|
The following <<docs-index_,index API>> request targets `logs`, which matches
|
|
|
|
the wildcard pattern for the `logs_data_stream` template. Because no existing
|
|
|
|
index or data stream uses this name, this request creates the `logs` data stream
|
|
|
|
and indexes the document to it.
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
----
|
|
|
|
POST /logs/_doc/
|
|
|
|
{
|
|
|
|
"@timestamp": "2020-12-06T11:04:05.000Z",
|
|
|
|
"user": {
|
|
|
|
"id": "vlb44hny"
|
|
|
|
},
|
|
|
|
"message": "Login attempt failed"
|
|
|
|
}
|
|
|
|
----
|
|
|
|
// TEST[continued]
|
|
|
|
|
|
|
|
The API returns the following response. Note the `_index` property contains
|
|
|
|
`.ds-logs-000001`, indicating the document was indexed to the write index of the
|
|
|
|
new `logs` data stream.
|
|
|
|
|
|
|
|
[source,console-result]
|
|
|
|
----
|
|
|
|
{
|
|
|
|
"_index": ".ds-logs-000001",
|
|
|
|
"_id": "qecQmXIBT4jB8tq1nG0j",
|
|
|
|
"_type": "_doc",
|
|
|
|
"_version": 1,
|
|
|
|
"result": "created",
|
|
|
|
"_shards": {
|
|
|
|
"total": 2,
|
|
|
|
"successful": 1,
|
|
|
|
"failed": 0
|
|
|
|
},
|
|
|
|
"_seq_no": 0,
|
|
|
|
"_primary_term": 1
|
|
|
|
}
|
|
|
|
----
|
|
|
|
// TESTRESPONSE[s/"_id": "qecQmXIBT4jB8tq1nG0j"/"_id": $body._id/]
|
|
|
|
====
|
|
|
|
--
|
|
|
|
|
|
|
|
* Use the <<indices-create-data-stream,create data stream API>> to manually
|
|
|
|
create a data stream. The name of the data stream must match the
|
|
|
|
name or wildcard pattern defined in the template's `index_patterns` property.
|
|
|
|
+
|
|
|
|
--
|
|
|
|
.*Example: Manually create a data stream*
|
|
|
|
[%collapsible]
|
|
|
|
====
|
|
|
|
The following <<indices-create-data-stream,create data stream API>> request
|
|
|
|
targets `logs_alt`, which matches the wildcard pattern for the
|
|
|
|
`logs_data_stream` template. Because no existing index or data stream uses this
|
|
|
|
name, this request creates the `logs_alt` data stream.
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
----
|
|
|
|
PUT /_data_stream/logs_alt
|
|
|
|
----
|
|
|
|
// TEST[continued]
|
|
|
|
====
|
|
|
|
--
|
|
|
|
|
|
|
|
////
|
|
|
|
[source,console]
|
|
|
|
----
|
|
|
|
DELETE /_data_stream/logs
|
|
|
|
|
|
|
|
DELETE /_data_stream/logs_alt
|
|
|
|
|
|
|
|
DELETE /_index_template/logs_data_stream
|
|
|
|
|
|
|
|
DELETE /_ilm/policy/logs_policy
|
|
|
|
----
|
|
|
|
// TEST[continued]
|
2020-06-15 08:52:43 -04:00
|
|
|
////
|
|
|
|
|
|
|
|
[discrete]
|
|
|
|
[[get-info-about-a-data-stream]]
|
|
|
|
=== Get information about a data stream
|
|
|
|
|
|
|
|
You can use the <<indices-get-data-stream,get data stream API>> to get
|
|
|
|
information about one or more data streams, including:
|
|
|
|
|
|
|
|
* The timestamp field
|
|
|
|
* The current backing indices, which is returned as an array. The last item in
|
|
|
|
the array contains information about the stream's current write index.
|
|
|
|
* The current generation
|
|
|
|
|
|
|
|
This is also handy way to verify that a recently created data stream exists.
|
|
|
|
|
|
|
|
.*Example*
|
|
|
|
[%collapsible]
|
|
|
|
====
|
|
|
|
The following get data stream API request retrieves information about any data
|
|
|
|
streams starting with `logs`.
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
----
|
|
|
|
GET /_data_stream/logs*
|
|
|
|
----
|
|
|
|
// TEST[skip: shard failures]
|
|
|
|
|
|
|
|
The API returns the following response, which includes information about the
|
|
|
|
`logs` data stream. Note the `indices` property contains an array of the
|
|
|
|
stream's current backing indices. The last item in this array contains
|
|
|
|
information for the `logs` stream's write index, `.ds-logs-000002`.
|
|
|
|
|
|
|
|
[source,console-result]
|
|
|
|
----
|
|
|
|
[
|
|
|
|
{
|
|
|
|
"name": "logs",
|
|
|
|
"timestamp_field": "@timestamp",
|
|
|
|
"indices": [
|
|
|
|
{
|
|
|
|
"index_name": ".ds-logs-000001",
|
|
|
|
"index_uuid": "DXAE-xcCQTKF93bMm9iawA"
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"index_name": ".ds-logs-000002",
|
|
|
|
"index_uuid": "Wzxq0VhsQKyPxHhaK3WYAg"
|
|
|
|
}
|
|
|
|
],
|
|
|
|
"generation": 2
|
|
|
|
}
|
|
|
|
]
|
|
|
|
----
|
|
|
|
// TESTRESPONSE[skip:unable to assert responses with top level array]
|
|
|
|
====
|
|
|
|
|
|
|
|
[discrete]
|
|
|
|
[[delete-a-data-stream]]
|
|
|
|
=== Delete a data stream
|
|
|
|
|
|
|
|
You can use the <<indices-delete-data-stream,delete data stream API>> to delete
|
|
|
|
a data stream and its backing indices.
|
|
|
|
|
|
|
|
.*Example*
|
|
|
|
[%collapsible]
|
|
|
|
====
|
|
|
|
The following delete data stream API request deletes the `logs` data stream. This
|
|
|
|
request also deletes the stream's backing indices and any data they contain.
|
|
|
|
|
|
|
|
////
|
|
|
|
[source,console]
|
|
|
|
----
|
|
|
|
PUT /_index_template/logs_data_stream
|
|
|
|
{
|
|
|
|
"index_patterns": [ "logs*" ],
|
|
|
|
"data_stream": {
|
|
|
|
"timestamp_field": "@timestamp"
|
|
|
|
},
|
|
|
|
"template": {
|
|
|
|
"mappings": {
|
|
|
|
"properties": {
|
|
|
|
"@timestamp": {
|
|
|
|
"type": "date"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
PUT /_data_stream/logs
|
|
|
|
----
|
|
|
|
////
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
----
|
|
|
|
DELETE /_data_stream/logs
|
|
|
|
----
|
|
|
|
// TEST[continued]
|
|
|
|
====
|
|
|
|
|
|
|
|
////
|
|
|
|
[source,console]
|
|
|
|
----
|
|
|
|
DELETE /_index_template/logs_data_stream
|
|
|
|
----
|
|
|
|
// TEST[continued]
|
|
|
|
////
|