mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-07 21:48:39 +00:00
With #58096, data streams now track the timestamp field mapping outside of the template associated with the stream. This means you can no longer update the timestamp field mapping using template changes. This updates the associated data stream docs.
400 lines
10 KiB
Plaintext
400 lines
10 KiB
Plaintext
[[set-up-a-data-stream]]
|
|
== Set up a data stream
|
|
|
|
To set up a data stream, follow these steps:
|
|
|
|
. Check the <<data-stream-prereqs, prerequisites>>.
|
|
. <<configure-a-data-stream-ilm-policy>>.
|
|
. <<create-a-data-stream-template>>.
|
|
. <<create-a-data-stream>>.
|
|
. <<get-info-about-a-data-stream>> to verify it exists.
|
|
|
|
After you set up a data stream, you can <<use-a-data-stream, use the data
|
|
stream>> for indexing, searches, and other supported operations.
|
|
|
|
If you no longer need it, you can <<delete-a-data-stream,delete a data stream>>
|
|
and its backing indices.
|
|
|
|
[discrete]
|
|
[[data-stream-prereqs]]
|
|
=== Prerequisites
|
|
|
|
* {es} data streams are intended for time-series data only. Each document
|
|
indexed to a data stream must contain a shared timestamp field.
|
|
+
|
|
TIP: Data streams work well with most common log formats. While no schema is
|
|
required to use data streams, we recommend the {ecs-ref}[Elastic Common Schema
|
|
(ECS)].
|
|
|
|
* Data streams are designed to be <<data-streams-append-only,append-only>>.
|
|
While you can index new documents directly to a data stream, you cannot use a
|
|
data stream to directly update or delete individual documents. To update or
|
|
delete specific documents in a data stream, submit a <<docs-delete,delete>> or
|
|
<<docs-update,update>> API request to the backing index containing the document.
|
|
|
|
|
|
[discrete]
|
|
[[configure-a-data-stream-ilm-policy]]
|
|
=== Optional: Configure an {ilm-init} lifecycle policy for a data stream
|
|
|
|
You can use <<index-lifecycle-management,{ilm} ({ilm-init})>> to automatically
|
|
manage a data stream's backing indices. For example, you could use {ilm-init}
|
|
to:
|
|
|
|
* Spin up a new write index for the data stream when the current one reaches a
|
|
certain size or age.
|
|
* Move older backing indices to slower, less expensive hardware.
|
|
* Delete stale backing indices to enforce data retention standards.
|
|
|
|
To use {ilm-init} with a data stream, you must
|
|
<<set-up-lifecycle-policy,configure a lifecycle policy>>. This lifecycle policy
|
|
should contain the automated actions to take on backing indices and the
|
|
triggers for such actions.
|
|
|
|
TIP: While optional, we recommend using {ilm-init} to scale data streams in
|
|
production.
|
|
|
|
.*Example*
|
|
[%collapsible]
|
|
====
|
|
The following <<ilm-put-lifecycle,create lifecycle policy API>> request
|
|
configures the `logs_policy` lifecycle policy.
|
|
|
|
The `logs_policy` policy uses the <<ilm-rollover,`rollover` action>> to create a
|
|
new <<data-stream-write-index,write index>> for the data stream when the current
|
|
one reaches 25GB in size. The policy also deletes backing indices 30 days after
|
|
their rollover.
|
|
|
|
[source,console]
|
|
----
|
|
PUT /_ilm/policy/logs_policy
|
|
{
|
|
"policy": {
|
|
"phases": {
|
|
"hot": {
|
|
"actions": {
|
|
"rollover": {
|
|
"max_size": "25GB"
|
|
}
|
|
}
|
|
},
|
|
"delete": {
|
|
"min_age": "30d",
|
|
"actions": {
|
|
"delete": {}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
====
|
|
|
|
|
|
[discrete]
|
|
[[create-a-data-stream-template]]
|
|
=== Create a composable template for a data stream
|
|
|
|
Each data stream requires a <<indices-templates,composable template>>. The data
|
|
stream uses this template to create its backing indices.
|
|
|
|
Composable templates for data streams must contain:
|
|
|
|
* A name or wildcard (`*`) pattern for the data stream in the `index_patterns`
|
|
property.
|
|
+
|
|
You can use the resolve index API to check if the name or pattern
|
|
matches any existing indices, index aliases, or data streams. If so, you should
|
|
consider using another name or pattern.
|
|
+
|
|
.*Example*
|
|
[%collapsible]
|
|
====
|
|
The following resolve index API request checks for any existing indices, index
|
|
aliases, or data streams that start with `logs`. If not, the `logs*`
|
|
wildcard pattern can be used to create a new data stream.
|
|
|
|
[source,console]
|
|
----
|
|
GET /_resolve/index/logs*
|
|
----
|
|
// TEST[continued]
|
|
|
|
The API returns the following response, indicating no existing targets match
|
|
this pattern.
|
|
|
|
[source,console-result]
|
|
----
|
|
{
|
|
"indices" : [ ],
|
|
"aliases" : [ ],
|
|
"data_streams" : [ ]
|
|
}
|
|
----
|
|
====
|
|
|
|
* A `data_stream` definition containing the `timestamp_field` property.
|
|
This timestamp field must be included in every document indexed to the data
|
|
stream.
|
|
|
|
* A <<date,`date`>> or <<date_nanos,`date_nanos`>> field mapping for the
|
|
timestamp field specified in the `timestamp_field` property.
|
|
+
|
|
IMPORTANT: Carefully consider the timestamp field's mapping, including
|
|
<<mapping-params,mapping parameters>> such as <<mapping-date-format,`format`>>.
|
|
Once the stream is created, you can only update the timestamp field's mapping by
|
|
reindexing the data stream. See
|
|
<<data-streams-use-reindex-to-change-mappings-settings>>.
|
|
|
|
* If you intend to use {ilm-init}, you must specify the
|
|
<<configure-a-data-stream-ilm-policy,lifecycle policy>> in the
|
|
`index.lifecycle.name` setting.
|
|
|
|
You can also specify other mappings and settings you'd like to apply to the
|
|
stream's backing indices.
|
|
|
|
TIP: We recommend you carefully consider which mappings and settings to include
|
|
in this template before creating a data stream. Later changes to the mappings or
|
|
settings of a stream's backing indices may require reindexing. See
|
|
<<data-streams-change-mappings-and-settings>>.
|
|
|
|
.*Example*
|
|
[%collapsible]
|
|
====
|
|
The following <<indices-templates,put composable template API>> request
|
|
configures the `logs_data_stream` template.
|
|
|
|
[source,console]
|
|
----
|
|
PUT /_index_template/logs_data_stream
|
|
{
|
|
"index_patterns": [ "logs*" ],
|
|
"data_stream": {
|
|
"timestamp_field": "@timestamp"
|
|
},
|
|
"template": {
|
|
"mappings": {
|
|
"properties": {
|
|
"@timestamp": {
|
|
"type": "date"
|
|
}
|
|
}
|
|
},
|
|
"settings": {
|
|
"index.lifecycle.name": "logs_policy"
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TEST[continued]
|
|
====
|
|
|
|
NOTE: You cannot delete a composable template that's in use by a data stream.
|
|
This would prevent the data stream from creating new backing indices.
|
|
|
|
[discrete]
|
|
[[create-a-data-stream]]
|
|
=== Create a data stream
|
|
|
|
With a composable template, you can create a data stream using one of two
|
|
methods:
|
|
|
|
* Submit an <<add-documents-to-a-data-stream,indexing request>> to a target
|
|
matching the name or wildcard pattern defined in the template's `index_patterns`
|
|
property.
|
|
+
|
|
--
|
|
If the indexing request's target doesn't exist, {es} creates the data stream and
|
|
uses the target name as the name for the stream.
|
|
|
|
NOTE: Data streams support only specific types of indexing requests. See
|
|
<<add-documents-to-a-data-stream>>.
|
|
|
|
[[index-documents-to-create-a-data-stream]]
|
|
.*Example: Index documents to create a data stream*
|
|
[%collapsible]
|
|
====
|
|
The following <<docs-index_,index API>> request targets `logs`, which matches
|
|
the wildcard pattern for the `logs_data_stream` template. Because no existing
|
|
index or data stream uses this name, this request creates the `logs` data stream
|
|
and indexes the document to it.
|
|
|
|
[source,console]
|
|
----
|
|
POST /logs/_doc/
|
|
{
|
|
"@timestamp": "2020-12-06T11:04:05.000Z",
|
|
"user": {
|
|
"id": "vlb44hny"
|
|
},
|
|
"message": "Login attempt failed"
|
|
}
|
|
----
|
|
// TEST[continued]
|
|
|
|
The API returns the following response. Note the `_index` property contains
|
|
`.ds-logs-000001`, indicating the document was indexed to the write index of the
|
|
new `logs` data stream.
|
|
|
|
[source,console-result]
|
|
----
|
|
{
|
|
"_index": ".ds-logs-000001",
|
|
"_id": "qecQmXIBT4jB8tq1nG0j",
|
|
"_type": "_doc",
|
|
"_version": 1,
|
|
"result": "created",
|
|
"_shards": {
|
|
"total": 2,
|
|
"successful": 1,
|
|
"failed": 0
|
|
},
|
|
"_seq_no": 0,
|
|
"_primary_term": 1
|
|
}
|
|
----
|
|
// TESTRESPONSE[s/"_id": "qecQmXIBT4jB8tq1nG0j"/"_id": $body._id/]
|
|
====
|
|
--
|
|
|
|
* Use the <<indices-create-data-stream,create data stream API>> to manually
|
|
create a data stream. The name of the data stream must match the
|
|
name or wildcard pattern defined in the template's `index_patterns` property.
|
|
+
|
|
--
|
|
.*Example: Manually create a data stream*
|
|
[%collapsible]
|
|
====
|
|
The following <<indices-create-data-stream,create data stream API>> request
|
|
targets `logs_alt`, which matches the wildcard pattern for the
|
|
`logs_data_stream` template. Because no existing index or data stream uses this
|
|
name, this request creates the `logs_alt` data stream.
|
|
|
|
[source,console]
|
|
----
|
|
PUT /_data_stream/logs_alt
|
|
----
|
|
// TEST[continued]
|
|
====
|
|
--
|
|
|
|
////
|
|
[source,console]
|
|
----
|
|
DELETE /_data_stream/logs
|
|
|
|
DELETE /_data_stream/logs_alt
|
|
|
|
DELETE /_index_template/logs_data_stream
|
|
|
|
DELETE /_ilm/policy/logs_policy
|
|
----
|
|
// TEST[continued]
|
|
////
|
|
|
|
[discrete]
|
|
[[get-info-about-a-data-stream]]
|
|
=== Get information about a data stream
|
|
|
|
You can use the <<indices-get-data-stream,get data stream API>> to get
|
|
information about one or more data streams, including:
|
|
|
|
* The timestamp field
|
|
* The current backing indices, which is returned as an array. The last item in
|
|
the array contains information about the stream's current write index.
|
|
* The current generation
|
|
|
|
This is also handy way to verify that a recently created data stream exists.
|
|
|
|
.*Example*
|
|
[%collapsible]
|
|
====
|
|
The following get data stream API request retrieves information about any data
|
|
streams starting with `logs`.
|
|
|
|
[source,console]
|
|
----
|
|
GET /_data_stream/logs*
|
|
----
|
|
// TEST[skip: shard failures]
|
|
|
|
The API returns the following response, which includes information about the
|
|
`logs` data stream. Note the `indices` property contains an array of the
|
|
stream's current backing indices. The last item in this array contains
|
|
information for the `logs` stream's write index, `.ds-logs-000002`.
|
|
|
|
[source,console-result]
|
|
----
|
|
[
|
|
{
|
|
"name": "logs",
|
|
"timestamp_field": "@timestamp",
|
|
"indices": [
|
|
{
|
|
"index_name": ".ds-logs-000001",
|
|
"index_uuid": "DXAE-xcCQTKF93bMm9iawA"
|
|
},
|
|
{
|
|
"index_name": ".ds-logs-000002",
|
|
"index_uuid": "Wzxq0VhsQKyPxHhaK3WYAg"
|
|
}
|
|
],
|
|
"generation": 2
|
|
}
|
|
]
|
|
----
|
|
// TESTRESPONSE[skip:unable to assert responses with top level array]
|
|
====
|
|
|
|
[discrete]
|
|
[[delete-a-data-stream]]
|
|
=== Delete a data stream
|
|
|
|
You can use the <<indices-delete-data-stream,delete data stream API>> to delete
|
|
a data stream and its backing indices.
|
|
|
|
.*Example*
|
|
[%collapsible]
|
|
====
|
|
The following delete data stream API request deletes the `logs` data stream. This
|
|
request also deletes the stream's backing indices and any data they contain.
|
|
|
|
////
|
|
[source,console]
|
|
----
|
|
PUT /_index_template/logs_data_stream
|
|
{
|
|
"index_patterns": [ "logs*" ],
|
|
"data_stream": {
|
|
"timestamp_field": "@timestamp"
|
|
},
|
|
"template": {
|
|
"mappings": {
|
|
"properties": {
|
|
"@timestamp": {
|
|
"type": "date"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
PUT /_data_stream/logs
|
|
----
|
|
////
|
|
|
|
[source,console]
|
|
----
|
|
DELETE /_data_stream/logs
|
|
----
|
|
// TEST[continued]
|
|
====
|
|
|
|
////
|
|
[source,console]
|
|
----
|
|
DELETE /_index_template/logs_data_stream
|
|
----
|
|
// TEST[continued]
|
|
////
|