OpenSearch/docs/reference/data-streams/data-streams-overview.asciidoc

[[data-streams-overview]]
== Data streams overview
++++
<titleabbrev>Overview</titleabbrev>
++++

A data stream consists of one or more _backing indices_. Backing indices are
<<index-hidden,hidden>>, auto-generated indices used to store a stream's
documents.

image::images/data-streams/data-streams-diagram.svg[align="center"]

The creation of a data stream requires a matching
<<indices-templates,index template>>. This template acts as a blueprint for
the stream's backing indices. It contains:

* A name or wildcard (`*`) pattern for the data stream.

* The data stream's _timestamp field_. This field must be mapped as a
  <<date,`date`>> or <<date_nanos,`date_nanos`>> field datatype and must be
  included in every document indexed to the data stream.

* The mappings and settings applied to each backing index when it's created.

The same index template can be used to create multiple data streams.
See <<set-up-a-data-stream>>.

[discrete]
[[data-streams-generation]]
=== Generation

Each data stream tracks its _generation_: a six-digit, zero-padded integer
that acts as a cumulative count of the data stream's backing indices. This count
includes any deleted indices for the stream. The generation is incremented
whenever a new backing index is added to the stream.

When a backing index is created, the index is named using the following
convention:

[source,text]
----
.ds-<data-stream>-<generation>
----

.*Example*
[%collapsible]
====
The `web_server_logs` data stream has a generation of `34`. The most recently
created backing index for this data stream is named
`.ds-web_server_logs-000034`.
====

Because the generation increments with each new backing index, backing indices
with a higher generation contain more recent data. Backing indices with a lower
generation contain older data.

A backing index's name can change after its creation due to a
<<indices-shrink-index,shrink>>, <<snapshots-restore-snapshot,restore>>, or
other operations.

[discrete]
[[data-stream-write-index]]
=== Write index

When a read request is sent to a data stream, it routes the request to all its
backing indices. For example, a search request sent to a data stream would query
all its backing indices.

image::images/data-streams/data-streams-search-request.svg[align="center"]

However, the most recently created backing index is the data stream’s only
_write index_. The data stream routes all indexing requests for new documents to
this index.

image::images/data-streams/data-streams-index-request.svg[align="center"]

You cannot add new documents to a stream's other backing indices, even by
sending requests directly to the index. This means you cannot submit the
following requests directly to any backing index except the write index:

* An <<docs-index_,index API>> request with an
  <<docs-index-api-op_type,`op_type`>> of `create`. The `op_type` parameter
  defaults to `create` when adding new documents.
* A <<docs-bulk,bulk API>> request using a `create` action

Because it's the only index capable of ingesting new documents, you cannot
perform operations on a write index that might hinder indexing. These
prohibited operations include:

* <<indices-clone-index,Clone>>
* <<indices-close,Close>>
* <<indices-delete-index,Delete>>
* <<freeze-index-api,Freeze>>
* <<indices-shrink-index,Shrink>>
* <<indices-split-index,Split>>

[discrete]
[[data-streams-rollover]]
=== Rollover

When a data stream is created, one backing index is automatically created.
Because this single index is also the most recently created backing index, it
acts as the stream's write index.

A <<indices-rollover-index,rollover>> creates a new backing index for a data
stream. This new backing index becomes the stream's write index, replacing
the current one, and increments the stream's generation.

In most cases, we recommend using <<index-lifecycle-management,{ilm}
({ilm-init})>> to automate rollovers for data streams. This lets you
automatically roll over the current write index when it meets specified
criteria, such as a maximum age or size.

However, you can also use the <<indices-rollover-index,rollover API>> to
manually perform a rollover. See <<manually-roll-over-a-data-stream>>.

[discrete]
[[data-streams-append-only]]
=== Append-only

For most time-series use cases, existing data is rarely, if ever, updated.
Because of this, data streams are designed to be append-only.

You can send <<add-documents-to-a-data-stream,indexing requests for new
documents>> directly to a data stream. However, you cannot send the following
requests for existing documents directly to a data stream:

* An <<docs-index_,index API>> request with an
  <<docs-index-api-op_type,`op_type`>> of `index`. The `op_type` parameter
  defaults to `index` for existing documents.

* A <<docs-bulk,bulk API>> request using the `delete`, `index`, or `update`
  action.

* A <<docs-delete,delete API>> request

Instead, you can use the <<docs-update-by-query,update by query>> and
<<docs-delete-by-query,delete by query>> APIs to update or delete existing
documents in a data stream. See <<update-delete-docs-in-a-data-stream>>.

Alternatively, you can update or delete a document by submitting requests to the
backing index containing the document. See
<<update-delete-docs-in-a-backing-index>>.

TIP: If you frequently update or delete existing documents,
we recommend using an <<indices-add-alias,index alias>> and
<<indices-templates,index template>> instead of a data stream. You can still
use <<index-lifecycle-management,{ilm-init}>> to manage indices for the alias.