mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-03-24 17:09:48 +00:00
Changes: * Updates 'Data streams' intro page to focus on problem solution and benefits. * Adds 'Data streams overview' page to cover conceptual information, based on existing content in the 'Data streams' intro. * Adds diagrams for data streams and search/indexing request examples. * Moves API jump list and API docs to a new 'Data streams APIs' section. Links to these APIs will be available through tutorials. * Add xrefs to existing docs for concepts like generation, write index, and append-only.
This commit is contained in:
parent
4e738f60f8
commit
6fc8317f07
14
docs/reference/data-streams/data-stream-apis.asciidoc
Normal file
14
docs/reference/data-streams/data-stream-apis.asciidoc
Normal file
@ -0,0 +1,14 @@
|
||||
[[data-stream-apis]]
|
||||
== Data stream APIs
|
||||
|
||||
The following APIs are available for managing data streams:
|
||||
|
||||
* To get information about data streams, use the <<indices-get-data-stream, get data stream API>>.
|
||||
* To delete data streams, use the <<indices-delete-data-stream, delete data stream API>>.
|
||||
* To manually create a data stream, use the <<indices-create-data-stream, create data stream API>>.
|
||||
|
||||
include::{es-repo-dir}/indices/create-data-stream.asciidoc[]
|
||||
|
||||
include::{es-repo-dir}/indices/get-data-stream.asciidoc[]
|
||||
|
||||
include::{es-repo-dir}/indices/delete-data-stream.asciidoc[]
|
142
docs/reference/data-streams/data-streams-overview.asciidoc
Normal file
142
docs/reference/data-streams/data-streams-overview.asciidoc
Normal file
@ -0,0 +1,142 @@
|
||||
[[data-streams-overview]]
|
||||
== Data streams overview
|
||||
++++
|
||||
<titleabbrev>Overview</titleabbrev>
|
||||
++++
|
||||
|
||||
A data stream consists of one or more _backing indices_. Backing indices are
|
||||
<<index-hidden,hidden>>, automatically-generated indices used to store a
|
||||
stream's documents.
|
||||
|
||||
image::images/data-streams/data-streams-diagram.svg[align="center"]
|
||||
|
||||
The creation of a data stream requires an associated
|
||||
<<indices-templates,composable template>>. This template acts as a blueprint for
|
||||
the stream's backing indices. It contains:
|
||||
|
||||
* A name or wildcard (`*`) pattern for the data stream.
|
||||
|
||||
* The data stream's _timestamp field_. This field must be mapped as a
|
||||
<<date,`date`>> or <<date_nanos,`date_nanos`>> field datatype and must be
|
||||
included in every document indexed to the data stream.
|
||||
|
||||
* The mappings and settings applied to each backing index when it's created.
|
||||
|
||||
The same composable template can be used to create multiple data streams.
|
||||
See <<set-up-a-data-stream>>.
|
||||
|
||||
[discrete]
|
||||
[[data-streams-generation]]
|
||||
=== Generation
|
||||
|
||||
Each data stream tracks its _generation_: a six-digit, zero-padded integer
|
||||
that acts as a cumulative count of the data stream's backing indices. This count
|
||||
includes any deleted indices for the stream. The generation is incremented
|
||||
whenever a new backing index is added to the stream.
|
||||
|
||||
When a backing index is created, the index is named using the following
|
||||
convention:
|
||||
|
||||
[source,text]
|
||||
----
|
||||
.ds-<data-stream>-<generation>
|
||||
----
|
||||
|
||||
.*Example*
|
||||
[%collapsible]
|
||||
====
|
||||
The `web_server_logs` data stream has a generation of `34`. The most recently
|
||||
created backing index for this data stream is named
|
||||
`.ds-web_server_logs-000034`.
|
||||
====
|
||||
|
||||
Because the generation increments with each new backing index, backing indices
|
||||
with a higher generation contain more recent data. Backing indices with a lower
|
||||
generation contain older data.
|
||||
|
||||
A backing index's name can change after its creation due to a
|
||||
<<indices-shrink-index,shrink>>, <<snapshots-restore-snapshot,restore>>, or
|
||||
other operations.
|
||||
|
||||
[discrete]
|
||||
[[data-stream-write-index]]
|
||||
=== Write index
|
||||
|
||||
When a read request is sent to a data stream, it routes the request to all its
|
||||
backing indices. For example, a search request sent to a data stream would query
|
||||
all its backing indices.
|
||||
|
||||
image::images/data-streams/data-streams-search-request.svg[align="center"]
|
||||
|
||||
However, the most recently created backing index is the data stream’s only
|
||||
_write index_. The data stream routes all indexing requests for new documents to
|
||||
this index.
|
||||
|
||||
image::images/data-streams/data-streams-index-request.svg[align="center"]
|
||||
|
||||
You cannot add new documents to a stream's other backing indices, even by
|
||||
sending requests directly to the index. This means you cannot submit the
|
||||
following requests directly to any backing index except the write index:
|
||||
|
||||
* An <<docs-index_,Index API>> request with an
|
||||
<<docs-index-api-op_type,`op_type`>> of `create`. The `op_type` parameter
|
||||
defaults to `create` when adding new documents.
|
||||
* A <<docs-bulk,Bulk API>> request using a `create` action
|
||||
|
||||
Because it's the only index capable of ingesting new documents, you cannot
|
||||
perform operations on a write index that might hinder indexing. These
|
||||
prohibited operations include:
|
||||
|
||||
* <<indices-close,Closing the write index>>
|
||||
* <<indices-delete-index,Deleting the write index>>
|
||||
* <<freeze-index-api,Freezing the write index>>
|
||||
* <<indices-shrink-index,Shrinking the write index>>
|
||||
|
||||
[discrete]
|
||||
[[data-streams-rollover]]
|
||||
=== Rollover
|
||||
|
||||
When a data stream is created, one backing index is automatically created.
|
||||
Because this single index is also the most recently created backing index, it
|
||||
acts as the stream's write index.
|
||||
|
||||
A <<indices-rollover-index,rollover>> creates a new backing index for a data
|
||||
stream. This new backing index becomes the stream's write index, replacing
|
||||
the current one, and increments the stream's generation.
|
||||
|
||||
In most cases, we recommend using <<index-lifecycle-management,{ilm}
|
||||
({ilm-init})>> to automate rollovers for data streams. This lets you
|
||||
automatically roll over the current write index when it meets specified
|
||||
criteria, such as a maximum age or size.
|
||||
|
||||
However, you can also use the <<indices-rollover-index,rollover API>> to
|
||||
manually perform a rollover. See <<manually-roll-over-a-data-stream>>.
|
||||
|
||||
[discrete]
|
||||
[[data-streams-append-only]]
|
||||
=== Append-only
|
||||
|
||||
For most time-series use cases, existing data is rarely, if ever, updated.
|
||||
Because of this, data streams are designed to be append-only. This means you can
|
||||
send indexing requests for new documents directly to a data stream. However, you
|
||||
cannot send update or deletion requests for existing documents to a data stream.
|
||||
|
||||
To update or delete specific documents in a data stream, submit one of the
|
||||
following requests to the backing index containing the document:
|
||||
|
||||
* An <<docs-index_,Index API>> request with an
|
||||
<<docs-index-api-op_type,`op_type`>> of `index`.
|
||||
These requests must include valid <<optimistic-concurrency-control,`if_seq_no`
|
||||
and `if_primary_term`>> arguments.
|
||||
|
||||
* A <<docs-bulk,Bulk API>> request using the `delete`, `index`, or `update`
|
||||
action. If the action type is `index`, the action must include valid
|
||||
<<bulk-optimistic-concurrency-control,`if_seq_no` and `if_primary_term`>>
|
||||
arguments.
|
||||
|
||||
* A <<docs-delete,Delete API>> request
|
||||
|
||||
TIP: If you need to frequently update or delete existing documents across
|
||||
multiple indices, we recommend using an <<indices-add-alias,index alias>> and
|
||||
<<indices-templates,index template>> instead of a data stream. You can still
|
||||
use <<index-lifecycle-management,{ilm-init}>> to manage the indices.
|
@ -1,130 +1,60 @@
|
||||
[[data-streams]]
|
||||
= Data streams
|
||||
++++
|
||||
<titleabbrev>Data streams</titleabbrev>
|
||||
++++
|
||||
|
||||
[partintro]
|
||||
--
|
||||
You can use data streams to index time-based data that's continuously generated.
|
||||
A data stream groups indices from the same time-based data source.
|
||||
A data stream tracks its indices, known as _backing indices_, using an ordered
|
||||
list.
|
||||
A _data stream_ is a convenient, scalable way to ingest, search, and manage
|
||||
continuously generated time-series data.
|
||||
|
||||
A data stream's backing indices are <<index-hidden,hidden>>.
|
||||
While all backing indices handle read requests, the most recently created
|
||||
backing index is the data stream's only write index. A data stream only
|
||||
accepts <<docs-index_,index requests>> with `op_type` set to `create`. To update
|
||||
or delete specific documents in a data stream, submit a <<docs-delete,delete>>
|
||||
or <<docs-update,update>> API request to the backing index containing the
|
||||
document.
|
||||
Time-series data, such as logs, tends to grow over time. While storing an entire
|
||||
time series in a single {es} index is simpler, it is often more efficient and
|
||||
cost-effective to store large volumes of data across multiple, time-based
|
||||
indices. Multiple indices let you move indices containing older, less frequently
|
||||
queried data to less expensive hardware and delete indices when they're no
|
||||
longer needed, reducing overhead and storage costs.
|
||||
|
||||
To create a data stream, set up a <<indices-templates,composable index
|
||||
template>> containing:
|
||||
A data stream is designed to give you the best of both worlds:
|
||||
|
||||
* A name or wildcard pattern for the data stream in the `index_patterns` property.
|
||||
* A `data_stream` definition that contains the `timestamp_field` property.
|
||||
The `timestamp_field` must be the primary timestamp field
|
||||
for the data source. This field must be included in every
|
||||
document indexed to the data stream.
|
||||
* The simplicity of a single, named resource you can use for requests
|
||||
related
|
||||
* The storage, scalability, and cost-saving benefits of multiple indices
|
||||
|
||||
When you index one or more documents to a not-yet-existent target matching
|
||||
the template's name or pattern, {es} automatically creates the corresponding
|
||||
data stream. You can also manually create a data stream using the
|
||||
<<indices-create-data-stream,create data stream API>>. However, a composable
|
||||
template for the stream is still required.
|
||||
You can submit indexing and search requests directly to a data stream. The
|
||||
stream automatically routes the requests to a collection of hidden,
|
||||
auto-generated indices that store the stream's data.
|
||||
|
||||
You can use a <<indices-templates,composable template>> and
|
||||
<<index-lifecycle-management,{ilm} ({ilm-init})>> to automate the management of
|
||||
these hidden indices. You can use {ilm-init} to spin up new indices, allocate
|
||||
indices to different hardware, delete old indices, and take other automatic
|
||||
actions based on age or size criteria you set. This lets you seamlessly scale
|
||||
your data storage based on your budget, performance, resiliency, and retention
|
||||
needs.
|
||||
|
||||
You can use the <<indices-rollover-index,rollover API>> to roll a data stream
|
||||
over to a new index when the current write index meets specified criteria, such
|
||||
as a maximum age or size. A rollover creates a new backing index and updates the
|
||||
data stream's list of backing indices. This new index then becomes the stream's
|
||||
new write index. See <<rollover-data-stream-ex>>.
|
||||
|
||||
[discrete]
|
||||
[[create-data-stream]]
|
||||
== Create a data stream
|
||||
[[when-to-use-data-streams]]
|
||||
== When to use data streams
|
||||
|
||||
Create a composable template with a `data_stream` definition:
|
||||
We recommend using data streams if you:
|
||||
|
||||
[source,console]
|
||||
-----------------------------------
|
||||
PUT /_index_template/logs_template
|
||||
{
|
||||
"index_patterns": ["logs-*"],
|
||||
"data_stream": {
|
||||
"timestamp_field": "@timestamp"
|
||||
}
|
||||
}
|
||||
-----------------------------------
|
||||
* Use {es} to ingest, search, and manage large volumes of time-series data
|
||||
* Want to scale and reduce costs by using {ilm-init} to automate the management
|
||||
of your indices
|
||||
* Index large volumes of time-series data in {es} but rarely delete or update
|
||||
individual documents
|
||||
|
||||
Start indexing data to a target matching the composable template's wildcard
|
||||
pattern:
|
||||
|
||||
[source,console]
|
||||
----
|
||||
POST /logs-foobar/_doc
|
||||
{
|
||||
"@timestamp": "2050-11-15T14:12:12",
|
||||
...
|
||||
}
|
||||
----
|
||||
// TEST[continued]
|
||||
// TEST[s/,//]
|
||||
// TEST[s/\.\.\.//]
|
||||
|
||||
Response:
|
||||
|
||||
[source,console-result]
|
||||
--------------------------------------------------
|
||||
{
|
||||
"_shards" : {
|
||||
"total" : 2,
|
||||
"failed" : 0,
|
||||
"successful" : 1
|
||||
},
|
||||
"_index" : ".ds-logs-foobar-000001",
|
||||
"_type" : "_doc",
|
||||
"_id" : "W0tpsmIBdwcYyG50zbta",
|
||||
"_version" : 1,
|
||||
"_seq_no" : 0,
|
||||
"_primary_term" : 1,
|
||||
"result": "created"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// TESTRESPONSE[s/W0tpsmIBdwcYyG50zbta/$body._id/]
|
||||
|
||||
Or create a data stream using the create data stream API:
|
||||
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /_data_stream/logs-barbaz
|
||||
--------------------------------------------------
|
||||
// TEST[continued]
|
||||
|
||||
////
|
||||
[source,console]
|
||||
-----------------------------------
|
||||
DELETE /_data_stream/logs-foobar
|
||||
DELETE /_data_stream/logs-barbaz
|
||||
DELETE /_index_template/logs_template
|
||||
-----------------------------------
|
||||
// TEST[continued]
|
||||
////
|
||||
|
||||
[discrete]
|
||||
[[data-streams-apis]]
|
||||
== Data stream APIs
|
||||
|
||||
The following APIs are available for managing data streams:
|
||||
|
||||
* To get information about data streams, use the <<indices-get-data-stream, get data stream API>>.
|
||||
* To delete data streams, use the <<indices-delete-data-stream, delete data stream API>>.
|
||||
* To manually create a data stream, use the <<indices-create-data-stream, create data stream API>>.
|
||||
|
||||
[discrete]
|
||||
[[data-streams-toc]]
|
||||
== In this section
|
||||
|
||||
* <<data-streams-overview>>
|
||||
* <<set-up-a-data-stream>>
|
||||
* <<use-a-data-stream>>
|
||||
--
|
||||
|
||||
|
||||
include::data-streams-overview.asciidoc[]
|
||||
include::set-up-a-data-stream.asciidoc[]
|
||||
include::use-a-data-stream.asciidoc[]
|
||||
|
@ -22,11 +22,11 @@ TIP: Data streams work well with most common log formats. While no schema is
|
||||
required to use data streams, we recommend the {ecs-ref}[Elastic Common Schema
|
||||
(ECS)].
|
||||
|
||||
* Data streams are designed to be append-only. While you can index new documents
|
||||
directly to a data stream, you cannot use a data stream to directly update or
|
||||
delete individual documents. To update or delete specific documents in a data
|
||||
stream, submit a <<docs-delete,delete>> or <<docs-update,update>> API request to
|
||||
the backing index containing the document.
|
||||
* Data streams are designed to be <<data-streams-append-only,append-only>>.
|
||||
While you can index new documents directly to a data stream, you cannot use a
|
||||
data stream to directly update or delete individual documents. To update or
|
||||
delete specific documents in a data stream, submit a <<docs-delete,delete>> or
|
||||
<<docs-update,update>> API request to the backing index containing the document.
|
||||
|
||||
|
||||
[discrete]
|
||||
@ -57,8 +57,9 @@ The following <<ilm-put-lifecycle,create lifecycle policy API>> request
|
||||
configures the `logs_policy` lifecycle policy.
|
||||
|
||||
The `logs_policy` policy uses the <<ilm-rollover,`rollover` action>> to create a
|
||||
new write index for the data stream when the current one reaches 25GB in size.
|
||||
The policy also deletes backing indices 30 days after their rollover.
|
||||
new <<data-stream-write-index,write index>> for the data stream when the current
|
||||
one reaches 25GB in size. The policy also deletes backing indices 30 days after
|
||||
their rollover.
|
||||
|
||||
[source,console]
|
||||
----
|
||||
|
@ -144,7 +144,8 @@ GET /logs/_search
|
||||
=== Manually roll over a data stream
|
||||
|
||||
A rollover creates a new backing index for a data stream. This new backing index
|
||||
becomes the stream's new write index and increments the stream's generation.
|
||||
becomes the stream's <<data-stream-write-index,write index>> and increments
|
||||
the stream's <<data-streams-generation,generation>>.
|
||||
|
||||
In most cases, we recommend using <<index-lifecycle-management,{ilm-init}>> to
|
||||
automate rollovers for data streams. This lets you automatically roll over the
|
||||
|
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 34 KiB |
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 43 KiB |
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 45 KiB |
@ -2,7 +2,7 @@
|
||||
== Index APIs
|
||||
|
||||
Index APIs are used to manage individual indices,
|
||||
index settings, data streams, aliases, mappings, and index templates.
|
||||
index settings, aliases, mappings, and index templates.
|
||||
|
||||
[float]
|
||||
[[index-management]]
|
||||
@ -31,13 +31,6 @@ index settings, data streams, aliases, mappings, and index templates.
|
||||
* <<indices-get-field-mapping>>
|
||||
* <<indices-types-exists>>
|
||||
|
||||
[float]
|
||||
[[data-stream-management]]
|
||||
=== Data stream management:
|
||||
* <<indices-create-data-stream>>
|
||||
* <<indices-delete-data-stream>>
|
||||
* <<indices-get-data-stream>>
|
||||
|
||||
[float]
|
||||
[[alias-management]]
|
||||
=== Alias management:
|
||||
@ -165,9 +158,3 @@ include::indices/apis/unfreeze.asciidoc[]
|
||||
include::indices/aliases.asciidoc[]
|
||||
|
||||
include::indices/update-settings.asciidoc[]
|
||||
|
||||
include::indices/create-data-stream.asciidoc[]
|
||||
|
||||
include::indices/get-data-stream.asciidoc[]
|
||||
|
||||
include::indices/delete-data-stream.asciidoc[]
|
||||
|
@ -47,6 +47,7 @@ endif::[]
|
||||
include::{es-repo-dir}/cat.asciidoc[]
|
||||
include::{es-repo-dir}/cluster.asciidoc[]
|
||||
include::{es-repo-dir}/ccr/apis/ccr-apis.asciidoc[]
|
||||
include::{es-repo-dir}/data-streams/data-stream-apis.asciidoc[]
|
||||
include::{es-repo-dir}/docs.asciidoc[]
|
||||
include::{es-repo-dir}/ingest/apis/enrich/index.asciidoc[]
|
||||
include::{es-repo-dir}/graph/explore.asciidoc[]
|
||||
|
Loading…
x
Reference in New Issue
Block a user