diff --git a/docs/reference/glossary.asciidoc b/docs/reference/glossary.asciidoc index c580f0a2109..7a9a3596e39 100644 --- a/docs/reference/glossary.asciidoc +++ b/docs/reference/glossary.asciidoc @@ -417,6 +417,22 @@ This value can be overridden by specifying a `routing` value at index time, or a <> in the <>. +[[glossary-searchable-snapshot]] searchable snapshot :: +// tag::searchable-snapshot-def[] +A <> of an index that has been mounted as a +<> and can be +searched as if it were a regular index. +// end::searchable-snapshot-def[] + +[[glossary-searchable-snapshot-index]] searchable snapshot index :: +// tag::searchable-snapshot-index-def[] +An <> whose data is stored in a <> that resides in a separate <> such as AWS S3. Searchable snapshot indices do not need +<> shards for resilience, since their data is +reliably stored outside the cluster. +// end::searchable-snapshot-index-def[] + [[glossary-shard]] shard :: + -- @@ -449,9 +465,11 @@ See the {ref}/indices-shrink-index.html[shrink index API]. [[glossary-snapshot]] snapshot :: // tag::snapshot-def[] -A backup taken from a running {es} cluster. -A snapshot can include backups of an entire cluster or only data streams and -indices you specify. +Captures the state of the whole cluster or of particular indices or data +streams at a particular point in time. Snapshots provide a back up of a running +cluster, ensuring you can restore your data in the event of a failure. You can +also mount indices or datastreams from snapshots as read-only +{ref}/glossary.html#glossary-searchable-snapshot-index[searchable snapshots]. // end::snapshot-def[] [[glossary-snapshot-lifecycle-policy]] snapshot lifecycle policy :: diff --git a/docs/reference/searchable-snapshots/index.asciidoc b/docs/reference/searchable-snapshots/index.asciidoc new file mode 100644 index 00000000000..21f7aa22fb5 --- /dev/null +++ b/docs/reference/searchable-snapshots/index.asciidoc @@ -0,0 +1,99 @@ +[[searchable-snapshots]] +== {search-snaps-cap} + +beta::[] + +{search-snaps-cap} let you reduce your operating costs by using +<> for resiliency rather than maintaining +<> within a cluster. When you mount an index from a +snapshot as a {search-snap}, {es} copies the index shards to local storage +within the cluster. This ensures that search performance is comparable to +searching any other index, and minimizes the need to access the snapshot +repository. Should a node fail, shards of a {search-snap} index are +automatically recovered from the snapshot repository. + +This can result in significant cost savings. With {search-snaps}, you may be +able to halve your cluster size without increasing the risk of data loss or +reducing the amount of data you can search. Because {search-snaps} rely on the +same snapshot mechanism you use for backups, they have a minimal impact on your +snapshot repository storage costs. + +[discrete] +[[using-searchable-snapshots]] +=== Using {search-snaps} + +Searching a {search-snap} index is the same as searching any other index. +Search performance is comparable to regular indices because the shard data is +copied onto nodes in the cluster when the {search-snap} is mounted. + +By default, {search-snap} indices have no replicas. The underlying snapshot +provides resilience and the query volume is expected to be low enough that a +single shard copy will be sufficient. However, if you need to support a higher +query volume, you can add replicas by adjusting the `index.number_of_replicas` +index setting. + +If a node fails and {search-snap} shards need to be restored from the snapshot, +there is a brief window of time while {es} allocates the shards to other nodes +where the cluster health will not be `green`. Searches that hit these shards +will fail or return partial results until they are reallocated. + +You typically manage {search-snaps} through {ilm-init}. The +<> action automatically converts +an index to a {search-snap} when it reaches the `cold` phase. You can also make +indices in existing snapshots searchable by manually mounting them as +{search-snaps} with the <> API. + +To mount an index from a snapshot that contains multiple indices, we recommend +creating a <> of the snapshot that contains only the +index you want to search, and mounting the clone. You cannot delete a snapshot +if it has any mounted indices, so creating a clone enables you to manage the +lifecycle of the backup snapshot independently of any {search-snaps}. + +You can control the allocation of the shards of {search-snap} indices using the +same mechanisms as for regular indices. For example, you could use +<> to restrict {search-snap} shards to a subset of +your nodes. + +We recommend that you <> indices to a single +segment per shard before taking a snapshot that will be mounted as a +{search-snap} index. Each read from a snapshot repository takes time and costs +money, and the fewer segments there are the fewer reads are needed to restore +the snapshot. + +[TIP] +==== +{search-snaps-cap} are ideal for managing a large archive of historical data. +Historical information is typically searched less frequently than recent data +and therefore may not need replicas for their performance benefits. + +For more complex or time-consuming searches, you can use <> with +{search-snaps}. +==== + +[discrete] +[[how-searchable-snapshots-work]] +=== How {search-snaps} work + +When an index is mounted from a snapshot, {es} allocates its shards to data +nodes within the cluster. The data nodes then automatically restore the shard +data from the repository onto local storage. Once the restore process +completes, these shards respond to searches using the data held in local +storage and do not need to access the repository. This avoids incurring the +cost or performance penalty associated with reading data from the repository. + +If a node holding one of these shards fails, {es} automatically allocates it to +another node, and that node restores the shard data from the repository. No +replicas are needed, and no complicated monitoring or orchestration is +necessary to restore lost shards. + +{es} restores {search-snap} shards in the background and you can search them +even if they have not been fully restored. If a search hits a {search-snap} +shard before it has been fully restored, {es} eagerly retrieves the data needed +for the search. If a shard is freshly allocated to a node and still warming up, +some searches will be slower. However, searches typically access a very small +fraction of the total shard data so the performance penalty is typically small. + +Replicas of {search-snaps} shards are restored by copying data from the +snapshot repository. In contrast, replicas of regular indices are restored by +copying data from the primary. diff --git a/docs/reference/snapshot-restore/index.asciidoc b/docs/reference/snapshot-restore/index.asciidoc index 8286c732768..1b30afdbce9 100644 --- a/docs/reference/snapshot-restore/index.asciidoc +++ b/docs/reference/snapshot-restore/index.asciidoc @@ -112,3 +112,5 @@ include::restore-snapshot.asciidoc[] include::monitor-snapshot-restore.asciidoc[] include::delete-snapshot.asciidoc[] include::../slm/index.asciidoc[] +include::../searchable-snapshots/index.asciidoc[] +