100 lines
4.9 KiB
Plaintext
100 lines
4.9 KiB
Plaintext
[[searchable-snapshots]]
|
|
== {search-snaps-cap}
|
|
|
|
beta::[]
|
|
|
|
{search-snaps-cap} let you reduce your operating costs by using
|
|
<<snapshot-restore, snapshots>> for resiliency rather than maintaining
|
|
<<scalability,replica shards>> within a cluster. When you mount an index from a
|
|
snapshot as a {search-snap}, {es} copies the index shards to local storage
|
|
within the cluster. This ensures that search performance is comparable to
|
|
searching any other index, and minimizes the need to access the snapshot
|
|
repository. Should a node fail, shards of a {search-snap} index are
|
|
automatically recovered from the snapshot repository.
|
|
|
|
This can result in significant cost savings. With {search-snaps}, you may be
|
|
able to halve your cluster size without increasing the risk of data loss or
|
|
reducing the amount of data you can search. Because {search-snaps} rely on the
|
|
same snapshot mechanism you use for backups, they have a minimal impact on your
|
|
snapshot repository storage costs.
|
|
|
|
[discrete]
|
|
[[using-searchable-snapshots]]
|
|
=== Using {search-snaps}
|
|
|
|
Searching a {search-snap} index is the same as searching any other index.
|
|
Search performance is comparable to regular indices because the shard data is
|
|
copied onto nodes in the cluster when the {search-snap} is mounted.
|
|
|
|
By default, {search-snap} indices have no replicas. The underlying snapshot
|
|
provides resilience and the query volume is expected to be low enough that a
|
|
single shard copy will be sufficient. However, if you need to support a higher
|
|
query volume, you can add replicas by adjusting the `index.number_of_replicas`
|
|
index setting.
|
|
|
|
If a node fails and {search-snap} shards need to be restored from the snapshot,
|
|
there is a brief window of time while {es} allocates the shards to other nodes
|
|
where the cluster health will not be `green`. Searches that hit these shards
|
|
will fail or return partial results until they are reallocated.
|
|
|
|
You typically manage {search-snaps} through {ilm-init}. The
|
|
<<ilm-searchable-snapshot, searchable snapshots>> action automatically converts
|
|
an index to a {search-snap} when it reaches the `cold` phase. You can also make
|
|
indices in existing snapshots searchable by manually mounting them as
|
|
{search-snaps} with the <<searchable-snapshots-api-mount-snapshot, mount
|
|
snapshot>> API.
|
|
|
|
To mount an index from a snapshot that contains multiple indices, we recommend
|
|
creating a <<clone-snapshot-api, clone>> of the snapshot that contains only the
|
|
index you want to search, and mounting the clone. You cannot delete a snapshot
|
|
if it has any mounted indices, so creating a clone enables you to manage the
|
|
lifecycle of the backup snapshot independently of any {search-snaps}.
|
|
|
|
You can control the allocation of the shards of {search-snap} indices using the
|
|
same mechanisms as for regular indices. For example, you could use
|
|
<<shard-allocation-filtering>> to restrict {search-snap} shards to a subset of
|
|
your nodes.
|
|
|
|
We recommend that you <<indices-forcemerge, force-merge>> indices to a single
|
|
segment per shard before taking a snapshot that will be mounted as a
|
|
{search-snap} index. Each read from a snapshot repository takes time and costs
|
|
money, and the fewer segments there are the fewer reads are needed to restore
|
|
the snapshot.
|
|
|
|
[TIP]
|
|
====
|
|
{search-snaps-cap} are ideal for managing a large archive of historical data.
|
|
Historical information is typically searched less frequently than recent data
|
|
and therefore may not need replicas for their performance benefits.
|
|
|
|
For more complex or time-consuming searches, you can use <<async-search>> with
|
|
{search-snaps}.
|
|
====
|
|
|
|
[discrete]
|
|
[[how-searchable-snapshots-work]]
|
|
=== How {search-snaps} work
|
|
|
|
When an index is mounted from a snapshot, {es} allocates its shards to data
|
|
nodes within the cluster. The data nodes then automatically restore the shard
|
|
data from the repository onto local storage. Once the restore process
|
|
completes, these shards respond to searches using the data held in local
|
|
storage and do not need to access the repository. This avoids incurring the
|
|
cost or performance penalty associated with reading data from the repository.
|
|
|
|
If a node holding one of these shards fails, {es} automatically allocates it to
|
|
another node, and that node restores the shard data from the repository. No
|
|
replicas are needed, and no complicated monitoring or orchestration is
|
|
necessary to restore lost shards.
|
|
|
|
{es} restores {search-snap} shards in the background and you can search them
|
|
even if they have not been fully restored. If a search hits a {search-snap}
|
|
shard before it has been fully restored, {es} eagerly retrieves the data needed
|
|
for the search. If a shard is freshly allocated to a node and still warming up,
|
|
some searches will be slower. However, searches typically access a very small
|
|
fraction of the total shard data so the performance penalty is typically small.
|
|
|
|
Replicas of {search-snaps} shards are restored by copying data from the
|
|
snapshot repository. In contrast, replicas of regular indices are restored by
|
|
copying data from the primary.
|