mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-08 05:58:44 +00:00
aa8ebeb918
Today we describe snapshots as "incremental" but their incrementality is rather different beast from e.g. incremental filesystem backups. With traditional backups you take a large and relatively infrequent "full" backup and then a sequence of smaller "incremental" ones, and this whole sequence of backups is required for a restore so it must be kept around until at least the next full backup. In contrast, Elasticsearch snapshots are logically independent and each can be deleted without affecting the integrity of the others. This distinction frequently causes confusion amongst newer users, so this commit clarifies what we mean by "incremental" in the docs.
120 lines
5.5 KiB
Plaintext
120 lines
5.5 KiB
Plaintext
[[snapshot-restore]]
|
|
= Snapshot and restore
|
|
|
|
[partintro]
|
|
--
|
|
|
|
// tag::snapshot-intro[]
|
|
A _snapshot_ is a backup taken from a running {es} cluster.
|
|
You can take snapshots of an entire cluster, including all its data streams and
|
|
indices. You can also take snapshots of only specific data streams or indices in
|
|
the cluster.
|
|
|
|
You must
|
|
<<snapshots-register-repository, register a snapshot repository>>
|
|
before you can <<snapshots-take-snapshot, create snapshots>>.
|
|
|
|
Snapshots can be stored in either local or remote repositories.
|
|
Remote repositories can reside on Amazon S3, HDFS, Microsoft Azure,
|
|
Google Cloud Storage,
|
|
and other platforms supported by a {plugins}/repository.html[repository plugin].
|
|
|
|
{es} takes snapshots incrementally: the snapshotting process only copies data
|
|
to the repository that was not already copied there by an earlier snapshot,
|
|
avoiding unnecessary duplication of work or storage space. This means you can
|
|
safely take snapshots very frequently with minimal overhead. However, snapshots
|
|
are also logically independent: deleting a snapshot does not affect the
|
|
integrity of any other snapshot.
|
|
// end::snapshot-intro[]
|
|
|
|
// tag::restore-intro[]
|
|
You can <<snapshots-restore-snapshot,restore snapshots>> to a running cluster, which includes all data streams and indices in the snapshot
|
|
by default.
|
|
However, you can choose to restore only the cluster state or specific data
|
|
streams or indices from a snapshot.
|
|
// end::restore-intro[]
|
|
|
|
You can use
|
|
<<getting-started-snapshot-lifecycle-management, {slm}>>
|
|
to automatically take and manage snapshots.
|
|
|
|
// tag::backup-warning[]
|
|
WARNING: You cannot back up an {es} cluster by simply copying
|
|
the data directories of all of its nodes. {es} may be making changes to
|
|
the contents of its data directories while it is running; copying its data
|
|
directories cannot be expected to capture a consistent picture of their contents.
|
|
If you try to restore a cluster from such a backup, it may fail and report
|
|
corruption and/or missing files. Alternatively, it may appear to have succeeded
|
|
though it silently lost some of its data. The only reliable way to back up a
|
|
cluster is by using the snapshot and restore functionality.
|
|
|
|
// end::backup-warning[]
|
|
|
|
[discrete]
|
|
[[snapshot-restore-version-compatibility]]
|
|
=== Version compatibility
|
|
|
|
IMPORTANT: Version compatibility refers to the underlying Lucene index
|
|
compatibility. Follow the <<setup-upgrade,Upgrade documentation>>
|
|
when migrating between versions.
|
|
|
|
A snapshot contains a copy of the on-disk data structures that comprise an
|
|
index or a data stream's backing indices. This means that snapshots can only be restored to versions of
|
|
{es} that can read the indices.
|
|
|
|
The following table indicates snapshot compatibility between versions. The first column denotes the base version that you can restore snapshots from.
|
|
|
|
// tag::snapshot-compatibility-matrix[]
|
|
[cols="6"]
|
|
|===
|
|
| 5+^h| Cluster version
|
|
^h| Snapshot version ^| 2.x ^| 5.x ^| 6.x ^| 7.x ^| 8.x
|
|
^| *1.x* -> ^|{yes-icon} ^|{no-icon} ^|{no-icon} ^|{no-icon} ^|{no-icon}
|
|
^| *2.x* -> ^|{yes-icon} ^|{yes-icon} ^|{no-icon} ^|{no-icon} ^|{no-icon}
|
|
^| *5.x* -> ^|{no-icon} ^|{yes-icon} ^|{yes-icon} ^|{no-icon} ^|{no-icon}
|
|
^| *6.x* -> ^|{no-icon} ^|{no-icon} ^|{yes-icon} ^|{yes-icon} ^|{no-icon}
|
|
^| *7.x* -> ^|{no-icon} ^|{no-icon} ^|{no-icon} ^|{yes-icon} ^|{yes-icon}
|
|
|===
|
|
// end::snapshot-compatibility-matrix[]
|
|
|
|
The following conditions apply for restoring snapshots and indices across versions:
|
|
|
|
* *Snapshots*: You cannot restore snapshots from later {es} versions into a cluster running an earlier {es} version. For example, you cannot restore a snapshot taken in 7.6.0 to a cluster running 7.5.0.
|
|
* *Indices*: You cannot restore indices into a cluster running a version of {es} that is more than _one major version_ newer than the version of {es} used to snapshot the indices. For example, you cannot restore indices from a snapshot taken in 5.0 to a cluster running 7.0.
|
|
+
|
|
[NOTE]
|
|
====
|
|
The one caveat is that snapshots taken by {es} 2.0 can be restored in clusters running {es} 5.0.
|
|
====
|
|
|
|
Each snapshot can contain indices created in various versions of {es}. This
|
|
includes backing indices created for data streams. When restoring a snapshot, it
|
|
must be possible to restore all of these indices into the target cluster. If any
|
|
indices in a snapshot were created in an incompatible version, you will not be
|
|
able restore the snapshot.
|
|
|
|
IMPORTANT: When backing up your data prior to an upgrade, keep in mind that you
|
|
won't be able to restore snapshots after you upgrade if they contain indices
|
|
created in a version that's incompatible with the upgrade version.
|
|
|
|
If you end up in a situation where you need to restore a snapshot of a data stream or index
|
|
that is incompatible with the version of the cluster you are currently running,
|
|
you can restore it on the latest compatible version and use
|
|
<<reindex-from-remote,reindex-from-remote>> to rebuild the data stream or index on the current
|
|
version. Reindexing from remote is only possible if the original data stream or index has
|
|
source enabled. Retrieving and reindexing the data can take significantly
|
|
longer than simply restoring a snapshot. If you have a large amount of data, we
|
|
recommend testing the reindex from remote process with a subset of your data to
|
|
understand the time requirements before proceeding.
|
|
|
|
--
|
|
|
|
include::register-repository.asciidoc[]
|
|
include::take-snapshot.asciidoc[]
|
|
include::restore-snapshot.asciidoc[]
|
|
include::monitor-snapshot-restore.asciidoc[]
|
|
include::delete-snapshot.asciidoc[]
|
|
include::../slm/index.asciidoc[]
|
|
include::../searchable-snapshots/index.asciidoc[]
|
|
|