mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-03-27 18:38:41 +00:00
* Adding ESS icons to supported ES settings. * Adding new file for supported ESS settings. * Adding supported ESS settings for HTTP and disk-based shard allocation. * Adding more supported settings for ESS. * Adding descriptions for each Cloud section, plus additional settings. * Adding new warehouse file for Cloud, plus additional settings. * Adding node settings for Cloud. * Adding audit settings for Cloud. * Resolving merge conflict. * Adding SAML settings (part 1). * Adding SAML realm encryption and signing settings. * Adding SAML SSL settings. * Adding Kerberos realm settings. * Adding OpenID Connect Realm settings. * Adding OpenID Connect SSL settings. * Resolving leftover Git merge markers. * Removing Cloud settings page and link to it. * Add link to mapping source * Update docs/reference/docs/reindex.asciidoc * Incorporate edit of HTTP settings * Remove "cloud" from tag and ID * Remove "cloud" from tag and update description * Remove "cloud" from tag and ID * Change "whitelists" to "specifies" * Remove "cloud" from end tag * Removing cloud from IDs and tags. * Changing link reference to fix build issue. * Adding index management page for missing settings. * Removing warehouse file for Cloud and moving settings elsewhere. * Clarifying true/false usage of http.detailed_errors.enabled. * Changing underscore to dash in link to fix ci build.
335 lines
12 KiB
Plaintext
335 lines
12 KiB
Plaintext
[[snapshots-register-repository]]
|
|
== Register a snapshot repository
|
|
|
|
++++
|
|
<titleabbrev>Register repository</titleabbrev>
|
|
++++
|
|
[[snapshots-register-repository-description]]
|
|
// tag::snapshots-register-repository-tag[]
|
|
You must register a snapshot repository before you can perform snapshot and
|
|
restore operations. We recommend creating a new snapshot repository for each
|
|
major version. The valid repository settings depend on the repository type.
|
|
|
|
If you register same snapshot repository with multiple clusters, only
|
|
one cluster should have write access to the repository. All other clusters
|
|
connected to that repository should set the repository to `readonly` mode.
|
|
// end::snapshots-register-repository-tag[]
|
|
IMPORTANT: The snapshot format can change across major versions, so if you have
|
|
clusters on different versions trying to write the same repository, snapshots
|
|
written by one version may not be visible to the other and the repository could
|
|
be corrupted. While setting the repository to `readonly` on all but one of the
|
|
clusters should work with multiple clusters differing by one major version, it
|
|
is not a supported configuration.
|
|
|
|
[source,console]
|
|
-----------------------------------
|
|
PUT /_snapshot/my_backup
|
|
{
|
|
"type": "fs",
|
|
"settings": {
|
|
"location": "my_backup_location"
|
|
}
|
|
}
|
|
-----------------------------------
|
|
// TESTSETUP
|
|
|
|
To retrieve information about a registered repository, use a GET request:
|
|
|
|
[source,console]
|
|
-----------------------------------
|
|
GET /_snapshot/my_backup
|
|
-----------------------------------
|
|
|
|
which returns:
|
|
|
|
[source,console-result]
|
|
-----------------------------------
|
|
{
|
|
"my_backup": {
|
|
"type": "fs",
|
|
"settings": {
|
|
"location": "my_backup_location"
|
|
}
|
|
}
|
|
}
|
|
-----------------------------------
|
|
|
|
To retrieve information about multiple repositories, specify a comma-delimited
|
|
list of repositories. You can also use the * wildcard when
|
|
specifying repository names. For example, the following request retrieves
|
|
information about all of the snapshot repositories that start with `repo` or
|
|
contain `backup`:
|
|
|
|
[source,console]
|
|
-----------------------------------
|
|
GET /_snapshot/repo*,*backup*
|
|
-----------------------------------
|
|
|
|
To retrieve information about all registered snapshot repositories, omit the
|
|
repository name or specify `_all`:
|
|
|
|
[source,console]
|
|
-----------------------------------
|
|
GET /_snapshot
|
|
-----------------------------------
|
|
|
|
or
|
|
|
|
[source,console]
|
|
-----------------------------------
|
|
GET /_snapshot/_all
|
|
-----------------------------------
|
|
|
|
You can unregister a repository using the <<delete-snapshot-repo-api,delete
|
|
snapshot repository API>>:
|
|
|
|
[source,console]
|
|
-----------------------------------
|
|
DELETE /_snapshot/my_backup
|
|
-----------------------------------
|
|
|
|
When a repository is unregistered, {es} only removes the reference to the
|
|
location where the repository is storing the snapshots. The snapshots themselves
|
|
are left untouched and in place.
|
|
|
|
[float]
|
|
[[snapshots-filesystem-repository]]
|
|
=== Shared file system repository
|
|
|
|
The shared file system repository (`"type": "fs"`) uses the shared file system to store snapshots. In order to register
|
|
the shared file system repository it is necessary to mount the same shared filesystem to the same location on all
|
|
master and data nodes. This location (or one of its parent directories) must be registered in the `path.repo`
|
|
setting on all master and data nodes.
|
|
|
|
Assuming that the shared filesystem is mounted to `/mount/backups/my_fs_backup_location`, the following setting should
|
|
be added to `elasticsearch.yml` file:
|
|
|
|
[source,yaml]
|
|
--------------
|
|
path.repo: ["/mount/backups", "/mount/longterm_backups"]
|
|
--------------
|
|
|
|
The `path.repo` setting supports Microsoft Windows UNC paths as long as at least server name and share are specified as
|
|
a prefix and back slashes are properly escaped:
|
|
|
|
[source,yaml]
|
|
--------------
|
|
path.repo: ["\\\\MY_SERVER\\Snapshots"]
|
|
--------------
|
|
|
|
After all nodes are restarted, the following command can be used to register the shared file system repository with
|
|
the name `my_fs_backup`:
|
|
|
|
[source,console]
|
|
-----------------------------------
|
|
PUT /_snapshot/my_fs_backup
|
|
{
|
|
"type": "fs",
|
|
"settings": {
|
|
"location": "/mount/backups/my_fs_backup_location",
|
|
"compress": true
|
|
}
|
|
}
|
|
-----------------------------------
|
|
// TEST[skip:no access to absolute path]
|
|
|
|
If the repository location is specified as a relative path this path will be resolved against the first path specified
|
|
in `path.repo`:
|
|
|
|
[source,console]
|
|
-----------------------------------
|
|
PUT /_snapshot/my_fs_backup
|
|
{
|
|
"type": "fs",
|
|
"settings": {
|
|
"location": "my_fs_backup_location",
|
|
"compress": true
|
|
}
|
|
}
|
|
-----------------------------------
|
|
// TEST[continued]
|
|
|
|
The following settings are supported:
|
|
|
|
[horizontal]
|
|
`location`:: Location of the snapshots. Mandatory.
|
|
`compress`:: Turns on compression of the snapshot files. Compression is applied only to metadata files (index mapping and settings). Data files are not compressed. Defaults to `true`.
|
|
`chunk_size`:: Big files can be broken down into chunks during snapshotting if needed. Specify the chunk size as a value and
|
|
unit, for example: `1GB`, `10MB`, `5KB`, `500B`. Defaults to `null` (unlimited chunk size).
|
|
`max_restore_bytes_per_sec`:: Throttles per node restore rate. Defaults to unlimited. Note that restores are also throttled through <<recovery,recovery settings>>.
|
|
`max_snapshot_bytes_per_sec`:: Throttles per node snapshot rate. Defaults to `40mb` per second.
|
|
`readonly`:: Makes repository read-only. Defaults to `false`.
|
|
|
|
[float]
|
|
[[snapshots-read-only-repository]]
|
|
=== Read-only URL repository
|
|
|
|
If you register the same snapshot repository with multiple clusters, only one
|
|
cluster should have write access to the repository. Having multiple clusters
|
|
write to the repository at the same time risks corrupting the contents of the
|
|
repository.
|
|
|
|
To reduce this risk, you can use URL repositories (`"type": "url"`) to give one
|
|
or more clusters read-only access to a shared file system repository. As URL
|
|
repositories are always read-only, they are a safer and more convenient
|
|
alternative to registering a read-only shared filesystem repository.
|
|
|
|
The URL specified in the `url` parameter should point to the root of the shared
|
|
filesystem repository.
|
|
|
|
[source,console]
|
|
----
|
|
PUT /_snapshot/my_read_only_url_repository
|
|
{
|
|
"type": "url",
|
|
"settings": {
|
|
"url": "file:/mount/backups/my_fs_backup_location"
|
|
}
|
|
}
|
|
----
|
|
// TEST[skip:no access to url file path]
|
|
|
|
The `url` parameter supports the following protocols:
|
|
|
|
* `file`
|
|
* `ftp`
|
|
* `http`
|
|
* `https`
|
|
* `jar`
|
|
|
|
URLs using the `file` protocol must point to the location of a shared filesystem
|
|
accessible to all master and data nodes in the cluster. This location must be
|
|
registered in the `path.repo` setting, similar to a
|
|
<<snapshots-filesystem-repository,shared file system repository>>.
|
|
|
|
URLs using the `ftp`, `http`, or `https` protocols must be explicitly allowed with the
|
|
`repositories.url.allowed_urls` setting. This setting supports wildcards (`*`)
|
|
in place of a host, path, query, or fragment in the URL. For example:
|
|
|
|
[source,yaml]
|
|
----
|
|
repositories.url.allowed_urls: ["http://www.example.org/root/*", "https://*.mydomain.com/*?*#*"]
|
|
----
|
|
|
|
NOTE: URLs using the `ftp`, `http`, `https`, or `jar` protocols do not need to
|
|
be registered in the `path.repo` setting.
|
|
|
|
[float]
|
|
[role="xpack"]
|
|
[testenv="basic"]
|
|
[[snapshots-source-only-repository]]
|
|
=== Source only repository
|
|
|
|
A source repository enables you to create minimal, source-only snapshots that take up to 50% less space on disk.
|
|
Source only snapshots contain stored fields and index metadata. They do not include index or doc values structures
|
|
and are not searchable when restored. After restoring a source-only snapshot, you must <<docs-reindex,reindex>>
|
|
the data into a new index.
|
|
|
|
Source repositories delegate to another snapshot repository for storage.
|
|
|
|
[IMPORTANT]
|
|
==================================================
|
|
|
|
Source only snapshots are only supported if the `_source` field is enabled and no source-filtering is applied.
|
|
When you restore a source only snapshot:
|
|
|
|
* The restored index is read-only and can only serve `match_all` search or scroll requests to enable reindexing.
|
|
|
|
* Queries other than `match_all` and `_get` requests are not supported.
|
|
|
|
* The mapping of the restored index is empty, but the original mapping is available from the types top
|
|
level `meta` element.
|
|
|
|
==================================================
|
|
|
|
When you create a source repository, you must specify the type and name of the delegate repository
|
|
where the snapshots will be stored:
|
|
|
|
[source,console]
|
|
-----------------------------------
|
|
PUT _snapshot/my_src_only_repository
|
|
{
|
|
"type": "source",
|
|
"settings": {
|
|
"delegate_type": "fs",
|
|
"location": "my_backup_location"
|
|
}
|
|
}
|
|
-----------------------------------
|
|
// TEST[continued]
|
|
|
|
[float]
|
|
[[snapshots-repository-plugins]]
|
|
=== Repository plugins
|
|
|
|
Other repository backends are available in these official plugins:
|
|
|
|
* {plugins}/repository-s3.html[repository-s3] for S3 repository support
|
|
* {plugins}/repository-hdfs.html[repository-hdfs] for HDFS repository support in Hadoop environments
|
|
* {plugins}/repository-azure.html[repository-azure] for Azure storage repositories
|
|
* {plugins}/repository-gcs.html[repository-gcs] for Google Cloud Storage repositories
|
|
|
|
[float]
|
|
[[snapshots-repository-verification]]
|
|
=== Repository verification
|
|
When a repository is registered, it's immediately verified on all master and data nodes to make sure that it is functional
|
|
on all nodes currently present in the cluster. The `verify` parameter can be used to explicitly disable the repository
|
|
verification when registering or updating a repository:
|
|
|
|
[source,console]
|
|
-----------------------------------
|
|
PUT /_snapshot/my_unverified_backup?verify=false
|
|
{
|
|
"type": "fs",
|
|
"settings": {
|
|
"location": "my_unverified_backup_location"
|
|
}
|
|
}
|
|
-----------------------------------
|
|
// TEST[continued]
|
|
|
|
The verification process can also be executed manually by running the following command:
|
|
|
|
[source,console]
|
|
-----------------------------------
|
|
POST /_snapshot/my_unverified_backup/_verify
|
|
-----------------------------------
|
|
// TEST[continued]
|
|
|
|
It returns a list of nodes where repository was successfully verified or an error message if verification process failed.
|
|
|
|
[float]
|
|
[[snapshots-repository-cleanup]]
|
|
=== Repository cleanup
|
|
Repositories can over time accumulate data that is not referenced by any existing snapshot. This is a result of the data safety guarantees
|
|
the snapshot functionality provides in failure scenarios during snapshot creation and the decentralized nature of the snapshot creation
|
|
process. This unreferenced data does in no way negatively impact the performance or safety of a snapshot repository but leads to higher
|
|
than necessary storage use. In order to clean up this unreferenced data, users can call the cleanup endpoint for a repository which will
|
|
trigger a complete accounting of the repositories contents and subsequent deletion of all unreferenced data that was found.
|
|
|
|
[source,console]
|
|
-----------------------------------
|
|
POST /_snapshot/my_repository/_cleanup
|
|
-----------------------------------
|
|
// TEST[continued]
|
|
|
|
The response to a cleanup request looks as follows:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"results": {
|
|
"deleted_bytes": 20,
|
|
"deleted_blobs": 5
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
Depending on the concrete repository implementation the numbers shown for bytes free as well as the number of blobs removed will either
|
|
be an approximation or an exact result. Any non-zero value for the number of blobs removed implies that unreferenced blobs were found and
|
|
subsequently cleaned up.
|
|
|
|
Please note that most of the cleanup operations executed by this endpoint are automatically executed when deleting any snapshot from a
|
|
repository. If you regularly delete snapshots, you will in most cases not get any or only minor space savings from using this functionality
|
|
and should lower your frequency of invoking it accordingly.
|