Add remote cluster state documentation (#5726)

* Add remote cluster state documentation

Signed-off-by: Sooraj Sinha <soosinha@amazon.com>

* Apply suggestions from code review

Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Signed-off-by: Sooraj Sinha <81695996+soosinha@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>

---------

Signed-off-by: Sooraj Sinha <soosinha@amazon.com>
Signed-off-by: Sooraj Sinha <81695996+soosinha@users.noreply.github.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
This commit is contained in:
Sooraj Sinha 2023-12-05 23:21:46 +05:30 committed by GitHub
parent a55bfa2fa7
commit 2b7a862e8d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 54 additions and 6 deletions

View File

@ -36,7 +36,6 @@ To enable remote-backed storage for a given cluster, provide the remote store re
# Repository name
node.attr.remote_store.segment.repository: my-repo-1
node.attr.remote_store.translog.repository: my-repo-2
node.attr.remote_store.state.repository: my-repo-3
# Segment repository settings
node.attr.remote_store.repository.my-repo-1.type: s3
@ -50,14 +49,11 @@ node.attr.remote_store.repository.my-repo-2.settings.bucket: <Bucket Name 2>
node.attr.remote_store.repository.my-repo-2.settings.base_path: <Bucket Base Path 2>
node.attr.remote_store.repository.my-repo-2.settings.region: us-east-1
# Cluster state repository settings
node.attr.remote_store.repository.my-repo-3.type: s3
node.attr.remote_store.repository.my-repo-3.settings.bucket: <Bucket Name 3>
node.attr.remote_store.repository.my-repo-3.settings.base_path: <Bucket Base Path 3>
node.attr.remote_store.repository.my-repo-3.settings.region: us-east-1
```
{% include copy-curl.html %}
For more information about configuring settings for the remote cluster state, see [Remote Cluster State]({{site.url}}{{site.baseurl}}/tuning-your-cluster/availability-and-recovery/remote-store/remote-cluster-state/). This is required in order for cluster metadata to persist on the remote store.
You do not have to use three different remote store repositories for segment, translog, and state. All three stores can share the same repository.
During the bootstrapping process, the remote-backed repositories listed in `opensearch.yml` are automatically registered. After the cluster is created with the `remote_store` settings, all indexes created in that cluster will start uploading data to the configured remote store.

View File

@ -0,0 +1,52 @@
---
layout: default
title: Remote Cluster State
nav_order: 5
parent: Remote-backed storage
grand_parent: Availability and recovery
---
# Remote cluster state
Introduced 2.10
{: .label .label-purple }
The _remote cluster state_ functionality for remote-backed storage protects against any cluster state metadata loss resulting due to the permanent loss of the majority of cluster manager nodes inside the cluster.
_Cluster state_ is an internal data structure that contains the metadata of the cluster, including the following:
- Index settings
- Index mappings
- Active copies of shards in the cluster
- Cluster-level settings
- Data streams
- Templates
The cluster state metadata is managed by the elected cluster manager node and is essential for the cluster to properly function. When the cluster loses the majority of the cluster manager nodes permanently, then the cluster may experience data loss because the latest cluster state metadata might not be present in the surviving cluster manager nodes. Persisting the state of all the cluster manager nodes in the cluster to remote-backed storage provides better durability.
When the remote cluster state feature is enabled, the cluster metadata will be published to a remote repository configured in the cluster. As of OpenSearch 2.10, only index metadata will persist to remote-backed storage.
Any time new cluster manager nodes are launched after disaster recovery, the nodes will automatically bootstrap using the latest index metadata stored in the remote repository. Consequently, the index data will also be restored when the remote store is enabled.
## Configuring the remote cluster state
Remote cluster state settings can be enabled while bootstrapping the cluster. After the remote cluster state is enabled, it can be disabled by updating the settings and performing a rolling restart of all the nodes.
To enable the remote cluster state for a given cluster, add the following cluster-level and repository settings to the cluster's `opensearch.yml` file:
```yml
# Enable Remote cluster state cluster setting
cluster.remote_store.state.enabled: true
# Remote cluster state repository settings
node.attr.remote_store.state.repository: my-remote-state-repo
node.attr.remote_store.repository.my-remote-state-repo.type: s3
node.attr.remote_store.repository.my-remote-state-repo.settings.bucket: <Bucket Name 3>
node.attr.remote_store.repository.my-remote-state-repo.settings.base_path: <Bucket Base Path 3>
node.attr.remote_store.repository.my-remote-state-repo.settings.region: <Bucket region>
```
{% include copy-curl.html %}
## Limitations
The remote cluster state functionality has the following limitations:
- As of OpenSearch 2.10, only index metadata can be uploaded and restored from remote-backed storage.
- Unsafe bootstrap scripts cannot be run when the remote cluster state is enabled. When a majority of cluster-manager nodes are lost and the cluster goes down, the user needs to replace any remaining cluster manager nodes and reseed the nodes in order to bootstrap a new cluster.
- The remote cluster state cannot be enabled without first configuring remote-backed storage.