* Adding appendix that I want to use for reading aids or additional context for the user and reader Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Saving - trying to multitask and follow a migrations workshop so not a lot of progress should be expected today Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Working on splitting up the content Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Sanitized example commands Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Added intro blurb to the appendix - I will revisit the wording after the doc grows a bit to see how much sense it makes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Redid headers and added cross-link Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Changed headings again to make them simpler Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Expanding on intro and starting to add steps Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Removed disclaimer about prerequisite skills since I backed-off with the specifics in this guide Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Adding more context to the overview Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Added more details to the snapshot recommendation including links to the supported repository solutions Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Wording changes in appendix Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Adding process steps and testing the copy html button function - so far so good as long as indentation is correct Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Stashing changes because I need to get the script uploaded to the repo so I can make sure the permalink works Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Working out steps Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Added one-liner for resetting the environment Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Added one-liner for resetting the environment Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Adding more steps Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Still adding Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Reorganizing the page layout Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Improving landing page Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Breaking up the content for sustainability Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Phrasing Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Clip index page to keep it concise - I think Jekyll will automatically add a related articles section since it has_children is true Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Clean up environment Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Clean up environment Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Testing out codeblock labels Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Stashing changes while I pull down a remote branch for a PR review Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Adding steps Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Continuing to build out process and added code block labels to improve readability Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Stashing to check another branch for PR review Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Continuing to add steps Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Added headings to backups section so I can link the bullet list Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Adding steps and cleaning up sections for readability Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Completed some additional research on security and started adding steps with commentary Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Wrapping up security settings backup procedure Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Stashing changes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Upgrade steps Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Adding upgrade steps Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Added final steps to the actual upgrade procedure Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Added steps for upgrading OpenSearch Dashboards Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Almost forgot to re-enabled shard replica allocation Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Wrapped up first part of validation and working on next Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Adding more checks Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Finished adding steps which completes first draft now starting a first round review of the doc Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Fixed typo Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Reordering of intro and removal of a tip box in favor of just adding to the existing sentences Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Further trimmed intro and removed the other box Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Fixed wording in intro Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Fixed wording and ordering in set up the environment Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Revisions Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Gerunds Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Cleaning up phrasing Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Added tip about curl Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Make wording more reader-friendly and informative Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Make wording more reader-friendly and informative Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Doc review edits Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Changed h1 and title to sentence case per other review comments Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Implementing review feedback on h2 header Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Making gerunds Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Editorial fixes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Editorial fixes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Editorial fixes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Editorial fixes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Editorial fixes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Editorial fixes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Editorial fixes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Editorial fixes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Editorial fixes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Editorial fixes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Final editorial changes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> * Final editorial changes Signed-off-by: JeffH-AWS <jeffhuss@amazon.com> --------- Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>
8.8 KiB
layout | title | parent | nav_order |
---|---|---|---|
default | Rolling Upgrade | Upgrading OpenSearch | 10 |
Rolling Upgrade
Rolling upgrades, sometimes referred to as "node replacement upgrades," can be performed on running clusters with virtually no downtime. Nodes are individually stopped and upgraded in place. Alternatively, nodes can be stopped and replaced, one at a time, by hosts running the new version. During this process you can continue to index and query data in your cluster.
This document serves as a high-level, platform-agnostic overview of the rolling upgrade procedure. For specific examples of commands, scripts, and configuration files, refer to the Appendix.
Preparing to upgrade
Review Upgrading OpenSearch for recommendations about backing up your configuration files and creating a snapshot of the cluster state and indexes before you make any changes to your OpenSearch cluster.
Important: OpenSearch nodes cannot be downgraded. If you need to revert the upgrade, then you will need to perform a fresh installation of OpenSearch and restore the cluster from a snapshot. Take a snapshot and store it in a remote repository before beginning the upgrade procedure. {: .important}
Performing the upgrade
- Verify the health of your OpenSearch cluster before you begin. You should resolve any index or shard allocation issues prior to upgrading to ensure that your data is preserved. A status of green indicates that all primary and replica shards are allocated. See Cluster health for more information. The following command queries the
_cluster/health
API endpoint:
The response should look similar to the following example:GET "/_cluster/health?pretty"
{ "cluster_name":"opensearch-dev-cluster", "status":"green", "timed_out":false, "number_of_nodes":4, "number_of_data_nodes":4, "active_primary_shards":1, "active_shards":4, "relocating_shards":0, "initializing_shards":0, "unassigned_shards":0, "delayed_unassigned_shards":0, "number_of_pending_tasks":0, "number_of_in_flight_fetch":0, "task_max_waiting_in_queue_millis":0, "active_shards_percent_as_number":100.0 }
- Disable shard replication to prevent shard replicas from being created while nodes are being taken offline. This stops the movement of Lucene index segments on nodes in your cluster. You can disable shard replication by querying the
_cluster/settings
API endpoint:
The response should look similar to the following example:PUT "/_cluster/settings?pretty" { "persistent": { "cluster.routing.allocation.enable": "primaries" } }
{ "acknowledged" : true, "persistent" : { "cluster" : { "routing" : { "allocation" : { "enable" : "primaries" } } } }, "transient" : { } }
- Perform a flush operation on the cluster to commit transaction log entries to the Lucene index:
The response should look similar to the following example:POST "/_flush?pretty"
{ "_shards" : { "total" : 4, "successful" : 4, "failed" : 0 } }
- Review your cluster and identify the first node to upgrade. Eligible cluster manager nodes should be upgraded last because OpenSearch nodes can join a cluster with manager nodes running an older version, but they cannot join a cluster with all manager nodes running a newer version.
- Query the
_cat/nodes
endpoint to identify which node was promoted to cluster manager. The following command includes additional query parameters that request only the name, version, node.role, and master headers. Note that OpenSearch 1.x versions use the term "master," which has been deprecated and replaced by "cluster_manager" in OpenSearch 2.x and later.
The response should look similar to the following example:GET "/_cat/nodes?v&h=name,version,node.role,master" | column -t
name version node.role master os-node-01 7.10.2 dimr - os-node-04 7.10.2 dimr - os-node-03 7.10.2 dimr - os-node-02 7.10.2 dimr *
- Stop the node you are upgrading. Do not delete the volume associated with the container when you delete the container. The new OpenSearch container will use the existing volume. Deleting the volume will result in data loss.
- Confirm that the associated node has been dismissed from the cluster by querying the
_cat/nodes
API endpoint:
The response should look similar to the following example:GET "/_cat/nodes?v&h=name,version,node.role,master" | column -t
name version node.role master os-node-02 7.10.2 dimr * os-node-04 7.10.2 dimr - os-node-03 7.10.2 dimr -
os-node-01
is no longer listed because the container has been stopped and deleted. - Deploy a new container running the desired version of OpenSearch and mapped to the same volume as the container you deleted.
- Query the
_cat/nodes
endpoint after OpenSearch is running on the new node to confirm that it has joined the cluster:
The response should look similar to the following example:GET "/_cat/nodes?v&h=name,version,node.role,master" | column -t
In the example output, the new OpenSearch node reports a running version ofname version node.role master os-node-02 7.10.2 dimr * os-node-04 7.10.2 dimr - os-node-01 7.10.2 dimr - os-node-03 7.10.2 dimr -
7.10.2
to the cluster. This is the result ofcompatibility.override_main_response_version
, which is used when connecting to a cluster with legacy clients that check for a version. You can manually confirm the version of the node by calling the/_nodes
API endpoint, as in the following command. Replace<nodeName>
with the name of your node. See Nodes API to learn more.
The response should look similar to the following example:GET "/_nodes/<nodeName>?pretty=true" | jq -r '.nodes | .[] | "\(.name) v\(.version)"'
os-node-01 v1.3.7
- Repeat steps 5 through 9 for each node in your cluster. Remember to upgrade an eligible cluster manager node last. After replacing the last node, query the
_cat/nodes
endpoint to confirm that all nodes have joined the cluster. The cluster is now bootstrapped to the new version of OpenSearch. You can verify the cluster version by querying the_cat/nodes
API endpoint:
The response should look similar to the following example:GET "/_cat/nodes?v&h=name,version,node.role,master" | column -t
name version node.role master os-node-04 1.3.7 dimr - os-node-02 1.3.7 dimr * os-node-01 1.3.7 dimr - os-node-03 1.3.7 dimr -
- Reenable shard replication:
The response should look similar to the following example:PUT "/_cluster/settings?pretty" { "persistent": { "cluster.routing.allocation.enable": "all" } }
{ "acknowledged" : true, "persistent" : { "cluster" : { "routing" : { "allocation" : { "enable" : "all" } } } }, "transient" : { } }
- Confirm that the cluster is healthy:
The response should look similar to the following example:GET "/_cluster/health?pretty"
{ "cluster_name" : "opensearch-dev-cluster", "status" : "green", "timed_out" : false, "number_of_nodes" : 4, "number_of_data_nodes" : 4, "discovered_master" : true, "active_primary_shards" : 1, "active_shards" : 4, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 }
- The upgrade is now complete, and you can begin enjoying the latest features and fixes!