Jeff Huss 9fd6cf9caf
Add steps for upgrading OpenSearch with Security enabled (#3052)
* Adding appendix that I want to use for reading aids or additional context for the user and reader

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Saving - trying to multitask and follow a migrations workshop so not a lot of progress should be expected today

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Working on splitting up the content

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Sanitized example commands

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Added intro blurb to the appendix - I will revisit the wording after the doc grows a bit to see how much sense it makes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Redid headers and added cross-link

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Changed headings again to make them simpler

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Expanding on intro and starting to add steps

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Removed disclaimer about prerequisite skills since I backed-off with the specifics in this guide

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Adding more context to the overview

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Added more details to the snapshot recommendation including links to the supported repository solutions

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Wording changes in appendix

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Adding process steps and testing the copy html button function - so far so good as long as indentation is correct

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Stashing changes because I need to get the script uploaded to the repo so I can make sure the permalink works

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Working out steps

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Added one-liner for resetting the environment

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Added one-liner for resetting the environment

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Adding more steps

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Still adding

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Reorganizing the page layout

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Improving landing page

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Breaking up the content for sustainability

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Phrasing

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Clip index page to keep it concise - I think Jekyll will automatically add a related articles section since it has_children is true

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Clean up environment

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Clean up environment

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Testing out codeblock labels

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Stashing changes while I pull down a remote branch for a PR review

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Adding steps

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Continuing to build out process and added code block labels to improve readability

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Stashing to check another branch for PR review

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Continuing to add steps

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Added headings to backups section so I can link the bullet list

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Adding steps and cleaning up sections for readability

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Completed some additional research on security and started adding steps with commentary

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Wrapping up security settings backup procedure

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Stashing changes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Upgrade steps

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Adding upgrade steps

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Added final steps to the actual upgrade procedure

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Added steps for upgrading OpenSearch Dashboards

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Almost forgot to re-enabled shard replica allocation

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Wrapped up first part of validation and working on next

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Adding more checks

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Finished adding steps which completes first draft now starting a first round review of the doc

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Fixed typo

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Reordering of intro and removal of a tip box in favor of just adding to the existing sentences

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Further trimmed intro and removed the other box

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Fixed wording in intro

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Fixed wording and ordering in set up the environment

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Revisions

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Gerunds

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Cleaning up phrasing

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Added tip about curl

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Make wording more reader-friendly and informative

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Make wording more reader-friendly and informative

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Doc review edits

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Changed h1 and title to sentence case per other review comments

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Implementing review feedback on h2 header

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Making gerunds

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Editorial fixes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Editorial fixes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Editorial fixes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Editorial fixes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Editorial fixes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Editorial fixes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Editorial fixes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Editorial fixes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Editorial fixes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Editorial fixes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Final editorial changes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

* Final editorial changes

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>

---------

Signed-off-by: JeffH-AWS <jeffhuss@amazon.com>
2023-03-10 13:43:20 -08:00

8.8 KiB

layout title parent nav_order
default Rolling Upgrade Upgrading OpenSearch 10

Rolling Upgrade

Rolling upgrades, sometimes referred to as "node replacement upgrades," can be performed on running clusters with virtually no downtime. Nodes are individually stopped and upgraded in place. Alternatively, nodes can be stopped and replaced, one at a time, by hosts running the new version. During this process you can continue to index and query data in your cluster.

This document serves as a high-level, platform-agnostic overview of the rolling upgrade procedure. For specific examples of commands, scripts, and configuration files, refer to the Appendix.

Preparing to upgrade

Review Upgrading OpenSearch for recommendations about backing up your configuration files and creating a snapshot of the cluster state and indexes before you make any changes to your OpenSearch cluster.

Important: OpenSearch nodes cannot be downgraded. If you need to revert the upgrade, then you will need to perform a fresh installation of OpenSearch and restore the cluster from a snapshot. Take a snapshot and store it in a remote repository before beginning the upgrade procedure. {: .important}

Performing the upgrade

  1. Verify the health of your OpenSearch cluster before you begin. You should resolve any index or shard allocation issues prior to upgrading to ensure that your data is preserved. A status of green indicates that all primary and replica shards are allocated. See Cluster health for more information. The following command queries the _cluster/health API endpoint:
    GET "/_cluster/health?pretty"
    
    The response should look similar to the following example:
    {
        "cluster_name":"opensearch-dev-cluster",
        "status":"green",
        "timed_out":false,
        "number_of_nodes":4,
        "number_of_data_nodes":4,
        "active_primary_shards":1,
        "active_shards":4,
        "relocating_shards":0,
        "initializing_shards":0,
        "unassigned_shards":0,
        "delayed_unassigned_shards":0,
        "number_of_pending_tasks":0,
        "number_of_in_flight_fetch":0,
        "task_max_waiting_in_queue_millis":0,
        "active_shards_percent_as_number":100.0
    }
    
  2. Disable shard replication to prevent shard replicas from being created while nodes are being taken offline. This stops the movement of Lucene index segments on nodes in your cluster. You can disable shard replication by querying the _cluster/settings API endpoint:
    PUT "/_cluster/settings?pretty"
    {
        "persistent": {
            "cluster.routing.allocation.enable": "primaries"
        }
    }
    
    The response should look similar to the following example:
    {
      "acknowledged" : true,
      "persistent" : {
        "cluster" : {
          "routing" : {
            "allocation" : {
              "enable" : "primaries"
            }
          }
        }
      },
      "transient" : { }
    }
    
  3. Perform a flush operation on the cluster to commit transaction log entries to the Lucene index:
    POST "/_flush?pretty"
    
    The response should look similar to the following example:
    {
      "_shards" : {
        "total" : 4,
        "successful" : 4,
        "failed" : 0
      }
    }
    
  4. Review your cluster and identify the first node to upgrade. Eligible cluster manager nodes should be upgraded last because OpenSearch nodes can join a cluster with manager nodes running an older version, but they cannot join a cluster with all manager nodes running a newer version.
  5. Query the _cat/nodes endpoint to identify which node was promoted to cluster manager. The following command includes additional query parameters that request only the name, version, node.role, and master headers. Note that OpenSearch 1.x versions use the term "master," which has been deprecated and replaced by "cluster_manager" in OpenSearch 2.x and later.
    GET "/_cat/nodes?v&h=name,version,node.role,master" | column -t
    
    The response should look similar to the following example:
    name        version  node.role  master
    os-node-01  7.10.2   dimr       -
    os-node-04  7.10.2   dimr       -
    os-node-03  7.10.2   dimr       -
    os-node-02  7.10.2   dimr       *
    
  6. Stop the node you are upgrading. Do not delete the volume associated with the container when you delete the container. The new OpenSearch container will use the existing volume. Deleting the volume will result in data loss.
  7. Confirm that the associated node has been dismissed from the cluster by querying the _cat/nodes API endpoint:
    GET "/_cat/nodes?v&h=name,version,node.role,master" | column -t
    
    The response should look similar to the following example:
    name        version  node.role  master
    os-node-02  7.10.2   dimr       *
    os-node-04  7.10.2   dimr       -
    os-node-03  7.10.2   dimr       -
    
    os-node-01 is no longer listed because the container has been stopped and deleted.
  8. Deploy a new container running the desired version of OpenSearch and mapped to the same volume as the container you deleted.
  9. Query the _cat/nodes endpoint after OpenSearch is running on the new node to confirm that it has joined the cluster:
    GET "/_cat/nodes?v&h=name,version,node.role,master" | column -t
    
    The response should look similar to the following example:
    name        version  node.role  master
    os-node-02  7.10.2   dimr       *
    os-node-04  7.10.2   dimr       -
    os-node-01  7.10.2   dimr       -
    os-node-03  7.10.2   dimr       -
    
    In the example output, the new OpenSearch node reports a running version of 7.10.2 to the cluster. This is the result of compatibility.override_main_response_version, which is used when connecting to a cluster with legacy clients that check for a version. You can manually confirm the version of the node by calling the /_nodes API endpoint, as in the following command. Replace <nodeName> with the name of your node. See Nodes API to learn more.
    GET "/_nodes/<nodeName>?pretty=true" | jq -r '.nodes | .[] | "\(.name) v\(.version)"'
    
    The response should look similar to the following example:
    os-node-01 v1.3.7
    
  10. Repeat steps 5 through 9 for each node in your cluster. Remember to upgrade an eligible cluster manager node last. After replacing the last node, query the _cat/nodes endpoint to confirm that all nodes have joined the cluster. The cluster is now bootstrapped to the new version of OpenSearch. You can verify the cluster version by querying the _cat/nodes API endpoint:
    GET "/_cat/nodes?v&h=name,version,node.role,master" | column -t
    
    The response should look similar to the following example:
    name        version  node.role  master
    os-node-04  1.3.7    dimr       -
    os-node-02  1.3.7    dimr       *
    os-node-01  1.3.7    dimr       -
    os-node-03  1.3.7    dimr       -
    
  11. Reenable shard replication:
    PUT "/_cluster/settings?pretty"
    {
        "persistent": {
            "cluster.routing.allocation.enable": "all"
        }
    }
    
    The response should look similar to the following example:
    {
      "acknowledged" : true,
      "persistent" : {
        "cluster" : {
          "routing" : {
            "allocation" : {
              "enable" : "all"
            }
          }
        }
      },
      "transient" : { }
    }
    
  12. Confirm that the cluster is healthy:
    GET "/_cluster/health?pretty"
    
    The response should look similar to the following example:
    {
      "cluster_name" : "opensearch-dev-cluster",
      "status" : "green",
      "timed_out" : false,
      "number_of_nodes" : 4,
      "number_of_data_nodes" : 4,
      "discovered_master" : true,
      "active_primary_shards" : 1,
      "active_shards" : 4,
      "relocating_shards" : 0,
      "initializing_shards" : 0,
      "unassigned_shards" : 0,
      "delayed_unassigned_shards" : 0,
      "number_of_pending_tasks" : 0,
      "number_of_in_flight_fetch" : 0,
      "task_max_waiting_in_queue_millis" : 0,
      "active_shards_percent_as_number" : 100.0
    }
    
  13. The upgrade is now complete, and you can begin enjoying the latest features and fixes!