OpenSearch/docs/reference/setup/cluster_restart.asciidoc

[[restart-upgrade]]
=== Full cluster restart upgrade

Elasticsearch requires a full cluster restart when upgrading across major
versions.  Rolling upgrades are not supported across major versions. Consult
this <<setup-upgrade,table>> to verify that a full cluster restart is
required.

The process to perform an upgrade with a full cluster restart is as follows:

==== Step 1: Disable shard allocation

When you shut down a node, the allocation process will immediately try to
replicate the shards that were on that node to other nodes in the cluster,
causing a lot of wasted I/O.  This can be avoided by disabling allocation
before shutting down a node:

[source,js]
--------------------------------------------------
PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.enable": "none"
  }
}
--------------------------------------------------
// AUTOSENSE
// TEST[skip:indexes don't assign]

==== Step 2: Perform a synced flush

Shard recovery will be much faster if you stop indexing and issue a
<<indices-synced-flush, synced-flush>> request:

[source,sh]
--------------------------------------------------
POST _flush/synced
--------------------------------------------------
// AUTOSENSE

A synced flush request is a ``best effort'' operation. It will fail if there
are any pending indexing operations, but it is safe to reissue the request
multiple times if necessary.

==== Step 3: Shutdown and upgrade all nodes

Stop all Elasticsearch services on all nodes in the cluster. Each node can be
upgraded following the same procedure described in <<upgrade-node>>.

==== Step 4: Upgrade any plugins

Elasticsearch plugins must be upgraded when upgrading a node.  Use the
`elasticsearch-plugin` script to install the correct version of any plugins
that you need.

==== Step 5: Start the cluster

If you have dedicated master nodes -- nodes with `node.master` set to
`true`(the default) and `node.data` set to `false` --  then it is a good idea
to start them first.  Wait for them to form a cluster and to elect a master
before proceeding with the data nodes. You can check progress by looking at the
logs.

As soon as the <<master-election,minimum number of master-eligible nodes>>
have discovered each other, they will form a cluster and elect a master.  From
that point on, the <<cat-health,`_cat/health`>> and <<cat-nodes,`_cat/nodes`>>
APIs can be used to monitor nodes joining the cluster:

[source,sh]
--------------------------------------------------
GET _cat/health

GET _cat/nodes
--------------------------------------------------
// AUTOSENSE

Use these APIs to check that all nodes have successfully joined the cluster.

==== Step 6: Wait for yellow

As soon as each node has joined the cluster, it will start to recover any
primary shards that are stored locally.  Initially, the
<<cat-health,`_cat/health`>> request will report a `status` of `red`, meaning
that not all primary shards have been allocated.

Once each node has recovered its local shards, the `status` will become
`yellow`, meaning all primary shards have been recovered, but not all replica
shards are allocated.  This is to be expected because allocation is still
disabled.

==== Step 7: Reenable allocation

Delaying the allocation of replicas until all nodes have joined the cluster
allows the master to allocate replicas to nodes which already have local shard
copies.   At this point, with all the nodes in the cluster, it is safe to
reenable shard allocation:

[source,js]
------------------------------------------------------
PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.enable": "all"
  }
}
------------------------------------------------------
// AUTOSENSE

The cluster will now start allocating replica shards to all data nodes. At this
point it is safe to resume indexing and searching, but your cluster will
recover more quickly if you can delay indexing and searching until all shards
have recovered.

You can monitor progress with the <<cat-health,`_cat/health`>> and
<<cat-recovery,`_cat/recovery`>> APIs:

[source,sh]
--------------------------------------------------
GET _cat/health

GET _cat/recovery
--------------------------------------------------
// AUTOSENSE

Once the `status` column in the `_cat/health` output has reached `green`, all
primary and replica shards have been successfully allocated.
Docs: Rewrote the upgrade section 2015-06-19 16:27:28 +02:00			`[[restart-upgrade]]`
			`=== Full cluster restart upgrade`

			`Elasticsearch requires a full cluster restart when upgrading across major`
Docs: Complete rewrite of setup, installation, and configuration docs 2016-04-03 16:09:24 +02:00			`versions. Rolling upgrades are not supported across major versions. Consult`
			`this <<setup-upgrade,table>> to verify that a full cluster restart is`
			`required.`
Docs: Rewrote the upgrade section 2015-06-19 16:27:28 +02:00
			`The process to perform an upgrade with a full cluster restart is as follows:`

			`==== Step 1: Disable shard allocation`

			`When you shut down a node, the allocation process will immediately try to`
			`replicate the shards that were on that node to other nodes in the cluster,`
			`causing a lot of wasted I/O. This can be avoided by disabling allocation`
			`before shutting down a node:`

Docs: Use "js" instead of "json" and "sh" instead of "shell" for source highlighting 2015-07-14 18:14:09 +02:00			`[source,js]`
Docs: Rewrote the upgrade section 2015-06-19 16:27:28 +02:00			`--------------------------------------------------`
Generate and run tests from the docs Adds infrastructure so `gradle :docs:check` will extract tests from snippets in the documentation and execute the tests. This is included in `gradle check` so it should happen on CI and during a normal build. By default each `// AUTOSENSE` snippet creates a unique REST test. These tests are executed in a random order and the cluster is wiped between each one. If multiple snippets chain together into a test you can annotate all snippets after the first with `// TEST[continued]` to have the generated tests for both snippets joined. Snippets marked as `// TESTRESPONSE` are checked against the response of the last action. See docs/README.asciidoc for lots more. Closes #12583. That issue is about catching bugs in the docs during build. This catches some bugs in the docs during build which is a good start. 2016-04-29 10:42:03 -04:00			`PUT _cluster/settings`
Docs: Rewrote the upgrade section 2015-06-19 16:27:28 +02:00			`{`
			`"persistent": {`
			`"cluster.routing.allocation.enable": "none"`
			`}`
			`}`
			`--------------------------------------------------`
			`// AUTOSENSE`
Generate and run tests from the docs Adds infrastructure so `gradle :docs:check` will extract tests from snippets in the documentation and execute the tests. This is included in `gradle check` so it should happen on CI and during a normal build. By default each `// AUTOSENSE` snippet creates a unique REST test. These tests are executed in a random order and the cluster is wiped between each one. If multiple snippets chain together into a test you can annotate all snippets after the first with `// TEST[continued]` to have the generated tests for both snippets joined. Snippets marked as `// TESTRESPONSE` are checked against the response of the last action. See docs/README.asciidoc for lots more. Closes #12583. That issue is about catching bugs in the docs during build. This catches some bugs in the docs during build which is a good start. 2016-04-29 10:42:03 -04:00			`// TEST[skip:indexes don't assign]`
Docs: Rewrote the upgrade section 2015-06-19 16:27:28 +02:00
			`==== Step 2: Perform a synced flush`

			`Shard recovery will be much faster if you stop indexing and issue a`
			`<<indices-synced-flush, synced-flush>> request:`

			`[source,sh]`
			`--------------------------------------------------`
Generate and run tests from the docs Adds infrastructure so `gradle :docs:check` will extract tests from snippets in the documentation and execute the tests. This is included in `gradle check` so it should happen on CI and during a normal build. By default each `// AUTOSENSE` snippet creates a unique REST test. These tests are executed in a random order and the cluster is wiped between each one. If multiple snippets chain together into a test you can annotate all snippets after the first with `// TEST[continued]` to have the generated tests for both snippets joined. Snippets marked as `// TESTRESPONSE` are checked against the response of the last action. See docs/README.asciidoc for lots more. Closes #12583. That issue is about catching bugs in the docs during build. This catches some bugs in the docs during build which is a good start. 2016-04-29 10:42:03 -04:00			`POST _flush/synced`
Docs: Rewrote the upgrade section 2015-06-19 16:27:28 +02:00			`--------------------------------------------------`
			`// AUTOSENSE`

			A synced flush request is a ``best effort'' operation. It will fail if there
			`are any pending indexing operations, but it is safe to reissue the request`
			`multiple times if necessary.`

			`==== Step 3: Shutdown and upgrade all nodes`

			`Stop all Elasticsearch services on all nodes in the cluster. Each node can be`
			`upgraded following the same procedure described in <<upgrade-node>>.`

Docs: Complete rewrite of setup, installation, and configuration docs 2016-04-03 16:09:24 +02:00			`==== Step 4: Upgrade any plugins`

			`Elasticsearch plugins must be upgraded when upgrading a node. Use the`
			`elasticsearch-plugin` script to install the correct version of any plugins
			`that you need.`

			`==== Step 5: Start the cluster`
Docs: Rewrote the upgrade section 2015-06-19 16:27:28 +02:00
			If you have dedicated master nodes -- nodes with `node.master` set to
			`true`(the default) and `node.data` set to `false` -- then it is a good idea
			`to start them first. Wait for them to form a cluster and to elect a master`
Fixing typo 2015-10-26 16:47:44 -04:00			`before proceeding with the data nodes. You can check progress by looking at the`
Docs: Rewrote the upgrade section 2015-06-19 16:27:28 +02:00			`logs.`

			`As soon as the <<master-election,minimum number of master-eligible nodes>>`
			`have discovered each other, they will form a cluster and elect a master. From`
			that point on, the <<cat-health,`_cat/health`>> and <<cat-nodes,`_cat/nodes`>>
			`APIs can be used to monitor nodes joining the cluster:`

			`[source,sh]`
			`--------------------------------------------------`
			`GET _cat/health`

			`GET _cat/nodes`
			`--------------------------------------------------`
			`// AUTOSENSE`

			`Use these APIs to check that all nodes have successfully joined the cluster.`

Docs: Complete rewrite of setup, installation, and configuration docs 2016-04-03 16:09:24 +02:00			`==== Step 6: Wait for yellow`
Docs: Rewrote the upgrade section 2015-06-19 16:27:28 +02:00
			`As soon as each node has joined the cluster, it will start to recover any`
			`primary shards that are stored locally. Initially, the`
			<<cat-health,`_cat/health`>> request will report a `status` of `red`, meaning
			`that not all primary shards have been allocated.`

			Once each node has recovered its local shards, the `status` will become
			`yellow`, meaning all primary shards have been recovered, but not all replica
			`shards are allocated. This is to be expected because allocation is still`
			`disabled.`

Docs: Complete rewrite of setup, installation, and configuration docs 2016-04-03 16:09:24 +02:00			`==== Step 7: Reenable allocation`
Docs: Rewrote the upgrade section 2015-06-19 16:27:28 +02:00
			`Delaying the allocation of replicas until all nodes have joined the cluster`
			`allows the master to allocate replicas to nodes which already have local shard`
			`copies. At this point, with all the nodes in the cluster, it is safe to`
			`reenable shard allocation:`

Docs: Use "js" instead of "json" and "sh" instead of "shell" for source highlighting 2015-07-14 18:14:09 +02:00			`[source,js]`
Docs: Rewrote the upgrade section 2015-06-19 16:27:28 +02:00			`------------------------------------------------------`
Generate and run tests from the docs Adds infrastructure so `gradle :docs:check` will extract tests from snippets in the documentation and execute the tests. This is included in `gradle check` so it should happen on CI and during a normal build. By default each `// AUTOSENSE` snippet creates a unique REST test. These tests are executed in a random order and the cluster is wiped between each one. If multiple snippets chain together into a test you can annotate all snippets after the first with `// TEST[continued]` to have the generated tests for both snippets joined. Snippets marked as `// TESTRESPONSE` are checked against the response of the last action. See docs/README.asciidoc for lots more. Closes #12583. That issue is about catching bugs in the docs during build. This catches some bugs in the docs during build which is a good start. 2016-04-29 10:42:03 -04:00			`PUT _cluster/settings`
Docs: Rewrote the upgrade section 2015-06-19 16:27:28 +02:00			`{`
			`"persistent": {`
			`"cluster.routing.allocation.enable": "all"`
			`}`
			`}`
			`------------------------------------------------------`
			`// AUTOSENSE`

			`The cluster will now start allocating replica shards to all data nodes. At this`
			`point it is safe to resume indexing and searching, but your cluster will`
			`recover more quickly if you can delay indexing and searching until all shards`
			`have recovered.`

			You can monitor progress with the <<cat-health,`_cat/health`>> and
			<<cat-recovery,`_cat/recovery`>> APIs:

			`[source,sh]`
			`--------------------------------------------------`
			`GET _cat/health`

			`GET _cat/recovery`
			`--------------------------------------------------`
			`// AUTOSENSE`

			Once the `status` column in the `_cat/health` output has reached `green`, all
			`primary and replica shards have been successfully allocated.`