OpenSearch/docs/reference/upgrade/cluster_restart.asciidoc

132 lines
4.0 KiB
Plaintext
Raw Normal View History

[[restart-upgrade]]
== Full cluster restart upgrade
A full cluster restart upgrade requires that you shut all nodes in the cluster
down, upgrade them, and restart the cluster. A full cluster restart was required
when upgrading to major versions prior to 6.x. Elasticsearch 6.x supports
<<rolling-upgrades, rolling upgrades>> from *Elasticsearch 5.6*. Upgrading to
6.x from earlier versions requires a full cluster restart. See the
<<upgrade-paths,Upgrade paths table>> to verify the type of upgrade you need
to perform.
To perform a full cluster restart upgrade:
. *Disable shard allocation.*
+
--
include::disable-shard-alloc.asciidoc[]
--
. *Stop indexing and perform a synced flush.*
+
--
Performing a <<indices-synced-flush, synced-flush>> speeds up shard
recovery.
include::synced-flush.asciidoc[]
--
. *Stop any machine learning jobs that are running.* See
{xpack-ref}/stopping-ml.html[Stopping Machine Learning].
. *Shutdown all nodes.*
+
--
include::shut-down-node.asciidoc[]
--
. *Upgrade all nodes.*
+
--
include::upgrade-node.asciidoc[]
include::set-paths-tip.asciidoc[]
--
. *Upgrade any plugins.*
+
Use the `elasticsearch-plugin` script to install the upgraded version of each
installed Elasticsearch plugin. All plugins must be upgraded when you upgrade
a node.
. *Start each upgraded node.*
+
--
If you have dedicated master nodes, start them first and wait for them to
form a cluster and elect a master before proceeding with your data nodes.
You can check progress by looking at the logs.
As soon as the <<master-election,minimum number of master-eligible nodes>>
have discovered each other, they form a cluster and elect a master. At
that point, you can use <<cat-health,`_cat/health`>> and
<<cat-nodes,`_cat/nodes`>> to monitor nodes joining the cluster:
[source,sh]
--------------------------------------------------
GET _cat/health
GET _cat/nodes
--------------------------------------------------
// CONSOLE
The `status` column returned by `_cat/health` shows the health of each node
in the cluster: `red`, `yellow`, or `green`.
--
. *Wait for all nodes to join the cluster and report a status of yellow.*
+
--
When a node joins the cluster, it begins to recover any primary shards that
are stored locally. The <<cat-health,`_cat/health`>> API initially reports
a `status` of `red`, indicating that not all primary shards have been allocated.
Once a node recovers its local shards, the cluster `status` switches to `yellow`,
indicating that all primary shards have been recovered, but not all replica
shards are allocated. This is to be expected because you have not yet
reenabled allocation. Delaying the allocation of replicas until all nodes
are `yellow` allows the master to allocate replicas to nodes that
already have local shard copies.
--
. *Reenable allocation.*
+
--
When all nodes have joined the cluster and recovered their primary shards,
reenable allocation.
[source,js]
------------------------------------------------------
PUT _cluster/settings
{
"transient": {
"cluster.routing.allocation.enable": "all"
}
}
------------------------------------------------------
// CONSOLE
2018-02-21 13:01:13 -05:00
NOTE: Because <<_precedence_of_settings, transient
settings take precedence over persistent settings>>, this overrides the
persistent setting used to disable shard allocation in the first step. If you
don't explicitly reenable shard allocation after a full cluster restart, the
persistent setting is used and shard allocation remains disabled.
Once allocation is reenabled, the cluster starts allocating replica shards to
the data nodes. At this point it is safe to resume indexing and searching,
but your cluster will recover more quickly if you can wait until all primary
and replica shards have been successfully allocated and the status of all nodes
is `green`.
You can monitor progress with the <<cat-health,`_cat/health`>> and
<<cat-recovery,`_cat/recovery`>> APIs:
[source,sh]
--------------------------------------------------
GET _cat/health
GET _cat/recovery
--------------------------------------------------
// CONSOLE
--
. *Restart machine learning jobs.*