Node repurpose tool docs (#40525)

Added documentation for node repurpose tool and included documentation on how to repurpose nodes safely. Adjusted order of tools in `elasticsearch-node` tool since the repurpose tool is most likely to be used.

Co-Authored-By: David Turner <david.turner@elastic.co>
This commit is contained in:
Henning Andersen 2019-04-09 15:02:03 +02:00 committed by Henning Andersen
parent 40638d7b28
commit c5a77e5d8c
3 changed files with 153 additions and 19 deletions

View File

@ -1,15 +1,17 @@
[[node-tool]]
== elasticsearch-node
The `elasticsearch-node` command enables you to perform unsafe operations that
risk data loss but which may help to recover some data in a disaster.
The `elasticsearch-node` command enables you to perform certain unsafe
operations on a node that are only possible while it is shut down. This command
allows you to adjust the <<modules-node,role>> of a node and may be able to
recover some data after a disaster.
[float]
=== Synopsis
[source,shell]
--------------------------------------------------
bin/elasticsearch-node unsafe-bootstrap|detach-cluster
bin/elasticsearch-node repurpose|unsafe-bootstrap|detach-cluster
[--ordinal <Integer>] [-E <KeyValuePair>]
[-h, --help] ([-s, --silent] | [-v, --verbose])
--------------------------------------------------
@ -17,6 +19,60 @@ bin/elasticsearch-node unsafe-bootstrap|detach-cluster
[float]
=== Description
This tool has three modes:
* `elasticsearch-node repurpose` can be used to delete unwanted data from a
node if it used to be a <<data-node,data node>> or a
<<master-node,master-eligible node>> but has been repurposed not to have one
or other of these roles.
* `elasticsearch-node unsafe-bootstrap` can be used to perform _unsafe cluster
bootstrapping_. It forces one of the nodes to form a brand-new cluster on
its own, using its local copy of the cluster metadata.
* `elasticsearch-node detach-cluster` enables you to move nodes from one
cluster to another. This can be used to move nodes into a new cluster
created with the `elasticsearch-node unsafe-bootstap` command. If unsafe
cluster bootstrapping was not possible, it also enables you to move nodes
into a brand-new cluster.
[[node-tool-repurpose]]
[float]
==== Changing the role of a node
There may be situations where you want to repurpose a node without following
the <<change-node-role,proper repurposing processes>>. The `elasticsearch-node
repurpose` tool allows you to delete any excess on-disk data and start a node
after repurposing it.
The intended use is:
* Stop the node
* Update `elasticsearch.yml` by setting `node.master` and `node.data` as
desired.
* Run `elasticsearch-node repurpose` on the node
* Start the node
If you run `elasticsearch-node repurpose` on a node with `node.data: false` and
`node.master: true` then it will delete any remaining shard data on that node,
but it will leave the index and cluster metadata alone. If you run
`elasticsearch-node repurpose` on a node with `node.data: false` and
`node.master: false` then it will delete any remaining shard data and index
metadata, but it will leave the cluster metadata alone.
[WARNING]
Running this command can lead to data loss for the indices mentioned if the
data contained is not available on other nodes in the cluster. Only run this
tool if you understand and accept the possible consequences, and only after
determining that the node cannot be repurposed cleanly.
The tool provides a summary of the data to be deleted and asks for confirmation
before making any changes. You can get detailed information about the affected
indices and shards by passing the verbose (`-v`) option.
[float]
==== Recovering data after a disaster
Sometimes {es} nodes are temporarily stopped, perhaps because of the need to
perform some maintenance activity or perhaps because of a hardware failure.
After you resolve the temporary condition and restart the node,
@ -53,22 +109,10 @@ way forward that does not risk data loss, but it may be possible to use the
`elasticsearch-node` tool to construct a new cluster that contains some of the
data from the failed cluster.
This tool has two modes:
* `elastisearch-node unsafe-bootstap` can be used if there is at least one
remaining master-eligible node. It forces one of the remaining nodes to form
a brand-new cluster on its own, using its local copy of the cluster metadata.
This is known as _unsafe cluster bootstrapping_.
* `elastisearch-node detach-cluster` enables you to move nodes from one cluster
to another. This can be used to move nodes into the new cluster created with
the `elastisearch-node unsafe-bootstap` command. If unsafe cluster bootstrapping was not
possible, it also enables you to
move nodes into a brand-new cluster.
[[node-tool-unsafe-bootstrap]]
[float]
==== Unsafe cluster bootstrapping
===== Unsafe cluster bootstrapping
If there is at least one remaining master-eligible node, but it is not possible
to restart a majority of them, then the `elasticsearch-node unsafe-bootstrap`
@ -143,7 +187,7 @@ job.
[[node-tool-detach-cluster]]
[float]
==== Detaching nodes from their cluster
===== Detaching nodes from their cluster
It is unsafe for nodes to move between clusters, because different clusters
have completely different cluster metadata. There is no way to safely merge the
@ -206,9 +250,12 @@ The message `Node was successfully detached from the cluster` does not mean
that there has been no data loss, it just means that tool was able to complete
its job.
[float]
=== Parameters
`repurpose`:: Delete excess data when a node's roles are changed.
`unsafe-bootstrap`:: Specifies to unsafely bootstrap this node as a new
one-node cluster.
@ -230,6 +277,51 @@ to `0`, meaning to use the first node in the data path.
[float]
=== Examples
[float]
==== Repurposing a node as a dedicated master node (master: true, data: false)
In this example, a former data node is repurposed as a dedicated master node.
First update the node's settings to `node.master: true` and `node.data: false`
in its `elasticsearch.yml` config file. Then run the `elasticsearch-node
repurpose` command to find and remove excess shard data:
[source,txt]
----
node$ ./bin/elasticsearch-node repurpose
WARNING: Elasticsearch MUST be stopped before running this tool.
Found 2 shards in 2 indices to clean up
Use -v to see list of paths and indices affected
Node is being re-purposed as master and no-data. Clean-up of shard data will be performed.
Do you want to proceed?
Confirm [y/N] y
Node successfully repurposed to master and no-data.
----
[float]
==== Repurposing a node as a coordinating-only node (master: false, data: false)
In this example, a node that previously held data is repurposed as a
coordinating-only node. First update the node's settings to `node.master:
false` and `node.data: false` in its `elasticsearch.yml` config file. Then run
the `elasticsearch-node repurpose` command to find and remove excess shard data
and index metadata:
[source,txt]
----
node$./bin/elasticsearch-node repurpose
WARNING: Elasticsearch MUST be stopped before running this tool.
Found 2 indices (2 shards and 2 index meta data) to clean up
Use -v to see list of paths and indices affected
Node is being re-purposed as no-master and no-data. Clean-up of index data will be performed.
Do you want to proceed?
Confirm [y/N] y
Node successfully repurposed to no-master and no-data.
----
[float]
==== Unsafe cluster bootstrapping
@ -331,4 +423,3 @@ Do you want to proceed?
Confirm [y/N] y
Node was successfully detached from the cluster
----

View File

@ -204,6 +204,49 @@ NOTE: These settings apply only when {xpack} is not installed. To create a
dedicated coordinating node when {xpack} is installed, see <<modules-node-xpack,{xpack} node settings>>.
endif::include-xpack[]
[float]
[[change-node-role]]
=== Changing the role of a node
Each data node maintains the following data on disk:
* the shard data for every shard allocated to that node,
* the index metadata corresponding with every shard allocated to that node, and
* the cluster-wide metadata, such as settings and index templates.
Similarly, each master-eligible node maintains the following data on disk:
* the index metadata for every index in the cluster, and
* the cluster-wide metadata, such as settings and index templates.
Each node checks the contents of its data path at startup. If it discovers
unexpected data then it will refuse to start. This is to avoid importing
unwanted <<modules-gateway-dangling-indices,dangling indices>> which can lead
to a red cluster health. To be more precise, nodes with `node.data: false` will
refuse to start if they find any shard data on disk at startup, and nodes with
both `node.master: false` and `node.data: false` will refuse to start if they
have any index metadata on disk at startup.
It is possible to change the roles of a node by adjusting its
`elasticsearch.yml` file and restarting it. This is known as _repurposing_ a
node. In order to satisfy the checks for unexpected data described above, you
must perform some extra steps to prepare a node for repurposing when setting
its `node.data` or `node.master` roles to `false`:
* If you want to repurpose a data node by changing `node.data` to `false` then
you should first use an <<allocation-filtering,allocation filter>> to safely
migrate all the shard data onto other nodes in the cluster.
* If you want to repurpose a node to have both `node.master: false` and
`node.data: false` then it is simplest to start a brand-new node with an
empty data path and the desired roles. You may find it safest to use an
<<allocation-filtering,allocation filter>> to migrate the shard data
elsewhere in the cluster first.
If it is not possible to follow these extra steps then you may be able to use
the <<node-tool-repurpose,`elasticsearch-node repurpose`>> tool to delete any
excess data that prevents a node from starting.
[float]
== Node data path settings

View File

@ -36,9 +36,9 @@ public class NodeToolCli extends MultiCommand {
super("A CLI tool to do unsafe cluster and index manipulations on current node",
()->{});
CommandLoggingConfigurator.configureLoggingWithoutConfig();
subcommands.put("repurpose", new NodeRepurposeCommand());
subcommands.put("unsafe-bootstrap", new UnsafeBootstrapMasterCommand());
subcommands.put("detach-cluster", new DetachClusterCommand());
subcommands.put("repurpose", new NodeRepurposeCommand());
}
public static void main(String[] args) throws Exception {