diff --git a/docs/reference/cluster/reroute.asciidoc b/docs/reference/cluster/reroute.asciidoc index 0bc8610e0c7..f076a7b8358 100644 --- a/docs/reference/cluster/reroute.asciidoc +++ b/docs/reference/cluster/reroute.asciidoc @@ -1,13 +1,12 @@ [[cluster-reroute]] == Cluster Reroute -The reroute command allows to explicitly execute a cluster reroute -allocation command including specific commands. For example, a shard can -be moved from one node to another explicitly, an allocation can be -canceled, or an unassigned shard can be explicitly allocated on a -specific node. +The reroute command allows for manual changes to the allocation of individual +shards in the cluster. For example, a shard can be moved from one node to +another explicitly, an allocation can be cancelled, and an unassigned shard can +be explicitly allocated to a specific node. -Here is a short example of how a simple reroute API call: +Here is a short example of a simple reroute API call: [source,js] -------------------------------------------------- @@ -32,59 +31,53 @@ POST /_cluster/reroute // CONSOLE // TEST[skip:doc tests run with only a single node] -An important aspect to remember is the fact that once when an allocation -occurs, the cluster will aim at re-balancing its state back to an even -state. For example, if the allocation includes moving a shard from -`node1` to `node2`, in an `even` state, then another shard will be moved -from `node2` to `node1` to even things out. +It is important to note that that after processing any reroute commands +Elasticsearch will perform rebalancing as normal (respecting the values of +settings such as `cluster.routing.rebalance.enable`) in order to remain in a +balanced state. For example, if the requested allocation includes moving a +shard from `node1` to `node2` then this may cause a shard to be moved from +`node2` back to `node1` to even things out. -The cluster can be set to disable allocations, which means that only the -explicitly allocations will be performed. Obviously, only once all -commands has been applied, the cluster will aim to be re-balance its -state. +The cluster can be set to disable allocations using the +`cluster.routing.allocation.enable` setting. If allocations are disabled then +the only allocations that will be performed are explicit ones given using the +`reroute` command, and consequent allocations due to rebalancing. -Another option is to run the commands in `dry_run` (as a URI flag, or in -the request body). This will cause the commands to apply to the current -cluster state, and return the resulting cluster after the commands (and -re-balancing) has been applied. +It is possible to run `reroute` commands in "dry run" mode by using the +`?dry_run` URI query parameter, or by passing `"dry_run": true` in the request +body. This will calculate the result of applying the commands to the current +cluster state, and return the resulting cluster state after the commands (and +re-balancing) has been applied, but will not actually perform the requested +changes. -If the `explain` parameter is specified, a detailed explanation of why the -commands could or could not be executed is returned. +If the `?explain` URI query parameter is included then a detailed explanation +of why the commands could or could not be executed is included in the response. The commands supported are: `move`:: Move a started shard from one node to another node. Accepts `index` and `shard` for index name and shard number, `from_node` for the - node to move the shard `from`, and `to_node` for the node to move the + node to move the shard from, and `to_node` for the node to move the shard to. `cancel`:: - Cancel allocation of a shard (or recovery). Accepts `index` - and `shard` for index name and shard number, and `node` for the node to - cancel the shard allocation on. It also accepts `allow_primary` flag to - explicitly specify that it is allowed to cancel allocation for a primary - shard. This can be used to force resynchronization of existing replicas - from the primary shard by cancelling them and allowing them to be - reinitialized through the standard reallocation process. + Cancel allocation of a shard (or recovery). Accepts `index` and `shard` for + index name and shard number, and `node` for the node to cancel the shard + allocation on. This can be used to force resynchronization of existing + replicas from the primary shard by cancelling them and allowing them to be + reinitialized through the standard recovery process. By default only + replica shard allocations can be cancelled. If it is necessary to cancel + the allocation of a primary shard then the `allow_primary` flag must also + be included in the request. `allocate_replica`:: - Allocate an unassigned replica shard to a node. Accepts the - `index` and `shard` for index name and shard number, and `node` to - allocate the shard to. Takes <> into account. - -Two more commands are available that allow the allocation of a primary shard -to a node. These commands should however be used with extreme care, as primary -shard allocation is usually fully automatically handled by Elasticsearch. -Reasons why a primary shard cannot be automatically allocated include the following: - -- A new index was created but there is no node which satisfies the allocation deciders. -- An up-to-date shard copy of the data cannot be found on the current data nodes in -the cluster. To prevent data loss, the system does not automatically promote a stale -shard copy to primary. + Allocate an unassigned replica shard to a node. Accepts `index` and `shard` + for index name and shard number, and `node` to allocate the shard to. Takes + <> into account. [float] -=== Retry failed shards +=== Retrying failed allocations The cluster will attempt to allocate a shard a maximum of `index.allocation.max_retries` times in a row (defaults to `5`), before giving @@ -93,36 +86,48 @@ structural problems such as having an analyzer which refers to a stopwords file which doesn't exist on all nodes. Once the problem has been corrected, allocation can be manually retried by -calling the <> API with `?retry_failed`, which -will attempt a single retry round for these shards. +calling the <> API with the `?retry_failed` URI +query parameter, which will attempt a single retry round for these shards. [float] === Forced allocation on unrecoverable errors +Two more commands are available that allow the allocation of a primary shard to +a node. These commands should however be used with extreme care, as primary +shard allocation is usually fully automatically handled by Elasticsearch. +Reasons why a primary shard cannot be automatically allocated include the +following: + +- A new index was created but there is no node which satisfies the allocation + deciders. +- An up-to-date shard copy of the data cannot be found on the current data + nodes in the cluster. To prevent data loss, the system does not automatically +promote a stale shard copy to primary. + The following two commands are dangerous and may result in data loss. They are -meant to be used in cases where the original data can not be recovered and the cluster -administrator accepts the loss. If you have suffered a temporary issue that has been -fixed, please see the `retry_failed` flag described above. +meant to be used in cases where the original data can not be recovered and the +cluster administrator accepts the loss. If you have suffered a temporary issue +that can be fixed, please see the `retry_failed` flag described above. To +emphasise: if these commands are performed and then a node joins the cluster +that holds a copy of the affected shard then the copy on the newly-joined node +will be deleted or overwritten. `allocate_stale_primary`:: Allocate a primary shard to a node that holds a stale copy. Accepts the - `index` and `shard` for index name and shard number, and `node` to - allocate the shard to. Using this command may lead to data loss - for the provided shard id. If a node which has the good copy of the - data rejoins the cluster later on, that data will be overwritten with - the data of the stale copy that was forcefully allocated with this - command. To ensure that these implications are well-understood, - this command requires the special field `accept_data_loss` to be - explicitly set to `true` for it to work. + `index` and `shard` for index name and shard number, and `node` to allocate + the shard to. Using this command may lead to data loss for the provided + shard id. If a node which has the good copy of the data rejoins the cluster + later on, that data will be deleted or overwritten with the data of the + stale copy that was forcefully allocated with this command. To ensure that + these implications are well-understood, this command requires the flag + `accept_data_loss` to be explicitly set to `true`. `allocate_empty_primary`:: - Allocate an empty primary shard to a node. Accepts the - `index` and `shard` for index name and shard number, and `node` to - allocate the shard to. Using this command leads to a complete loss - of all data that was indexed into this shard, if it was previously - started. If a node which has a copy of the - data rejoins the cluster later on, that data will be deleted! - To ensure that these implications are well-understood, - this command requires the special field `accept_data_loss` to be - explicitly set to `true` for it to work. + Allocate an empty primary shard to a node. Accepts the `index` and `shard` + for index name and shard number, and `node` to allocate the shard to. Using + this command leads to a complete loss of all data that was indexed into + this shard, if it was previously started. If a node which has a copy of the + data rejoins the cluster later on, that data will be deleted. To ensure + that these implications are well-understood, this command requires the flag + `accept_data_loss` to be explicitly set to `true`.