2013-08-28 19:24:34 -04:00
|
|
|
[[cluster-reroute]]
|
|
|
|
== Cluster Reroute
|
|
|
|
|
|
|
|
The reroute command allows to explicitly execute a cluster reroute
|
|
|
|
allocation command including specific commands. For example, a shard can
|
|
|
|
be moved from one node to another explicitly, an allocation can be
|
|
|
|
canceled, or an unassigned shard can be explicitly allocated on a
|
|
|
|
specific node.
|
|
|
|
|
|
|
|
Here is a short example of how a simple reroute API call:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2017-02-07 16:24:05 -05:00
|
|
|
POST /_cluster/reroute
|
|
|
|
{
|
|
|
|
"commands" : [
|
|
|
|
{
|
|
|
|
"move" : {
|
|
|
|
"index" : "test", "shard" : 0,
|
|
|
|
"from_node" : "node1", "to_node" : "node2"
|
2013-08-28 19:24:34 -04:00
|
|
|
}
|
|
|
|
},
|
|
|
|
{
|
2016-01-13 10:59:39 -05:00
|
|
|
"allocate_replica" : {
|
2017-02-07 16:24:05 -05:00
|
|
|
"index" : "test", "shard" : 1,
|
|
|
|
"node" : "node3"
|
2013-08-28 19:24:34 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
]
|
2017-02-07 16:24:05 -05:00
|
|
|
}
|
2013-08-28 19:24:34 -04:00
|
|
|
--------------------------------------------------
|
2017-02-07 16:24:05 -05:00
|
|
|
// CONSOLE
|
|
|
|
// TEST[skip:doc tests run with only a single node]
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
An important aspect to remember is the fact that once when an allocation
|
|
|
|
occurs, the cluster will aim at re-balancing its state back to an even
|
|
|
|
state. For example, if the allocation includes moving a shard from
|
|
|
|
`node1` to `node2`, in an `even` state, then another shard will be moved
|
|
|
|
from `node2` to `node1` to even things out.
|
|
|
|
|
|
|
|
The cluster can be set to disable allocations, which means that only the
|
|
|
|
explicitly allocations will be performed. Obviously, only once all
|
|
|
|
commands has been applied, the cluster will aim to be re-balance its
|
|
|
|
state.
|
|
|
|
|
|
|
|
Another option is to run the commands in `dry_run` (as a URI flag, or in
|
|
|
|
the request body). This will cause the commands to apply to the current
|
|
|
|
cluster state, and return the resulting cluster after the commands (and
|
|
|
|
re-balancing) has been applied.
|
|
|
|
|
Add `explain` flag support to the reroute API
By specifying the `explain` flag, an explanation for the reason a
command can or cannot be executed is returned. No allocation commands
are actually performed.
Returns a response similar to:
{
"state": {...cluster state...},
"acknowledged": true,
"explanations" : [ {
"command" : "cancel",
"parameters" : {
"index" : "decide",
"shard" : 0,
"node" : "IvpoKRdtRiGrQ_WKtt4_4w",
"allow_primary" : false
},
"decisions" : [ {
"decider" : "cancel_allocation_command",
"decision" : "YES",
"explanation" : "..."
} ]
}, {
"command" : "move",
"parameters" : {
"index" : "decide",
"shard" : 0,
"from_node" : "IvpoKRdtRiGrQ_WKtt4_4w",
"to_node" : "IvpoKRdtRiGrQ_WKtt4_4w"
},
"decisions" : [ {
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "shard cannot be allocated on same node [IvpoKRdtRiGrQ_WKtt4_4w] it already exists on"
},
etc
]
}]
}
also removes AllocationExplanation from cluster state
Closes #2483
Closes #5169
2014-01-31 18:50:32 -05:00
|
|
|
If the `explain` parameter is specified, a detailed explanation of why the
|
|
|
|
commands could or could not be executed is returned.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
Add `explain` flag support to the reroute API
By specifying the `explain` flag, an explanation for the reason a
command can or cannot be executed is returned. No allocation commands
are actually performed.
Returns a response similar to:
{
"state": {...cluster state...},
"acknowledged": true,
"explanations" : [ {
"command" : "cancel",
"parameters" : {
"index" : "decide",
"shard" : 0,
"node" : "IvpoKRdtRiGrQ_WKtt4_4w",
"allow_primary" : false
},
"decisions" : [ {
"decider" : "cancel_allocation_command",
"decision" : "YES",
"explanation" : "..."
} ]
}, {
"command" : "move",
"parameters" : {
"index" : "decide",
"shard" : 0,
"from_node" : "IvpoKRdtRiGrQ_WKtt4_4w",
"to_node" : "IvpoKRdtRiGrQ_WKtt4_4w"
},
"decisions" : [ {
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "shard cannot be allocated on same node [IvpoKRdtRiGrQ_WKtt4_4w] it already exists on"
},
etc
]
}]
}
also removes AllocationExplanation from cluster state
Closes #2483
Closes #5169
2014-01-31 18:50:32 -05:00
|
|
|
The commands supported are:
|
|
|
|
|
|
|
|
`move`::
|
2013-08-28 19:24:34 -04:00
|
|
|
Move a started shard from one node to another node. Accepts
|
|
|
|
`index` and `shard` for index name and shard number, `from_node` for the
|
|
|
|
node to move the shard `from`, and `to_node` for the node to move the
|
Add `explain` flag support to the reroute API
By specifying the `explain` flag, an explanation for the reason a
command can or cannot be executed is returned. No allocation commands
are actually performed.
Returns a response similar to:
{
"state": {...cluster state...},
"acknowledged": true,
"explanations" : [ {
"command" : "cancel",
"parameters" : {
"index" : "decide",
"shard" : 0,
"node" : "IvpoKRdtRiGrQ_WKtt4_4w",
"allow_primary" : false
},
"decisions" : [ {
"decider" : "cancel_allocation_command",
"decision" : "YES",
"explanation" : "..."
} ]
}, {
"command" : "move",
"parameters" : {
"index" : "decide",
"shard" : 0,
"from_node" : "IvpoKRdtRiGrQ_WKtt4_4w",
"to_node" : "IvpoKRdtRiGrQ_WKtt4_4w"
},
"decisions" : [ {
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "shard cannot be allocated on same node [IvpoKRdtRiGrQ_WKtt4_4w] it already exists on"
},
etc
]
}]
}
also removes AllocationExplanation from cluster state
Closes #2483
Closes #5169
2014-01-31 18:50:32 -05:00
|
|
|
shard to.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
Add `explain` flag support to the reroute API
By specifying the `explain` flag, an explanation for the reason a
command can or cannot be executed is returned. No allocation commands
are actually performed.
Returns a response similar to:
{
"state": {...cluster state...},
"acknowledged": true,
"explanations" : [ {
"command" : "cancel",
"parameters" : {
"index" : "decide",
"shard" : 0,
"node" : "IvpoKRdtRiGrQ_WKtt4_4w",
"allow_primary" : false
},
"decisions" : [ {
"decider" : "cancel_allocation_command",
"decision" : "YES",
"explanation" : "..."
} ]
}, {
"command" : "move",
"parameters" : {
"index" : "decide",
"shard" : 0,
"from_node" : "IvpoKRdtRiGrQ_WKtt4_4w",
"to_node" : "IvpoKRdtRiGrQ_WKtt4_4w"
},
"decisions" : [ {
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "shard cannot be allocated on same node [IvpoKRdtRiGrQ_WKtt4_4w] it already exists on"
},
etc
]
}]
}
also removes AllocationExplanation from cluster state
Closes #2483
Closes #5169
2014-01-31 18:50:32 -05:00
|
|
|
`cancel`::
|
2013-08-28 19:24:34 -04:00
|
|
|
Cancel allocation of a shard (or recovery). Accepts `index`
|
|
|
|
and `shard` for index name and shard number, and `node` for the node to
|
|
|
|
cancel the shard allocation on. It also accepts `allow_primary` flag to
|
|
|
|
explicitly specify that it is allowed to cancel allocation for a primary
|
2014-02-07 16:58:49 -05:00
|
|
|
shard. This can be used to force resynchronization of existing replicas
|
|
|
|
from the primary shard by cancelling them and allowing them to be
|
|
|
|
reinitialized through the standard reallocation process.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
2016-01-13 10:59:39 -05:00
|
|
|
`allocate_replica`::
|
|
|
|
Allocate an unassigned replica shard to a node. Accepts the
|
2013-08-28 19:24:34 -04:00
|
|
|
`index` and `shard` for index name and shard number, and `node` to
|
2016-01-13 10:59:39 -05:00
|
|
|
allocate the shard to. Takes <<modules-cluster,allocation deciders>> into account.
|
|
|
|
|
|
|
|
Two more commands are available that allow the allocation of a primary shard
|
|
|
|
to a node. These commands should however be used with extreme care, as primary
|
|
|
|
shard allocation is usually fully automatically handled by Elasticsearch.
|
|
|
|
Reasons why a primary shard cannot be automatically allocated include the following:
|
|
|
|
|
|
|
|
- A new index was created but there is no node which satisfies the allocation deciders.
|
|
|
|
- An up-to-date shard copy of the data cannot be found on the current data nodes in
|
|
|
|
the cluster. To prevent data loss, the system does not automatically promote a stale
|
|
|
|
shard copy to primary.
|
|
|
|
|
|
|
|
As a manual override, two commands to forcefully allocate primary shards
|
|
|
|
are available:
|
|
|
|
|
|
|
|
`allocate_stale_primary`::
|
|
|
|
Allocate a primary shard to a node that holds a stale copy. Accepts the
|
|
|
|
`index` and `shard` for index name and shard number, and `node` to
|
|
|
|
allocate the shard to. Using this command may lead to data loss
|
|
|
|
for the provided shard id. If a node which has the good copy of the
|
|
|
|
data rejoins the cluster later on, that data will be overwritten with
|
|
|
|
the data of the stale copy that was forcefully allocated with this
|
|
|
|
command. To ensure that these implications are well-understood,
|
|
|
|
this command requires the special field `accept_data_loss` to be
|
|
|
|
explicitly set to `true` for it to work.
|
|
|
|
|
|
|
|
`allocate_empty_primary`::
|
|
|
|
Allocate an empty primary shard to a node. Accepts the
|
|
|
|
`index` and `shard` for index name and shard number, and `node` to
|
|
|
|
allocate the shard to. Using this command leads to a complete loss
|
|
|
|
of all data that was indexed into this shard, if it was previously
|
|
|
|
started. If a node which has a copy of the
|
|
|
|
data rejoins the cluster later on, that data will be deleted!
|
|
|
|
To ensure that these implications are well-understood,
|
|
|
|
this command requires the special field `accept_data_loss` to be
|
|
|
|
explicitly set to `true` for it to work.
|
2016-05-20 14:37:45 -04:00
|
|
|
|
|
|
|
[float]
|
|
|
|
=== Retry failed shards
|
|
|
|
|
|
|
|
The cluster will attempt to allocate a shard a maximum of
|
|
|
|
`index.allocation.max_retries` times in a row (defaults to `5`), before giving
|
|
|
|
up and leaving the shard unallocated. This scenario can be caused by
|
|
|
|
structural problems such as having an analyzer which refers to a stopwords
|
|
|
|
file which doesn't exist on all nodes.
|
|
|
|
|
|
|
|
Once the problem has been corrected, allocation can be manually retried by
|
|
|
|
calling the <<cluster-reroute,`_reroute`>> API with `?retry_failed`, which
|
2017-02-07 16:24:05 -05:00
|
|
|
will attempt a single retry round for these shards.
|