347 lines
11 KiB
Plaintext
347 lines
11 KiB
Plaintext
[[cluster-allocation-explain]]
|
|
=== Cluster allocation explain API
|
|
++++
|
|
<titleabbrev>Cluster allocation explain</titleabbrev>
|
|
++++
|
|
|
|
Provides explanations for shard allocations in the cluster.
|
|
|
|
|
|
[[cluster-allocation-explain-api-request]]
|
|
==== {api-request-title}
|
|
|
|
`GET /_cluster/allocation/explain`
|
|
|
|
|
|
[[cluster-allocation-explain-api-desc]]
|
|
==== {api-description-title}
|
|
|
|
The purpose of the cluster allocation explain API is to provide
|
|
explanations for shard allocations in the cluster. For unassigned shards,
|
|
the explain API provides an explanation for why the shard is unassigned.
|
|
For assigned shards, the explain API provides an explanation for why the
|
|
shard is remaining on its current node and has not moved or rebalanced to
|
|
another node. This API can be very useful when attempting to diagnose why a
|
|
shard is unassigned or why a shard continues to remain on its current node when
|
|
you might expect otherwise.
|
|
|
|
|
|
[[cluster-allocation-explain-api-query-params]]
|
|
==== {api-query-parms-title}
|
|
|
|
`include_disk_info`::
|
|
(Optional, boolean) If `true`, returns information about disk usage and
|
|
shard sizes. Defaults to `false`.
|
|
|
|
`include_yes_decisions`::
|
|
(Optional, boolean) If `true`, returns 'YES' decisions in explanation.
|
|
Defaults to `false`.
|
|
|
|
|
|
[[cluster-allocation-explain-api-request-body]]
|
|
==== {api-request-body-title}
|
|
|
|
`current_node`::
|
|
(Optional, string) Specifies the node ID or the name of the node to only
|
|
explain a shard that is currently located on the specified node.
|
|
|
|
`index`::
|
|
(Optional, string) Specifies the name of the index that you would like an
|
|
explanation for.
|
|
|
|
`primary`::
|
|
(Optional, boolean) If `true`, returns explanation for the primary shard
|
|
for the given shard ID.
|
|
|
|
`shard`::
|
|
(Optional, integer) Specifies the ID of the shard that you would like an
|
|
explanation for.
|
|
|
|
You can also have {es} explain the allocation of the first unassigned shard that
|
|
it finds by sending an empty body for the request.
|
|
|
|
|
|
[[cluster-allocation-explain-api-examples]]
|
|
==== {api-examples-title}
|
|
|
|
|
|
//////
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT /my-index-000001
|
|
--------------------------------------------------
|
|
// TESTSETUP
|
|
//////
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET /_cluster/allocation/explain
|
|
{
|
|
"index": "my-index-000001",
|
|
"shard": 0,
|
|
"primary": true
|
|
}
|
|
--------------------------------------------------
|
|
|
|
|
|
===== Example of the current_node parameter
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET /_cluster/allocation/explain
|
|
{
|
|
"index": "my-index-000001",
|
|
"shard": 0,
|
|
"primary": false,
|
|
"current_node": "nodeA" <1>
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[skip:no way of knowing the current_node]
|
|
|
|
<1> The node where shard 0 currently has a replica on
|
|
|
|
|
|
===== Examples of unassigned primary shard explanations
|
|
|
|
//////
|
|
[source,console]
|
|
--------------------------------------------------
|
|
DELETE my-index-000001
|
|
--------------------------------------------------
|
|
//////
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT /my-index-000001?master_timeout=1s&timeout=1s
|
|
{
|
|
"settings": {
|
|
"index.routing.allocation.include._name": "non_existent_node",
|
|
"index.routing.allocation.include._tier_preference": null
|
|
}
|
|
}
|
|
|
|
GET /_cluster/allocation/explain
|
|
{
|
|
"index": "my-index-000001",
|
|
"shard": 0,
|
|
"primary": true
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[continued]
|
|
|
|
|
|
The API returns the following response for an unassigned primary shard:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"index" : "my-index-000001",
|
|
"shard" : 0,
|
|
"primary" : true,
|
|
"current_state" : "unassigned", <1>
|
|
"unassigned_info" : {
|
|
"reason" : "INDEX_CREATED", <2>
|
|
"at" : "2017-01-04T18:08:16.600Z",
|
|
"last_allocation_status" : "no"
|
|
},
|
|
"can_allocate" : "no", <3>
|
|
"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
|
|
"node_allocation_decisions" : [
|
|
{
|
|
"node_id" : "8qt2rY-pT6KNZB3-hGfLnw",
|
|
"node_name" : "node-0",
|
|
"transport_address" : "127.0.0.1:9401",
|
|
"node_attributes" : {},
|
|
"node_decision" : "no", <4>
|
|
"weight_ranking" : 1,
|
|
"deciders" : [
|
|
{
|
|
"decider" : "filter", <5>
|
|
"decision" : "NO",
|
|
"explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"non_existent_node\"]" <6>
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/"at" : "[^"]*"/"at" : $body.$_path/]
|
|
// TESTRESPONSE[s/"node_id" : "[^"]*"/"node_id" : $body.$_path/]
|
|
// TESTRESPONSE[s/"transport_address" : "[^"]*"/"transport_address" : $body.$_path/]
|
|
// TESTRESPONSE[s/"node_attributes" : \{\}/"node_attributes" : $body.$_path/]
|
|
|
|
<1> The current state of the shard.
|
|
<2> The reason for the shard originally becoming unassigned.
|
|
<3> Whether to allocate the shard.
|
|
<4> Whether to allocate the shard to the particular node.
|
|
<5> The decider which led to the `no` decision for the node.
|
|
<6> An explanation as to why the decider returned a `no` decision, with a helpful hint pointing to the setting that led to the decision.
|
|
|
|
|
|
The API response output for an unassigned primary shard that had previously been
|
|
allocated to a node in the cluster:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"index" : "my-index-000001",
|
|
"shard" : 0,
|
|
"primary" : true,
|
|
"current_state" : "unassigned",
|
|
"unassigned_info" : {
|
|
"reason" : "NODE_LEFT",
|
|
"at" : "2017-01-04T18:03:28.464Z",
|
|
"details" : "node_left[OIWe8UhhThCK0V5XfmdrmQ]",
|
|
"last_allocation_status" : "no_valid_shard_copy"
|
|
},
|
|
"can_allocate" : "no_valid_shard_copy",
|
|
"allocate_explanation" : "cannot allocate because a previous copy of the primary shard existed but can no longer be found on the nodes in the cluster"
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
|
|
===== Example of an unassigned replica shard explanation
|
|
|
|
The API response output for a replica that is unassigned due to delayed
|
|
allocation:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"index" : "my-index-000001",
|
|
"shard" : 0,
|
|
"primary" : false,
|
|
"current_state" : "unassigned",
|
|
"unassigned_info" : {
|
|
"reason" : "NODE_LEFT",
|
|
"at" : "2017-01-04T18:53:59.498Z",
|
|
"details" : "node_left[G92ZwuuaRY-9n8_tc-IzEg]",
|
|
"last_allocation_status" : "no_attempt"
|
|
},
|
|
"can_allocate" : "allocation_delayed",
|
|
"allocate_explanation" : "cannot allocate because the cluster is still waiting 59.8s for the departed node holding a replica to rejoin, despite being allowed to allocate the shard to at least one other node",
|
|
"configured_delay" : "1m", <1>
|
|
"configured_delay_in_millis" : 60000,
|
|
"remaining_delay" : "59.8s", <2>
|
|
"remaining_delay_in_millis" : 59824,
|
|
"node_allocation_decisions" : [
|
|
{
|
|
"node_id" : "pmnHu_ooQWCPEFobZGbpWw",
|
|
"node_name" : "node_t2",
|
|
"transport_address" : "127.0.0.1:9402",
|
|
"node_decision" : "yes"
|
|
},
|
|
{
|
|
"node_id" : "3sULLVJrRneSg0EfBB-2Ew",
|
|
"node_name" : "node_t0",
|
|
"transport_address" : "127.0.0.1:9400",
|
|
"node_decision" : "no",
|
|
"store" : { <3>
|
|
"matching_size" : "4.2kb",
|
|
"matching_size_in_bytes" : 4325
|
|
},
|
|
"deciders" : [
|
|
{
|
|
"decider" : "same_shard",
|
|
"decision" : "NO",
|
|
"explanation" : "a copy of this shard is already allocated to this node [[my-index-000001][0], node[3sULLVJrRneSg0EfBB-2Ew], [P], s[STARTED], a[id=eV9P8BN1QPqRc3B4PLx6cg]]"
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
<1> The configured delay before allocating a replica shard that does not exist due to the node holding it leaving the cluster.
|
|
<2> The remaining delay before allocating the replica shard.
|
|
<3> Information about the shard data found on a node.
|
|
|
|
|
|
===== Examples of allocated shard explanations
|
|
|
|
The API response output for an assigned shard that is not allowed to remain on
|
|
its current node and is required to move:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"index" : "my-index-000001",
|
|
"shard" : 0,
|
|
"primary" : true,
|
|
"current_state" : "started",
|
|
"current_node" : {
|
|
"id" : "8lWJeJ7tSoui0bxrwuNhTA",
|
|
"name" : "node_t1",
|
|
"transport_address" : "127.0.0.1:9401"
|
|
},
|
|
"can_remain_on_current_node" : "no", <1>
|
|
"can_remain_decisions" : [ <2>
|
|
{
|
|
"decider" : "filter",
|
|
"decision" : "NO",
|
|
"explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"non_existent_node\"]"
|
|
}
|
|
],
|
|
"can_move_to_other_node" : "no", <3>
|
|
"move_explanation" : "cannot move shard to another node, even though it is not allowed to remain on its current node",
|
|
"node_allocation_decisions" : [
|
|
{
|
|
"node_id" : "_P8olZS8Twax9u6ioN-GGA",
|
|
"node_name" : "node_t0",
|
|
"transport_address" : "127.0.0.1:9400",
|
|
"node_decision" : "no",
|
|
"weight_ranking" : 1,
|
|
"deciders" : [
|
|
{
|
|
"decider" : "filter",
|
|
"decision" : "NO",
|
|
"explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"non_existent_node\"]"
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
<1> Whether the shard is allowed to remain on its current node.
|
|
<2> The deciders that factored into the decision of why the shard is not allowed to remain on its current node.
|
|
<3> Whether the shard is allowed to be allocated to another node.
|
|
|
|
|
|
The API response output for an assigned shard that remains on its current node
|
|
because moving the shard to another node does not form a better cluster balance:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"index" : "my-index-000001",
|
|
"shard" : 0,
|
|
"primary" : true,
|
|
"current_state" : "started",
|
|
"current_node" : {
|
|
"id" : "wLzJm4N4RymDkBYxwWoJsg",
|
|
"name" : "node_t0",
|
|
"transport_address" : "127.0.0.1:9400",
|
|
"weight_ranking" : 1
|
|
},
|
|
"can_remain_on_current_node" : "yes",
|
|
"can_rebalance_cluster" : "yes", <1>
|
|
"can_rebalance_to_other_node" : "no", <2>
|
|
"rebalance_explanation" : "cannot rebalance as no target node exists that can both allocate this shard and improve the cluster balance",
|
|
"node_allocation_decisions" : [
|
|
{
|
|
"node_id" : "oE3EGFc8QN-Tdi5FFEprIA",
|
|
"node_name" : "node_t1",
|
|
"transport_address" : "127.0.0.1:9401",
|
|
"node_decision" : "worse_balance", <3>
|
|
"weight_ranking" : 1
|
|
}
|
|
]
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
<1> Whether rebalancing is allowed on the cluster.
|
|
<2> Whether the shard can be rebalanced to another node.
|
|
<3> The reason the shard cannot be rebalanced to the node, in this case indicating that it offers no better balance than the current node.
|