parent
b5e958b2e0
commit
84acb65ca1
|
@ -18,7 +18,11 @@ $ curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
|
|||
"relocating_shards" : 0,
|
||||
"initializing_shards" : 0,
|
||||
"unassigned_shards" : 0,
|
||||
"number_of_pending_tasks" : 0
|
||||
"delayed_unassigned_shards": 0,
|
||||
"number_of_pending_tasks" : 0,
|
||||
"number_of_in_flight_fetch": 0,
|
||||
"task_max_waiting_in_queue_millis": 0,
|
||||
"active_shards_percent_as_number": 100
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
|
|
|
@ -2,130 +2,17 @@
|
|||
== Index Shard Allocation
|
||||
|
||||
This module provides per-index settings to control the allocation of shards to
|
||||
nodes.
|
||||
nodes:
|
||||
|
||||
[float]
|
||||
[[shard-allocation-filtering]]
|
||||
=== Shard Allocation Filtering
|
||||
* <<shard-allocation-filtering,Shard allocation filtering>>: Controlling which shards are allocated to which nodes.
|
||||
* <<delayed-allocation,Delayed allocation>>: Delaying allocation of unassigned shards caused by a node leaving.
|
||||
* <<allocation-total-shards,Total shards per node>>: A hard limit on the number of shards from the same index per node.
|
||||
|
||||
Shard allocation filtering allows you to specify which nodes are allowed
|
||||
to host the shards of a particular index.
|
||||
include::allocation/filtering.asciidoc[]
|
||||
|
||||
NOTE: The per-index shard allocation filters explained below work in
|
||||
conjunction with the cluster-wide allocation filters explained in
|
||||
<<shards-allocation>>.
|
||||
include::allocation/delayed.asciidoc[]
|
||||
|
||||
It is possible to assign arbitrary metadata attributes to each node at
|
||||
startup. For instance, nodes could be assigned a `rack` and a `group`
|
||||
attribute as follows:
|
||||
|
||||
[source,sh]
|
||||
------------------------
|
||||
bin/elasticsearch --node.rack rack1 --node.size big <1>
|
||||
------------------------
|
||||
<1> These attribute settings can also be specfied in the `elasticsearch.yml` config file.
|
||||
|
||||
These metadata attributes can be used with the
|
||||
`index.routing.allocation.*` settings to allocate an index to a particular
|
||||
group of nodes. For instance, we can move the index `test` to either `big` or
|
||||
`medium` nodes as follows:
|
||||
|
||||
[source,json]
|
||||
------------------------
|
||||
PUT test/_settings
|
||||
{
|
||||
"index.routing.allocation.include.size": "big,medium"
|
||||
}
|
||||
------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
Alternatively, we can move the index `test` away from the `small` nodes with
|
||||
an `exclude` rule:
|
||||
|
||||
[source,json]
|
||||
------------------------
|
||||
PUT test/_settings
|
||||
{
|
||||
"index.routing.allocation.exclude.size": "small"
|
||||
}
|
||||
------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
Multiple rules can be specified, in which case all conditions must be
|
||||
satisfied. For instance, we could move the index `test` to `big` nodes in
|
||||
`rack1` with the following:
|
||||
|
||||
[source,json]
|
||||
------------------------
|
||||
PUT test/_settings
|
||||
{
|
||||
"index.routing.allocation.include.size": "big",
|
||||
"index.routing.allocation.include.rack": "rack1"
|
||||
}
|
||||
------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
NOTE: If some conditions cannot be satisfied then shards will not be moved.
|
||||
|
||||
The following settings are _dynamic_, allowing live indices to be moved from
|
||||
one set of nodes to another:
|
||||
|
||||
`index.routing.allocation.include.{attribute}`::
|
||||
|
||||
Assign the index to a node whose `{attribute}` has at least one of the
|
||||
comma-separated values.
|
||||
|
||||
`index.routing.allocation.require.{attribute}`::
|
||||
|
||||
Assign the index to a node whose `{attribute}` has _all_ of the
|
||||
comma-separated values.
|
||||
|
||||
`index.routing.allocation.exclude.{attribute}`::
|
||||
|
||||
Assign the index to a node whose `{attribute}` has _none_ of the
|
||||
comma-separated values.
|
||||
|
||||
These special attributes are also supported:
|
||||
|
||||
[horizontal]
|
||||
`_name`:: Match nodes by node name
|
||||
`_ip`:: Match nodes by IP address (the IP address associated with the hostname)
|
||||
`_host`:: Match nodes by hostname
|
||||
|
||||
All attribute values can be specified with wildcards, eg:
|
||||
|
||||
[source,json]
|
||||
------------------------
|
||||
PUT test/_settings
|
||||
{
|
||||
"index.routing.allocation.include._ip": "192.168.2.*"
|
||||
}
|
||||
------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
[float]
|
||||
=== Total Shards Per Node
|
||||
|
||||
The cluster-level shard allocator tries to spread the shards of a single index
|
||||
across as many nodes as possible. However, depending on how many shards and
|
||||
indices you have, and how big they are, it may not always be possible to spread
|
||||
shards evenly.
|
||||
|
||||
The following _dynamic_ setting allows you to specify a hard limit on the total
|
||||
number of shards from a single index allowed per node:
|
||||
|
||||
`index.routing.allocation.total_shards_per_node`::
|
||||
|
||||
The maximum number of shards (replicas and primaries) that will be
|
||||
allocated to a single node. Defaults to unbounded.
|
||||
|
||||
[WARNING]
|
||||
=======================================
|
||||
This setting imposes a hard limit which can result in some shards not
|
||||
being allocated.
|
||||
|
||||
Use with caution.
|
||||
=======================================
|
||||
include::allocation/total_shards.asciidoc[]
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -0,0 +1,91 @@
|
|||
[[delayed-allocation]]
|
||||
=== Delaying allocation when a node leaves
|
||||
|
||||
When a node leaves the cluster for whatever reason, intentional or otherwise,
|
||||
the master reacts by:
|
||||
|
||||
* Promoting a replica shard to primary to replace any primaries that were on the node.
|
||||
* Allocating replica shards to replace the missing replicas (assuming there are enough nodes).
|
||||
* Rebalancing shards evenly across the remaining nodes.
|
||||
|
||||
These actions are intended to protect the cluster against data loss by
|
||||
ensuring that every shard is fully replicated as soon as possible.
|
||||
|
||||
Even though we throttle concurrent recoveries both at the
|
||||
<<recovery,node level>> and at the <<shards-allocation,cluster level>>, this
|
||||
``shard-shuffle'' can still put a lot of extra load on the cluster which
|
||||
may not be necessary if the missing node is likely to return soon. Imagine
|
||||
this scenario:
|
||||
|
||||
* Node 5 loses network connectivity.
|
||||
* The master promotes a replica shard to primary for each primary that was on Node 5.
|
||||
* The master allocates new replicas to other nodes in the cluster.
|
||||
* Each new replica makes an entire copy of the primary shard across the network.
|
||||
* More shards are moved to different nodes to rebalance the cluster.
|
||||
* Node 5 returns after a few minutes.
|
||||
* The master rebalances the cluster by allocating shards to Node 5.
|
||||
|
||||
If the master had just waited for a few minutes, then the missing shards could
|
||||
have been re-allocated to Node 5 with the minimum of network traffic. This
|
||||
process would be even quicker for idle shards (shards not receiving indexing
|
||||
requests) which have been automatically <<indices-synced-flush,sync-flushed>>.
|
||||
|
||||
The allocation of replica shards which become unassigned because a node has
|
||||
left can be delayed with the `index.unassigned.node_left.delayed_timeout`
|
||||
dynamic setting, which defaults to `0` (reassign shards immediately).
|
||||
|
||||
This setting can be updated on a live index (or on all indices):
|
||||
|
||||
[source,js]
|
||||
------------------------------
|
||||
PUT /_all/_settings
|
||||
{
|
||||
"settings": {
|
||||
"index.unassigned.node_left.delayed_timeout": "5m"
|
||||
}
|
||||
}
|
||||
------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
With delayed allocation enabled, the above scenario changes to look like this:
|
||||
|
||||
* Node 5 loses network connectivity.
|
||||
* The master promotes a replica shard to primary for each primary that was on Node 5.
|
||||
* The master logs a message that allocation of unassigned shards has been delayed, and for how long.
|
||||
* The cluster remains yellow because there are unassigned replica shards.
|
||||
* Node 5 returns after a few minutes, before the `timeout` expires.
|
||||
* The missing replicas are re-allocated to Node 5 (and sync-flushed shards recover almost immediately).
|
||||
|
||||
NOTE: This setting will not affect the promotion of replicas to primaries, nor
|
||||
will it affect the assignment of replicas that have not been assigned
|
||||
previously.
|
||||
|
||||
==== Monitoring delayed unassigned shards
|
||||
|
||||
The number of shards whose allocation has been delayed by this timeout setting
|
||||
can be viewed with the <<cluster-health,cluster health API>>:
|
||||
|
||||
[source,js]
|
||||
------------------------------
|
||||
GET _cluster/health <1>
|
||||
------------------------------
|
||||
<1> This request will return a `delayed_unassigned_shards` value.
|
||||
|
||||
==== Removing a node permanently
|
||||
|
||||
If a node is not going to return and you would like Elasticsearch to allocate
|
||||
the missing shards immediately, just update the timeout to zero:
|
||||
|
||||
|
||||
[source,js]
|
||||
------------------------------
|
||||
PUT /_all/_settings
|
||||
{
|
||||
"settings": {
|
||||
"index.unassigned.node_left.delayed_timeout": "0"
|
||||
}
|
||||
}
|
||||
------------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
You can reset the timeout as soon as the missing shards have started to recover.
|
|
@ -0,0 +1,97 @@
|
|||
[[shard-allocation-filtering]]
|
||||
=== Shard Allocation Filtering
|
||||
|
||||
Shard allocation filtering allows you to specify which nodes are allowed
|
||||
to host the shards of a particular index.
|
||||
|
||||
NOTE: The per-index shard allocation filters explained below work in
|
||||
conjunction with the cluster-wide allocation filters explained in
|
||||
<<shards-allocation>>.
|
||||
|
||||
It is possible to assign arbitrary metadata attributes to each node at
|
||||
startup. For instance, nodes could be assigned a `rack` and a `group`
|
||||
attribute as follows:
|
||||
|
||||
[source,sh]
|
||||
------------------------
|
||||
bin/elasticsearch --node.rack rack1 --node.size big <1>
|
||||
------------------------
|
||||
<1> These attribute settings can also be specfied in the `elasticsearch.yml` config file.
|
||||
|
||||
These metadata attributes can be used with the
|
||||
`index.routing.allocation.*` settings to allocate an index to a particular
|
||||
group of nodes. For instance, we can move the index `test` to either `big` or
|
||||
`medium` nodes as follows:
|
||||
|
||||
[source,json]
|
||||
------------------------
|
||||
PUT test/_settings
|
||||
{
|
||||
"index.routing.allocation.include.size": "big,medium"
|
||||
}
|
||||
------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
Alternatively, we can move the index `test` away from the `small` nodes with
|
||||
an `exclude` rule:
|
||||
|
||||
[source,json]
|
||||
------------------------
|
||||
PUT test/_settings
|
||||
{
|
||||
"index.routing.allocation.exclude.size": "small"
|
||||
}
|
||||
------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
Multiple rules can be specified, in which case all conditions must be
|
||||
satisfied. For instance, we could move the index `test` to `big` nodes in
|
||||
`rack1` with the following:
|
||||
|
||||
[source,json]
|
||||
------------------------
|
||||
PUT test/_settings
|
||||
{
|
||||
"index.routing.allocation.include.size": "big",
|
||||
"index.routing.allocation.include.rack": "rack1"
|
||||
}
|
||||
------------------------
|
||||
// AUTOSENSE
|
||||
|
||||
NOTE: If some conditions cannot be satisfied then shards will not be moved.
|
||||
|
||||
The following settings are _dynamic_, allowing live indices to be moved from
|
||||
one set of nodes to another:
|
||||
|
||||
`index.routing.allocation.include.{attribute}`::
|
||||
|
||||
Assign the index to a node whose `{attribute}` has at least one of the
|
||||
comma-separated values.
|
||||
|
||||
`index.routing.allocation.require.{attribute}`::
|
||||
|
||||
Assign the index to a node whose `{attribute}` has _all_ of the
|
||||
comma-separated values.
|
||||
|
||||
`index.routing.allocation.exclude.{attribute}`::
|
||||
|
||||
Assign the index to a node whose `{attribute}` has _none_ of the
|
||||
comma-separated values.
|
||||
|
||||
These special attributes are also supported:
|
||||
|
||||
[horizontal]
|
||||
`_name`:: Match nodes by node name
|
||||
`_ip`:: Match nodes by IP address (the IP address associated with the hostname)
|
||||
`_host`:: Match nodes by hostname
|
||||
|
||||
All attribute values can be specified with wildcards, eg:
|
||||
|
||||
[source,json]
|
||||
------------------------
|
||||
PUT test/_settings
|
||||
{
|
||||
"index.routing.allocation.include._ip": "192.168.2.*"
|
||||
}
|
||||
------------------------
|
||||
// AUTOSENSE
|
|
@ -0,0 +1,26 @@
|
|||
[[allocation-total-shards]]
|
||||
=== Total Shards Per Node
|
||||
|
||||
The cluster-level shard allocator tries to spread the shards of a single index
|
||||
across as many nodes as possible. However, depending on how many shards and
|
||||
indices you have, and how big they are, it may not always be possible to spread
|
||||
shards evenly.
|
||||
|
||||
The following _dynamic_ setting allows you to specify a hard limit on the total
|
||||
number of shards from a single index allowed per node:
|
||||
|
||||
`index.routing.allocation.total_shards_per_node`::
|
||||
|
||||
The maximum number of shards (replicas and primaries) that will be
|
||||
allocated to a single node. Defaults to unbounded.
|
||||
|
||||
[WARNING]
|
||||
=======================================
|
||||
This setting imposes a hard limit which can result in some shards not
|
||||
being allocated.
|
||||
|
||||
Use with caution.
|
||||
=======================================
|
||||
|
||||
|
||||
|
Loading…
Reference in New Issue