[DOCS] Reworked the shard allocation filtering info. (#36456)
* [DOCS] Reworked the shard allocation filtering info. Closes #36079 * Added multiple index allocation settings example back. * Removed extraneous space
This commit is contained in:
parent
c3a6d1998a
commit
c9e03e6ead
|
@ -1,29 +1,54 @@
|
|||
[[shard-allocation-filtering]]
|
||||
=== Shard Allocation Filtering
|
||||
=== Index-level shard allocation filtering
|
||||
|
||||
Shard allocation filtering allows you to specify which nodes are allowed
|
||||
to host the shards of a particular index.
|
||||
You can use shard allocation filters to control where {es} allocates shards of
|
||||
a particular index. These per-index filters are applied in conjunction with
|
||||
<<allocation-filtering, cluster-wide allocation filtering>> and
|
||||
<<allocation-awareness, allocation awareness>>.
|
||||
|
||||
NOTE: The per-index shard allocation filters explained below work in
|
||||
conjunction with the cluster-wide allocation filters explained in
|
||||
<<shards-allocation>>.
|
||||
Shard allocation filters can be based on custom node attributes or the built-in
|
||||
`_name`, `host_ip`, `publish_ip`, `_ip`, and `_host` attributes.
|
||||
<<index-lifecycle-management, Index lifecycle management>> uses filters based
|
||||
on custom node attributes to determine how to reallocate shards when moving
|
||||
between phases.
|
||||
|
||||
It is possible to assign arbitrary metadata attributes to each node at
|
||||
startup. For instance, nodes could be assigned a `rack` and a `size`
|
||||
attribute as follows:
|
||||
The `cluster.routing.allocation` settings are dynamic, enabling live indices to
|
||||
be moved from one set of nodes to another. Shards are only relocated if it is
|
||||
possible to do so without breaking another routing constraint, such as never
|
||||
allocating a primary and replica shard on the same node.
|
||||
|
||||
For example, you could use a custom node attribute to indicate a node's
|
||||
performance characteristics and use shard allocation filtering to route shards
|
||||
for a particular index to the most appropriate class of hardware.
|
||||
|
||||
[float]
|
||||
[[index-allocation-filters]]
|
||||
==== Enabling index-level shard allocation filtering
|
||||
|
||||
To filter based on a custom node attribute:
|
||||
|
||||
. Specify the filter characteristics with a custom node attribute in each
|
||||
node's `elasticsearch.yml` configuration file. For example, if you have `small`,
|
||||
`medium`, and `big` nodes, you could add a `size` attribute to filter based
|
||||
on node size.
|
||||
+
|
||||
[source,yaml]
|
||||
--------------------------------------------------------
|
||||
node.attr.size: medium
|
||||
--------------------------------------------------------
|
||||
+
|
||||
You can also set custom attributes when you start a node:
|
||||
+
|
||||
[source,sh]
|
||||
------------------------
|
||||
bin/elasticsearch -Enode.attr.rack=rack1 -Enode.attr.size=big <1>
|
||||
------------------------
|
||||
<1> These attribute settings can also be specified in the `elasticsearch.yml` config file.
|
||||
|
||||
These metadata attributes can be used with the
|
||||
`index.routing.allocation.*` settings to allocate an index to a particular
|
||||
group of nodes. For instance, we can move the index `test` to either `big` or
|
||||
`medium` nodes as follows:
|
||||
|
||||
--------------------------------------------------------
|
||||
`./bin/elasticsearch -Enode.attr.size=medium
|
||||
--------------------------------------------------------
|
||||
|
||||
. Add a routing allocation filter to the index. The `index.routing.allocation`
|
||||
settings support three types of filters: `include`, `exclude`, and `require`.
|
||||
For example, to tell {es} to allocate shards from the `test` index to either
|
||||
`big` or `medium` nodes, use `index.routing.allocation.include`:
|
||||
+
|
||||
[source,js]
|
||||
------------------------
|
||||
PUT test/_settings
|
||||
|
@ -33,24 +58,11 @@ PUT test/_settings
|
|||
------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/^/PUT test\n/]
|
||||
|
||||
Alternatively, we can move the index `test` away from the `small` nodes with
|
||||
an `exclude` rule:
|
||||
|
||||
[source,js]
|
||||
------------------------
|
||||
PUT test/_settings
|
||||
{
|
||||
"index.routing.allocation.exclude.size": "small"
|
||||
}
|
||||
------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/^/PUT test\n/]
|
||||
|
||||
Multiple rules can be specified, in which case all conditions must be
|
||||
satisfied. For instance, we could move the index `test` to `big` nodes in
|
||||
`rack1` with the following:
|
||||
|
||||
+
|
||||
If you specify multiple filters, all conditions must be satisfied for shards to
|
||||
be relocated. For example, to move the `test` index to `big` nodes in `rack1`,
|
||||
you could specify:
|
||||
+
|
||||
[source,js]
|
||||
------------------------
|
||||
PUT test/_settings
|
||||
|
@ -62,10 +74,9 @@ PUT test/_settings
|
|||
// CONSOLE
|
||||
// TEST[s/^/PUT test\n/]
|
||||
|
||||
NOTE: If some conditions cannot be satisfied then shards will not be moved.
|
||||
|
||||
The following settings are _dynamic_, allowing live indices to be moved from
|
||||
one set of nodes to another:
|
||||
[float]
|
||||
[[index-allocation-settings]]
|
||||
==== Index allocation filter settings
|
||||
|
||||
`index.routing.allocation.include.{attribute}`::
|
||||
|
||||
|
@ -82,7 +93,7 @@ one set of nodes to another:
|
|||
Assign the index to a node whose `{attribute}` has _none_ of the
|
||||
comma-separated values.
|
||||
|
||||
These special attributes are also supported:
|
||||
The index allocation settings support the following built-in attributes:
|
||||
|
||||
[horizontal]
|
||||
`_name`:: Match nodes by node name
|
||||
|
@ -91,7 +102,7 @@ These special attributes are also supported:
|
|||
`_ip`:: Match either `_host_ip` or `_publish_ip`
|
||||
`_host`:: Match nodes by hostname
|
||||
|
||||
All attribute values can be specified with wildcards, eg:
|
||||
You can use wildcards when specifying attribute values, for example:
|
||||
|
||||
[source,js]
|
||||
------------------------
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
[[allocation-total-shards]]
|
||||
=== Total Shards Per Node
|
||||
=== Total shards per node
|
||||
|
||||
The cluster-level shard allocator tries to spread the shards of a single index
|
||||
across as many nodes as possible. However, depending on how many shards and
|
||||
|
@ -28,6 +28,3 @@ allocated.
|
|||
|
||||
Use with caution.
|
||||
=======================================
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -1,114 +1,110 @@
|
|||
[[allocation-awareness]]
|
||||
=== Shard Allocation Awareness
|
||||
=== Shard allocation awareness
|
||||
|
||||
When running nodes on multiple VMs on the same physical server, on multiple
|
||||
racks, or across multiple zones or domains, it is more likely that two nodes on
|
||||
the same physical server, in the same rack, or in the same zone or domain will
|
||||
crash at the same time, rather than two unrelated nodes crashing
|
||||
simultaneously.
|
||||
You can use custom node attributes as _awareness attributes_ to enable {es}
|
||||
to take your physical hardware configuration into account when allocating shards.
|
||||
If {es} knows which nodes are on the same physical server, in the same rack, or
|
||||
in the same zone, it can distribute the primary shard and its replica shards to
|
||||
minimise the risk of losing all shard copies in the event of a failure.
|
||||
|
||||
If Elasticsearch is _aware_ of the physical configuration of your hardware, it
|
||||
can ensure that the primary shard and its replica shards are spread across
|
||||
different physical servers, racks, or zones, to minimise the risk of losing
|
||||
all shard copies at the same time.
|
||||
When shard allocation awareness is enabled with the
|
||||
`cluster.routing.allocation.awareness.attributes` setting, shards are only
|
||||
allocated to nodes that have values set for the specified awareness
|
||||
attributes. If you use multiple awareness attributes, {es} considers
|
||||
each attribute separately when allocating shards.
|
||||
|
||||
The shard allocation awareness settings allow you to tell Elasticsearch about
|
||||
your hardware configuration.
|
||||
|
||||
As an example, let's assume we have several racks. When we start a node, we
|
||||
can tell it which rack it is in by assigning it an arbitrary metadata
|
||||
attribute called `rack_id` -- we could use any attribute name. For example:
|
||||
|
||||
[source,sh]
|
||||
----------------------
|
||||
./bin/elasticsearch -Enode.attr.rack_id=rack_one <1>
|
||||
----------------------
|
||||
<1> This setting could also be specified in the `elasticsearch.yml` config file.
|
||||
|
||||
Now, we need to set up _shard allocation awareness_ by telling Elasticsearch
|
||||
which attributes to use. This can be configured in the `elasticsearch.yml`
|
||||
file on *all* master-eligible nodes, or it can be set (and changed) with the
|
||||
The allocation awareness settings can be configured in
|
||||
`elasticsearch.yml` and updated dynamically with the
|
||||
<<cluster-update-settings,cluster-update-settings>> API.
|
||||
|
||||
For our example, we'll set the value in the config file:
|
||||
{es} prefers using shards in the same location (with the same
|
||||
awareness attribute values) to process search or GET requests. Using local
|
||||
shards is usually faster than crossing rack or zone boundaries.
|
||||
|
||||
NOTE: The number of attribute values determines how many shard copies are
|
||||
allocated in each location. If the number of nodes in each location is
|
||||
unbalanced and there are a lot of replicas, replica shards might be left
|
||||
unassigned.
|
||||
|
||||
[float]
|
||||
[[enabling-awareness]]
|
||||
==== Enabling shard allocation awareness
|
||||
|
||||
To enable shard allocation awareness:
|
||||
|
||||
. Specify the location of each node with a custom node attribute. For example,
|
||||
if you want Elasticsearch to distribute shards across different racks, you might
|
||||
set an awareness attribute called `rack_id` in each node's `elasticsearch.yml`
|
||||
config file.
|
||||
+
|
||||
[source,yaml]
|
||||
--------------------------------------------------------
|
||||
cluster.routing.allocation.awareness.attributes: rack_id
|
||||
node.attr.rack_id: rack_one
|
||||
--------------------------------------------------------
|
||||
+
|
||||
You can also set custom attributes when you start a node:
|
||||
+
|
||||
[source,sh]
|
||||
--------------------------------------------------------
|
||||
`./bin/elasticsearch -Enode.attr.rack_id=rack_one`
|
||||
--------------------------------------------------------
|
||||
|
||||
With this config in place, let's say we start two nodes with
|
||||
`node.attr.rack_id` set to `rack_one`, and we create an index with 5 primary
|
||||
shards and 1 replica of each primary. All primaries and replicas are
|
||||
. Tell {es} to take one or more awareness attributes into account when
|
||||
allocating shards by setting
|
||||
`cluster.routing.allocation.awareness.attributes` in *every* master-eligible
|
||||
node's `elasticsearch.yml` config file.
|
||||
+
|
||||
--
|
||||
[source,yaml]
|
||||
--------------------------------------------------------
|
||||
cluster.routing.allocation.awareness.attributes: rack_id <1>
|
||||
--------------------------------------------------------
|
||||
<1> Specify multiple attributes as a comma-separated list.
|
||||
--
|
||||
+
|
||||
You can also use the
|
||||
<<cluster-update-settings,cluster-update-settings>> API to set or update
|
||||
a cluster's awareness attributes.
|
||||
|
||||
With this example configuration, if you start two nodes with
|
||||
`node.attr.rack_id` set to `rack_one` and create an index with 5 primary
|
||||
shards and 1 replica of each primary, all primaries and replicas are
|
||||
allocated across the two nodes.
|
||||
|
||||
Now, if we start two more nodes with `node.attr.rack_id` set to `rack_two`,
|
||||
Elasticsearch will move shards across to the new nodes, ensuring (if possible)
|
||||
that no two copies of the same shard will be in the same rack. However if
|
||||
`rack_two` were to fail, taking down both of its nodes, Elasticsearch will
|
||||
still allocate the lost shard copies to nodes in `rack_one`.
|
||||
If you add two nodes with `node.attr.rack_id` set to `rack_two`,
|
||||
{es} moves shards to the new nodes, ensuring (if possible)
|
||||
that no two copies of the same shard are in the same rack.
|
||||
|
||||
.Prefer local shards
|
||||
*********************************************
|
||||
|
||||
When executing search or GET requests, with shard awareness enabled,
|
||||
Elasticsearch will prefer using local shards -- shards in the same awareness
|
||||
group -- to execute the request. This is usually faster than crossing between
|
||||
racks or across zone boundaries.
|
||||
|
||||
*********************************************
|
||||
|
||||
Multiple awareness attributes can be specified, in which case each attribute
|
||||
is considered separately when deciding where to allocate the shards.
|
||||
|
||||
[source,yaml]
|
||||
-------------------------------------------------------------
|
||||
cluster.routing.allocation.awareness.attributes: rack_id,zone
|
||||
-------------------------------------------------------------
|
||||
|
||||
NOTE: When using awareness attributes, shards will not be allocated to nodes
|
||||
that don't have values set for those attributes.
|
||||
|
||||
NOTE: Number of primary/replica of a shard allocated on a specific group of
|
||||
nodes with the same awareness attribute value is determined by the number of
|
||||
attribute values. When the number of nodes in groups is unbalanced and there
|
||||
are many replicas, replica shards may be left unassigned.
|
||||
If `rack_two` fails and takes down both its nodes, by default {es}
|
||||
allocates the lost shard copies to nodes in `rack_one`. To prevent multiple
|
||||
copies of a particular shard from being allocated in the same location, you can
|
||||
enable forced awareness.
|
||||
|
||||
[float]
|
||||
[[forced-awareness]]
|
||||
=== Forced Awareness
|
||||
==== Forced awareness
|
||||
|
||||
Imagine that you have two zones and enough hardware across the two zones to
|
||||
host all of your primary and replica shards. But perhaps the hardware in a
|
||||
single zone, while sufficient to host half the shards, would be unable to host
|
||||
*ALL* the shards.
|
||||
By default, if one location fails, Elasticsearch assigns all of the missing
|
||||
replica shards to the remaining locations. While you might have sufficient
|
||||
resources across all locations to host your primary and replica shards, a single
|
||||
location might be unable to host *ALL* of the shards.
|
||||
|
||||
With ordinary awareness, if one zone lost contact with the other zone,
|
||||
Elasticsearch would assign all of the missing replica shards to a single zone.
|
||||
But in this example, this sudden extra load would cause the hardware in the
|
||||
remaining zone to be overloaded.
|
||||
To prevent a single location from being overloaded in the event of a failure,
|
||||
you can set `cluster.routing.allocation.awareness.force` so no replicas are
|
||||
allocated until nodes are available in another location.
|
||||
|
||||
Forced awareness solves this problem by *NEVER* allowing copies of the same
|
||||
shard to be allocated to the same zone.
|
||||
|
||||
For example, lets say we have an awareness attribute called `zone`, and we
|
||||
know we are going to have two zones, `zone1` and `zone2`. Here is how we can
|
||||
force awareness on a node:
|
||||
For example, if you have an awareness attribute called `zone` and configure nodes
|
||||
in `zone1` and `zone2`, you can use forced awareness to prevent Elasticsearch
|
||||
from allocating replicas if only one zone is available:
|
||||
|
||||
[source,yaml]
|
||||
-------------------------------------------------------------------
|
||||
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2 <1>
|
||||
cluster.routing.allocation.awareness.attributes: zone
|
||||
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2 <1>
|
||||
-------------------------------------------------------------------
|
||||
<1> We must list all possible values that the `zone` attribute can have.
|
||||
|
||||
Now, if we start 2 nodes with `node.attr.zone` set to `zone1` and create an
|
||||
index with 5 shards and 1 replica. The index will be created, but only the 5
|
||||
primary shards will be allocated (with no replicas). Only when we start more
|
||||
nodes with `node.attr.zone` set to `zone2` will the replicas be allocated.
|
||||
|
||||
The `cluster.routing.allocation.awareness.*` settings can all be updated
|
||||
dynamically on a live cluster with the
|
||||
<<cluster-update-settings,cluster-update-settings>> API.
|
||||
|
||||
<1> Specify all possible values for the awareness attribute.
|
||||
|
||||
With this example configuration, if you start two nodes with `node.attr.zone` set
|
||||
to `zone1` and create an index with 5 shards and 1 replica, Elasticsearch creates
|
||||
the index and allocates the 5 primary shards but no replicas. Replicas are
|
||||
only allocated once nodes with `node.attr.zone` set to `zone2` are available.
|
||||
|
|
|
@ -1,13 +1,37 @@
|
|||
[[allocation-filtering]]
|
||||
=== Shard Allocation Filtering
|
||||
=== Cluster-level shard allocation filtering
|
||||
|
||||
While <<index-modules-allocation>> provides *per-index* settings to control the
|
||||
allocation of shards to nodes, cluster-level shard allocation filtering allows
|
||||
you to allow or disallow the allocation of shards from *any* index to
|
||||
particular nodes.
|
||||
You can use cluster-level shard allocation filters to control where {es}
|
||||
allocates shards from any index. These cluster wide filters are applied in
|
||||
conjunction with <<shard-allocation-filtering, per-index allocation filtering>>
|
||||
and <<allocation-awareness, allocation awareness>>.
|
||||
|
||||
The available _dynamic_ cluster settings are as follows, where `{attribute}`
|
||||
refers to an arbitrary node attribute.:
|
||||
Shard allocation filters can be based on custom node attributes or the built-in
|
||||
`_name`, `_ip`, and `_host` attributes.
|
||||
|
||||
The `cluster.routing.allocation` settings are dynamic, enabling live indices to
|
||||
be moved from one set of nodes to another. Shards are only relocated if it is
|
||||
possible to do so without breaking another routing constraint, such as never
|
||||
allocating a primary and replica shard on the same node.
|
||||
|
||||
The most common use case for cluster-level shard allocation filtering is when
|
||||
you want to decommission a node. To move shards off of a node prior to shutting
|
||||
it down, you could create a filter that excludes the node by its IP address:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _cluster/settings
|
||||
{
|
||||
"transient" : {
|
||||
"cluster.routing.allocation.exclude._ip" : "10.0.0.1"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
[float]
|
||||
[[cluster-routing-settings]]
|
||||
==== Cluster routing settings
|
||||
|
||||
`cluster.routing.allocation.include.{attribute}`::
|
||||
|
||||
|
@ -24,36 +48,14 @@ refers to an arbitrary node attribute.:
|
|||
Do not allocate shards to a node whose `{attribute}` has _any_ of the
|
||||
comma-separated values.
|
||||
|
||||
These special attributes are also supported:
|
||||
The cluster allocation settings support the following built-in attributes:
|
||||
|
||||
[horizontal]
|
||||
`_name`:: Match nodes by node names
|
||||
`_ip`:: Match nodes by IP addresses (the IP address associated with the hostname)
|
||||
`_host`:: Match nodes by hostnames
|
||||
|
||||
The typical use case for cluster-wide shard allocation filtering is when you
|
||||
want to decommission a node, and you would like to move the shards from that
|
||||
node to other nodes in the cluster before shutting it down.
|
||||
|
||||
For instance, we could decommission a node using its IP address as follows:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _cluster/settings
|
||||
{
|
||||
"transient" : {
|
||||
"cluster.routing.allocation.exclude._ip" : "10.0.0.1"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
NOTE: Shards will only be relocated if it is possible to do so without
|
||||
breaking another routing constraint, such as never allocating a primary and
|
||||
replica shard to the same node.
|
||||
|
||||
In addition to listing multiple values as a comma-separated list, all
|
||||
attribute values can be specified with wildcards, eg:
|
||||
You can use wildcards when specifying attribute values, for example:
|
||||
|
||||
[source,js]
|
||||
------------------------
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
[[disk-allocator]]
|
||||
=== Disk-based Shard Allocation
|
||||
=== Disk-based shard allocation
|
||||
|
||||
Elasticsearch considers the available disk space on a node before deciding
|
||||
whether to allocate new shards to that node or to actively relocate shards away
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
[[shards-allocation]]
|
||||
=== Cluster Level Shard Allocation
|
||||
=== Cluster level shard allocation
|
||||
|
||||
Shard allocation is the process of allocating shards to nodes. This can
|
||||
happen during initial recovery, replica allocation, rebalancing, or
|
||||
when nodes are added or removed.
|
||||
|
||||
[float]
|
||||
=== Shard Allocation Settings
|
||||
=== Shard allocation settings
|
||||
|
||||
The following _dynamic_ settings may be used to control shard allocation and recovery:
|
||||
|
||||
|
@ -59,7 +59,7 @@ one of the active allocation ids in the cluster state.
|
|||
setting only applies if multiple nodes are started on the same machine.
|
||||
|
||||
[float]
|
||||
=== Shard Rebalancing Settings
|
||||
=== Shard rebalancing settings
|
||||
|
||||
The following _dynamic_ settings may be used to control the rebalancing of
|
||||
shards across the cluster:
|
||||
|
@ -98,7 +98,7 @@ Specify when shard rebalancing is allowed:
|
|||
or <<forced-awareness,forced awareness>>.
|
||||
|
||||
[float]
|
||||
=== Shard Balancing Heuristics
|
||||
=== Shard balancing heuristics
|
||||
|
||||
The following settings are used together to determine where to place each
|
||||
shard. The cluster is balanced when no allowed rebalancing operation can bring the weight
|
||||
|
|
Loading…
Reference in New Issue