[DOCS] Reworked the shard allocation filtering info. (#36456)

* [DOCS] Reworked the shard allocation filtering info. Closes #36079 * Added multiple index allocation settings example back. * Removed extraneous space
2018-12-11 07:44:57 -08:00 · 2018-12-11 07:44:57 -08:00 · c9e03e6ead
parent c3a6d1998a
commit c9e03e6ead
6 changed files with 177 additions and 171 deletions
--- a/docs/reference/index-modules/allocation/filtering.asciidoc
+++ b/docs/reference/index-modules/allocation/filtering.asciidoc
@ -1,29 +1,54 @@
 [[shard-allocation-filtering]]
-=== Shard Allocation Filtering
+=== Index-level shard allocation filtering
-Shard allocation filtering allows you to specify which nodes are allowed
+You can use shard allocation filters to control where {es} allocates shards of
-to host the shards of a particular index.
+a particular index. These per-index filters are applied in conjunction with
 <<allocation-filtering, cluster-wide allocation filtering>> and
 <<allocation-awareness, allocation awareness>>.
-NOTE: The per-index shard allocation filters explained below work in
+Shard allocation filters can be based on custom node attributes or the built-in
-conjunction with the cluster-wide allocation filters explained in
+`_name`, `host_ip`, `publish_ip`, `_ip`, and `_host` attributes.
-<<shards-allocation>>.
+<<index-lifecycle-management, Index lifecycle management>> uses filters based
 on custom node attributes to determine how to reallocate shards when moving
 between phases.
-It is possible to assign arbitrary metadata attributes to each node at
+The `cluster.routing.allocation` settings are dynamic, enabling live indices to
-startup.  For instance, nodes could be assigned a `rack` and a `size`
+be moved from one set of nodes to another. Shards are only relocated if it is
-attribute as follows:
+possible to do so without breaking another routing constraint, such as never
 allocating a primary and replica shard on the same node.
 For example, you could use a custom node attribute to indicate a node's
 performance characteristics and use shard allocation filtering to route shards
 for a particular index to the most appropriate class of hardware.
 [float]
 [[index-allocation-filters]]
 ==== Enabling index-level shard allocation filtering
 To filter based on a custom node attribute:
 . Specify the filter characteristics with a custom node attribute in each
 node's `elasticsearch.yml` configuration file. For example, if you have `small`,
 `medium`, and `big` nodes, you could add a `size` attribute to filter based
 on node size.
 +
 [source,yaml]
 --------------------------------------------------------
 node.attr.size: medium
 --------------------------------------------------------
 +
 You can also set custom attributes when you start a node:
 +
 [source,sh]
------------------------
+--------------------------------------------------------
-bin/elasticsearch -Enode.attr.rack=rack1 -Enode.attr.size=big  <1>
+`./bin/elasticsearch -Enode.attr.size=medium
------------------------
+--------------------------------------------------------
 <1> These attribute settings can also be specified in the `elasticsearch.yml` config file.
 These metadata attributes can be used with the
 `index.routing.allocation.*` settings to allocate an index to a particular
 group of nodes.  For instance, we can move the index `test` to either `big` or
 `medium` nodes as follows:
 . Add a routing allocation filter to the index. The `index.routing.allocation`
 settings support three types of filters: `include`, `exclude`, and `require`.
 For example, to tell {es} to allocate shards from the `test` index to either
 `big` or `medium` nodes, use `index.routing.allocation.include`:
 +
 [source,js]
 ------------------------
 PUT test/_settings
@ -33,24 +58,11 @@ PUT test/_settings
 ------------------------
 // CONSOLE
 // TEST[s/^/PUT test\n/]
-
+
-Alternatively, we can move the index `test` away from the `small` nodes with
+If you specify multiple filters, all conditions must be satisfied for shards to
-an `exclude` rule:
+be relocated. For example, to move the `test` index to `big` nodes in `rack1`,
-
+you could specify:
-[source,js]
+
 ------------------------
 PUT test/_settings
 {
  "index.routing.allocation.exclude.size": "small"
 }
 ------------------------
 // CONSOLE
 // TEST[s/^/PUT test\n/]
 Multiple rules can be specified, in which case all conditions must be
 satisfied.  For instance, we could move the index `test` to `big` nodes in
 `rack1` with the following:
 [source,js]
 ------------------------
 PUT test/_settings
@ -62,10 +74,9 @@ PUT test/_settings
 // CONSOLE
 // TEST[s/^/PUT test\n/]
-NOTE: If some conditions cannot be satisfied then shards will not be moved.
+[float]
-
+[[index-allocation-settings]]
-The following settings are _dynamic_, allowing live indices to be moved from
+==== Index allocation filter settings
 one set of nodes to another:
 `index.routing.allocation.include.{attribute}`::
@ -82,7 +93,7 @@ one set of nodes to another:
    Assign the index to a node whose `{attribute}` has _none_ of the
    comma-separated values.
-These special attributes are also supported:
+The index allocation settings support the following built-in attributes:
 [horizontal]
 `_name`::       Match nodes by node name
@ -91,7 +102,7 @@ These special attributes are also supported:
 `_ip`::         Match either `_host_ip` or `_publish_ip`
 `_host`::       Match nodes by hostname
-All attribute values can be specified with wildcards, eg:
+You can use wildcards when specifying attribute values, for example:
 [source,js]
 ------------------------
--- a/docs/reference/index-modules/allocation/total_shards.asciidoc
+++ b/docs/reference/index-modules/allocation/total_shards.asciidoc
@ -1,5 +1,5 @@
 [[allocation-total-shards]]
-=== Total Shards Per Node
+=== Total shards per node
 The cluster-level shard allocator tries to spread the shards of a single index
 across as many nodes as possible.  However, depending on how many shards and
@ -28,6 +28,3 @@ allocated.
 Use with caution.
 =======================================
--- a/docs/reference/modules/cluster/allocation_awareness.asciidoc
+++ b/docs/reference/modules/cluster/allocation_awareness.asciidoc
@ -1,114 +1,110 @@
 [[allocation-awareness]]
-=== Shard Allocation Awareness
+=== Shard allocation awareness
-When running nodes on multiple VMs on the same physical server, on multiple
+You can use custom node attributes as _awareness attributes_ to enable {es}
-racks, or across multiple zones or domains, it is more likely that two nodes on
+to take your physical hardware configuration into account when allocating shards.
-the same physical server, in the same rack, or in the same zone or domain will
+If {es} knows which nodes are on the same physical server, in the same rack, or
-crash at the same time, rather than two unrelated nodes crashing
+in the same zone, it can distribute the primary shard and its replica shards to
-simultaneously.
+minimise the risk of losing all shard copies in the event of a failure.
-If Elasticsearch is _aware_ of the physical configuration of your hardware, it
+When shard allocation awareness is enabled with the
-can ensure that the primary shard and its replica shards are spread across
+`cluster.routing.allocation.awareness.attributes` setting, shards are only
-different physical servers, racks, or zones, to minimise the risk of losing
+allocated to nodes that have values set for the specified awareness
-all shard copies at the same time.
+attributes. If you use multiple awareness attributes, {es} considers
 each attribute separately when allocating shards.
-The shard allocation awareness settings allow you to tell Elasticsearch about
+The allocation awareness settings can be configured in
-your hardware configuration.
+`elasticsearch.yml` and updated dynamically with the
 As an example, let's assume we have several racks.  When we start a node, we
 can tell it which rack it is in by assigning it an arbitrary metadata
 attribute called `rack_id` -- we could use any attribute name.  For example:
 [source,sh]
 ----------------------
 ./bin/elasticsearch -Enode.attr.rack_id=rack_one <1>
 ----------------------
 <1> This setting could also be specified in the `elasticsearch.yml` config file.
 Now, we need to set up _shard allocation awareness_  by telling Elasticsearch
 which attributes to use.  This can be configured in the `elasticsearch.yml`
 file on *all* master-eligible nodes, or it can be set (and changed) with the
 <<cluster-update-settings,cluster-update-settings>> API.
-For our example, we'll set the value in the config file:
+{es} prefers using shards in the same location (with the same
 awareness attribute values) to process search or GET requests. Using local
 shards is usually faster than crossing rack or zone boundaries.
 NOTE: The number of attribute values determines how many shard copies are
 allocated in each location. If the number of nodes in each location is
 unbalanced and there are a lot of replicas, replica shards might be left
 unassigned.
 [float]
 [[enabling-awareness]]
 ==== Enabling shard allocation awareness
 To enable shard allocation awareness:
 . Specify the location of each node with a custom node attribute. For example,
 if you want Elasticsearch to distribute shards across different racks, you might
 set an awareness attribute called `rack_id` in each node's `elasticsearch.yml`
 config file.
 +
 [source,yaml]
 --------------------------------------------------------
-cluster.routing.allocation.awareness.attributes: rack_id
+node.attr.rack_id: rack_one
 --------------------------------------------------------
 +
 You can also set custom attributes when you start a node:
 +
 [source,sh]
 --------------------------------------------------------
 `./bin/elasticsearch -Enode.attr.rack_id=rack_one`
 --------------------------------------------------------
-With this config in place, let's say we start two nodes with
+. Tell {es} to take one or more awareness attributes into account when
-`node.attr.rack_id` set to `rack_one`, and we create an index with 5 primary
+allocating shards by setting
-shards and 1 replica of each primary.  All primaries and replicas are
+`cluster.routing.allocation.awareness.attributes` in *every* master-eligible
 node's `elasticsearch.yml` config file.
 +
 --
 [source,yaml]
 --------------------------------------------------------
 cluster.routing.allocation.awareness.attributes: rack_id <1>
 --------------------------------------------------------
 <1> Specify multiple attributes as a comma-separated list.
 --
 +
 You can also use the
 <<cluster-update-settings,cluster-update-settings>> API to set or update
 a cluster's awareness attributes.
 With this example configuration, if you start two nodes with
 `node.attr.rack_id` set to `rack_one` and create an index with 5 primary
 shards and 1 replica of each primary, all primaries and replicas are
 allocated across the two nodes.
-Now, if we start two more nodes with `node.attr.rack_id` set to `rack_two`,
+If you add two nodes with `node.attr.rack_id` set to `rack_two`,
-Elasticsearch will move shards across to the new nodes, ensuring (if possible)
+{es} moves shards to the new nodes, ensuring (if possible)
-that no two copies of the same shard will be in the same rack. However if
+that no two copies of the same shard are in the same rack.
 `rack_two` were to fail, taking down both of its nodes, Elasticsearch will
 still allocate the lost shard copies to nodes in `rack_one`. 
-.Prefer local shards
+If `rack_two` fails and takes down both its nodes, by default {es}
-*********************************************
+allocates the lost shard copies to nodes in `rack_one`. To prevent multiple
-
+copies of a particular shard from being allocated in the same location, you can
-When executing search or GET requests, with shard awareness enabled,
+enable forced awareness.
 Elasticsearch will prefer using local shards -- shards in the same awareness
 group -- to execute the request. This is usually faster than crossing between
 racks or across zone boundaries.
 *********************************************
 Multiple awareness attributes can be specified, in which case each attribute
 is considered separately when deciding where to allocate the shards.
 [source,yaml]
 -------------------------------------------------------------
 cluster.routing.allocation.awareness.attributes: rack_id,zone
 -------------------------------------------------------------
 NOTE: When using awareness attributes, shards will not be allocated to nodes
 that don't have values set for those attributes.
 NOTE: Number of primary/replica of a shard allocated on a specific group of
 nodes with the same awareness attribute value is determined by the number of
 attribute values. When the number of nodes in groups is unbalanced and there
 are many replicas, replica shards may be left unassigned.
 [float]
 [[forced-awareness]]
-=== Forced Awareness
+==== Forced awareness
-Imagine that you have two zones and enough hardware across the two zones to
+By default, if one location fails, Elasticsearch assigns all of the missing
-host all of your primary and replica shards.  But perhaps the hardware in a
+replica shards to the remaining locations. While you might have sufficient
-single zone, while sufficient to host half the shards, would be unable to host
+resources across all locations to host your primary and replica shards, a single
-*ALL* the shards.
+location might be unable to host *ALL* of the shards.
-With ordinary awareness, if one zone lost contact with the other zone,
+To prevent a single location from being overloaded in the event of a failure,
-Elasticsearch would assign all of the missing replica shards to a single zone.
+you can set `cluster.routing.allocation.awareness.force` so no replicas are
-But in this example, this sudden extra load would cause the hardware in the
+allocated until nodes are available in another location.
 remaining zone to be overloaded.
-Forced awareness solves this problem by *NEVER* allowing copies of the same
+For example, if you have an awareness attribute called `zone` and configure nodes
-shard to be allocated to the same zone.
+in `zone1` and `zone2`, you can use forced awareness to prevent Elasticsearch
-
+from allocating replicas if only one zone is available:
 For example, lets say we have an awareness attribute called `zone`, and we
 know we are going to have two zones, `zone1` and `zone2`. Here is how we can
 force awareness on a node:
 [source,yaml]
 -------------------------------------------------------------------
 cluster.routing.allocation.awareness.force.zone.values: zone1,zone2 <1>
 cluster.routing.allocation.awareness.attributes: zone
 cluster.routing.allocation.awareness.force.zone.values: zone1,zone2 <1>
 -------------------------------------------------------------------
-<1> We must list all possible values that the `zone` attribute can have.
+<1> Specify all possible values for the awareness attribute.
 Now, if we start 2 nodes with `node.attr.zone` set to `zone1` and create an
 index with 5 shards and 1 replica. The index will be created, but only the 5
 primary shards will be allocated (with no replicas). Only when we start more
 nodes with `node.attr.zone` set to `zone2` will the replicas be allocated.
 The `cluster.routing.allocation.awareness.*` settings can all be updated
 dynamically on a live cluster with the
 <<cluster-update-settings,cluster-update-settings>> API.
 With this example configuration, if you start two nodes with `node.attr.zone` set
 to `zone1` and create an index with 5 shards and 1 replica, Elasticsearch creates
 the index and allocates the 5 primary shards but no replicas. Replicas are
 only allocated once nodes with `node.attr.zone` set to `zone2` are available.
--- a/docs/reference/modules/cluster/allocation_filtering.asciidoc
+++ b/docs/reference/modules/cluster/allocation_filtering.asciidoc
@ -1,13 +1,37 @@
 [[allocation-filtering]]
-=== Shard Allocation Filtering
+=== Cluster-level shard allocation filtering
-While <<index-modules-allocation>> provides *per-index* settings to control the
+You can use cluster-level shard allocation filters to control where {es}
-allocation of shards to nodes, cluster-level shard allocation filtering allows
+allocates shards from any index. These cluster wide filters are applied in
-you to allow or disallow the allocation of shards from *any* index to
+conjunction with <<shard-allocation-filtering, per-index allocation filtering>>
-particular nodes.
+and <<allocation-awareness, allocation awareness>>.
-The available _dynamic_ cluster settings are as follows, where `{attribute}`
+Shard allocation filters can be based on custom node attributes or the built-in
-refers to an arbitrary node attribute.:
+`_name`, `_ip`, and `_host` attributes.
 The `cluster.routing.allocation` settings are dynamic, enabling live indices to
 be moved from one set of nodes to another. Shards are only relocated if it is
 possible to do so without breaking another routing constraint, such as never
 allocating a primary and replica shard on the same node.
 The most common use case for cluster-level shard allocation filtering is when
 you want to decommission a node. To move shards off of a node prior to shutting
 it down, you could create a filter that excludes the node by its IP address:
 [source,js]
 --------------------------------------------------
 PUT _cluster/settings
 {
  "transient" : {
    "cluster.routing.allocation.exclude._ip" : "10.0.0.1"
  }
 }
 --------------------------------------------------
 // CONSOLE
 [float]
 [[cluster-routing-settings]]
 ==== Cluster routing settings
 `cluster.routing.allocation.include.{attribute}`::
@ -24,36 +48,14 @@ refers to an arbitrary node attribute.:
    Do not allocate shards to a node whose `{attribute}` has _any_ of the
    comma-separated values.
-These special attributes are also supported:
+The cluster allocation settings support the following built-in attributes:
 [horizontal]
 `_name`::   Match nodes by node names
 `_ip`::     Match nodes by IP addresses (the IP address associated with the hostname)
 `_host`::   Match nodes by hostnames
-The typical use case for cluster-wide shard allocation filtering is when you
+You can use wildcards when specifying attribute values, for example:
 want to decommission a node, and you would like to move the shards from that
 node to other nodes in the cluster before shutting it down.
 For instance, we could decommission a node using its IP address as follows:
 [source,js]
 --------------------------------------------------
 PUT _cluster/settings
 {
  "transient" : {
    "cluster.routing.allocation.exclude._ip" : "10.0.0.1"
  }
 }
 --------------------------------------------------
 // CONSOLE
 NOTE: Shards will only be relocated if it is possible to do so without
 breaking another routing constraint, such as never allocating a primary and
 replica shard to the same node.
 In addition to listing multiple values as a comma-separated list, all
 attribute values can be specified with wildcards, eg:
 [source,js]
 ------------------------
--- a/docs/reference/modules/cluster/disk_allocator.asciidoc
+++ b/docs/reference/modules/cluster/disk_allocator.asciidoc
@ -1,5 +1,5 @@
 [[disk-allocator]]
-=== Disk-based Shard Allocation
+=== Disk-based shard allocation
 Elasticsearch considers the available disk space on a node before deciding
 whether to allocate new shards to that node or to actively relocate shards away
--- a/docs/reference/modules/cluster/shards_allocation.asciidoc
+++ b/docs/reference/modules/cluster/shards_allocation.asciidoc
@ -1,12 +1,12 @@
 [[shards-allocation]]
-=== Cluster Level Shard Allocation
+=== Cluster level shard allocation
 Shard allocation is the process of allocating shards to nodes. This can
 happen during initial recovery, replica allocation, rebalancing, or
 when nodes are added or removed.
 [float]
-=== Shard Allocation Settings
+=== Shard allocation settings
 The following _dynamic_ settings may be used to control shard allocation and recovery:
@ -59,7 +59,7 @@ one of the active allocation ids in the cluster state.
      setting only applies if multiple nodes are started on the same machine.
 [float]
-=== Shard Rebalancing Settings
+=== Shard rebalancing settings
 The following _dynamic_ settings may be used to control the rebalancing of
 shards across the cluster:
@ -98,7 +98,7 @@ Specify when shard rebalancing is allowed:
      or <<forced-awareness,forced awareness>>.
 [float]
-=== Shard Balancing Heuristics
+=== Shard balancing heuristics
 The following settings are used together to determine where to place each
 shard.  The cluster is balanced when no allowed rebalancing operation can bring the weight