OpenSearch/docs/reference/index-modules/allocation.asciidoc

132 lines
3.7 KiB
Plaintext
Raw Normal View History

[[index-modules-allocation]]
== Index Shard Allocation
This module provides per-index settings to control the allocation of shards to
nodes.
[float]
[[shard-allocation-filtering]]
=== Shard Allocation Filtering
Shard allocation filtering allows you to specify which nodes are allowed
to host the shards of a particular index.
NOTE: The per-index shard allocation filters explained below work in
conjunction with the cluster-wide allocation filters explained in
<<shards-allocation>>.
It is possible to assign arbitrary metadata attributes to each node at
startup. For instance, nodes could be assigned a `rack` and a `group`
attribute as follows:
[source,sh]
------------------------
bin/elasticsearch --node.rack rack1 --node.size big <1>
------------------------
<1> These attribute settings can also be specfied in the `elasticsearch.yml` config file.
These metadata attributes can be used with the
`index.routing.allocation.*` settings to allocate an index to a particular
group of nodes. For instance, we can move the index `test` to either `big` or
`medium` nodes as follows:
[source,json]
------------------------
PUT test/_settings
{
"index.routing.allocation.include.size": "big,medium"
}
------------------------
// AUTOSENSE
Alternatively, we can move the index `test` away from the `small` nodes with
an `exclude` rule:
[source,json]
------------------------
PUT test/_settings
{
"index.routing.allocation.exclude.size": "small"
}
------------------------
// AUTOSENSE
Multiple rules can be specified, in which case all conditions must be
satisfied. For instance, we could move the index `test` to `big` nodes in
`rack1` with the following:
[source,json]
------------------------
PUT test/_settings
{
"index.routing.allocation.include.size": "big",
"index.routing.allocation.include.rack": "rack1"
}
------------------------
// AUTOSENSE
NOTE: If some conditions cannot be satisfied then shards will not be moved.
The following settings are _dynamic_, allowing live indices to be moved from
one set of nodes to another:
`index.routing.allocation.include.{attribute}`::
Assign the index to a node whose `{attribute}` has at least one of the
comma-separated values.
`index.routing.allocation.require.{attribute}`::
Assign the index to a node whose `{attribute}` has _all_ of the
comma-separated values.
`index.routing.allocation.exclude.{attribute}`::
Assign the index to a node whose `{attribute}` has _none_ of the
comma-separated values.
These special attributes are also supported:
[horizontal]
`_name`:: Match nodes by node name
`_ip`:: Match nodes by IP address (the IP address associated with the hostname)
`_host`:: Match nodes by hostname
All attribute values can be specified with wildcards, eg:
[source,json]
------------------------
PUT test/_settings
{
"index.routing.allocation.include._ip": "192.168.2.*"
}
------------------------
// AUTOSENSE
[float]
=== Total Shards Per Node
The cluster-level shard allocator tries to spread the shards of a single index
across as many nodes as possible. However, depending on how many shards and
indices you have, and how big they are, it may not always be possible to spread
shards evenly.
The following _dynamic_ setting allows you to specify a hard limit on the total
number of shards from a single index allowed per node:
`index.routing.allocation.total_shards_per_node`::
The maximum number of shards (replicas and primaries) that will be
allocated to a single node. Defaults to unbounded.
[WARNING]
=======================================
This setting imposes a hard limit which can result in some shards not
being allocated.
Use with caution.
=======================================
Add AllocationDecider that takes free disk space into account This commit adds two main pieces, the first is a ClusterInfoService that provides a service running on the master nodes that fetches the total/free bytes for each data node in the cluster as well as the sizes of all shards in the cluster. This information is gathered by default every 30 seconds, and can be changed dynamically by setting the `cluster.info.update.interval` setting. This ClusterInfoService can hopefully be used in the future to weight nodes for allocation based on their disk usage, if desired. The second main piece is the DiskThresholdDecider, which can disallow a shard from being allocated to a node, or from remaining on the node depending on configuration parameters. There are three main configuration parameters for the DiskThresholdDecider: `cluster.routing.allocation.disk.threshold_enabled` controls whether the decider is enabled. It defaults to false (disabled). Note that the decider is also disabled for clusters with only a single data node. `cluster.routing.allocation.disk.watermark.low` controls the low watermark for disk usage. It defaults to 0.70, meaning ES will not allocate new shards to nodes once they have more than 70% disk used. It can also be set to an absolute byte value (like 500mb) to prevent ES from allocating shards if less than the configured amount of space is available. `cluster.routing.allocation.disk.watermark.high` controls the high watermark. It defaults to 0.85, meaning ES will attempt to relocate shards to another node if the node disk usage rises above 85%. It can also be set to an absolute byte value (similar to the low watermark) to relocate shards once less than the configured amount of space is available on the node. Closes #3480
2013-08-16 12:20:56 -06:00