230 lines
9.0 KiB
Plaintext
230 lines
9.0 KiB
Plaintext
[[modules-cluster]]
|
|
== Cluster
|
|
|
|
[float]
|
|
[[shards-allocation]]
|
|
=== Shards Allocation
|
|
|
|
Shards allocation is the process of allocating shards to nodes. This can
|
|
happen during initial recovery, replica allocation, rebalancing, or
|
|
handling nodes being added or removed.
|
|
|
|
The following settings may be used:
|
|
|
|
`cluster.routing.allocation.allow_rebalance`::
|
|
Allow to control when rebalancing will happen based on the total
|
|
state of all the indices shards in the cluster. `always`,
|
|
`indices_primaries_active`, and `indices_all_active` are allowed,
|
|
defaulting to `indices_all_active` to reduce chatter during
|
|
initial recovery.
|
|
|
|
|
|
`cluster.routing.allocation.cluster_concurrent_rebalance`::
|
|
Allow to control how many concurrent rebalancing of shards are
|
|
allowed cluster wide, and default it to `2`.
|
|
|
|
|
|
`cluster.routing.allocation.node_initial_primaries_recoveries`::
|
|
Allow to control specifically the number of initial recoveries
|
|
of primaries that are allowed per node. Since most times local
|
|
gateway is used, those should be fast and we can handle more of
|
|
those per node without creating load.
|
|
|
|
|
|
`cluster.routing.allocation.node_concurrent_recoveries`::
|
|
How many concurrent recoveries are allowed to happen on a node.
|
|
Defaults to `2`.
|
|
|
|
`cluster.routing.allocation.enable`::
|
|
Controls shard allocation for all indices, by allowing specific
|
|
kinds of shard to be allocated.
|
|
|
|
Can be set to:
|
|
* `all` (default) - Allows shard allocation for all kinds of shards.
|
|
* `primaries` - Allows shard allocation only for primary shards.
|
|
* `new_primaries` - Allows shard allocation only for primary shards for new indices.
|
|
* `none` - No shard allocations of any kind are allowed for all indices.
|
|
|
|
`cluster.routing.allocation.same_shard.host`::
|
|
Allows to perform a check to prevent allocation of multiple instances
|
|
of the same shard on a single host, based on host name and host address.
|
|
Defaults to `false`, meaning that no check is performed by default. This
|
|
setting only applies if multiple nodes are started on the same machine.
|
|
|
|
`indices.recovery.concurrent_streams`::
|
|
The number of streams to open (on a *node* level) to recover a
|
|
shard from a peer shard. Defaults to `3`.
|
|
|
|
[float]
|
|
[[allocation-awareness]]
|
|
=== Shard Allocation Awareness
|
|
|
|
Cluster allocation awareness allows to configure shard and replicas
|
|
allocation across generic attributes associated the nodes. Lets explain
|
|
it through an example:
|
|
|
|
Assume we have several racks. When we start a node, we can configure an
|
|
attribute called `rack_id` (any attribute name works), for example, here
|
|
is a sample config:
|
|
|
|
----------------------
|
|
node.rack_id: rack_one
|
|
----------------------
|
|
|
|
The above sets an attribute called `rack_id` for the relevant node with
|
|
a value of `rack_one`. Now, we need to configure the `rack_id` attribute
|
|
as one of the awareness allocation attributes (set it on *all* (master
|
|
eligible) nodes config):
|
|
|
|
--------------------------------------------------------
|
|
cluster.routing.allocation.awareness.attributes: rack_id
|
|
--------------------------------------------------------
|
|
|
|
The above will mean that the `rack_id` attribute will be used to do
|
|
awareness based allocation of shard and its replicas. For example, lets
|
|
say we start 2 nodes with `node.rack_id` set to `rack_one`, and deploy a
|
|
single index with 5 shards and 1 replica. The index will be fully
|
|
deployed on the current nodes (5 shards and 1 replica each, total of 10
|
|
shards).
|
|
|
|
Now, if we start two more nodes, with `node.rack_id` set to `rack_two`,
|
|
shards will relocate to even the number of shards across the nodes, but,
|
|
a shard and its replica will not be allocated in the same `rack_id`
|
|
value.
|
|
|
|
The awareness attributes can hold several values, for example:
|
|
|
|
-------------------------------------------------------------
|
|
cluster.routing.allocation.awareness.attributes: rack_id,zone
|
|
-------------------------------------------------------------
|
|
|
|
*NOTE*: When using awareness attributes, shards will not be allocated to
|
|
nodes that don't have values set for those attributes.
|
|
|
|
[float]
|
|
[[forced-awareness]]
|
|
=== Forced Awareness
|
|
|
|
Sometimes, we know in advance the number of values an awareness
|
|
attribute can have, and more over, we would like never to have more
|
|
replicas than needed allocated on a specific group of nodes with the
|
|
same awareness attribute value. For that, we can force awareness on
|
|
specific attributes.
|
|
|
|
For example, lets say we have an awareness attribute called `zone`, and
|
|
we know we are going to have two zones, `zone1` and `zone2`. Here is how
|
|
we can force awareness on a node:
|
|
|
|
[source,js]
|
|
-------------------------------------------------------------------
|
|
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2
|
|
cluster.routing.allocation.awareness.attributes: zone
|
|
-------------------------------------------------------------------
|
|
|
|
Now, lets say we start 2 nodes with `node.zone` set to `zone1` and
|
|
create an index with 5 shards and 1 replica. The index will be created,
|
|
but only 5 shards will be allocated (with no replicas). Only when we
|
|
start more shards with `node.zone` set to `zone2` will the replicas be
|
|
allocated.
|
|
|
|
[float]
|
|
==== Automatic Preference When Searching / GETing
|
|
|
|
When executing a search, or doing a get, the node receiving the request
|
|
will prefer to execute the request on shards that exists on nodes that
|
|
have the same attribute values as the executing node.
|
|
|
|
[float]
|
|
==== Realtime Settings Update
|
|
|
|
The settings can be updated using the <<cluster-update-settings,cluster update settings API>> on a live cluster.
|
|
|
|
[float]
|
|
[[allocation-filtering]]
|
|
=== Shard Allocation Filtering
|
|
|
|
Allow to control allocation of indices on nodes based on include/exclude
|
|
filters. The filters can be set both on the index level and on the
|
|
cluster level. Lets start with an example of setting it on the cluster
|
|
level:
|
|
|
|
Lets say we have 4 nodes, each has specific attribute called `tag`
|
|
associated with it (the name of the attribute can be any name). Each
|
|
node has a specific value associated with `tag`. Node 1 has a setting
|
|
`node.tag: value1`, Node 2 a setting of `node.tag: value2`, and so on.
|
|
|
|
We can create an index that will only deploy on nodes that have `tag`
|
|
set to `value1` and `value2` by setting
|
|
`index.routing.allocation.include.tag` to `value1,value2`. For example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
curl -XPUT localhost:9200/test/_settings -d '{
|
|
"index.routing.allocation.include.tag" : "value1,value2"
|
|
}'
|
|
--------------------------------------------------
|
|
|
|
On the other hand, we can create an index that will be deployed on all
|
|
nodes except for nodes with a `tag` of value `value3` by setting
|
|
`index.routing.allocation.exclude.tag` to `value3`. For example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
curl -XPUT localhost:9200/test/_settings -d '{
|
|
"index.routing.allocation.exclude.tag" : "value3"
|
|
}'
|
|
--------------------------------------------------
|
|
|
|
`index.routing.allocation.require.*` can be used to
|
|
specify a number of rules, all of which MUST match in order for a shard
|
|
to be allocated to a node. This is in contrast to `include` which will
|
|
include a node if ANY rule matches.
|
|
|
|
The `include`, `exclude` and `require` values can have generic simple
|
|
matching wildcards, for example, `value1*`. A special attribute name
|
|
called `_ip` can be used to match on node ip values. In addition `_host`
|
|
attribute can be used to match on either the node's hostname or its ip
|
|
address. Similarly `_name` and `_id` attributes can be used to match on
|
|
node name and node id accordingly.
|
|
|
|
Obviously a node can have several attributes associated with it, and
|
|
both the attribute name and value are controlled in the setting. For
|
|
example, here is a sample of several node configurations:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
node.group1: group1_value1
|
|
node.group2: group2_value4
|
|
--------------------------------------------------
|
|
|
|
In the same manner, `include`, `exclude` and `require` can work against
|
|
several attributes, for example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
curl -XPUT localhost:9200/test/_settings -d '{
|
|
"index.routing.allocation.include.group1" : "xxx"
|
|
"index.routing.allocation.include.group2" : "yyy",
|
|
"index.routing.allocation.exclude.group3" : "zzz",
|
|
"index.routing.allocation.require.group4" : "aaa"
|
|
}'
|
|
--------------------------------------------------
|
|
|
|
The provided settings can also be updated in real time using the update
|
|
settings API, allowing to "move" indices (shards) around in realtime.
|
|
|
|
Cluster wide filtering can also be defined, and be updated in real time
|
|
using the cluster update settings API. This setting can come in handy
|
|
for things like decommissioning nodes (even if the replica count is set
|
|
to 0). Here is a sample of how to decommission a node based on `_ip`
|
|
address:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
curl -XPUT localhost:9200/_cluster/settings -d '{
|
|
"transient" : {
|
|
"cluster.routing.allocation.exclude._ip" : "10.0.0.1"
|
|
}
|
|
}'
|
|
--------------------------------------------------
|