166 lines
5.9 KiB
Plaintext
166 lines
5.9 KiB
Plaintext
[[modules-discovery-zen]]
|
|
=== Zen Discovery
|
|
|
|
The zen discovery is the built in discovery module for elasticsearch and
|
|
the default. It provides both multicast and unicast discovery as well
|
|
being easily extended to support cloud environments.
|
|
|
|
The zen discovery is integrated with other modules, for example, all
|
|
communication between nodes is done using the
|
|
<<modules-transport,transport>> module.
|
|
|
|
It is separated into several sub modules, which are explained below:
|
|
|
|
[float]
|
|
[[ping]]
|
|
==== Ping
|
|
|
|
This is the process where a node uses the discovery mechanisms to find
|
|
other nodes. There is support for both multicast and unicast based
|
|
discovery (can be used in conjunction as well).
|
|
|
|
[float]
|
|
[[multicast]]
|
|
===== Multicast
|
|
|
|
Multicast ping discovery of other nodes is done by sending one or more
|
|
multicast requests where existing nodes that exists will receive and
|
|
respond to. It provides the following settings with the
|
|
`discovery.zen.ping.multicast` prefix:
|
|
|
|
[cols="<,<",options="header",]
|
|
|=======================================================================
|
|
|Setting |Description
|
|
|`group` |The group address to use. Defaults to `224.2.2.4`.
|
|
|
|
|`port` |The port to use. Defaults to `54328`.
|
|
|
|
|`ttl` |The ttl of the multicast message. Defaults to `3`.
|
|
|
|
|`address` |The address to bind to, defaults to `null` which means it
|
|
will bind to all available network interfaces.
|
|
|
|
|`enabled` |Whether multicast ping discovery is enabled. Defaults to `true`.
|
|
|=======================================================================
|
|
|
|
[float]
|
|
[[unicast]]
|
|
===== Unicast
|
|
|
|
The unicast discovery allows to perform the discovery when multicast is
|
|
not enabled. It basically requires a list of hosts to use that will act
|
|
as gossip routers. It provides the following settings with the
|
|
`discovery.zen.ping.unicast` prefix:
|
|
|
|
[cols="<,<",options="header",]
|
|
|=======================================================================
|
|
|Setting |Description
|
|
|`hosts` |Either an array setting or a comma delimited setting. Each
|
|
value is either in the form of `host:port`, or in the form of
|
|
`host[port1-port2]`.
|
|
|=======================================================================
|
|
|
|
The unicast discovery uses the
|
|
<<modules-transport,transport>> module to
|
|
perform the discovery.
|
|
|
|
[float]
|
|
[[master-election]]
|
|
==== Master Election
|
|
|
|
As part of the initial ping process a master of the cluster is either
|
|
elected or joined to. This is done automatically. The
|
|
`discovery.zen.ping_timeout` (which defaults to `3s`) allows to
|
|
configure the election to handle cases of slow or congested networks
|
|
(higher values assure less chance of failure). Once a node joins, it
|
|
will send a join request to the master (`discovery.zen.join_timeout`)
|
|
with a timeout defaulting at 20 times the ping timeout.
|
|
coming[1.3.0,Previously defaulted to 10 times the ping timeout].
|
|
|
|
Nodes can be excluded from becoming a master by setting `node.master` to
|
|
`false`. Note, once a node is a client node (`node.client` set to
|
|
`true`), it will not be allowed to become a master (`node.master` is
|
|
automatically set to `false`).
|
|
|
|
The `discovery.zen.minimum_master_nodes` allows to control the minimum
|
|
number of master eligible nodes a node should "see" in order to operate
|
|
within the cluster. Its recommended to set it to a higher value than 1
|
|
when running more than 2 nodes in the cluster.
|
|
|
|
[float]
|
|
[[fault-detection]]
|
|
==== Fault Detection
|
|
|
|
There are two fault detection processes running. The first is by the
|
|
master, to ping all the other nodes in the cluster and verify that they
|
|
are alive. And on the other end, each node pings to master to verify if
|
|
its still alive or an election process needs to be initiated.
|
|
|
|
The following settings control the fault detection process using the
|
|
`discovery.zen.fd` prefix:
|
|
|
|
[cols="<,<",options="header",]
|
|
|=======================================================================
|
|
|Setting |Description
|
|
|`ping_interval` |How often a node gets pinged. Defaults to `1s`.
|
|
|
|
|`ping_timeout` |How long to wait for a ping response, defaults to
|
|
`30s`.
|
|
|
|
|`ping_retries` |How many ping failures / timeouts cause a node to be
|
|
considered failed. Defaults to `3`.
|
|
|=======================================================================
|
|
|
|
[float]
|
|
==== External Multicast
|
|
|
|
The multicast discovery also supports external multicast requests to
|
|
discover nodes. The external client can send a request to the multicast
|
|
IP/group and port, in the form of:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"request" : {
|
|
"cluster_name": "test_cluster"
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
And the response will be similar to node info response (with node level
|
|
information only, including transport/http addresses, and node
|
|
attributes):
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"response" : {
|
|
"cluster_name" : "test_cluster",
|
|
"transport_address" : "...",
|
|
"http_address" : "...",
|
|
"attributes" : {
|
|
"..."
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
Note, it can still be enabled, with disabled internal multicast
|
|
discovery, but still have external discovery working by keeping
|
|
`discovery.zen.ping.multicast.enabled` set to `true` (the default), but,
|
|
setting `discovery.zen.ping.multicast.ping.enabled` to `false`.
|
|
|
|
[float]
|
|
==== Cluster state updates
|
|
|
|
The master node is the only node in a cluster that can make changes to the
|
|
cluster state. The master node processes one cluster state update at a time,
|
|
applies the required changes and publishes the updated cluster state to all
|
|
the other nodes in the cluster. Each node receives the publish message,
|
|
updates its own cluster state and replies to the master node, which waits for
|
|
all nodes to respond, up to a timeout, before going ahead processing the next
|
|
updates in the queue. The `discovery.zen.publish_timeout` is set by default
|
|
to 30 seconds and can be changed dynamically through the
|
|
<<cluster-update-settings,cluster update settings api>> added[1.1.0, The
|
|
setting existed before but wasn't dynamic].
|