OpenSearch/docs/reference/modules/discovery/discovery.asciidoc

189 lines
7.7 KiB
Plaintext

[[modules-discovery-hosts-providers]]
=== Discovery
Discovery is the process by which the cluster formation module finds other
nodes with which to form a cluster. This process runs when you start an
Elasticsearch node or when a node believes the master node failed and continues
until the master node is found or a new master node is elected.
This process starts with a list of _seed_ addresses from one or more
<<built-in-hosts-providers,hosts providers>>, together with the addresses of
any master-eligible nodes that were in the last known cluster. The process
operates in two phases: First, each node probes the seed addresses by
connecting to each address and attempting to identify the node to which it is
connected. Secondly it shares with the remote node a list of all of its known
master-eligible peers and the remote node responds with _its_ peers in turn.
The node then probes all the new nodes that it just discovered, requests their
peers, and so on.
If the node is not master-eligible then it continues this discovery process
until it has discovered an elected master node. If no elected master is
discovered then the node will retry after `discovery.find_peers_interval` which
defaults to `1s`.
If the node is master-eligible then it continues this discovery process until it
has either discovered an elected master node or else it has discovered enough
masterless master-eligible nodes to complete an election. If neither of these
occur quickly enough then the node will retry after
`discovery.find_peers_interval` which defaults to `1s`.
[[built-in-hosts-providers]]
==== Hosts providers
By default the cluster formation module offers two hosts providers to configure
the list of seed nodes: a _settings_-based and a _file_-based hosts provider.
It can be extended to support cloud environments and other forms of hosts
providers via {plugins}/discovery.html[discovery plugins]. Hosts providers are
configured using the `discovery.zen.hosts_provider` setting, which defaults to
the _settings_-based hosts provider. Multiple hosts providers can be specified
as a list.
[float]
[[settings-based-hosts-provider]]
===== Settings-based hosts provider
The settings-based hosts provider uses a node setting to configure a static list
of hosts to use as seed nodes. These hosts can be specified as hostnames or IP
addresses; hosts specified as hostnames are resolved to IP addresses during each
round of discovery. Note that if you are in an environment where DNS resolutions
vary with time, you might need to adjust your <<networkaddress-cache-ttl,JVM
security settings>>.
The list of hosts is set using the <<unicast.hosts,`discovery.zen.ping.unicast.hosts`>> static
setting. For example:
[source,yaml]
--------------------------------------------------
discovery.zen.ping.unicast.hosts:
- 192.168.1.10:9300
- 192.168.1.11 <1>
- seeds.mydomain.com <2>
--------------------------------------------------
<1> The port will default to `transport.profiles.default.port` and fallback to
`transport.port` if not specified.
<2> A hostname that resolves to multiple IP addresses will try all resolved
addresses.
Additionally, the `discovery.zen.ping.unicast.hosts.resolve_timeout` configures
the amount of time to wait for DNS lookups on each round of discovery. This is
specified as a <<time-units, time unit>> and defaults to 5s.
Unicast discovery uses the <<modules-transport,transport>> module to perform the
discovery.
[float]
[[file-based-hosts-provider]]
===== File-based hosts provider
The file-based hosts provider configures a list of hosts via an external file.
Elasticsearch reloads this file when it changes, so that the list of seed nodes
can change dynamically without needing to restart each node. For example, this
gives a convenient mechanism for an Elasticsearch instance that is run in a
Docker container to be dynamically supplied with a list of IP addresses to
connect to when those IP addresses may not be known at node startup.
To enable file-based discovery, configure the `file` hosts provider as follows:
[source,txt]
----------------------------------------------------------------
discovery.zen.hosts_provider: file
----------------------------------------------------------------
Then create a file at `$ES_PATH_CONF/unicast_hosts.txt` in the format described
below. Any time a change is made to the `unicast_hosts.txt` file the new changes
will be picked up by Elasticsearch and the new hosts list will be used.
Note that the file-based discovery plugin augments the unicast hosts list in
`elasticsearch.yml`. If there are valid unicast host entries in
`discovery.zen.ping.unicast.hosts`, they are used in addition to those
supplied in `unicast_hosts.txt`.
The `discovery.zen.ping.unicast.hosts.resolve_timeout` setting also applies to
DNS lookups for nodes specified by address via file-based discovery. This is
specified as a <<time-units, time unit>> and defaults to 5s.
The format of the file is to specify one node entry per line. Each node entry
consists of the host (host name or IP address) and an optional transport port
number. If the port number is specified, is must come immediately after the
host (on the same line) separated by a `:`. If the port number is not
specified, a default value of 9300 is used.
For example, this is an example of `unicast_hosts.txt` for a cluster with four
nodes that participate in unicast discovery, some of which are not running on
the default port:
[source,txt]
----------------------------------------------------------------
10.10.10.5
10.10.10.6:9305
10.10.10.5:10005
# an IPv6 address
[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:9301
----------------------------------------------------------------
Host names are allowed instead of IP addresses (similar to
`discovery.zen.ping.unicast.hosts`), and IPv6 addresses must be specified in
brackets with the port coming after the brackets.
You can also add comments to this file. All comments must appear on
their lines starting with `#` (i.e. comments cannot start in the middle of a
line).
[float]
[[ec2-hosts-provider]]
===== EC2 hosts provider
The {plugins}/discovery-ec2.html[EC2 discovery plugin] adds a hosts provider
that uses the https://github.com/aws/aws-sdk-java[AWS API] to find a list of
seed nodes.
[float]
[[azure-classic-hosts-provider]]
===== Azure Classic hosts provider
The {plugins}/discovery-azure-classic.html[Azure Classic discovery plugin] adds
a hosts provider that uses the Azure Classic API find a list of seed nodes.
[float]
[[gce-hosts-provider]]
===== Google Compute Engine hosts provider
The {plugins}/discovery-gce.html[GCE discovery plugin] adds a hosts provider
that uses the GCE API find a list of seed nodes.
[float]
==== Discovery settings
The discovery process is controlled by the following settings.
`discovery.find_peers_interval`::
Sets how long a node will wait before attempting another discovery round.
Defaults to `1s`.
`discovery.request_peers_timeout`::
Sets how long a node will wait after asking its peers again before
considering the request to have failed. Defaults to `3s`.
`discovery.probe.connect_timeout`::
Sets how long to wait when attempting to connect to each address. Defaults
to `3s`.
`discovery.probe.handshake_timeout`::
Sets how long to wait when attempting to identify the remote node via a
handshake. Defaults to `1s`.
`discovery.cluster_formation_warning_timeout`::
Sets how long a node will try to form a cluster before logging a warning
that the cluster did not form. Defaults to `10s`.
If a cluster has not formed after `discovery.cluster_formation_warning_timeout`
has elapsed then the node will log a warning message that starts with the phrase
`master not discovered` which describes the current state of the discovery
process.