mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-13 08:25:26 +00:00
keepalives tell any intermediate devices that the connection remains alive, which helps with overzealous firewalls that are killing idle connections. keepalives are enabled by default in Elasticsearch, but use system defaults for their configuration, which often times do not have reasonable defaults (e.g. 7200s for TCP_KEEP_IDLE) in the context of distributed systems such as Elasticsearch. This PR sets the socket-level keep_alive options for network.tcp.{keep_idle,keep_interval} to 5 minutes on configurations that support it (>= Java 11 & (MacOS || Linux)) and where the system defaults are set to something higher than 5 minutes. This helps keep the connections alive while not interfering with system defaults or user-specified settings unless they are deemed to be set too high by providing better out-of-the-box defaults.
177 lines
7.8 KiB
Plaintext
177 lines
7.8 KiB
Plaintext
[[modules-transport]]
|
|
=== Transport
|
|
|
|
The transport networking layer is used for internal communication between nodes
|
|
within the cluster. Each call that goes from one node to the other uses
|
|
the transport layer (for example, when an HTTP GET request is processed
|
|
by one node, and should actually be processed by another node that holds
|
|
the data). The transport module is also used for the `TransportClient` in the
|
|
{es} Java API.
|
|
|
|
The transport mechanism is completely asynchronous in nature, meaning
|
|
that there is no blocking thread waiting for a response. The benefit of
|
|
using asynchronous communication is first solving the
|
|
http://en.wikipedia.org/wiki/C10k_problem[C10k problem], as well as
|
|
being the ideal solution for scatter (broadcast) / gather operations such
|
|
as search in Elasticsearch.
|
|
|
|
[[transport-settings]]
|
|
==== Transport settings
|
|
|
|
The internal transport communicates over TCP. You can configure it with the
|
|
following settings:
|
|
|
|
[cols="<,<",options="header",]
|
|
|=======================================================================
|
|
|Setting |Description
|
|
|`transport.port` |A bind port range. Defaults to `9300-9400`.
|
|
|
|
|`transport.publish_port` |The port that other nodes in the cluster
|
|
should use when communicating with this node. Useful when a cluster node
|
|
is behind a proxy or firewall and the `transport.port` is not directly
|
|
addressable from the outside. Defaults to the actual port assigned via
|
|
`transport.port`.
|
|
|
|
|`transport.bind_host` |The host address to bind the transport service to. Defaults to `transport.host` (if set) or `network.bind_host`.
|
|
|
|
|`transport.publish_host` |The host address to publish for nodes in the cluster to connect to. Defaults to `transport.host` (if set) or `network.publish_host`.
|
|
|
|
|`transport.host` |Used to set the `transport.bind_host` and the `transport.publish_host`.
|
|
|
|
|
|
|`transport.connect_timeout` |The connect timeout for initiating a new connection (in
|
|
time setting format). Defaults to `30s`.
|
|
|
|
|`transport.compress` |Set to `true` to enable compression (`DEFLATE`) between
|
|
all nodes. Defaults to `false`.
|
|
|
|
|`transport.ping_schedule` | Schedule a regular application-level ping message
|
|
to ensure that transport connections between nodes are kept alive. Defaults to
|
|
`5s` in the transport client and `-1` (disabled) elsewhere. It is preferable
|
|
to correctly configure TCP keep-alives instead of using this feature, because
|
|
TCP keep-alives apply to all kinds of long-lived connections and not just to
|
|
transport connections.
|
|
|
|
|=======================================================================
|
|
|
|
It also uses the common
|
|
<<modules-network,network settings>>.
|
|
|
|
[[transport-profiles]]
|
|
===== Transport profiles
|
|
|
|
Elasticsearch allows you to bind to multiple ports on different interfaces by
|
|
the use of transport profiles. See this example configuration
|
|
|
|
[source,yaml]
|
|
--------------
|
|
transport.profiles.default.port: 9300-9400
|
|
transport.profiles.default.bind_host: 10.0.0.1
|
|
transport.profiles.client.port: 9500-9600
|
|
transport.profiles.client.bind_host: 192.168.0.1
|
|
transport.profiles.dmz.port: 9700-9800
|
|
transport.profiles.dmz.bind_host: 172.16.1.2
|
|
--------------
|
|
|
|
The `default` profile is special. It is used as a fallback for any other
|
|
profiles, if those do not have a specific configuration setting set, and is how
|
|
this node connects to other nodes in the cluster.
|
|
|
|
The following parameters can be configured on each transport profile, as in the
|
|
example above:
|
|
|
|
* `port`: The port to which to bind.
|
|
* `bind_host`: The host to which to bind.
|
|
* `publish_host`: The host which is published in informational APIs.
|
|
* `tcp.no_delay`: Configures the `TCP_NO_DELAY` option for this socket.
|
|
* `tcp.keep_alive`: Configures the `SO_KEEPALIVE` option for this socket, which
|
|
determines whether it sends TCP keepalive probes.
|
|
* `tcp.keep_idle`: Configures the `TCP_KEEPIDLE` option for this socket, which
|
|
determines the time in seconds that a connection must be idle before
|
|
starting to send TCP keepalive probes. Defaults to `-1` which means to use
|
|
the smaller of 300 or the system default. May not be greater than 300. Only
|
|
applicable on Linux and macOS, and requires Java 11 or newer.
|
|
* `tcp.keep_interval`: Configures the `TCP_KEEPINTVL` option for this socket,
|
|
which determines the time in seconds between sending TCP keepalive probes.
|
|
Defaults to `-1` which means to use the smaller of 300 or the system
|
|
default. May not be greater than 300. Only applicable on Linux and macOS,
|
|
and requires Java 11 or newer.
|
|
* `tcp.keep_count`: Configures the `TCP_KEEPCNT` option for this socket, which
|
|
determines the number of unacknowledged TCP keepalive probes that may be
|
|
sent on a connection before it is dropped. Defaults to `-1` which means to
|
|
use the system default. Only applicable on Linux and macOS, and requires
|
|
Java 11 or newer.
|
|
* `tcp.reuse_address`: Configures the `SO_REUSEADDR` option for this socket.
|
|
* `tcp.send_buffer_size`: Configures the send buffer size of the socket.
|
|
* `tcp.receive_buffer_size`: Configures the receive buffer size of the socket.
|
|
|
|
[[long-lived-connections]]
|
|
===== Long-lived idle connections
|
|
|
|
Elasticsearch opens a number of long-lived TCP connections between each pair of
|
|
nodes in the cluster, and some of these connections may be idle for an extended
|
|
period of time. Nonetheless, Elasticsearch requires these connections to remain
|
|
open, and it can disrupt the operation of the cluster if any inter-node
|
|
connections are closed by an external influence such as a firewall. It is
|
|
important to configure your network to preserve long-lived idle connections
|
|
between Elasticsearch nodes, for instance by leaving `tcp.keep_alive` enabled
|
|
and ensuring that the keepalive interval is shorter than any timeout that might
|
|
cause idle connections to be closed, or by setting `transport.ping_schedule` if
|
|
keepalives cannot be configured.
|
|
|
|
|
|
[[request-compression]]
|
|
===== Request compression
|
|
|
|
By default, the `transport.compress` setting is `false` and network-level
|
|
request compression is disabled between nodes in the cluster. This default
|
|
normally makes sense for local cluster communication as compression has a
|
|
noticeable CPU cost and local clusters tend to be set up with fast network
|
|
connections between nodes.
|
|
|
|
The `transport.compress` setting always configures local cluster request
|
|
compression and is the fallback setting for remote cluster request compression.
|
|
If you want to configure remote request compression differently than local
|
|
request compression, you can set it on a per-remote cluster basis using the
|
|
<<remote-cluster-settings,`cluster.remote.${cluster_alias}.transport.compress` setting>>.
|
|
|
|
|
|
[[response-compression]]
|
|
===== Response compression
|
|
|
|
The compression settings do not configure compression for responses. {es} will
|
|
compress a response if the inbound request was compressed--even when compression
|
|
is not enabled. Similarly, {es} will not compress a response if the inbound
|
|
request was uncompressed--even when compression is enabled.
|
|
|
|
|
|
[[transport-tracer]]
|
|
==== Transport tracer
|
|
|
|
The transport layer has a dedicated tracer logger which, when activated, logs incoming and out going requests. The log can be dynamically activated
|
|
by setting the level of the `org.elasticsearch.transport.TransportService.tracer` logger to `TRACE`:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT _cluster/settings
|
|
{
|
|
"transient" : {
|
|
"logger.org.elasticsearch.transport.TransportService.tracer" : "TRACE"
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
You can also control which actions will be traced, using a set of include and exclude wildcard patterns. By default every request will be traced
|
|
except for fault detection pings:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT _cluster/settings
|
|
{
|
|
"transient" : {
|
|
"transport.tracer.include" : "*",
|
|
"transport.tracer.exclude" : "internal:coordination/fault_detection/*"
|
|
}
|
|
}
|
|
--------------------------------------------------
|