258 lines
13 KiB
Plaintext
258 lines
13 KiB
Plaintext
[[bootstrap-checks]]
|
|
== Bootstrap Checks
|
|
|
|
Collectively, we have a lot of experience with users suffering
|
|
unexpected issues because they have not configured
|
|
<<important-settings,important settings>>. In previous versions of
|
|
Elasticsearch, misconfiguration of some of these settings were logged
|
|
as warnings. Understandably, users sometimes miss these log messages.
|
|
To ensure that these settings receive the attention that they deserve,
|
|
Elasticsearch has bootstrap checks upon startup.
|
|
|
|
These bootstrap checks inspect a variety of Elasticsearch and system
|
|
settings and compare them to values that are safe for the operation of
|
|
Elasticsearch. If Elasticsearch is in development mode, any bootstrap
|
|
checks that fail appear as warnings in the Elasticsearch log. If
|
|
Elasticsearch is in production mode, any bootstrap checks that fail will
|
|
cause Elasticsearch to refuse to start.
|
|
|
|
There are some bootstrap checks that are always enforced to prevent
|
|
Elasticsearch from running with incompatible settings. These checks are
|
|
documented individually.
|
|
|
|
[discrete]
|
|
[[dev-vs-prod-mode]]
|
|
=== Development vs. production mode
|
|
|
|
By default, Elasticsearch binds to loopback addresses for <<modules-http,HTTP>>
|
|
and <<modules-transport,transport (internal)>> communication. This is fine for
|
|
downloading and playing with Elasticsearch as well as everyday development, but
|
|
it's useless for production systems. To join a cluster, an Elasticsearch node
|
|
must be reachable via transport communication. To join a cluster via a
|
|
non-loopback address, a node must bind transport to a non-loopback address and
|
|
not be using <<single-node-discovery,single-node discovery>>. Thus, we consider
|
|
an Elasticsearch node to be in development mode if it can not form a cluster
|
|
with another machine via a non-loopback address, and is otherwise in production
|
|
mode if it can join a cluster via non-loopback addresses.
|
|
|
|
Note that HTTP and transport can be configured independently via
|
|
<<modules-http,`http.host`>> and <<modules-transport,`transport.host`>>; this
|
|
can be useful for configuring a single node to be reachable via HTTP for testing
|
|
purposes without triggering production mode.
|
|
|
|
[[single-node-discovery]]
|
|
[discrete]
|
|
=== Single-node discovery
|
|
We recognize that some users need to bind transport to an external interface for
|
|
testing their usage of the transport client. For this situation, we provide the
|
|
discovery type `single-node` (configure it by setting `discovery.type` to
|
|
`single-node`); in this situation, a node will elect itself master and will not
|
|
join a cluster with any other node.
|
|
|
|
|
|
[discrete]
|
|
=== Forcing the bootstrap checks
|
|
If you are running a single node in production, it is possible to evade the
|
|
bootstrap checks (either by not binding transport to an external interface, or
|
|
by binding transport to an external interface and setting the discovery type to
|
|
`single-node`). For this situation, you can force execution of the bootstrap
|
|
checks by setting the system property `es.enforce.bootstrap.checks` to `true`
|
|
(set this in <<jvm-options>>, or by adding `-Des.enforce.bootstrap.checks=true`
|
|
to the environment variable `ES_JAVA_OPTS`). We strongly encourage you to do
|
|
this if you are in this specific situation. This system property can be used to
|
|
force execution of the bootstrap checks independent of the node configuration.
|
|
|
|
=== Heap size check
|
|
|
|
If a JVM is started with unequal initial and max heap size, it can be
|
|
prone to pauses as the JVM heap is resized during system usage. To avoid
|
|
these resize pauses, it's best to start the JVM with the initial heap
|
|
size equal to the maximum heap size. Additionally, if
|
|
<<bootstrap-memory_lock,`bootstrap.memory_lock`>> is enabled, the JVM
|
|
will lock the initial size of the heap on startup. If the initial heap
|
|
size is not equal to the maximum heap size, after a resize it will not
|
|
be the case that all of the JVM heap is locked in memory. To pass the
|
|
heap size check, you must configure the <<heap-size,heap size>>.
|
|
|
|
=== File descriptor check
|
|
|
|
File descriptors are a Unix construct for tracking open "files". In Unix
|
|
though, https://en.wikipedia.org/wiki/Everything_is_a_file[everything is
|
|
a file]. For example, "files" could be a physical file, a virtual file
|
|
(e.g., `/proc/loadavg`), or network sockets. Elasticsearch requires
|
|
lots of file descriptors (e.g., every shard is composed of multiple
|
|
segments and other files, plus connections to other nodes, etc.). This
|
|
bootstrap check is enforced on OS X and Linux. To pass the file
|
|
descriptor check, you might have to configure <<file-descriptors,file
|
|
descriptors>>.
|
|
|
|
=== Memory lock check
|
|
|
|
When the JVM does a major garbage collection it touches every page of
|
|
the heap. If any of those pages are swapped out to disk they will have
|
|
to be swapped back in to memory. That causes lots of disk thrashing that
|
|
Elasticsearch would much rather use to service requests. There are
|
|
several ways to configure a system to disallow swapping. One way is by
|
|
requesting the JVM to lock the heap in memory through `mlockall` (Unix)
|
|
or virtual lock (Windows). This is done via the Elasticsearch setting
|
|
<<bootstrap-memory_lock,`bootstrap.memory_lock`>>. However, there are
|
|
cases where this setting can be passed to Elasticsearch but
|
|
Elasticsearch is not able to lock the heap (e.g., if the `elasticsearch`
|
|
user does not have `memlock unlimited`). The memory lock check verifies
|
|
that *if* the `bootstrap.memory_lock` setting is enabled, that the JVM
|
|
was successfully able to lock the heap. To pass the memory lock check,
|
|
you might have to configure <<bootstrap-memory_lock,`bootstrap.memory_lock`>>.
|
|
|
|
[[max-number-threads-check]]
|
|
=== Maximum number of threads check
|
|
|
|
Elasticsearch executes requests by breaking the request down into stages
|
|
and handing those stages off to different thread pool executors. There
|
|
are different <<modules-threadpool,thread pool executors>> for a variety
|
|
of tasks within Elasticsearch. Thus, Elasticsearch needs the ability to
|
|
create a lot of threads. The maximum number of threads check ensures
|
|
that the Elasticsearch process has the rights to create enough threads
|
|
under normal use. This check is enforced only on Linux. If you are on
|
|
Linux, to pass the maximum number of threads check, you must configure
|
|
your system to allow the Elasticsearch process the ability to create at
|
|
least 4096 threads. This can be done via `/etc/security/limits.conf`
|
|
using the `nproc` setting (note that you might have to increase the
|
|
limits for the `root` user too).
|
|
|
|
=== Max file size check
|
|
|
|
The segment files that are the components of individual shards and the translog
|
|
generations that are components of the translog can get large (exceeding
|
|
multiple gigabytes). On systems where the max size of files that can be created
|
|
by the Elasticsearch process is limited, this can lead to failed
|
|
writes. Therefore, the safest option here is that the max file size is unlimited
|
|
and that is what the max file size bootstrap check enforces. To pass the max
|
|
file check, you must configure your system to allow the Elasticsearch process
|
|
the ability to write files of unlimited size. This can be done via
|
|
`/etc/security/limits.conf` using the `fsize` setting to `unlimited` (note that
|
|
you might have to increase the limits for the `root` user too).
|
|
|
|
[[max-size-virtual-memory-check]]
|
|
=== Maximum size virtual memory check
|
|
|
|
Elasticsearch and Lucene use `mmap` to great effect to map portions of
|
|
an index into the Elasticsearch address space. This keeps certain index
|
|
data off the JVM heap but in memory for blazing fast access. For this to
|
|
be effective, the Elasticsearch should have unlimited address space. The
|
|
maximum size virtual memory check enforces that the Elasticsearch
|
|
process has unlimited address space and is enforced only on Linux. To
|
|
pass the maximum size virtual memory check, you must configure your
|
|
system to allow the Elasticsearch process the ability to have unlimited
|
|
address space. This can be done via adding `<user> - as unlimited`
|
|
to `/etc/security/limits.conf`. This may require you to increase the limits
|
|
for the `root` user too.
|
|
|
|
=== Maximum map count check
|
|
|
|
Continuing from the previous <<max-size-virtual-memory-check,point>>, to
|
|
use `mmap` effectively, Elasticsearch also requires the ability to
|
|
create many memory-mapped areas. The maximum map count check checks that
|
|
the kernel allows a process to have at least 262,144 memory-mapped areas
|
|
and is enforced on Linux only. To pass the maximum map count check, you
|
|
must configure `vm.max_map_count` via `sysctl` to be at least `262144`.
|
|
|
|
Alternatively, the maximum map count check is only needed if you are using
|
|
`mmapfs` or `hybridfs` as the <<index-modules-store,store type>> for your
|
|
indices. If you <<allow-mmap,do not allow>> the use of `mmap` then this
|
|
bootstrap check will not be enforced.
|
|
|
|
=== Client JVM check
|
|
|
|
There are two different JVMs provided by OpenJDK-derived JVMs: the
|
|
client JVM and the server JVM. These JVMs use different compilers for
|
|
producing executable machine code from Java bytecode. The client JVM is
|
|
tuned for startup time and memory footprint while the server JVM is
|
|
tuned for maximizing performance. The difference in performance between
|
|
the two VMs can be substantial. The client JVM check ensures that
|
|
Elasticsearch is not running inside the client JVM. To pass the client
|
|
JVM check, you must start Elasticsearch with the server VM. On modern
|
|
systems and operating systems, the server VM is the
|
|
default.
|
|
|
|
=== Use serial collector check
|
|
|
|
There are various garbage collectors for the OpenJDK-derived JVMs
|
|
targeting different workloads. The serial collector in particular is
|
|
best suited for single logical CPU machines or extremely small heaps,
|
|
neither of which are suitable for running Elasticsearch. Using the
|
|
serial collector with Elasticsearch can be devastating for performance.
|
|
The serial collector check ensures that Elasticsearch is not configured
|
|
to run with the serial collector. To pass the serial collector check,
|
|
you must not start Elasticsearch with the serial collector (whether it's
|
|
from the defaults for the JVM that you're using, or you've explicitly
|
|
specified it with `-XX:+UseSerialGC`). Note that the default JVM
|
|
configuration that ships with Elasticsearch configures Elasticsearch to
|
|
use the CMS collector.
|
|
|
|
=== System call filter check
|
|
Elasticsearch installs system call filters of various flavors depending
|
|
on the operating system (e.g., seccomp on Linux). These system call
|
|
filters are installed to prevent the ability to execute system calls
|
|
related to forking as a defense mechanism against arbitrary code
|
|
execution attacks on Elasticsearch. The system call filter check ensures
|
|
that if system call filters are enabled, then they were successfully
|
|
installed. To pass the system call filter check you must either fix any
|
|
configuration errors on your system that prevented system call filters
|
|
from installing (check your logs), or *at your own risk* disable system
|
|
call filters by setting `bootstrap.system_call_filter` to `false`.
|
|
|
|
=== OnError and OnOutOfMemoryError checks
|
|
|
|
The JVM options `OnError` and `OnOutOfMemoryError` enable executing
|
|
arbitrary commands if the JVM encounters a fatal error (`OnError`) or an
|
|
`OutOfMemoryError` (`OnOutOfMemoryError`). However, by default,
|
|
Elasticsearch system call filters (seccomp) are enabled and these
|
|
filters prevent forking. Thus, using `OnError` or `OnOutOfMemoryError`
|
|
and system call filters are incompatible. The `OnError` and
|
|
`OnOutOfMemoryError` checks prevent Elasticsearch from starting if
|
|
either of these JVM options are used and system call filters are
|
|
enabled. This check is always enforced. To pass this check do not enable
|
|
`OnError` nor `OnOutOfMemoryError`; instead, upgrade to Java 8u92 and
|
|
use the JVM flag `ExitOnOutOfMemoryError`. While this does not have the
|
|
full capabilities of `OnError` nor `OnOutOfMemoryError`, arbitrary
|
|
forking will not be supported with seccomp enabled.
|
|
|
|
=== Early-access check
|
|
|
|
The OpenJDK project provides early-access snapshots of upcoming releases. These
|
|
releases are not suitable for production. The early-access check detects these
|
|
early-access snapshots. To pass this check, you must start Elasticsearch on a
|
|
release build of the JVM.
|
|
|
|
=== G1GC check
|
|
|
|
Early versions of the HotSpot JVM that shipped with JDK 8 are known to
|
|
have issues that can lead to index corruption when the G1GC collector is
|
|
enabled. The versions impacted are those earlier than the version of
|
|
HotSpot that shipped with JDK 8u40. The G1GC check detects these early
|
|
versions of the HotSpot JVM.
|
|
|
|
=== All permission check
|
|
|
|
The all permission check ensures that the security policy used during bootstrap
|
|
does not grant the `java.security.AllPermission` to Elasticsearch. Running with
|
|
the all permission granted is equivalent to disabling the security manager.
|
|
|
|
=== Discovery configuration check
|
|
|
|
By default, when Elasticsearch first starts up it will try and discover other
|
|
nodes running on the same host. If no elected master can be discovered within a
|
|
few seconds then Elasticsearch will form a cluster that includes any other
|
|
nodes that were discovered. It is useful to be able to form this cluster
|
|
without any extra configuration in development mode, but this is unsuitable for
|
|
production because it's possible to form multiple clusters and lose data as a
|
|
result.
|
|
|
|
This bootstrap check ensures that discovery is not running with the default
|
|
configuration. It can be satisfied by setting at least one of the following
|
|
properties:
|
|
|
|
- `discovery.seed_hosts`
|
|
- `discovery.seed_providers`
|
|
- `cluster.initial_master_nodes`
|