Reorganize configuring Elasticsearch docs
This commit reorganizes some of the content in the configuring Elasticsearch section of the docs. The changes are: - move JVM options out of system configuration into configuring Elasticsearch - move JVM options to its own page of the docs - move configuring the heap to important Elasticsearch settings - move configuring the heap to its own page of the docs - move all important settings to individual pages in the docs - remove bootstrap.memory_lock from important settings, this is covered in the swap section of system configuration Relates #27755
This commit is contained in:
parent
77617c8e62
commit
008296e2b6
|
@ -137,12 +137,12 @@ final class JvmOptionsParser {
|
|||
* only
|
||||
* </li>
|
||||
* <li>
|
||||
* a line starting with a number followed by a dash is treated as a JVM option that applies to the matching Java specified major
|
||||
* version and all larger Java major versions
|
||||
* a line starting with a number followed by a dash followed by a colon is treated as a JVM option that applies to the matching
|
||||
* Java specified major version and all larger Java major versions
|
||||
* </li>
|
||||
* <li>
|
||||
* a line starting with a number followed by a dash followed by a number is treated as a JVM option that applies to the
|
||||
* specified range of matching Java major versions
|
||||
* a line starting with a number followed by a dash followed by a number followed by a colon is treated as a JVM option that
|
||||
* applies to the specified range of matching Java major versions
|
||||
* </li>
|
||||
* </ul>
|
||||
*
|
||||
|
|
|
@ -40,6 +40,8 @@ include::setup/install.asciidoc[]
|
|||
|
||||
include::setup/configuration.asciidoc[]
|
||||
|
||||
include::setup/jvm-options.asciidoc[]
|
||||
|
||||
include::setup/secure-settings.asciidoc[]
|
||||
|
||||
include::setup/logging-config.asciidoc[]
|
||||
|
|
|
@ -67,13 +67,12 @@ If a JVM is started with unequal initial and max heap size, it can be
|
|||
prone to pauses as the JVM heap is resized during system usage. To avoid
|
||||
these resize pauses, it's best to start the JVM with the initial heap
|
||||
size equal to the maximum heap size. Additionally, if
|
||||
<<bootstrap.memory_lock,`bootstrap.memory_lock`>> is enabled, the JVM
|
||||
<<bootstrap-memory_lock,`bootstrap.memory_lock`>> is enabled, the JVM
|
||||
will lock the initial size of the heap on startup. If the initial heap
|
||||
size is not equal to the maximum heap size, after a resize it will not
|
||||
be the case that all of the JVM heap is locked in memory. To pass the
|
||||
heap size check, you must configure the <<heap-size,heap size>>.
|
||||
|
||||
|
||||
=== File descriptor check
|
||||
|
||||
File descriptors are a Unix construct for tracking open "files". In Unix
|
||||
|
@ -95,13 +94,13 @@ Elasticsearch would much rather use to service requests. There are
|
|||
several ways to configure a system to disallow swapping. One way is by
|
||||
requesting the JVM to lock the heap in memory through `mlockall` (Unix)
|
||||
or virtual lock (Windows). This is done via the Elasticsearch setting
|
||||
<<bootstrap.memory_lock,`bootstrap.memory_lock`>>. However, there are
|
||||
<<bootstrap-memory_lock,`bootstrap.memory_lock`>>. However, there are
|
||||
cases where this setting can be passed to Elasticsearch but
|
||||
Elasticsearch is not able to lock the heap (e.g., if the `elasticsearch`
|
||||
user does not have `memlock unlimited`). The memory lock check verifies
|
||||
that *if* the `bootstrap.memory_lock` setting is enabled, that the JVM
|
||||
was successfully able to lock the heap. To pass the memory lock check,
|
||||
you might have to configure <<mlockall,`mlockall`>>.
|
||||
you might have to configure <<bootstrap-memory_lock,`bootstrap.memory_lock`>>.
|
||||
|
||||
[[max-number-threads-check]]
|
||||
=== Maximum number of threads check
|
||||
|
|
|
@ -2,211 +2,31 @@
|
|||
== Important Elasticsearch configuration
|
||||
|
||||
While Elasticsearch requires very little configuration, there are a number of
|
||||
settings which need to be configured manually and should definitely be
|
||||
configured before going into production.
|
||||
settings which need to be considered before going into production.
|
||||
|
||||
* <<path-settings,`path.data` and `path.logs`>>
|
||||
* <<cluster.name,`cluster.name`>>
|
||||
* <<node.name,`node.name`>>
|
||||
* <<bootstrap.memory_lock,`bootstrap.memory_lock`>>
|
||||
* <<network.host,`network.host`>>
|
||||
* <<unicast.hosts,`discovery.zen.ping.unicast.hosts`>>
|
||||
* <<minimum_master_nodes,`discovery.zen.minimum_master_nodes`>>
|
||||
* <<heap-dump-path,JVM heap dump path>>
|
||||
The following settings *must* be considered before going to production:
|
||||
|
||||
[float]
|
||||
[[path-settings]]
|
||||
=== `path.data` and `path.logs`
|
||||
* <<path-settings,Path settings>>
|
||||
* <<cluster.name,Cluster name>>
|
||||
* <<node.name,Node name>>
|
||||
* <<network.host,Network host>>
|
||||
* <<discovery-settings,Discovery settings>>
|
||||
* <<heap-size,Heap size>>
|
||||
* <<heap-dump-path,Heap dump path>>
|
||||
* <<gc-logging,GC logging>>
|
||||
|
||||
If you are using the `.zip` or `.tar.gz` archives, the `data` and `logs`
|
||||
directories are sub-folders of `$ES_HOME`. If these important folders are
|
||||
left in their default locations, there is a high risk of them being deleted
|
||||
while upgrading Elasticsearch to a new version.
|
||||
include::important-settings/path-settings.asciidoc[]
|
||||
|
||||
In production use, you will almost certainly want to change the locations of
|
||||
the data and log folder:
|
||||
include::important-settings/cluster-name.asciidoc[]
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
path:
|
||||
logs: /var/log/elasticsearch
|
||||
data: /var/data/elasticsearch
|
||||
--------------------------------------------------
|
||||
include::important-settings/node-name.asciidoc[]
|
||||
|
||||
The RPM and Debian distributions already use custom paths for `data` and
|
||||
`logs`.
|
||||
include::important-settings/network-host.asciidoc[]
|
||||
|
||||
The `path.data` settings can be set to multiple paths, in which case all paths
|
||||
will be used to store data (although the files belonging to a single shard
|
||||
will all be stored on the same data path):
|
||||
include::important-settings/discovery-settings.asciidoc[]
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
path:
|
||||
data:
|
||||
- /mnt/elasticsearch_1
|
||||
- /mnt/elasticsearch_2
|
||||
- /mnt/elasticsearch_3
|
||||
--------------------------------------------------
|
||||
include::important-settings/heap-size.asciidoc[]
|
||||
|
||||
[float]
|
||||
[[cluster.name]]
|
||||
=== `cluster.name`
|
||||
include::important-settings/heap-dump-path.asciidoc[]
|
||||
|
||||
A node can only join a cluster when it shares its `cluster.name` with all the
|
||||
other nodes in the cluster. The default name is `elasticsearch`, but you
|
||||
should change it to an appropriate name which describes the purpose of the
|
||||
cluster.
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
cluster.name: logging-prod
|
||||
--------------------------------------------------
|
||||
|
||||
Make sure that you don't reuse the same cluster names in different
|
||||
environments, otherwise you might end up with nodes joining the wrong cluster.
|
||||
|
||||
[float]
|
||||
[[node.name]]
|
||||
=== `node.name`
|
||||
|
||||
By default, Elasticsearch will take the 7 first character of the randomly generated uuid used as the node id.
|
||||
Note that the node id is persisted and does not change when a node restarts and therefore the default node name
|
||||
will also not change.
|
||||
|
||||
It is worth configuring a more meaningful name which will also have the
|
||||
advantage of persisting after restarting the node:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
node.name: prod-data-2
|
||||
--------------------------------------------------
|
||||
|
||||
The `node.name` can also be set to the server's HOSTNAME as follows:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
node.name: ${HOSTNAME}
|
||||
--------------------------------------------------
|
||||
|
||||
[float]
|
||||
[[bootstrap.memory_lock]]
|
||||
=== `bootstrap.memory_lock`
|
||||
|
||||
It is vitally important to the health of your node that none of the JVM is
|
||||
ever swapped out to disk. One way of achieving that is set the
|
||||
`bootstrap.memory_lock` setting to `true`.
|
||||
|
||||
For this setting to have effect, other system settings need to be configured
|
||||
first. See <<mlockall>> for more details about how to set up memory locking
|
||||
correctly.
|
||||
|
||||
[float]
|
||||
[[network.host]]
|
||||
=== `network.host`
|
||||
|
||||
By default, Elasticsearch binds to loopback addresses only -- e.g. `127.0.0.1`
|
||||
and `[::1]`. This is sufficient to run a single development node on a server.
|
||||
|
||||
TIP: In fact, more than one node can be started from the same `$ES_HOME` location
|
||||
on a single node. This can be useful for testing Elasticsearch's ability to
|
||||
form clusters, but it is not a configuration recommended for production.
|
||||
|
||||
In order to communicate and to form a cluster with nodes on other servers,
|
||||
your node will need to bind to a non-loopback address. While there are many
|
||||
<<modules-network,network settings>>, usually all you need to configure is
|
||||
`network.host`:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
network.host: 192.168.1.10
|
||||
--------------------------------------------------
|
||||
|
||||
The `network.host` setting also understands some special values such as
|
||||
`_local_`, `_site_`, `_global_` and modifiers like `:ip4` and `:ip6`, details
|
||||
of which can be found in <<network-interface-values>>.
|
||||
|
||||
IMPORTANT: As soon you provide a custom setting for `network.host`,
|
||||
Elasticsearch assumes that you are moving from development mode to production
|
||||
mode, and upgrades a number of system startup checks from warnings to
|
||||
exceptions. See <<dev-vs-prod>> for more information.
|
||||
|
||||
[float]
|
||||
[[unicast.hosts]]
|
||||
=== `discovery.zen.ping.unicast.hosts`
|
||||
|
||||
Out of the box, without any network configuration, Elasticsearch will bind to
|
||||
the available loopback addresses and will scan ports 9300 to 9305 to try to
|
||||
connect to other nodes running on the same server. This provides an auto-
|
||||
clustering experience without having to do any configuration.
|
||||
|
||||
When the moment comes to form a cluster with nodes on other servers, you have
|
||||
to provide a seed list of other nodes in the cluster that are likely to be
|
||||
live and contactable. This can be specified as follows:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
discovery.zen.ping.unicast.hosts:
|
||||
- 192.168.1.10:9300
|
||||
- 192.168.1.11 <1>
|
||||
- seeds.mydomain.com <2>
|
||||
--------------------------------------------------
|
||||
<1> The port will default to `transport.profiles.default.port` and fallback to `transport.tcp.port` if not specified.
|
||||
<2> A hostname that resolves to multiple IP addresses will try all resolved addresses.
|
||||
|
||||
[float]
|
||||
[[minimum_master_nodes]]
|
||||
=== `discovery.zen.minimum_master_nodes`
|
||||
|
||||
To prevent data loss, it is vital to configure the
|
||||
`discovery.zen.minimum_master_nodes` setting so that each master-eligible node
|
||||
knows the _minimum number of master-eligible nodes_ that must be visible in
|
||||
order to form a cluster.
|
||||
|
||||
Without this setting, a cluster that suffers a network failure is at risk of
|
||||
having the cluster split into two independent clusters -- a split brain --
|
||||
which will lead to data loss. A more detailed explanation is provided
|
||||
in <<split-brain>>.
|
||||
|
||||
To avoid a split brain, this setting should be set to a _quorum_ of master-
|
||||
eligible nodes:
|
||||
|
||||
(master_eligible_nodes / 2) + 1
|
||||
|
||||
In other words, if there are three master-eligible nodes, then minimum master
|
||||
nodes should be set to `(3 / 2) + 1` or `2`:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
discovery.zen.minimum_master_nodes: 2
|
||||
--------------------------------------------------
|
||||
|
||||
[float]
|
||||
[[heap-dump-path]]
|
||||
=== JVM heap dump path
|
||||
|
||||
The <<rpm,RPM>> and <<deb,Debian>> package distributions default to configuring
|
||||
the JVM to dump the heap on out of memory exceptions to
|
||||
`/var/lib/elasticsearch`. If this path is not suitable for storing heap dumps,
|
||||
you should modify the entry `-XX:HeapDumpPath=/var/lib/elasticsearch` in
|
||||
<<jvm-options,`jvm.options`>> to an alternate path. If you specify a filename
|
||||
instead of a directory, the JVM will repeatedly use the same file; this is one
|
||||
mechanism for preventing heap dumps from accumulating in the heap dump path.
|
||||
Alternatively, you can configure a scheduled task via your OS to remove heap
|
||||
dumps that are older than a configured age.
|
||||
|
||||
Note that the archive distributions do not configure the heap dump path by
|
||||
default. Instead, the JVM will default to dumping to the working directory for
|
||||
the Elasticsearch process. If you wish to configure a heap dump path, you should
|
||||
modify the entry `#-XX:HeapDumpPath=/heap/dump/path` in
|
||||
<<jvm-options,`jvm.options`>> to remove the comment marker `#` and to specify an
|
||||
actual path.
|
||||
|
||||
[float]
|
||||
[[gc-logging]]
|
||||
=== GC logging
|
||||
|
||||
By default, Elasticsearch enables GC logs. These are configured in
|
||||
<<jvm-options,`jvm.options`>> and default to the same default location as the
|
||||
Elasticsearch logs. The default configuration rotates the logs every 64 MB and
|
||||
can consume up to 2 GB of disk space.
|
||||
include::important-settings/gc-logging.asciidoc[]
|
||||
|
|
|
@ -0,0 +1,14 @@
|
|||
[[cluster.name]]
|
||||
=== `cluster.name`
|
||||
|
||||
A node can only join a cluster when it shares its `cluster.name` with all the
|
||||
other nodes in the cluster. The default name is `elasticsearch`, but you should
|
||||
change it to an appropriate name which describes the purpose of the cluster.
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
cluster.name: logging-prod
|
||||
--------------------------------------------------
|
||||
|
||||
Make sure that you don't reuse the same cluster names in different environments,
|
||||
otherwise you might end up with nodes joining the wrong cluster.
|
|
@ -0,0 +1,58 @@
|
|||
[[discovery-settings]]
|
||||
=== Discovery settings
|
||||
|
||||
Elasticsearch uses a custom discovery implementation called "Zen Discovery" for
|
||||
node-to-node clustering and master election. There are two important discovery
|
||||
settings that should be configured before going to production.
|
||||
|
||||
[float]
|
||||
[[unicast.hosts]]
|
||||
==== `discovery.zen.ping.unicast.hosts`
|
||||
|
||||
Out of the box, without any network configuration, Elasticsearch will bind to
|
||||
the available loopback addresses and will scan ports 9300 to 9305 to try to
|
||||
connect to other nodes running on the same server. This provides an auto-
|
||||
clustering experience without having to do any configuration.
|
||||
|
||||
When the moment comes to form a cluster with nodes on other servers, you have to
|
||||
provide a seed list of other nodes in the cluster that are likely to be live and
|
||||
contactable. This can be specified as follows:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
discovery.zen.ping.unicast.hosts:
|
||||
- 192.168.1.10:9300
|
||||
- 192.168.1.11 <1>
|
||||
- seeds.mydomain.com <2>
|
||||
--------------------------------------------------
|
||||
<1> The port will default to `transport.profiles.default.port` and fallback to
|
||||
`transport.tcp.port` if not specified.
|
||||
<2> A hostname that resolves to multiple IP addresses will try all resolved
|
||||
addresses.
|
||||
|
||||
[float]
|
||||
[[minimum_master_nodes]]
|
||||
==== `discovery.zen.minimum_master_nodes`
|
||||
|
||||
To prevent data loss, it is vital to configure the
|
||||
`discovery.zen.minimum_master_nodes` setting so that each master-eligible node
|
||||
knows the _minimum number of master-eligible nodes_ that must be visible in
|
||||
order to form a cluster.
|
||||
|
||||
Without this setting, a cluster that suffers a network failure is at risk of
|
||||
having the cluster split into two independent clusters -- a split brain -- which
|
||||
will lead to data loss. A more detailed explanation is provided in
|
||||
<<split-brain>>.
|
||||
|
||||
To avoid a split brain, this setting should be set to a _quorum_ of
|
||||
master-eligible nodes:
|
||||
|
||||
(master_eligible_nodes / 2) + 1
|
||||
|
||||
In other words, if there are three master-eligible nodes, then minimum master
|
||||
nodes should be set to `(3 / 2) + 1` or `2`:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
discovery.zen.minimum_master_nodes: 2
|
||||
--------------------------------------------------
|
|
@ -0,0 +1,7 @@
|
|||
[[gc-logging]]
|
||||
=== GC logging
|
||||
|
||||
By default, Elasticsearch enables GC logs. These are configured in
|
||||
<<jvm-options,`jvm.options`>> and default to the same default location as the
|
||||
Elasticsearch logs. The default configuration rotates the logs every 64 MB and
|
||||
can consume up to 2 GB of disk space.
|
|
@ -0,0 +1,19 @@
|
|||
[[heap-dump-path]]
|
||||
=== JVM heap dump path
|
||||
|
||||
The <<rpm,RPM>> and <<deb,Debian>> package distributions default to configuring
|
||||
the JVM to dump the heap on out of memory exceptions to
|
||||
`/var/lib/elasticsearch`. If this path is not suitable for storing heap dumps,
|
||||
you should modify the entry `-XX:HeapDumpPath=/var/lib/elasticsearch` in
|
||||
<<jvm-options,`jvm.options`>> to an alternate path. If you specify a filename
|
||||
instead of a directory, the JVM will repeatedly use the same file; this is one
|
||||
mechanism for preventing heap dumps from accumulating in the heap dump path.
|
||||
Alternatively, you can configure a scheduled task via your OS to remove heap
|
||||
dumps that are older than a configured age.
|
||||
|
||||
Note that the archive distributions do not configure the heap dump path by
|
||||
default. Instead, the JVM will default to dumping to the working directory for
|
||||
the Elasticsearch process. If you wish to configure a heap dump path, you should
|
||||
modify the entry `#-XX:HeapDumpPath=/heap/dump/path` in
|
||||
<<jvm-options,`jvm.options`>> to remove the comment marker `#` and to specify an
|
||||
actual path.
|
|
@ -0,0 +1,72 @@
|
|||
[[heap-size]]
|
||||
=== Setting the heap size
|
||||
|
||||
By default, Elasticsearch tells the JVM to use a heap with a minimum and maximum
|
||||
size of 1 GB. When moving to production, it is important to configure heap size
|
||||
to ensure that Elasticsearch has enough heap available.
|
||||
|
||||
Elasticsearch will assign the entire heap specified in
|
||||
<<jvm-options,jvm.options>> via the `Xms` (minimum heap size) and `Xmx` (maximum
|
||||
heap size) settings.
|
||||
|
||||
The value for these setting depends on the amount of RAM available on your
|
||||
server. Good rules of thumb are:
|
||||
|
||||
* Set the minimum heap size (`Xms`) and maximum heap size (`Xmx`) to be equal to
|
||||
each other.
|
||||
|
||||
* The more heap available to Elasticsearch, the more memory it can use for
|
||||
caching. But note that too much heap can subject you to long garbage
|
||||
collection pauses.
|
||||
|
||||
* Set `Xmx` to no more than 50% of your physical RAM, to ensure that there is
|
||||
enough physical RAM left for kernel file system caches.
|
||||
|
||||
* Don’t set `Xmx` to above the cutoff that the JVM uses for compressed object
|
||||
pointers (compressed oops); the exact cutoff varies but is near 32 GB. You can
|
||||
verify that you are under the limit by looking for a line in the logs like the
|
||||
following:
|
||||
+
|
||||
heap size [1.9gb], compressed ordinary object pointers [true]
|
||||
|
||||
* Even better, try to stay below the threshold for zero-based compressed oops;
|
||||
the exact cutoff varies but 26 GB is safe on most systems, but can be as large
|
||||
as 30 GB on some systems. You can verify that you are under the limit by
|
||||
starting Elasticsearch with the JVM options `-XX:+UnlockDiagnosticVMOptions
|
||||
-XX:+PrintCompressedOopsMode` and looking for a line like the following:
|
||||
+
|
||||
--
|
||||
heap address: 0x000000011be00000, size: 27648 MB, zero based Compressed Oops
|
||||
|
||||
showing that zero-based compressed oops are enabled instead of
|
||||
|
||||
heap address: 0x0000000118400000, size: 28672 MB, Compressed Oops with base: 0x00000001183ff000
|
||||
--
|
||||
|
||||
Here are examples of how to set the heap size via the jvm.options file:
|
||||
|
||||
[source,txt]
|
||||
------------------
|
||||
-Xms2g <1>
|
||||
-Xmx2g <2>
|
||||
------------------
|
||||
<1> Set the minimum heap size to 2g.
|
||||
<2> Set the maximum heap size to 2g.
|
||||
|
||||
It is also possible to set the heap size via an environment variable. This can
|
||||
be done by commenting out the `Xms` and `Xmx` settings in the
|
||||
<<jvm-options,`jvm.options`>> file and setting these values via `ES_JAVA_OPTS`:
|
||||
|
||||
[source,sh]
|
||||
------------------
|
||||
ES_JAVA_OPTS="-Xms2g -Xmx2g" ./bin/elasticsearch <1>
|
||||
ES_JAVA_OPTS="-Xms4000m -Xmx4000m" ./bin/elasticsearch <2>
|
||||
------------------
|
||||
<1> Set the minimum and maximum heap size to 2 GB.
|
||||
<2> Set the minimum and maximum heap size to 4000 MB.
|
||||
|
||||
NOTE: Configuring the heap for the <<windows-service,Windows service>> is
|
||||
different than the above. The values initially populated for the Windows service
|
||||
can be configured as above but are different after the service has been
|
||||
installed. Consult the <<windows-service,Windows service documentation>> for
|
||||
additional details.
|
|
@ -0,0 +1,29 @@
|
|||
[[network.host]]
|
||||
=== `network.host`
|
||||
|
||||
By default, Elasticsearch binds to loopback addresses only -- e.g. `127.0.0.1`
|
||||
and `[::1]`. This is sufficient to run a single development node on a server.
|
||||
|
||||
TIP: In fact, more than one node can be started from the same `$ES_HOME`
|
||||
location on a single node. This can be useful for testing Elasticsearch's
|
||||
ability to form clusters, but it is not a configuration recommended for
|
||||
production.
|
||||
|
||||
In order to communicate and to form a cluster with nodes on other servers, your
|
||||
node will need to bind to a non-loopback address. While there are many
|
||||
<<modules-network,network settings>>, usually all you need to configure is
|
||||
`network.host`:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
network.host: 192.168.1.10
|
||||
--------------------------------------------------
|
||||
|
||||
The `network.host` setting also understands some special values such as
|
||||
`_local_`, `_site_`, `_global_` and modifiers like `:ip4` and `:ip6`, details of
|
||||
which can be found in <<network-interface-values>>.
|
||||
|
||||
IMPORTANT: As soon you provide a custom setting for `network.host`,
|
||||
Elasticsearch assumes that you are moving from development mode to production
|
||||
mode, and upgrades a number of system startup checks from warnings to
|
||||
exceptions. See <<dev-vs-prod>> for more information.
|
|
@ -0,0 +1,22 @@
|
|||
[[node.name]]
|
||||
=== `node.name`
|
||||
|
||||
By default, Elasticsearch will take the 7 first character of the randomly
|
||||
generated uuid used as the node id. Note that the node id is persisted and does
|
||||
not change when a node restarts and therefore the default node name will also
|
||||
not change.
|
||||
|
||||
It is worth configuring a more meaningful name which will also have the
|
||||
advantage of persisting after restarting the node:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
node.name: prod-data-2
|
||||
--------------------------------------------------
|
||||
|
||||
The `node.name` can also be set to the server's HOSTNAME as follows:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
node.name: ${HOSTNAME}
|
||||
--------------------------------------------------
|
|
@ -0,0 +1,32 @@
|
|||
[[path-settings]]
|
||||
=== `path.data` and `path.logs`
|
||||
|
||||
If you are using the `.zip` or `.tar.gz` archives, the `data` and `logs`
|
||||
directories are sub-folders of `$ES_HOME`. If these important folders are left
|
||||
in their default locations, there is a high risk of them being deleted while
|
||||
upgrading Elasticsearch to a new version.
|
||||
|
||||
In production use, you will almost certainly want to change the locations of the
|
||||
data and log folder:
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
path:
|
||||
logs: /var/log/elasticsearch
|
||||
data: /var/data/elasticsearch
|
||||
--------------------------------------------------
|
||||
|
||||
The RPM and Debian distributions already use custom paths for `data` and `logs`.
|
||||
|
||||
The `path.data` settings can be set to multiple paths, in which case all paths
|
||||
will be used to store data (although the files belonging to a single shard will
|
||||
all be stored on the same data path):
|
||||
|
||||
[source,yaml]
|
||||
--------------------------------------------------
|
||||
path:
|
||||
data:
|
||||
- /mnt/elasticsearch_1
|
||||
- /mnt/elasticsearch_2
|
||||
- /mnt/elasticsearch_3
|
||||
--------------------------------------------------
|
|
@ -0,0 +1,82 @@
|
|||
[[jvm-options]]
|
||||
=== Setting JVM options
|
||||
|
||||
The preferred method of setting JVM options (including system properties and JVM
|
||||
flags) is via the `jvm.options` configuration file. The default location of this
|
||||
file is `config/jvm.options` (when installing from the tar or zip distributions)
|
||||
and `/etc/elasticsearch/jvm.options` (when installing from the Debian or RPM
|
||||
packages).
|
||||
|
||||
This file contains a line-delimited list of JVM arguments following
|
||||
a special syntax:
|
||||
|
||||
* lines consisting of whitespace only are ignored
|
||||
* lines beginning with `#` are treated as comments and are ignored
|
||||
+
|
||||
[source,text]
|
||||
-------------------------------------
|
||||
# this is a comment
|
||||
-------------------------------------
|
||||
|
||||
* lines beginning with a `-` are treated as a JVM option that applies
|
||||
independent of the version of the JVM
|
||||
+
|
||||
[source,text]
|
||||
-------------------------------------
|
||||
-Xmx2g
|
||||
-------------------------------------
|
||||
|
||||
* lines beginning with a number followed by a `:` followed by a `-` are treated
|
||||
as a JVM option that applies only if the version of the JVM matches the number
|
||||
+
|
||||
[source,text]
|
||||
-------------------------------------
|
||||
8:-Xmx2g
|
||||
-------------------------------------
|
||||
|
||||
* lines beginning with a number followed by a `-` followed by a `:` are treated
|
||||
as a JVM option that applies only if the version of the JVM is greater than or
|
||||
equal to the number
|
||||
+
|
||||
[source,text]
|
||||
-------------------------------------
|
||||
8-:-Xmx2g
|
||||
-------------------------------------
|
||||
|
||||
* lines beginning with a number followed by a `-` followed by a number followed
|
||||
by a `:` are treated as a JVM option that applies only if the version of the
|
||||
JVM falls in the range of the two numbers
|
||||
+
|
||||
[source,text]
|
||||
-------------------------------------
|
||||
8-9:-Xmx2g
|
||||
-------------------------------------
|
||||
|
||||
* all other lines are rejected
|
||||
|
||||
You can add custom JVM flags to this file and check this configuration into your
|
||||
version control system.
|
||||
|
||||
An alternative mechanism for setting Java Virtual Machine options is via the
|
||||
`ES_JAVA_OPTS` environment variable. For instance:
|
||||
|
||||
[source,sh]
|
||||
---------------------------------
|
||||
export ES_JAVA_OPTS="$ES_JAVA_OPTS -Djava.io.tmpdir=/path/to/temp/dir"
|
||||
./bin/elasticsearch
|
||||
---------------------------------
|
||||
|
||||
When using the RPM or Debian packages, `ES_JAVA_OPTS` can be specified in the
|
||||
<<sysconfig,system configuration file>>.
|
||||
|
||||
The JVM has a built-in mechanism for observing the `JAVA_TOOL_OPTIONS`
|
||||
environment variable. We intentionally ignore this environment variable in our
|
||||
packaging scripts. The primary reason for this is that on some OS (e.g., Ubuntu)
|
||||
there are agents installed by default via this environment variable that we do
|
||||
not want interfering with Elasticsearch.
|
||||
|
||||
Additionally, some other Java programs support the `JAVA_OPTS` environment
|
||||
variable. This is *not* a mechanism built into the JVM but instead a convention
|
||||
in the ecosystem. However, we do not support this environment variable, instead
|
||||
supporting setting JVM options via the `jvm.options` file or the environment
|
||||
variable `ES_JAVA_OPTS` as above.
|
|
@ -6,13 +6,13 @@ resources available to it. In order to do so, you need to configure your
|
|||
operating system to allow the user running Elasticsearch to access more
|
||||
resources than allowed by default.
|
||||
|
||||
The following settings *must* be addressed before going to production:
|
||||
The following settings *must* be considered before going to production:
|
||||
|
||||
* <<heap-size,Set JVM heap size>>
|
||||
* <<setup-configuration-memory,Disable swapping>>
|
||||
* <<file-descriptors,Increase file descriptors>>
|
||||
* <<vm-max-map-count,Ensure sufficient virtual memory>>
|
||||
* <<max-number-of-threads,Ensure sufficient threads>>
|
||||
* <<networkaddress-cache-ttl,JVM DNS cache settings>>
|
||||
|
||||
[[dev-vs-prod]]
|
||||
[float]
|
||||
|
@ -31,8 +31,6 @@ lose data because of a malconfigured server.
|
|||
|
||||
include::sysconfig/configuring.asciidoc[]
|
||||
|
||||
include::sysconfig/heap_size.asciidoc[]
|
||||
|
||||
include::sysconfig/swap.asciidoc[]
|
||||
|
||||
include::sysconfig/file-descriptors.asciidoc[]
|
||||
|
|
|
@ -36,7 +36,6 @@ The new limit is only applied during the current session.
|
|||
|
||||
You can consult all currently applied limits with `ulimit -a`.
|
||||
|
||||
|
||||
[[limits.conf]]
|
||||
==== `/etc/security/limits.conf`
|
||||
|
||||
|
@ -66,7 +65,6 @@ following line:
|
|||
--------------------------------
|
||||
===============================
|
||||
|
||||
|
||||
[[sysconfig]]
|
||||
==== Sysconfig file
|
||||
|
||||
|
@ -81,7 +79,6 @@ Debian:: `/etc/default/elasticsearch`
|
|||
However, for systems which uses `systemd`, system limits need to be specified
|
||||
via <<systemd,systemd>>.
|
||||
|
||||
|
||||
[[systemd]]
|
||||
==== Systemd configuration
|
||||
|
||||
|
@ -110,57 +107,3 @@ Once finished, run the following command to reload units:
|
|||
---------------------------------
|
||||
sudo systemctl daemon-reload
|
||||
---------------------------------
|
||||
|
||||
[[jvm-options]]
|
||||
==== Setting JVM options
|
||||
|
||||
The preferred method of setting Java Virtual Machine options (including
|
||||
system properties and JVM flags) is via the `jvm.options` configuration
|
||||
file. The default location of this file is `config/jvm.options` (when
|
||||
installing from the tar or zip distributions) and
|
||||
`/etc/elasticsearch/jvm.options` (when installing from the Debian or RPM
|
||||
packages). This file contains a line-delimited list of JVM arguments following
|
||||
a special syntax:
|
||||
- lines beginning with `#` are treated as comments and are ignored
|
||||
- lines consisting only of whitespace are ignored
|
||||
- lines beginning with a `-` are treated as a JVM option that applies
|
||||
independent of the version of the JVM
|
||||
- lines beginning with a number followed by a `:` followed by a `-` are treated
|
||||
as a JVM option that applies only if the version of the JVM matches the
|
||||
number
|
||||
- lines beginning with a number followed by a `-` followed by a `:` are treated
|
||||
as a JVM option that applies only if the version of the JVM is greater than
|
||||
or equal to the number
|
||||
- lines beginning with a number followed by a `-` followed by a `:` followed by
|
||||
a `-` followed by a number are treated as a JVM option that applies only if
|
||||
the version of the JVM falls in the range of the two numbers
|
||||
- all other lines are rejected
|
||||
|
||||
|
||||
You can add custom JVM flags to this file and
|
||||
check this configuration into your version control system.
|
||||
|
||||
An alternative mechanism for setting Java Virtual Machine options is
|
||||
via the `ES_JAVA_OPTS` environment variable. For instance:
|
||||
|
||||
[source,sh]
|
||||
---------------------------------
|
||||
export ES_JAVA_OPTS="$ES_JAVA_OPTS -Djava.io.tmpdir=/path/to/temp/dir"
|
||||
./bin/elasticsearch
|
||||
---------------------------------
|
||||
|
||||
When using the RPM or Debian packages, `ES_JAVA_OPTS` can be specified in the
|
||||
<<sysconfig,system configuration file>>.
|
||||
|
||||
The JVM has a built-in mechanism for observing the `JAVA_TOOL_OPTIONS`
|
||||
environment variable. We intentionally ignore this environment variable in our
|
||||
packaging scripts. The primary reason for this is that on some OS (e.g., Ubuntu)
|
||||
there are agents installed by default via this environment variable that we do
|
||||
not want interfering with Elasticsearch.
|
||||
|
||||
Additionally, some other Java programs support the `JAVA_OPTS` environment
|
||||
variable. This is *not* a mechanism built into the JVM but instead a convention
|
||||
in the ecosystem. However, we do not support this environment variable, instead
|
||||
supporting setting JVM options via the `jvm.options` file or the environment
|
||||
variable `ES_JAVA_OPTS` as above.
|
||||
|
||||
|
|
|
@ -1,74 +0,0 @@
|
|||
[[heap-size]]
|
||||
=== Set JVM heap size via jvm.options
|
||||
|
||||
By default, Elasticsearch tells the JVM to use a heap with a minimum
|
||||
and maximum size of 1 GB. When moving to production, it is
|
||||
important to configure heap size to ensure that Elasticsearch has enough
|
||||
heap available.
|
||||
|
||||
Elasticsearch will assign the entire heap specified in <<jvm-options,jvm.options>>
|
||||
via the Xms (minimum heap size) and Xmx (maximum heap size) settings.
|
||||
|
||||
The value for these setting depends on the amount of RAM available on
|
||||
your server. Good rules of thumb are:
|
||||
|
||||
* Set the minimum heap size (Xms) and maximum heap size (Xmx) to be
|
||||
equal to each other.
|
||||
|
||||
* The more heap available to Elasticsearch, the more memory it can use for
|
||||
caching. But note that too much heap can subject you to long garbage
|
||||
collection pauses.
|
||||
|
||||
* Set Xmx to no more than 50% of your physical RAM, to ensure that there
|
||||
is enough physical RAM left for kernel file system caches.
|
||||
|
||||
* Don’t set Xmx to above the cutoff that the JVM uses for compressed
|
||||
object pointers (compressed oops); the exact cutoff varies but is
|
||||
near 32 GB. You can verify that you are under the limit by looking
|
||||
for a line in the logs like the following:
|
||||
+
|
||||
heap size [1.9gb], compressed ordinary object pointers [true]
|
||||
|
||||
* Even better, try to stay below the threshold for zero-based
|
||||
compressed oops; the exact cutoff varies but 26 GB is safe on most
|
||||
systems, but can be as large as 30 GB on some systems. You can verify
|
||||
that you are under the limit by starting Elasticsearch with the JVM
|
||||
options `-XX:+UnlockDiagnosticVMOptions -XX:+PrintCompressedOopsMode`
|
||||
and looking for a line like the following:
|
||||
+
|
||||
--
|
||||
heap address: 0x000000011be00000, size: 27648 MB, zero based Compressed Oops
|
||||
|
||||
showing that zero-based compressed oops are enabled instead of
|
||||
|
||||
heap address: 0x0000000118400000, size: 28672 MB, Compressed Oops with base: 0x00000001183ff000
|
||||
--
|
||||
|
||||
Here are examples of how to set the heap size via the jvm.options file:
|
||||
|
||||
[source,txt]
|
||||
------------------
|
||||
-Xms2g <1>
|
||||
-Xmx2g <2>
|
||||
------------------
|
||||
<1> Set the minimum heap size to 2g.
|
||||
<2> Set the maximum heap size to 2g.
|
||||
|
||||
It is also possible to set the heap size via an environment variable.
|
||||
This can be done by commenting out the `Xms` and `Xmx` settings
|
||||
in the jvm.options file and setting these values via `ES_JAVA_OPTS`:
|
||||
|
||||
[source,sh]
|
||||
------------------
|
||||
ES_JAVA_OPTS="-Xms2g -Xmx2g" ./bin/elasticsearch <1>
|
||||
ES_JAVA_OPTS="-Xms4000m -Xmx4000m" ./bin/elasticsearch <2>
|
||||
------------------
|
||||
<1> Set the minimum and maximum heap size to 2 GB.
|
||||
<2> Set the minimum and maximum heap size to 4000 MB.
|
||||
|
||||
NOTE: Configuring the heap for the <<windows-service,Windows service>>
|
||||
is different than the above. The values initially populated for the
|
||||
Windows service can be configured as above but are different after the
|
||||
service has been installed. Consult the
|
||||
<<windows-service,Windows service documentation>> for additional
|
||||
details.
|
|
@ -42,7 +42,7 @@ Another option available on Linux systems is to ensure that the sysctl value
|
|||
should not lead to swapping under normal circumstances, while still allowing the
|
||||
whole system to swap in emergency conditions.
|
||||
|
||||
[[mlockall]]
|
||||
[[bootstrap-memory_lock]]
|
||||
==== Enable `bootstrap.memory_lock`
|
||||
|
||||
Another option is to use
|
||||
|
|
Loading…
Reference in New Issue