Reorganize configuring Elasticsearch docs

This commit reorganizes some of the content in the configuring
Elasticsearch section of the docs. The changes are:
 - move JVM options out of system configuration into configuring
   Elasticsearch
 - move JVM options to its own page of the docs
 - move configuring the heap to important Elasticsearch settings
 - move configuring the heap to its own page of the docs
 - move all important settings to individual pages in the docs
 - remove bootstrap.memory_lock from important settings, this is covered
   in the swap section of system configuration

Relates #27755
This commit is contained in:
Jason Tedor 2017-12-12 10:24:37 -05:00 committed by GitHub
parent 77617c8e62
commit 008296e2b6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
17 changed files with 365 additions and 342 deletions

View File

@ -137,12 +137,12 @@ final class JvmOptionsParser {
* only
* </li>
* <li>
* a line starting with a number followed by a dash is treated as a JVM option that applies to the matching Java specified major
* version and all larger Java major versions
* a line starting with a number followed by a dash followed by a colon is treated as a JVM option that applies to the matching
* Java specified major version and all larger Java major versions
* </li>
* <li>
* a line starting with a number followed by a dash followed by a number is treated as a JVM option that applies to the
* specified range of matching Java major versions
* a line starting with a number followed by a dash followed by a number followed by a colon is treated as a JVM option that
* applies to the specified range of matching Java major versions
* </li>
* </ul>
*

View File

@ -40,6 +40,8 @@ include::setup/install.asciidoc[]
include::setup/configuration.asciidoc[]
include::setup/jvm-options.asciidoc[]
include::setup/secure-settings.asciidoc[]
include::setup/logging-config.asciidoc[]

View File

@ -67,13 +67,12 @@ If a JVM is started with unequal initial and max heap size, it can be
prone to pauses as the JVM heap is resized during system usage. To avoid
these resize pauses, it's best to start the JVM with the initial heap
size equal to the maximum heap size. Additionally, if
<<bootstrap.memory_lock,`bootstrap.memory_lock`>> is enabled, the JVM
<<bootstrap-memory_lock,`bootstrap.memory_lock`>> is enabled, the JVM
will lock the initial size of the heap on startup. If the initial heap
size is not equal to the maximum heap size, after a resize it will not
be the case that all of the JVM heap is locked in memory. To pass the
heap size check, you must configure the <<heap-size,heap size>>.
=== File descriptor check
File descriptors are a Unix construct for tracking open "files". In Unix
@ -95,13 +94,13 @@ Elasticsearch would much rather use to service requests. There are
several ways to configure a system to disallow swapping. One way is by
requesting the JVM to lock the heap in memory through `mlockall` (Unix)
or virtual lock (Windows). This is done via the Elasticsearch setting
<<bootstrap.memory_lock,`bootstrap.memory_lock`>>. However, there are
<<bootstrap-memory_lock,`bootstrap.memory_lock`>>. However, there are
cases where this setting can be passed to Elasticsearch but
Elasticsearch is not able to lock the heap (e.g., if the `elasticsearch`
user does not have `memlock unlimited`). The memory lock check verifies
that *if* the `bootstrap.memory_lock` setting is enabled, that the JVM
was successfully able to lock the heap. To pass the memory lock check,
you might have to configure <<mlockall,`mlockall`>>.
you might have to configure <<bootstrap-memory_lock,`bootstrap.memory_lock`>>.
[[max-number-threads-check]]
=== Maximum number of threads check

View File

@ -2,211 +2,31 @@
== Important Elasticsearch configuration
While Elasticsearch requires very little configuration, there are a number of
settings which need to be configured manually and should definitely be
configured before going into production.
settings which need to be considered before going into production.
* <<path-settings,`path.data` and `path.logs`>>
* <<cluster.name,`cluster.name`>>
* <<node.name,`node.name`>>
* <<bootstrap.memory_lock,`bootstrap.memory_lock`>>
* <<network.host,`network.host`>>
* <<unicast.hosts,`discovery.zen.ping.unicast.hosts`>>
* <<minimum_master_nodes,`discovery.zen.minimum_master_nodes`>>
* <<heap-dump-path,JVM heap dump path>>
The following settings *must* be considered before going to production:
[float]
[[path-settings]]
=== `path.data` and `path.logs`
* <<path-settings,Path settings>>
* <<cluster.name,Cluster name>>
* <<node.name,Node name>>
* <<network.host,Network host>>
* <<discovery-settings,Discovery settings>>
* <<heap-size,Heap size>>
* <<heap-dump-path,Heap dump path>>
* <<gc-logging,GC logging>>
If you are using the `.zip` or `.tar.gz` archives, the `data` and `logs`
directories are sub-folders of `$ES_HOME`. If these important folders are
left in their default locations, there is a high risk of them being deleted
while upgrading Elasticsearch to a new version.
include::important-settings/path-settings.asciidoc[]
In production use, you will almost certainly want to change the locations of
the data and log folder:
include::important-settings/cluster-name.asciidoc[]
[source,yaml]
--------------------------------------------------
path:
logs: /var/log/elasticsearch
data: /var/data/elasticsearch
--------------------------------------------------
include::important-settings/node-name.asciidoc[]
The RPM and Debian distributions already use custom paths for `data` and
`logs`.
include::important-settings/network-host.asciidoc[]
The `path.data` settings can be set to multiple paths, in which case all paths
will be used to store data (although the files belonging to a single shard
will all be stored on the same data path):
include::important-settings/discovery-settings.asciidoc[]
[source,yaml]
--------------------------------------------------
path:
data:
- /mnt/elasticsearch_1
- /mnt/elasticsearch_2
- /mnt/elasticsearch_3
--------------------------------------------------
include::important-settings/heap-size.asciidoc[]
[float]
[[cluster.name]]
=== `cluster.name`
include::important-settings/heap-dump-path.asciidoc[]
A node can only join a cluster when it shares its `cluster.name` with all the
other nodes in the cluster. The default name is `elasticsearch`, but you
should change it to an appropriate name which describes the purpose of the
cluster.
[source,yaml]
--------------------------------------------------
cluster.name: logging-prod
--------------------------------------------------
Make sure that you don't reuse the same cluster names in different
environments, otherwise you might end up with nodes joining the wrong cluster.
[float]
[[node.name]]
=== `node.name`
By default, Elasticsearch will take the 7 first character of the randomly generated uuid used as the node id.
Note that the node id is persisted and does not change when a node restarts and therefore the default node name
will also not change.
It is worth configuring a more meaningful name which will also have the
advantage of persisting after restarting the node:
[source,yaml]
--------------------------------------------------
node.name: prod-data-2
--------------------------------------------------
The `node.name` can also be set to the server's HOSTNAME as follows:
[source,yaml]
--------------------------------------------------
node.name: ${HOSTNAME}
--------------------------------------------------
[float]
[[bootstrap.memory_lock]]
=== `bootstrap.memory_lock`
It is vitally important to the health of your node that none of the JVM is
ever swapped out to disk. One way of achieving that is set the
`bootstrap.memory_lock` setting to `true`.
For this setting to have effect, other system settings need to be configured
first. See <<mlockall>> for more details about how to set up memory locking
correctly.
[float]
[[network.host]]
=== `network.host`
By default, Elasticsearch binds to loopback addresses only -- e.g. `127.0.0.1`
and `[::1]`. This is sufficient to run a single development node on a server.
TIP: In fact, more than one node can be started from the same `$ES_HOME` location
on a single node. This can be useful for testing Elasticsearch's ability to
form clusters, but it is not a configuration recommended for production.
In order to communicate and to form a cluster with nodes on other servers,
your node will need to bind to a non-loopback address. While there are many
<<modules-network,network settings>>, usually all you need to configure is
`network.host`:
[source,yaml]
--------------------------------------------------
network.host: 192.168.1.10
--------------------------------------------------
The `network.host` setting also understands some special values such as
`_local_`, `_site_`, `_global_` and modifiers like `:ip4` and `:ip6`, details
of which can be found in <<network-interface-values>>.
IMPORTANT: As soon you provide a custom setting for `network.host`,
Elasticsearch assumes that you are moving from development mode to production
mode, and upgrades a number of system startup checks from warnings to
exceptions. See <<dev-vs-prod>> for more information.
[float]
[[unicast.hosts]]
=== `discovery.zen.ping.unicast.hosts`
Out of the box, without any network configuration, Elasticsearch will bind to
the available loopback addresses and will scan ports 9300 to 9305 to try to
connect to other nodes running on the same server. This provides an auto-
clustering experience without having to do any configuration.
When the moment comes to form a cluster with nodes on other servers, you have
to provide a seed list of other nodes in the cluster that are likely to be
live and contactable. This can be specified as follows:
[source,yaml]
--------------------------------------------------
discovery.zen.ping.unicast.hosts:
- 192.168.1.10:9300
- 192.168.1.11 <1>
- seeds.mydomain.com <2>
--------------------------------------------------
<1> The port will default to `transport.profiles.default.port` and fallback to `transport.tcp.port` if not specified.
<2> A hostname that resolves to multiple IP addresses will try all resolved addresses.
[float]
[[minimum_master_nodes]]
=== `discovery.zen.minimum_master_nodes`
To prevent data loss, it is vital to configure the
`discovery.zen.minimum_master_nodes` setting so that each master-eligible node
knows the _minimum number of master-eligible nodes_ that must be visible in
order to form a cluster.
Without this setting, a cluster that suffers a network failure is at risk of
having the cluster split into two independent clusters -- a split brain --
which will lead to data loss. A more detailed explanation is provided
in <<split-brain>>.
To avoid a split brain, this setting should be set to a _quorum_ of master-
eligible nodes:
(master_eligible_nodes / 2) + 1
In other words, if there are three master-eligible nodes, then minimum master
nodes should be set to `(3 / 2) + 1` or `2`:
[source,yaml]
--------------------------------------------------
discovery.zen.minimum_master_nodes: 2
--------------------------------------------------
[float]
[[heap-dump-path]]
=== JVM heap dump path
The <<rpm,RPM>> and <<deb,Debian>> package distributions default to configuring
the JVM to dump the heap on out of memory exceptions to
`/var/lib/elasticsearch`. If this path is not suitable for storing heap dumps,
you should modify the entry `-XX:HeapDumpPath=/var/lib/elasticsearch` in
<<jvm-options,`jvm.options`>> to an alternate path. If you specify a filename
instead of a directory, the JVM will repeatedly use the same file; this is one
mechanism for preventing heap dumps from accumulating in the heap dump path.
Alternatively, you can configure a scheduled task via your OS to remove heap
dumps that are older than a configured age.
Note that the archive distributions do not configure the heap dump path by
default. Instead, the JVM will default to dumping to the working directory for
the Elasticsearch process. If you wish to configure a heap dump path, you should
modify the entry `#-XX:HeapDumpPath=/heap/dump/path` in
<<jvm-options,`jvm.options`>> to remove the comment marker `#` and to specify an
actual path.
[float]
[[gc-logging]]
=== GC logging
By default, Elasticsearch enables GC logs. These are configured in
<<jvm-options,`jvm.options`>> and default to the same default location as the
Elasticsearch logs. The default configuration rotates the logs every 64 MB and
can consume up to 2 GB of disk space.
include::important-settings/gc-logging.asciidoc[]

View File

@ -0,0 +1,14 @@
[[cluster.name]]
=== `cluster.name`
A node can only join a cluster when it shares its `cluster.name` with all the
other nodes in the cluster. The default name is `elasticsearch`, but you should
change it to an appropriate name which describes the purpose of the cluster.
[source,yaml]
--------------------------------------------------
cluster.name: logging-prod
--------------------------------------------------
Make sure that you don't reuse the same cluster names in different environments,
otherwise you might end up with nodes joining the wrong cluster.

View File

@ -0,0 +1,58 @@
[[discovery-settings]]
=== Discovery settings
Elasticsearch uses a custom discovery implementation called "Zen Discovery" for
node-to-node clustering and master election. There are two important discovery
settings that should be configured before going to production.
[float]
[[unicast.hosts]]
==== `discovery.zen.ping.unicast.hosts`
Out of the box, without any network configuration, Elasticsearch will bind to
the available loopback addresses and will scan ports 9300 to 9305 to try to
connect to other nodes running on the same server. This provides an auto-
clustering experience without having to do any configuration.
When the moment comes to form a cluster with nodes on other servers, you have to
provide a seed list of other nodes in the cluster that are likely to be live and
contactable. This can be specified as follows:
[source,yaml]
--------------------------------------------------
discovery.zen.ping.unicast.hosts:
- 192.168.1.10:9300
- 192.168.1.11 <1>
- seeds.mydomain.com <2>
--------------------------------------------------
<1> The port will default to `transport.profiles.default.port` and fallback to
`transport.tcp.port` if not specified.
<2> A hostname that resolves to multiple IP addresses will try all resolved
addresses.
[float]
[[minimum_master_nodes]]
==== `discovery.zen.minimum_master_nodes`
To prevent data loss, it is vital to configure the
`discovery.zen.minimum_master_nodes` setting so that each master-eligible node
knows the _minimum number of master-eligible nodes_ that must be visible in
order to form a cluster.
Without this setting, a cluster that suffers a network failure is at risk of
having the cluster split into two independent clusters -- a split brain -- which
will lead to data loss. A more detailed explanation is provided in
<<split-brain>>.
To avoid a split brain, this setting should be set to a _quorum_ of
master-eligible nodes:
(master_eligible_nodes / 2) + 1
In other words, if there are three master-eligible nodes, then minimum master
nodes should be set to `(3 / 2) + 1` or `2`:
[source,yaml]
--------------------------------------------------
discovery.zen.minimum_master_nodes: 2
--------------------------------------------------

View File

@ -0,0 +1,7 @@
[[gc-logging]]
=== GC logging
By default, Elasticsearch enables GC logs. These are configured in
<<jvm-options,`jvm.options`>> and default to the same default location as the
Elasticsearch logs. The default configuration rotates the logs every 64 MB and
can consume up to 2 GB of disk space.

View File

@ -0,0 +1,19 @@
[[heap-dump-path]]
=== JVM heap dump path
The <<rpm,RPM>> and <<deb,Debian>> package distributions default to configuring
the JVM to dump the heap on out of memory exceptions to
`/var/lib/elasticsearch`. If this path is not suitable for storing heap dumps,
you should modify the entry `-XX:HeapDumpPath=/var/lib/elasticsearch` in
<<jvm-options,`jvm.options`>> to an alternate path. If you specify a filename
instead of a directory, the JVM will repeatedly use the same file; this is one
mechanism for preventing heap dumps from accumulating in the heap dump path.
Alternatively, you can configure a scheduled task via your OS to remove heap
dumps that are older than a configured age.
Note that the archive distributions do not configure the heap dump path by
default. Instead, the JVM will default to dumping to the working directory for
the Elasticsearch process. If you wish to configure a heap dump path, you should
modify the entry `#-XX:HeapDumpPath=/heap/dump/path` in
<<jvm-options,`jvm.options`>> to remove the comment marker `#` and to specify an
actual path.

View File

@ -0,0 +1,72 @@
[[heap-size]]
=== Setting the heap size
By default, Elasticsearch tells the JVM to use a heap with a minimum and maximum
size of 1 GB. When moving to production, it is important to configure heap size
to ensure that Elasticsearch has enough heap available.
Elasticsearch will assign the entire heap specified in
<<jvm-options,jvm.options>> via the `Xms` (minimum heap size) and `Xmx` (maximum
heap size) settings.
The value for these setting depends on the amount of RAM available on your
server. Good rules of thumb are:
* Set the minimum heap size (`Xms`) and maximum heap size (`Xmx`) to be equal to
each other.
* The more heap available to Elasticsearch, the more memory it can use for
caching. But note that too much heap can subject you to long garbage
collection pauses.
* Set `Xmx` to no more than 50% of your physical RAM, to ensure that there is
enough physical RAM left for kernel file system caches.
* Dont set `Xmx` to above the cutoff that the JVM uses for compressed object
pointers (compressed oops); the exact cutoff varies but is near 32 GB. You can
verify that you are under the limit by looking for a line in the logs like the
following:
+
heap size [1.9gb], compressed ordinary object pointers [true]
* Even better, try to stay below the threshold for zero-based compressed oops;
the exact cutoff varies but 26 GB is safe on most systems, but can be as large
as 30 GB on some systems. You can verify that you are under the limit by
starting Elasticsearch with the JVM options `-XX:+UnlockDiagnosticVMOptions
-XX:+PrintCompressedOopsMode` and looking for a line like the following:
+
--
heap address: 0x000000011be00000, size: 27648 MB, zero based Compressed Oops
showing that zero-based compressed oops are enabled instead of
heap address: 0x0000000118400000, size: 28672 MB, Compressed Oops with base: 0x00000001183ff000
--
Here are examples of how to set the heap size via the jvm.options file:
[source,txt]
------------------
-Xms2g <1>
-Xmx2g <2>
------------------
<1> Set the minimum heap size to 2g.
<2> Set the maximum heap size to 2g.
It is also possible to set the heap size via an environment variable. This can
be done by commenting out the `Xms` and `Xmx` settings in the
<<jvm-options,`jvm.options`>> file and setting these values via `ES_JAVA_OPTS`:
[source,sh]
------------------
ES_JAVA_OPTS="-Xms2g -Xmx2g" ./bin/elasticsearch <1>
ES_JAVA_OPTS="-Xms4000m -Xmx4000m" ./bin/elasticsearch <2>
------------------
<1> Set the minimum and maximum heap size to 2 GB.
<2> Set the minimum and maximum heap size to 4000 MB.
NOTE: Configuring the heap for the <<windows-service,Windows service>> is
different than the above. The values initially populated for the Windows service
can be configured as above but are different after the service has been
installed. Consult the <<windows-service,Windows service documentation>> for
additional details.

View File

@ -0,0 +1,29 @@
[[network.host]]
=== `network.host`
By default, Elasticsearch binds to loopback addresses only -- e.g. `127.0.0.1`
and `[::1]`. This is sufficient to run a single development node on a server.
TIP: In fact, more than one node can be started from the same `$ES_HOME`
location on a single node. This can be useful for testing Elasticsearch's
ability to form clusters, but it is not a configuration recommended for
production.
In order to communicate and to form a cluster with nodes on other servers, your
node will need to bind to a non-loopback address. While there are many
<<modules-network,network settings>>, usually all you need to configure is
`network.host`:
[source,yaml]
--------------------------------------------------
network.host: 192.168.1.10
--------------------------------------------------
The `network.host` setting also understands some special values such as
`_local_`, `_site_`, `_global_` and modifiers like `:ip4` and `:ip6`, details of
which can be found in <<network-interface-values>>.
IMPORTANT: As soon you provide a custom setting for `network.host`,
Elasticsearch assumes that you are moving from development mode to production
mode, and upgrades a number of system startup checks from warnings to
exceptions. See <<dev-vs-prod>> for more information.

View File

@ -0,0 +1,22 @@
[[node.name]]
=== `node.name`
By default, Elasticsearch will take the 7 first character of the randomly
generated uuid used as the node id. Note that the node id is persisted and does
not change when a node restarts and therefore the default node name will also
not change.
It is worth configuring a more meaningful name which will also have the
advantage of persisting after restarting the node:
[source,yaml]
--------------------------------------------------
node.name: prod-data-2
--------------------------------------------------
The `node.name` can also be set to the server's HOSTNAME as follows:
[source,yaml]
--------------------------------------------------
node.name: ${HOSTNAME}
--------------------------------------------------

View File

@ -0,0 +1,32 @@
[[path-settings]]
=== `path.data` and `path.logs`
If you are using the `.zip` or `.tar.gz` archives, the `data` and `logs`
directories are sub-folders of `$ES_HOME`. If these important folders are left
in their default locations, there is a high risk of them being deleted while
upgrading Elasticsearch to a new version.
In production use, you will almost certainly want to change the locations of the
data and log folder:
[source,yaml]
--------------------------------------------------
path:
logs: /var/log/elasticsearch
data: /var/data/elasticsearch
--------------------------------------------------
The RPM and Debian distributions already use custom paths for `data` and `logs`.
The `path.data` settings can be set to multiple paths, in which case all paths
will be used to store data (although the files belonging to a single shard will
all be stored on the same data path):
[source,yaml]
--------------------------------------------------
path:
data:
- /mnt/elasticsearch_1
- /mnt/elasticsearch_2
- /mnt/elasticsearch_3
--------------------------------------------------

View File

@ -0,0 +1,82 @@
[[jvm-options]]
=== Setting JVM options
The preferred method of setting JVM options (including system properties and JVM
flags) is via the `jvm.options` configuration file. The default location of this
file is `config/jvm.options` (when installing from the tar or zip distributions)
and `/etc/elasticsearch/jvm.options` (when installing from the Debian or RPM
packages).
This file contains a line-delimited list of JVM arguments following
a special syntax:
* lines consisting of whitespace only are ignored
* lines beginning with `#` are treated as comments and are ignored
+
[source,text]
-------------------------------------
# this is a comment
-------------------------------------
* lines beginning with a `-` are treated as a JVM option that applies
independent of the version of the JVM
+
[source,text]
-------------------------------------
-Xmx2g
-------------------------------------
* lines beginning with a number followed by a `:` followed by a `-` are treated
as a JVM option that applies only if the version of the JVM matches the number
+
[source,text]
-------------------------------------
8:-Xmx2g
-------------------------------------
* lines beginning with a number followed by a `-` followed by a `:` are treated
as a JVM option that applies only if the version of the JVM is greater than or
equal to the number
+
[source,text]
-------------------------------------
8-:-Xmx2g
-------------------------------------
* lines beginning with a number followed by a `-` followed by a number followed
by a `:` are treated as a JVM option that applies only if the version of the
JVM falls in the range of the two numbers
+
[source,text]
-------------------------------------
8-9:-Xmx2g
-------------------------------------
* all other lines are rejected
You can add custom JVM flags to this file and check this configuration into your
version control system.
An alternative mechanism for setting Java Virtual Machine options is via the
`ES_JAVA_OPTS` environment variable. For instance:
[source,sh]
---------------------------------
export ES_JAVA_OPTS="$ES_JAVA_OPTS -Djava.io.tmpdir=/path/to/temp/dir"
./bin/elasticsearch
---------------------------------
When using the RPM or Debian packages, `ES_JAVA_OPTS` can be specified in the
<<sysconfig,system configuration file>>.
The JVM has a built-in mechanism for observing the `JAVA_TOOL_OPTIONS`
environment variable. We intentionally ignore this environment variable in our
packaging scripts. The primary reason for this is that on some OS (e.g., Ubuntu)
there are agents installed by default via this environment variable that we do
not want interfering with Elasticsearch.
Additionally, some other Java programs support the `JAVA_OPTS` environment
variable. This is *not* a mechanism built into the JVM but instead a convention
in the ecosystem. However, we do not support this environment variable, instead
supporting setting JVM options via the `jvm.options` file or the environment
variable `ES_JAVA_OPTS` as above.

View File

@ -6,13 +6,13 @@ resources available to it. In order to do so, you need to configure your
operating system to allow the user running Elasticsearch to access more
resources than allowed by default.
The following settings *must* be addressed before going to production:
The following settings *must* be considered before going to production:
* <<heap-size,Set JVM heap size>>
* <<setup-configuration-memory,Disable swapping>>
* <<file-descriptors,Increase file descriptors>>
* <<vm-max-map-count,Ensure sufficient virtual memory>>
* <<max-number-of-threads,Ensure sufficient threads>>
* <<networkaddress-cache-ttl,JVM DNS cache settings>>
[[dev-vs-prod]]
[float]
@ -31,8 +31,6 @@ lose data because of a malconfigured server.
include::sysconfig/configuring.asciidoc[]
include::sysconfig/heap_size.asciidoc[]
include::sysconfig/swap.asciidoc[]
include::sysconfig/file-descriptors.asciidoc[]

View File

@ -36,7 +36,6 @@ The new limit is only applied during the current session.
You can consult all currently applied limits with `ulimit -a`.
[[limits.conf]]
==== `/etc/security/limits.conf`
@ -66,7 +65,6 @@ following line:
--------------------------------
===============================
[[sysconfig]]
==== Sysconfig file
@ -81,7 +79,6 @@ Debian:: `/etc/default/elasticsearch`
However, for systems which uses `systemd`, system limits need to be specified
via <<systemd,systemd>>.
[[systemd]]
==== Systemd configuration
@ -110,57 +107,3 @@ Once finished, run the following command to reload units:
---------------------------------
sudo systemctl daemon-reload
---------------------------------
[[jvm-options]]
==== Setting JVM options
The preferred method of setting Java Virtual Machine options (including
system properties and JVM flags) is via the `jvm.options` configuration
file. The default location of this file is `config/jvm.options` (when
installing from the tar or zip distributions) and
`/etc/elasticsearch/jvm.options` (when installing from the Debian or RPM
packages). This file contains a line-delimited list of JVM arguments following
a special syntax:
- lines beginning with `#` are treated as comments and are ignored
- lines consisting only of whitespace are ignored
- lines beginning with a `-` are treated as a JVM option that applies
independent of the version of the JVM
- lines beginning with a number followed by a `:` followed by a `-` are treated
as a JVM option that applies only if the version of the JVM matches the
number
- lines beginning with a number followed by a `-` followed by a `:` are treated
as a JVM option that applies only if the version of the JVM is greater than
or equal to the number
- lines beginning with a number followed by a `-` followed by a `:` followed by
a `-` followed by a number are treated as a JVM option that applies only if
the version of the JVM falls in the range of the two numbers
- all other lines are rejected
You can add custom JVM flags to this file and
check this configuration into your version control system.
An alternative mechanism for setting Java Virtual Machine options is
via the `ES_JAVA_OPTS` environment variable. For instance:
[source,sh]
---------------------------------
export ES_JAVA_OPTS="$ES_JAVA_OPTS -Djava.io.tmpdir=/path/to/temp/dir"
./bin/elasticsearch
---------------------------------
When using the RPM or Debian packages, `ES_JAVA_OPTS` can be specified in the
<<sysconfig,system configuration file>>.
The JVM has a built-in mechanism for observing the `JAVA_TOOL_OPTIONS`
environment variable. We intentionally ignore this environment variable in our
packaging scripts. The primary reason for this is that on some OS (e.g., Ubuntu)
there are agents installed by default via this environment variable that we do
not want interfering with Elasticsearch.
Additionally, some other Java programs support the `JAVA_OPTS` environment
variable. This is *not* a mechanism built into the JVM but instead a convention
in the ecosystem. However, we do not support this environment variable, instead
supporting setting JVM options via the `jvm.options` file or the environment
variable `ES_JAVA_OPTS` as above.

View File

@ -1,74 +0,0 @@
[[heap-size]]
=== Set JVM heap size via jvm.options
By default, Elasticsearch tells the JVM to use a heap with a minimum
and maximum size of 1 GB. When moving to production, it is
important to configure heap size to ensure that Elasticsearch has enough
heap available.
Elasticsearch will assign the entire heap specified in <<jvm-options,jvm.options>>
via the Xms (minimum heap size) and Xmx (maximum heap size) settings.
The value for these setting depends on the amount of RAM available on
your server. Good rules of thumb are:
* Set the minimum heap size (Xms) and maximum heap size (Xmx) to be
equal to each other.
* The more heap available to Elasticsearch, the more memory it can use for
caching. But note that too much heap can subject you to long garbage
collection pauses.
* Set Xmx to no more than 50% of your physical RAM, to ensure that there
is enough physical RAM left for kernel file system caches.
* Dont set Xmx to above the cutoff that the JVM uses for compressed
object pointers (compressed oops); the exact cutoff varies but is
near 32 GB. You can verify that you are under the limit by looking
for a line in the logs like the following:
+
heap size [1.9gb], compressed ordinary object pointers [true]
* Even better, try to stay below the threshold for zero-based
compressed oops; the exact cutoff varies but 26 GB is safe on most
systems, but can be as large as 30 GB on some systems. You can verify
that you are under the limit by starting Elasticsearch with the JVM
options `-XX:+UnlockDiagnosticVMOptions -XX:+PrintCompressedOopsMode`
and looking for a line like the following:
+
--
heap address: 0x000000011be00000, size: 27648 MB, zero based Compressed Oops
showing that zero-based compressed oops are enabled instead of
heap address: 0x0000000118400000, size: 28672 MB, Compressed Oops with base: 0x00000001183ff000
--
Here are examples of how to set the heap size via the jvm.options file:
[source,txt]
------------------
-Xms2g <1>
-Xmx2g <2>
------------------
<1> Set the minimum heap size to 2g.
<2> Set the maximum heap size to 2g.
It is also possible to set the heap size via an environment variable.
This can be done by commenting out the `Xms` and `Xmx` settings
in the jvm.options file and setting these values via `ES_JAVA_OPTS`:
[source,sh]
------------------
ES_JAVA_OPTS="-Xms2g -Xmx2g" ./bin/elasticsearch <1>
ES_JAVA_OPTS="-Xms4000m -Xmx4000m" ./bin/elasticsearch <2>
------------------
<1> Set the minimum and maximum heap size to 2 GB.
<2> Set the minimum and maximum heap size to 4000 MB.
NOTE: Configuring the heap for the <<windows-service,Windows service>>
is different than the above. The values initially populated for the
Windows service can be configured as above but are different after the
service has been installed. Consult the
<<windows-service,Windows service documentation>> for additional
details.

View File

@ -42,7 +42,7 @@ Another option available on Linux systems is to ensure that the sysctl value
should not lead to swapping under normal circumstances, while still allowing the
whole system to swap in emergency conditions.
[[mlockall]]
[[bootstrap-memory_lock]]
==== Enable `bootstrap.memory_lock`
Another option is to use