383 lines
13 KiB
Plaintext
383 lines
13 KiB
Plaintext
[[setup-configuration]]
|
|
== Configuration
|
|
|
|
[float]
|
|
=== Environment Variables
|
|
|
|
Within the scripts, Elasticsearch comes with built in `JAVA_OPTS` passed
|
|
to the JVM started. The most important setting for that is the `-Xmx` to
|
|
control the maximum allowed memory for the process, and `-Xms` to
|
|
control the minimum allocated memory for the process (_in general, the
|
|
more memory allocated to the process, the better_).
|
|
|
|
Most times it is better to leave the default `JAVA_OPTS` as they are,
|
|
and use the `ES_JAVA_OPTS` environment variable in order to set / change
|
|
JVM settings or arguments.
|
|
|
|
The `ES_HEAP_SIZE` environment variable allows to set the heap memory
|
|
that will be allocated to elasticsearch java process. It will allocate
|
|
the same value to both min and max values, though those can be set
|
|
explicitly (not recommended) by setting `ES_MIN_MEM` (defaults to
|
|
`256m`), and `ES_MAX_MEM` (defaults to `1g`).
|
|
|
|
It is recommended to set the min and max memory to the same value, and
|
|
enable <<setup-configuration-memory,`mlockall`>>.
|
|
|
|
[float]
|
|
[[system]]
|
|
=== System Configuration
|
|
|
|
[float]
|
|
[[file-descriptors]]
|
|
==== File Descriptors
|
|
|
|
Make sure to increase the number of open files descriptors on the
|
|
machine (or for the user running elasticsearch). Setting it to 32k or
|
|
even 64k is recommended.
|
|
|
|
In order to test how many open files the process can open, start it with
|
|
`-Des.max-open-files` set to `true`. This will print the number of open
|
|
files the process can open on startup.
|
|
|
|
Alternatively, you can retrieve the `max_file_descriptors` for each node
|
|
using the <<cluster-nodes-info>> API, with:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
curl localhost:9200/_nodes/process?pretty
|
|
--------------------------------------------------
|
|
|
|
[float]
|
|
[[vm-max-map-count]]
|
|
==== Virtual memory
|
|
|
|
Elasticsearch uses a <<default_fs,`hybrid mmapfs / niofs`>> directory by default to store its indices. The default
|
|
operating system limits on mmap counts is likely to be too low, which may
|
|
result in out of memory exceptions. On Linux, you can increase the limits by
|
|
running the following command as `root`:
|
|
|
|
[source,sh]
|
|
-------------------------------------
|
|
sysctl -w vm.max_map_count=262144
|
|
-------------------------------------
|
|
|
|
To set this value permanently, update the `vm.max_map_count` setting in
|
|
`/etc/sysctl.conf`.
|
|
|
|
NOTE: If you installed Elasticsearch using a package (.deb, .rpm) this setting will be changed automatically. To verify, run `sysctl vm.max_map_count`.
|
|
|
|
[float]
|
|
[[setup-configuration-memory]]
|
|
==== Memory Settings
|
|
|
|
Most operating systems try to use as much memory as possible for file system
|
|
caches and eagerly swap out unused application memory, possibly resulting
|
|
in the elasticsearch process being swapped. Swapping is very bad for
|
|
performance and for node stability, so it should be avoided at all costs.
|
|
|
|
There are three options:
|
|
|
|
* **Disable swap**
|
|
+
|
|
--
|
|
|
|
The simplest option is to completely disable swap. Usually Elasticsearch
|
|
is the only service running on a box, and its memory usage is controlled
|
|
by the `ES_HEAP_SIZE` environment variable. There should be no need
|
|
to have swap enabled.
|
|
|
|
On Linux systems, you can disable swap temporarily
|
|
by running: `sudo swapoff -a`. To disable it permanently, you will need
|
|
to edit the `/etc/fstab` file and comment out any lines that contain the
|
|
word `swap`.
|
|
|
|
On Windows, the equivalent can be achieved by disabling the paging file entirely
|
|
via `System Properties → Advanced → Performance → Advanced → Virtual memory`.
|
|
|
|
--
|
|
|
|
* **Configure `swappiness`**
|
|
+
|
|
--
|
|
The second option is to ensure that the sysctl value `vm.swappiness` is set
|
|
to `0`. This reduces the kernel's tendency to swap and should not lead to
|
|
swapping under normal circumstances, while still allowing the whole system
|
|
to swap in emergency conditions.
|
|
|
|
NOTE: From kernel version 3.5-rc1 and above, a `swappiness` of `0` will
|
|
cause the OOM killer to kill the process instead of allowing swapping.
|
|
You will need to set `swappiness` to `1` to still allow swapping in
|
|
emergencies.
|
|
--
|
|
|
|
* **`mlockall`**
|
|
+
|
|
--
|
|
The third option is to use
|
|
http://opengroup.org/onlinepubs/007908799/xsh/mlockall.html[mlockall] on Linux/Unix systems, or https://msdn.microsoft.com/en-us/library/windows/desktop/aa366895%28v=vs.85%29.aspx[VirtualLock] on Windows, to
|
|
try to lock the process address space into RAM, preventing any Elasticsearch
|
|
memory from being swapped out. This can be done, by adding this line
|
|
to the `config/elasticsearch.yml` file:
|
|
|
|
[source,yaml]
|
|
--------------
|
|
bootstrap.mlockall: true
|
|
--------------
|
|
|
|
After starting Elasticsearch, you can see whether this setting was applied
|
|
successfully by checking the value of `mlockall` in the output from this
|
|
request:
|
|
|
|
[source,sh]
|
|
--------------
|
|
curl http://localhost:9200/_nodes/process?pretty
|
|
--------------
|
|
|
|
If you see that `mlockall` is `false`, then it means that the the `mlockall`
|
|
request has failed. The most probable reason, on Linux/Unix systems, is that
|
|
the user running Elasticsearch doesn't have permission to lock memory. This can
|
|
be granted by running `ulimit -l unlimited` as `root` before starting Elasticsearch.
|
|
|
|
Another possible reason why `mlockall` can fail is that the temporary directory
|
|
(usually `/tmp`) is mounted with the `noexec` option. This can be solved by
|
|
specifying a new temp directory, by starting Elasticsearch with:
|
|
|
|
[source,sh]
|
|
--------------
|
|
./bin/elasticsearch -Djna.tmpdir=/path/to/new/dir
|
|
--------------
|
|
|
|
WARNING: `mlockall` might cause the JVM or shell session to exit if it tries
|
|
to allocate more memory than is available!
|
|
--
|
|
|
|
[float]
|
|
[[settings]]
|
|
=== Elasticsearch Settings
|
|
|
|
*elasticsearch* configuration files can be found under `ES_HOME/config`
|
|
folder. The folder comes with two files, the `elasticsearch.yml` for
|
|
configuring Elasticsearch different
|
|
<<modules,modules>>, and `logging.yml` for
|
|
configuring the Elasticsearch logging.
|
|
|
|
The configuration format is http://www.yaml.org/[YAML]. Here is an
|
|
example of changing the address all network based modules will use to
|
|
bind and publish to:
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
network :
|
|
host : 10.0.0.4
|
|
--------------------------------------------------
|
|
|
|
|
|
[float]
|
|
[[paths]]
|
|
==== Paths
|
|
|
|
In production use, you will almost certainly want to change paths for
|
|
data and log files:
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
path:
|
|
logs: /var/log/elasticsearch
|
|
data: /var/data/elasticsearch
|
|
--------------------------------------------------
|
|
|
|
[float]
|
|
[[cluster-name]]
|
|
==== Cluster name
|
|
|
|
Also, don't forget to give your production cluster a name, which is used
|
|
to discover and auto-join other nodes:
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
cluster:
|
|
name: <NAME OF YOUR CLUSTER>
|
|
--------------------------------------------------
|
|
|
|
Make sure that you don't reuse the same cluster names in different
|
|
environments, otherwise you might end up with nodes joining the wrong cluster.
|
|
For instance you could use `logging-dev`, `logging-stage`, and `logging-prod`
|
|
for the development, staging, and production clusters.
|
|
|
|
[float]
|
|
[[node-name]]
|
|
==== Node name
|
|
|
|
You may also want to change the default node name for each node to
|
|
something like the display hostname. By default Elasticsearch will
|
|
randomly pick a Marvel character name from a list of around 3000 names
|
|
when your node starts up.
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
node:
|
|
name: <NAME OF YOUR NODE>
|
|
--------------------------------------------------
|
|
|
|
The hostname of the machine is provided in the environment
|
|
variable `HOSTNAME`. If on your machine you only run a
|
|
single elasticsearch node for that cluster, you can set
|
|
the node name to the hostname using the `${...}` notation:
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
node:
|
|
name: ${HOSTNAME}
|
|
--------------------------------------------------
|
|
|
|
Internally, all settings are collapsed into "namespaced" settings. For
|
|
example, the above gets collapsed into `node.name`. This means that
|
|
its easy to support other configuration formats, for example,
|
|
http://www.json.org[JSON]. If JSON is a preferred configuration format,
|
|
simply rename the `elasticsearch.yml` file to `elasticsearch.json` and
|
|
add:
|
|
|
|
[float]
|
|
[[styles]]
|
|
==== Configuration styles
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
{
|
|
"network" : {
|
|
"host" : "10.0.0.4"
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
It also means that its easy to provide the settings externally either
|
|
using the `ES_JAVA_OPTS` or as parameters to the `elasticsearch`
|
|
command, for example:
|
|
|
|
[source,sh]
|
|
--------------------------------------------------
|
|
$ elasticsearch -Des.network.host=10.0.0.4
|
|
--------------------------------------------------
|
|
|
|
Another option is to set `es.default.` prefix instead of `es.` prefix,
|
|
which means the default setting will be used only if not explicitly set
|
|
in the configuration file.
|
|
|
|
Another option is to use the `${...}` notation within the configuration
|
|
file which will resolve to an environment setting, for example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"network" : {
|
|
"host" : "${ES_NET_HOST}"
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
Additionally, for settings that you do not wish to store in the configuration
|
|
file, you can use the value `${prompt.text}` or `${prompt.secret}` and start
|
|
Elasticsearch in the foreground. `${prompt.secret}` has echoing disabled so
|
|
that the value entered will not be shown in your terminal; `${prompt.text}`
|
|
will allow you to see the value as you type it in. For example:
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
node:
|
|
name: ${prompt.text}
|
|
--------------------------------------------------
|
|
|
|
On execution of the `elasticsearch` command, you will be prompted to enter
|
|
the actual value like so:
|
|
|
|
[source,sh]
|
|
--------------------------------------------------
|
|
Enter value for [node.name]:
|
|
--------------------------------------------------
|
|
|
|
NOTE: Elasticsearch will not start if `${prompt.text}` or `${prompt.secret}`
|
|
is used in the settings and the process is run as a service or in the background.
|
|
|
|
[float]
|
|
[[configuration-index-settings]]
|
|
=== Index Settings
|
|
|
|
Indices created within the cluster can provide their own settings. For
|
|
example, the following creates an index with memory based storage
|
|
instead of the default file system based one (the format can be either
|
|
YAML or JSON):
|
|
|
|
[source,sh]
|
|
--------------------------------------------------
|
|
$ curl -XPUT http://localhost:9200/kimchy/ -d \
|
|
'
|
|
index:
|
|
refresh_interval: 5s
|
|
'
|
|
--------------------------------------------------
|
|
|
|
Index level settings can be set on the node level as well, for example,
|
|
within the `elasticsearch.yml` file, the following can be set:
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
index :
|
|
refresh_interval: 5s
|
|
--------------------------------------------------
|
|
|
|
This means that every index that gets created on the specific node
|
|
started with the mentioned configuration will store the index in memory
|
|
*unless the index explicitly sets it*. In other words, any index level
|
|
settings override what is set in the node configuration. Of course, the
|
|
above can also be set as a "collapsed" setting, for example:
|
|
|
|
[source,sh]
|
|
--------------------------------------------------
|
|
$ elasticsearch -Des.index.refresh_interval=5s
|
|
--------------------------------------------------
|
|
|
|
All of the index level configuration can be found within each
|
|
<<index-modules,index module>>.
|
|
|
|
[float]
|
|
[[logging]]
|
|
=== Logging
|
|
|
|
Elasticsearch uses an internal logging abstraction and comes, out of the
|
|
box, with http://logging.apache.org/log4j/1.2/[log4j]. It tries to simplify
|
|
log4j configuration by using http://www.yaml.org/[YAML] to configure it,
|
|
and the logging configuration file is `config/logging.yml`. The
|
|
http://en.wikipedia.org/wiki/JSON[JSON] and
|
|
http://en.wikipedia.org/wiki/.properties[properties] formats are also
|
|
supported. Multiple configuration files can be loaded, in which case they will
|
|
get merged, as long as they start with the `logging.` prefix and end with one
|
|
of the supported suffixes (either `.yml`, `.yaml`, `.json` or `.properties`).
|
|
The logger section contains the java packages and their corresponding log
|
|
level, where it is possible to omit the `org.elasticsearch` prefix. The
|
|
appender section contains the destinations for the logs. Extensive information
|
|
on how to customize logging and all the supported appenders can be found on
|
|
the http://logging.apache.org/log4j/1.2/manual.html[log4j documentation].
|
|
|
|
Additional Appenders and other logging classes provided by
|
|
http://logging.apache.org/log4j/extras/[log4j-extras] are also available,
|
|
out of the box.
|
|
|
|
[float]
|
|
[[deprecation-logging]]
|
|
==== Deprecation logging
|
|
|
|
In addition to regular logging, Elasticsearch allows you to enable logging
|
|
of deprecated actions. For example this allows you to determine early, if
|
|
you need to migrate certain functionality in the future. By default,
|
|
deprecation logging is disabled. You can enable it in the `config/logging.yml`
|
|
file by setting the deprecation log level to `DEBUG`.
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
deprecation: DEBUG, deprecation_log_file
|
|
--------------------------------------------------
|
|
|
|
This will create a daily rolling deprecation log file in your log directory.
|
|
Check this file regularly, especially when you intend to upgrade to a new
|
|
major version.
|