313 lines
9.8 KiB
Plaintext
313 lines
9.8 KiB
Plaintext
[[setup-configuration]]
|
|
== Configuration
|
|
|
|
[float]
|
|
=== Environment Variables
|
|
|
|
Within the scripts, Elasticsearch comes with built in `JAVA_OPTS` passed
|
|
to the JVM started. The most important setting for that is the `-Xmx` to
|
|
control the maximum allowed memory for the process, and `-Xms` to
|
|
control the minimum allocated memory for the process (_in general, the
|
|
more memory allocated to the process, the better_).
|
|
|
|
Most times it is better to leave the default `JAVA_OPTS` as they are,
|
|
and use the `ES_JAVA_OPTS` environment variable in order to set / change
|
|
JVM settings or arguments.
|
|
|
|
The `ES_HEAP_SIZE` environment variable allows to set the heap memory
|
|
that will be allocated to elasticsearch java process. It will allocate
|
|
the same value to both min and max values, though those can be set
|
|
explicitly (not recommended) by setting `ES_MIN_MEM` (defaults to
|
|
`256m`), and `ES_MAX_MEM` (defaults to `1gb`).
|
|
|
|
It is recommended to set the min and max memory to the same value, and
|
|
enable <<setup-configuration-memory,`mlockall`>>.
|
|
|
|
[float]
|
|
[[system]]
|
|
=== System Configuration
|
|
|
|
[float]
|
|
[[file-descriptors]]
|
|
==== File Descriptors
|
|
|
|
Make sure to increase the number of open files descriptors on the
|
|
machine (or for the user running elasticsearch). Setting it to 32k or
|
|
even 64k is recommended.
|
|
|
|
In order to test how many open files the process can open, start it with
|
|
`-Des.max-open-files` set to `true`. This will print the number of open
|
|
files the process can open on startup.
|
|
|
|
Alternatively, you can retrieve the `max_file_descriptors` for each node
|
|
using the <<cluster-nodes-info>> API, with:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
curl localhost:9200/_nodes/process?pretty
|
|
--------------------------------------------------
|
|
|
|
[float]
|
|
[[vm-max-map-count]]
|
|
==== Virtual memory
|
|
|
|
Elasticsearch uses a <<default_fs,`hybrid mmapfs / niofs`>> directory by default to store its indices. The default
|
|
operating system limits on mmap counts is likely to be too low, which may
|
|
result in out of memory exceptions. On Linux, you can increase the limits by
|
|
running the following command as `root`:
|
|
|
|
[source,bash]
|
|
-------------------------------------
|
|
sysctl -w vm.max_map_count=262144
|
|
-------------------------------------
|
|
|
|
To set this value permanently, update the `vm.max_map_count` setting in
|
|
`/etc/sysctl.conf`.
|
|
|
|
[float]
|
|
[[setup-configuration-memory]]
|
|
==== Memory Settings
|
|
|
|
The Linux kernel tries to use as much memory as possible for file system
|
|
caches and eagerly swaps out unused application memory, possibly resulting
|
|
in the elasticsearch process being swapped. Swapping is very bad for
|
|
performance and for node stability, so it should be avoided at all costs.
|
|
|
|
There are three options:
|
|
|
|
* **Disable swap**
|
|
+
|
|
--
|
|
|
|
The simplest option is to completely disable swap. Usually Elasticsearch
|
|
is the only service running on a box, and its memory usage is controlled
|
|
by the `ES_HEAP_SIZE` environment variable. There should be no need
|
|
to have swap enabled. On Linux systems, you can disable swap temporarily
|
|
by running: `sudo swapoff -a`. To disable it permanently, you will need
|
|
to edit the `/etc/fstab` file and comment out any lines that contain the
|
|
word `swap`.
|
|
--
|
|
|
|
* **Configure `swappiness`**
|
|
+
|
|
--
|
|
The second option is to ensure that the sysctl value `vm.swappiness` is set
|
|
to `0`. This reduces the kernel's tendency to swap and should not lead to
|
|
swapping under normal circumstances, while still allowing the whole system
|
|
to swap in emergency conditions.
|
|
|
|
NOTE: From kernel version 3.5-rc1 and above, a `swappiness` of `0` will
|
|
cause the OOM killer to kill the process instead of allowing swapping.
|
|
You will need to set `swappiness` to `1` to still allow swapping in
|
|
emergencies.
|
|
--
|
|
|
|
* **`mlockall`**
|
|
+
|
|
--
|
|
The third option on Linux/Unix systems only, is to use
|
|
http://opengroup.org/onlinepubs/007908799/xsh/mlockall.html[mlockall] to
|
|
try to lock the process address space into RAM, preventing any Elasticsearch
|
|
memory from being swapped out. This can be done, by adding this line
|
|
to the `config/elasticsearch.yml` file:
|
|
|
|
[source,yaml]
|
|
--------------
|
|
bootstrap.mlockall: true
|
|
--------------
|
|
|
|
After starting Elasticsearch, you can see whether this setting was applied
|
|
successfully by checking the value of `mlockall` in the output from this
|
|
request:
|
|
|
|
[source,sh]
|
|
--------------
|
|
curl http://localhost:9200/_nodes/process?pretty
|
|
--------------
|
|
|
|
If you see that `mlockall` is `false`, then it means that the the `mlockall`
|
|
request has failed. The most probable reason is that the user running
|
|
Elasticsearch doesn't have permission to lock memory. This can be granted
|
|
by running `ulimit -l unlimited` as `root` before starting Elasticsearch.
|
|
|
|
Another possible reason why `mlockall` can fail is that the temporary directory
|
|
(usually `/tmp`) is mounted with the `noexec` option. This can be solved by
|
|
specfying a new temp directory, by starting Elasticsearch with:
|
|
|
|
[source,sh]
|
|
--------------
|
|
./bin/elasticsearch -Djna.tmpdir=/path/to/new/dir
|
|
--------------
|
|
|
|
WARNING: `mlockall` might cause the JVM or shell session to exit if it tries
|
|
to allocate more memory than is available!
|
|
--
|
|
|
|
[float]
|
|
[[settings]]
|
|
=== Elasticsearch Settings
|
|
|
|
*elasticsearch* configuration files can be found under `ES_HOME/config`
|
|
folder. The folder comes with two files, the `elasticsearch.yml` for
|
|
configuring Elasticsearch different
|
|
<<modules,modules>>, and `logging.yml` for
|
|
configuring the Elasticsearch logging.
|
|
|
|
The configuration format is http://www.yaml.org/[YAML]. Here is an
|
|
example of changing the address all network based modules will use to
|
|
bind and publish to:
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
network :
|
|
host : 10.0.0.4
|
|
--------------------------------------------------
|
|
|
|
|
|
[float]
|
|
[[paths]]
|
|
==== Paths
|
|
|
|
In production use, you will almost certainly want to change paths for
|
|
data and log files:
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
path:
|
|
logs: /var/log/elasticsearch
|
|
data: /var/data/elasticsearch
|
|
--------------------------------------------------
|
|
|
|
[float]
|
|
[[cluster-name]]
|
|
==== Cluster name
|
|
|
|
Also, don't forget to give your production cluster a name, which is used
|
|
to discover and auto-join other nodes:
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
cluster:
|
|
name: <NAME OF YOUR CLUSTER>
|
|
--------------------------------------------------
|
|
|
|
[float]
|
|
[[node-name]]
|
|
==== Node name
|
|
|
|
You may also want to change the default node name for each node to
|
|
something like the display hostname. By default Elasticsearch will
|
|
randomly pick a Marvel character name from a list of around 3000 names
|
|
when your node starts up.
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
node:
|
|
name: <NAME OF YOUR NODE>
|
|
--------------------------------------------------
|
|
|
|
Internally, all settings are collapsed into "namespaced" settings. For
|
|
example, the above gets collapsed into `node.name`. This means that
|
|
its easy to support other configuration formats, for example,
|
|
http://www.json.org[JSON]. If JSON is a preferred configuration format,
|
|
simply rename the `elasticsearch.yml` file to `elasticsearch.json` and
|
|
add:
|
|
|
|
[float]
|
|
[[styles]]
|
|
==== Configuration styles
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
{
|
|
"network" : {
|
|
"host" : "10.0.0.4"
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
It also means that its easy to provide the settings externally either
|
|
using the `ES_JAVA_OPTS` or as parameters to the `elasticsearch`
|
|
command, for example:
|
|
|
|
[source,sh]
|
|
--------------------------------------------------
|
|
$ elasticsearch -Des.network.host=10.0.0.4
|
|
--------------------------------------------------
|
|
|
|
Another option is to set `es.default.` prefix instead of `es.` prefix,
|
|
which means the default setting will be used only if not explicitly set
|
|
in the configuration file.
|
|
|
|
Another option is to use the `${...}` notation within the configuration
|
|
file which will resolve to an environment setting, for example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"network" : {
|
|
"host" : "${ES_NET_HOST}"
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
The location of the configuration file can be set externally using a
|
|
system property:
|
|
|
|
[source,sh]
|
|
--------------------------------------------------
|
|
$ elasticsearch -Des.config=/path/to/config/file
|
|
--------------------------------------------------
|
|
|
|
[float]
|
|
[[configuration-index-settings]]
|
|
=== Index Settings
|
|
|
|
Indices created within the cluster can provide their own settings. For
|
|
example, the following creates an index with memory based storage
|
|
instead of the default file system based one (the format can be either
|
|
YAML or JSON):
|
|
|
|
[source,sh]
|
|
--------------------------------------------------
|
|
$ curl -XPUT http://localhost:9200/kimchy/ -d \
|
|
'
|
|
index :
|
|
store:
|
|
type: memory
|
|
'
|
|
--------------------------------------------------
|
|
|
|
Index level settings can be set on the node level as well, for example,
|
|
within the `elasticsearch.yml` file, the following can be set:
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------
|
|
index :
|
|
store:
|
|
type: memory
|
|
--------------------------------------------------
|
|
|
|
This means that every index that gets created on the specific node
|
|
started with the mentioned configuration will store the index in memory
|
|
*unless the index explicitly sets it*. In other words, any index level
|
|
settings override what is set in the node configuration. Of course, the
|
|
above can also be set as a "collapsed" setting, for example:
|
|
|
|
[source,sh]
|
|
--------------------------------------------------
|
|
$ elasticsearch -Des.index.store.type=memory
|
|
--------------------------------------------------
|
|
|
|
All of the index level configuration can be found within each
|
|
<<index-modules,index module>>.
|
|
|
|
[float]
|
|
[[logging]]
|
|
=== Logging
|
|
|
|
Elasticsearch uses an internal logging abstraction and comes, out of the
|
|
box, with http://logging.apache.org/log4j/[log4j]. It tries to simplify
|
|
log4j configuration by using http://www.yaml.org/[YAML] to configure it,
|
|
and the logging configuration file is `config/logging.yml` file.
|