OpenSearch/docs/reference/setup/install/docker.asciidoc

434 lines
16 KiB
Plaintext
Raw Normal View History

[[docker]]
=== Install {es} with Docker
{es} is also available as Docker images.
The images use https://hub.docker.com/_/centos/[centos:8] as the base image.
A list of all published Docker images and tags is available at
https://www.docker.elastic.co[www.docker.elastic.co]. The source files
are in
https://github.com/elastic/elasticsearch/blob/{branch}/distribution/docker[Github].
These images are free to use under the Elastic license. They contain open source
and free commercial features and access to paid commercial features.
{kibana-ref}/managing-licenses.html[Start a 30-day trial] to try out all of the
paid commercial features. See the
https://www.elastic.co/subscriptions[Subscriptions] page for information about
Elastic license levels.
==== Pulling the image
Obtaining {es} for Docker is as simple as issuing a +docker pull+ command
against the Elastic Docker registry.
ifeval::["{release-state}"=="unreleased"]
WARNING: Version {version} of {es} has not yet been released, so no
Docker image is currently available for this version.
endif::[]
ifeval::["{release-state}"!="unreleased"]
[source,sh,subs="attributes"]
--------------------------------------------
docker pull {docker-repo}:{version}
--------------------------------------------
Alternatively, you can download other Docker images that contain only features
available under the Apache 2.0 license. To download the images, go to
https://www.docker.elastic.co[www.docker.elastic.co].
endif::[]
[[docker-cli-run-dev-mode]]
==== Starting a single node cluster with Docker
ifeval::["{release-state}"=="unreleased"]
WARNING: Version {version} of the {es} Docker image has not yet been released.
endif::[]
ifeval::["{release-state}"!="unreleased"]
To start a single-node {es} cluster for development or testing, specify
<<single-node-discovery,single-node discovery>> to bypass the <<bootstrap-checks,bootstrap checks>>:
[source,sh,subs="attributes"]
--------------------------------------------
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" {docker-image}
--------------------------------------------
endif::[]
[[docker-compose-file]]
==== Starting a multi-node cluster with Docker Compose
To get a three-node {es} cluster up and running in Docker,
you can use Docker Compose:
. Create a `docker-compose.yml` file:
ifeval::["{release-state}"=="unreleased"]
+
--
WARNING: Version {version} of {es} has not yet been released, so a
`docker-compose.yml` is not available for this version.
endif::[]
ifeval::["{release-state}"!="unreleased"]
[source,yaml,subs="attributes"]
--------------------------------------------
include::docker-compose.yml[]
--------------------------------------------
endif::[]
This sample Docker Compose file brings up a three-node {es} cluster.
Node `es01` listens on `localhost:9200` and `es02` and `es03` talk to `es01` over a Docker network.
2020-10-27 11:22:44 -04:00
Please note that this configuration exposes port 9200 on all network interfaces, and given how
Docker manipulates `iptables` on Linux, this means that your {es} cluster is publically accessible,
potentially ignoring any firewall settings. If you don't want to expose port 9200 and instead use
a reverse proxy, replace `9200:9200` with `127.0.0.1:9200:9200` in the docker-compose.yml file.
{es} will then only be accessible from the host machine itself.
The https://docs.docker.com/storage/volumes[Docker named volumes]
`data01`, `data02`, and `data03` store the node data directories so the data persists across restarts.
If they don't already exist, `docker-compose` creates them when you bring up the cluster.
--
. Make sure Docker Engine is allotted at least 4GiB of memory.
In Docker Desktop, you configure resource usage on the Advanced tab in Preference (macOS)
or Settings (Windows).
+
NOTE: Docker Compose is not pre-installed with Docker on Linux.
See docs.docker.com for installation instructions:
https://docs.docker.com/compose/install[Install Compose on Linux]
. Run `docker-compose` to bring up the cluster:
+
[source,sh,subs="attributes"]
--------------------------------------------
docker-compose up
--------------------------------------------
. Submit a `_cat/nodes` request to see that the nodes are up and running:
+
[source,sh]
--------------------------------------------------
curl -X GET "localhost:9200/_cat/nodes?v&pretty"
--------------------------------------------------
// NOTCONSOLE
Log messages go to the console and are handled by the configured Docker logging driver.
By default you can access logs with `docker logs`.
To stop the cluster, run `docker-compose down`.
The data in the Docker volumes is preserved and loaded
2019-11-11 04:10:13 -05:00
when you restart the cluster with `docker-compose up`.
To **delete the data volumes** when you bring down the cluster,
specify the `-v` option: `docker-compose down -v`.
[[next-getting-started-tls-docker]]
===== Start a multi-node cluster with TLS enabled
See <<configuring-tls-docker>> and
{stack-gs}/get-started-docker.html#get-started-docker-tls[Run the {stack} in Docker with TLS enabled].
[[docker-prod-prerequisites]]
==== Using the Docker images in production
The following requirements and recommendations apply when running {es} in Docker in production.
===== Set `vm.max_map_count` to at least `262144`
The `vm.max_map_count` kernel setting must be set to at least `262144` for production use.
How you set `vm.max_map_count` depends on your platform:
* Linux
+
--
The `vm.max_map_count` setting should be set permanently in `/etc/sysctl.conf`:
[source,sh]
--------------------------------------------
grep vm.max_map_count /etc/sysctl.conf
vm.max_map_count=262144
--------------------------------------------
To apply the setting on a live system, run:
[source,sh]
--------------------------------------------
sysctl -w vm.max_map_count=262144
--------------------------------------------
--
* macOS with https://docs.docker.com/docker-for-mac[Docker for Mac]
+
--
The `vm.max_map_count` setting must be set within the xhyve virtual machine:
. From the command line, run:
+
[source,sh]
--------------------------------------------
screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty
--------------------------------------------
. Press enter and use`sysctl` to configure `vm.max_map_count`:
+
[source,sh]
--------------------------------------------
sysctl -w vm.max_map_count=262144
--------------------------------------------
. To exit the `screen` session, type `Ctrl a d`.
--
* Windows and macOS with https://www.docker.com/products/docker-desktop[Docker Desktop]
+
--
The `vm.max_map_count` setting must be set via docker-machine:
[source,sh]
--------------------------------------------
docker-machine ssh
sudo sysctl -w vm.max_map_count=262144
--------------------------------------------
--
* Windows with https://docs.docker.com/docker-for-windows/wsl[Docker Desktop WSL 2 backend]
+
--
The `vm.max_map_count` setting must be set in the docker-desktop container:
[source,sh]
--------------------------------------------
wsl -d docker-desktop
sysctl -w vm.max_map_count=262144
--------------------------------------------
--
===== Configuration files must be readable by the `elasticsearch` user
By default, {es} runs inside the container as user `elasticsearch` using
uid:gid `1000:0`.
IMPORTANT: One exception is https://docs.openshift.com/container-platform/3.6/creating_images/guidelines.html#openshift-specific-guidelines[Openshift],
which runs containers using an arbitrarily assigned user ID.
Openshift presents persistent volumes with the gid set to `0`, which works without any adjustments.
If you are bind-mounting a local directory or file, it must be readable by the `elasticsearch` user.
In addition, this user must have write access to the <<path-settings,data and log dirs>>.
A good strategy is to grant group access to gid `0` for the local directory.
For example, to prepare a local directory for storing data through a bind-mount:
[source,sh]
--------------------------------------------
mkdir esdatadir
chmod g+rwx esdatadir
chgrp 0 esdatadir
--------------------------------------------
As a last resort, you can force the container to mutate the ownership of
any bind-mounts used for the <<path-settings,data and log dirs>> through the
environment variable `TAKE_FILE_OWNERSHIP`. When you do this, they will be owned by
uid:gid `1000:0`, which provides the required read/write access to the {es} process.
===== Increase ulimits for nofile and nproc
Increased ulimits for <<setting-system-settings,nofile>> and <<max-number-threads-check,nproc>>
must be available for the {es} containers.
Verify the https://github.com/moby/moby/tree/ea4d1243953e6b652082305a9c3cda8656edab26/contrib/init[init system]
for the Docker daemon sets them to acceptable values.
To check the Docker daemon defaults for ulimits, run:
[source,sh]
--------------------------------------------
docker run --rm centos:8 /bin/bash -c 'ulimit -Hn && ulimit -Sn && ulimit -Hu && ulimit -Su'
--------------------------------------------
If needed, adjust them in the Daemon or override them per container.
For example, when using `docker run`, set:
[source,sh]
--------------------------------------------
--ulimit nofile=65535:65535
--------------------------------------------
===== Disable swapping
Swapping needs to be disabled for performance and node stability.
For information about ways to do this, see <<setup-configuration-memory>>.
If you opt for the `bootstrap.memory_lock: true` approach,
you also need to define the `memlock: true` ulimit in the
https://docs.docker.com/engine/reference/commandline/dockerd/#default-ulimits[Docker Daemon],
or explicitly set for the container as shown in the <<docker-compose-file, sample compose file>>.
When using `docker run`, you can specify:
-e "bootstrap.memory_lock=true" --ulimit memlock=-1:-1
===== Randomize published ports
The image https://docs.docker.com/engine/reference/builder/#/expose[exposes]
TCP ports 9200 and 9300. For production clusters, randomizing the
published ports with `--publish-all` is recommended,
unless you are pinning one container per host.
[[docker-set-heap-size]]
===== Set the heap size
To configure the heap size, you can bind mount a <<jvm-options,JVM options>>
file under `/usr/share/elasticsearch/config/jvm.options.d` that includes your
desired <<heap-size,heap size>> settings. Note that while the default root
`jvm.options` file sets a default heap of 1 GB, any value you set in a
bind-mounted JVM options file will override it.
While setting the heap size via bind-mounted JVM options is the recommended
method, you can also configure this by using the `ES_JAVA_OPTS` environment
variable to set the heap size. For example, to use 16 GB, specify
`-e ES_JAVA_OPTS="-Xms16g -Xmx16g"` with `docker run`. Note that while the
default root `jvm.options` file sets a default heap of 1 GB, any value you set
in `ES_JAVA_OPTS` will override it. The `docker-compose.yml` file above sets the heap size to 512 MB.
IMPORTANT: You must <<heap-size,configure the heap size>> even if you are
https://docs.docker.com/config/containers/resource_constraints/#limit-a-containers-access-to-memory[limiting
memory access] to the container.
===== Pin deployments to a specific image version
Pin your deployments to a specific version of the {es} Docker image. For
example +docker.elastic.co/elasticsearch/elasticsearch:{version}+.
===== Always bind data volumes
You should use a volume bound on `/usr/share/elasticsearch/data` for the following reasons:
. The data of your {es} node won't be lost if the container is killed
. {es} is I/O sensitive and the Docker storage driver is not ideal for fast I/O
. It allows the use of advanced
https://docs.docker.com/engine/extend/plugins/#volume-plugins[Docker volume plugins]
===== Avoid using `loop-lvm` mode
If you are using the devicemapper storage driver, do not use the default `loop-lvm` mode.
Configure docker-engine to use
https://docs.docker.com/engine/userguide/storagedriver/device-mapper-driver/#configure-docker-with-devicemapper[direct-lvm].
===== Centralize your logs
Consider centralizing your logs by using a different
https://docs.docker.com/engine/admin/logging/overview/[logging driver]. Also
note that the default json-file logging driver is not ideally suited for
production use.
[[docker-configuration-methods]]
==== Configuring {es} with Docker
When you run in Docker, the <<config-files-location,{es} configuration files>> are loaded from
`/usr/share/elasticsearch/config/`.
To use custom configuration files, you <<docker-config-bind-mount, bind-mount the files>>
over the configuration files in the image.
You can set individual {es} configuration parameters using Docker environment variables.
The <<docker-compose-file, sample compose file>> and the
<<docker-cli-run-dev-mode, single-node example>> use this method.
To use the contents of a file to set an environment variable, suffix the environment
variable name with `_FILE`. This is useful for passing secrets such as passwords to {es}
without specifying them directly.
For example, to set the {es} bootstrap password from a file, you can bind mount the
file and set the `ELASTIC_PASSWORD_FILE` environment variable to the mount location.
2020-10-27 11:22:44 -04:00
If you mount the password file to `/run/secrets/bootstrapPassword.txt`, specify:
[source,sh]
--------------------------------------------
-e ELASTIC_PASSWORD_FILE=/run/secrets/bootstrapPassword.txt
--------------------------------------------
You can also override the default command for the image to pass {es} configuration
parameters as command line options. For example:
[source,sh]
--------------------------------------------
docker run <various parameters> bin/elasticsearch -Ecluster.name=mynewclustername
--------------------------------------------
While bind-mounting your configuration files is usually the preferred method in production,
you can also <<_c_customized_image, create a custom Docker image>>
that contains your configuration.
[[docker-config-bind-mount]]
===== Mounting {es} configuration files
Create custom config files and bind-mount them over the corresponding files in the Docker image.
For example, to bind-mount `custom_elasticsearch.yml` with `docker run`, specify:
[source,sh]
--------------------------------------------
-v full_path_to/custom_elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
--------------------------------------------
IMPORTANT: The container **runs {es} as user `elasticsearch` using
uid:gid `1000:0`**. Bind mounted host directories and files must be accessible by this user,
and the data and log directories must be writable by this user.
Password-protected Keystore Feature Branch PR (#51123) (#51510) * Reload secure settings with password (#43197) If a password is not set, we assume an empty string to be compatible with previous behavior. Only allow the reload to be broadcast to other nodes if TLS is enabled for the transport layer. * Add passphrase support to elasticsearch-keystore (#38498) This change adds support for keystore passphrases to all subcommands of the elasticsearch-keystore cli tool and adds a subcommand for changing the passphrase of an existing keystore. The work to read the passphrase in Elasticsearch when loading, which will be addressed in a different PR. Subcommands of elasticsearch-keystore can handle (open and create) passphrase protected keystores When reading a keystore, a user is only prompted for a passphrase only if the keystore is passphrase protected. When creating a keystore, a user is allowed (default behavior) to create one with an empty passphrase Passphrase can be set to be empty when changing/setting it for an existing keystore Relates to: #32691 Supersedes: #37472 * Restore behavior for force parameter (#44847) Turns out that the behavior of `-f` for the add and add-file sub commands where it would also forcibly create the keystore if it didn't exist, was by design - although undocumented. This change restores that behavior auto-creating a keystore that is not password protected if the force flag is used. The force OptionSpec is moved to the BaseKeyStoreCommand as we will presumably want to maintain the same behavior in any other command that takes a force option. * Handle pwd protected keystores in all CLI tools (#45289) This change ensures that `elasticsearch-setup-passwords` and `elasticsearch-saml-metadata` can handle a password protected elasticsearch.keystore. For setup passwords the user would be prompted to add the elasticsearch keystore password upon running the tool. There is no option to pass the password as a parameter as we assume the user is present in order to enter the desired passwords for the built-in users. For saml-metadata, we prompt for the keystore password at all times even though we'd only need to read something from the keystore when there is a signing or encryption configuration. * Modify docs for setup passwords and saml metadata cli (#45797) Adds a sentence in the documentation of `elasticsearch-setup-passwords` and `elasticsearch-saml-metadata` to describe that users would be prompted for the keystore's password when running these CLI tools, when the keystore is password protected. Co-Authored-By: Lisa Cawley <lcawley@elastic.co> * Elasticsearch keystore passphrase for startup scripts (#44775) This commit allows a user to provide a keystore password on Elasticsearch startup, but only prompts when the keystore exists and is encrypted. The entrypoint in Java code is standard input. When the Bootstrap class is checking for secure keystore settings, it checks whether or not the keystore is encrypted. If so, we read one line from standard input and use this as the password. For simplicity's sake, we allow a maximum passphrase length of 128 characters. (This is an arbitrary limit and could be increased or eliminated. It is also enforced in the keystore tools, so that a user can't create a password that's too long to enter at startup.) In order to provide a password on standard input, we have to account for four different ways of starting Elasticsearch: the bash startup script, the Windows batch startup script, systemd startup, and docker startup. We use wrapper scripts to reduce systemd and docker to the bash case: in both cases, a wrapper script can read a passphrase from the filesystem and pass it to the bash script. In order to simplify testing the need for a passphrase, I have added a has-passwd command to the keystore tool. This command can run silently, and exit with status 0 when the keystore has a password. It exits with status 1 if the keystore doesn't exist or exists and is unencrypted. A good deal of the code-change in this commit has to do with refactoring packaging tests to cleanly use the same tests for both the "archive" and the "package" cases. This required not only moving tests around, but also adding some convenience methods for an abstraction layer over distribution-specific commands. * Adjust docs for password protected keystore (#45054) This commit adds relevant parts in the elasticsearch-keystore sub-commands reference docs and in the reload secure settings API doc. * Fix failing Keystore Passphrase test for feature branch (#50154) One problem with the passphrase-from-file tests, as written, is that they would leave a SystemD environment variable set when they failed, and this setting would cause elasticsearch startup to fail for other tests as well. By using a try-finally, I hope that these tests will fail more gracefully. It appears that our Fedora and Ubuntu environments may be configured to store journald information under /var rather than under /run, so that it will persist between boots. Our destructive tests that read from the journal need to account for this in order to avoid trying to limit the output we check in tests. * Run keystore management tests on docker distros (#50610) * Add Docker handling to PackagingTestCase Keystore tests need to be able to run in the Docker case. We can do this by using a DockerShell instead of a plain Shell when Docker is running. * Improve ES startup check for docker Previously we were checking truncated output for the packaged JDK as an indication that Elasticsearch had started. With new preliminary password checks, we might get a false positive from ES keystore commands, so we have to check specifically that the Elasticsearch class from the Bootstrap package is what's running. * Test password-protected keystore with Docker (#50803) This commit adds two tests for the case where we mount a password-protected keystore into a Docker container and provide a password via a Docker environment variable. We also fix a logging bug where we were logging the identifier for an array of strings rather than the contents of that array. * Add documentation for keystore startup prompting (#50821) When a keystore is password-protected, Elasticsearch will prompt at startup. This commit adds documentation for this prompt for the archive, systemd, and Docker cases. Co-authored-by: Lisa Cawley <lcawley@elastic.co> * Warn when unable to upgrade keystore on debian (#51011) For Red Hat RPM upgrades, we warn if we can't upgrade the keystore. This commit brings the same logic to the code for Debian packages. See the posttrans file for gets executed for RPMs. * Restore handling of string input Adds tests that were mistakenly removed. One of these tests proved we were not handling the the stdin (-x) option correctly when no input was added. This commit restores the original approach of reading stdin one char at a time until there is no more (-1, \r, \n) instead of using readline() that might return null * Apply spotless reformatting * Use '--since' flag to get recent journal messages When we get Elasticsearch logs from journald, we want to fetch only log messages from the last run. There are two reasons for this. First, if there are many logs, we might get a string that's too large for our utility methods. Second, when we're looking for a specific message or error, we almost certainly want to look only at messages from the last execution. Previously, we've been trying to do this by clearing out the physical files under the journald process. But there seems to be some contention over these directories: if journald writes a log file in between when our deletion command deletes the file and when it deletes the log directory, the deletion will fail. It seems to me that we might be able to use journald's "--since" flag to retrieve only log messages from the last run, and that this might be less likely to fail due to race conditions in file deletion. Unfortunately, it looks as if the "--since" flag has a granularity of one-second. I've added a two-second sleep to make sure that there's a sufficient gap between the test that will read from journald and the test before it. * Use new journald wrapper pattern * Update version added in secure settings request Co-authored-by: Lisa Cawley <lcawley@elastic.co> Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
2020-01-28 05:32:32 -05:00
[[docker-keystore-bind-mount]]
===== Mounting an {es} keystore
By default, {es} will auto-generate a keystore file for secure settings. This
file is obfuscated but not encrypted. If you want to encrypt your
<<secure-settings,secure settings>> with a password, you must use the
`elasticsearch-keystore` utility to create a password-protected keystore and
bind-mount it to the container as
`/usr/share/elasticsearch/config/elasticsearch.keystore`. In order to provide
the Docker container with the password at startup, set the Docker environment
value `KEYSTORE_PASSWORD` to the value of your password. For example, a `docker
run` command might have the following options:
[source, sh]
--------------------------------------------
-v full_path_to/elasticsearch.keystore:/usr/share/elasticsearch/config/elasticsearch.keystore
-E KEYSTORE_PASSWORD=mypassword
--------------------------------------------
[[_c_customized_image]]
===== Using custom Docker images
In some environments, it might make more sense to prepare a custom image that contains
your configuration. A `Dockerfile` to achieve this might be as simple as:
[source,sh,subs="attributes"]
--------------------------------------------
FROM docker.elastic.co/elasticsearch/elasticsearch:{version}
COPY --chown=elasticsearch:elasticsearch elasticsearch.yml /usr/share/elasticsearch/config/
--------------------------------------------
You could then build and run the image with:
[source,sh]
--------------------------------------------
docker build --tag=elasticsearch-custom .
docker run -ti -v /usr/share/elasticsearch/data elasticsearch-custom
--------------------------------------------
Some plugins require additional security permissions.
You must explicitly accept them either by:
* Attaching a `tty` when you run the Docker image and allowing the permissions when prompted.
* Inspecting the security permissions and accepting them (if appropriate) by adding the `--batch` flag to the plugin install command.
See {plugins}/_other_command_line_parameters.html[Plugin management]
for more information.
include::next-steps.asciidoc[]