This commit slightly reworks the recommendations in the docs about setting the
heap size:
* the "rules of thumb" are actually instructions that should be followed
* the reason for setting `Xmx` to 50% of the heap size is more subtle than just
leaving space for the filesystem cache
* it is normal to see Elasticsearch using more memory than `Xmx`
* replace `cutoff` and `limit` with `threshold` since all three terms are used
interchangeably
* since we recommend setting `Xmx` equal to `Xms`, avoid talking about setting
`Xmx` in isolation
Relates #41954
In cases where node names and transport addresses can be muddled, it is unclear
that `cluster.initial_master_nodes: master-a:9300` means to look for a node
called `master-a:9300` rather than a node called `master-a` with transport port
`9300`. This commit adds docs to that effect.
Today's `docker-compose` docs are missing the `discovery.seed_nodes` config on
one of the nodes. With today's configuration the cluster can still form the
first time it is started, because `cluster.initial_master_nodes` requires both
nodes to bootstrap the cluster which ensures that each discover the other.
However if `es02` is elected master it will remove `es01` from the voting
configuration and then when restarted it will form a cluster on its own without
needing to do any discovery. Meanwhile `es01` doesn't know how to find `es02`
after a restart so will be unable to join this cluster.
This commit fixes this by adding the missing configuration.
The following phrase causes confusion:
> Alternatively the IP addresses or hostnames (if node name defaults to the
> host name) can be used.
This change clarifies the conditions under which you can use a hostname, and
adds an anchor to the note introduced in (#41137) so we can link directly to it
in conversations with users.
* [DOCS] Fix broken link to Elasticsearch Docker source code
* [DOCS] Link to Dockerfile in elastic/elasticsearch repo
* [DOCS] Link to Docker source files in elastic/elasticsearch repo
This commit changes the note in docs about required java version to note
the existence of the bundled jdk and how to bring your own java. It also
reorganizes the zip/targz docs as zip is no longer suitable on
Linux/MacOS.
While running these commands from alias, facing issues using kill `cat pid`, In some situations, the more compact:
```
pkill -F /var/run/myProcess.pid
```
is the way to go.
This fixes a bug in the sensing of the current OS family in the test cluster
formation code. Previously all builds would assume every environment
was windows and would jump to using the windows zip build. This fixes
the OS sensing code as well as updates some tests to account for
different build flavors.
Backport of #38457
In #38333 and #38350 we moved away from the `discovery.zen` settings namespace
since these settings have an effect even though Zen Discovery itself is being
phased out. This change aligns the documentation and the names of related
classes and methods with the newly-introduced naming conventions.
Renames the following settings to remove the mention of `zen` in their names:
- `discovery.zen.hosts_provider` -> `discovery.seed_providers`
- `discovery.zen.ping.unicast.concurrent_connects` -> `discovery.seed_resolver.max_concurrent_resolvers`
- `discovery.zen.ping.unicast.hosts.resolve_timeout` -> `discovery.seed_resolver.timeout`
- `discovery.zen.ping.unicast.hosts` -> `discovery.seed_addresses`
This commit adds classifiers to the distributions indicating the
OS (for archives) and platform. The current OSes are for windows, darwin (ie
macos) and linux. This change will allow future OS/architecture specific
changes to the distributions. Note the docs using distribution links
have been updated, but will be reworked in a followup to make OS
specific instructions for the archives.
In order to support JSON log format, a custom pattern layout was used and its configuration is enclosed in ESJsonLayout. Users are free to use their own patterns, but if smooth Beats integration is needed, they should use ESJsonLayout. EvilLoggerTests are left intact to make sure user's custom log patterns work fine.
To populate additional fields node.id and cluster.uuid which are not available at start time,
a cluster state update will have to be received and the values passed to log4j pattern converter.
A ClusterStateObserver.Listener is used to receive only one ClusteStateUpdate. Once update is received the nodeId and clusterUUid are set in a static field in a NodeAndClusterIdConverter.
Following fields are expected in JSON log lines: type, tiemstamp, level, component, cluster.name, node.name, node.id, cluster.uuid, message, stacktrace
see ESJsonLayout.java for more details and field descriptions
Docker log4j2 configuration is now almost the same as the one use for ES binary.
The only difference is that docker is using console appenders, whereas ES is using file appenders.
relates: #32850
Some systems default to a nofile ulimit of 65535. To reduce the pain of
deploying Elasticsearch to such systems, this commit lowers the required
limit from 65536 to 65535.
With this commit we rename `node.store.allow_mmapfs` to
`node.store.allow_mmap`. Previously this setting has controlled whether
`mmapfs` could be used as a store type. With the introduction of
`hybridfs` which also relies on memory-mapping,
`node.store.allow_mmapfs` also applies to `hybridfs` and thus we rename
it in order to convey that it is actually used to allow memory-mapping
but not a specific store type.
Relates #36668
Relates #37070
This commit overhauls the documentation of discovery and cluster coordination,
removing mention of the Zen Discovery module and replacing it with docs for the
new cluster coordination mechanism introduced in 7.0.
Relates #32006
This is related to #36652. In 7.0 we plan to deprecate a number of
settings that make reference to the concept of a tcp transport. We
mostly just have a single transport type now (based on tcp). Settings
should only reference tcp if they are referring to socket options. This
commit updates the settings in the docs. And removes string usages of
the old settings. Additionally it adds a missing remote compress setting
to the docs.
When a security manager is present, the JVM will cache positive hostname
lookups indefinitely. This can be problematic, especially in the modern
world with cloud services where DNS addresses can change, or
environments using Docker containers where IP addresses could be
considered ephemeral. This behavior impacts cluster discovery,
cross-cluster replication and cross-cluster search, reindex from remote,
snapshot repositories, webhooks in Watcher, external authentication
mechanisms, and the Elastic Stack Monitoring Service. The experience of
watching a DNS lookup change yet not be reflected within Elasticsearch
is a poor experience for users. The reason the JVM has this is guard
against DNS cache posioning attacks. Yet, there is already a defense in
the modern world against such attacks: TLS. With proper certificate
validation, even if a resolver falls prey to a DNS cache poisoning
attack, using TLS would neuter the attack. Therefore we have a policy
with dubious security value that significantly impacts usability. As
such we make the usability/security tradeoff towards usability, since
the security risks are very low. This commit introduces new system
properties that Elasticsearch observes to override the JVM DNS cache
policy.
In real deployments it is important that clusters are properly configured to
avoid accidentally forming multiple independent clusters at cluster
bootstrapping time. However we also expect to be able to unpack Elasticsearch
and start up one or more nodes without any up-front configuration, and have
them do their best to find each other and form a cluster after a few seconds.
This change adds a delayed automatic bootstrapping process to nodes that start
up with no relevant settings set to support the desired out-of-the-box
experience without compromising safety in properly-configured deployments.
Recent Docker for Mac releases[1] have a different path to the tty for
accessing the console of the xhyve vm, required for altering the
`vm.max_map_count` sysctl.
Update instructions on how to enter the xhyve vm for altering the
`vm.max_map_count` sysctl setting on Docker for Mac.
Closes#34817
[1]
https://forums.docker.com/t/is-it-possible-to-ssh-to-the-xhyve-machine/17426/13
If the underlying mount point for the JNA temporary directory is mounted
noexec on Linux, then the JVM will not be able to map the native code in
as executable. This will prevent JNA from executing and will prevent
Elasticsearch from being able to execute some functions that rely on
native code (e.g., memory locking, and installing system call
filters). We do not want to get into the business of catching exceptions
and parsing messages towards this because these exception messages can
change on us. We also do not want to jump through a lot of hoops to
check the underlying mount point for noexec. Instead, we will rely on
documentation to address this problem. This commit adds to the important
system configuration section of the docs that the JNA temporary
directory is not on a mount point with the noexec mount option.
With this change, `Version` no longer carries information about the qualifier,
we still need a way to show the "display version" that does have both
qualifier and snapshot. This is now stored by the build and red from `META-INF`.
Changes the default of the `node.name` setting to the hostname of the
machine on which Elasticsearch is running. Previously it was the first 8
characters of the node id. This had the advantage of producing a unique
name even when the node name isn't configured but the disadvantage of
being unrecognizable and not being available until fairly late in the
startup process. Of particular interest is that it isn't available until
after logging is configured. This forces us to use a volatile read
whenever we add the node name to the log.
Using the hostname is available immediately on startup and is generally
recognizable but has the disadvantage of not being unique when run on
machines that don't set their hostname or when multiple elasticsearch
processes are run on the same host. I believe that, taken together, it
is better to default to the hostname.
1. Running multiple copies of Elasticsearch on the same node is a fairly
advanced feature. We do it all the as part of the elasticsearch build
for testing but we make sure to set the node name then.
2. That the node.name defaults to some flavor of "localhost" on an
unconfigured box feels like it isn't going to come up too much in
production. I expect most production deployments to at least set the
hostname.
As a bonus, production deployments need no longer set the node name in
most cases. At least in my experience most folks set it to the hostname
anyway.
The main installation instructions page for the Windows MSI installer includes a header at the top to indicate that the installer is in beta, but the Installing Elasticsearch page does not. This commit adds the beta label to the MSI entry within the installation options.
The maximum map count boostrap check can be a hindrance to users that do
not own the underlying platform on which they are executing
Elasticsearch. This is because addressing it requires tuning the kernel
and a platform provider might now allow this, especially on shared
infrastructure. However, this bootstrap check is not needed if mmapfs is
not in use. Today we do not have a way for the user to communicate that
they are not going to use mmapfs. This commit therefore adds a setting
that enables the user to disallow mmapfs. When mmapfs is disallowed, the
maximum map count bootstrap check is not enforced. Additionally, we
fallback to a different default index store and prevent the explicit use
of mmapfs for an index.
The docs here incorrectly state that it is okay for a heap dump file to
exist when heap dump path is configured to a fixed filename. This is
incorrect, the JVM will fail to write the heap dump if a heap dump file
already exists at the specified location (see the DumpWriter constructor
DumpWriter::DumpWriter(const char* path) in the JVM source).
On some Linux distributions tmpfiles.d cleans files and
directories under /tmp if they haven't been accessed for
10 days.
This can cause problems for ML as ML is currently the only
component that uses the temp directory more than a few
seconds after startup. If you didn't open an ML job for
10 days and then tried to open one then the temp directory
would have been deleted.
This commit prevents the problem occurring in the case of
Elasticsearch being managed by systemd, as systemd private
temp directories are not subject to periodic cleanup (by
default).
Additionally there are now some docs to warn people about
the risk and suggest a manual mitigation for .tar.gz users.
First, some background: we have 15 different methods to get a logger in
Elasticsearch but they can be broken down into three broad categories
based on what information is provided when building the logger.
Just a class like:
```
private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class);
```
or:
```
protected final Logger logger = Loggers.getLogger(getClass());
```
The class and settings:
```
this.logger = Loggers.getLogger(getClass(), settings);
```
Or more information like:
```
Loggers.getLogger("index.store.deletes", settings, shardId)
```
The goal of the "class and settings" variant is to attach the node name
to the logger. Because we don't always have the settings available, we
often use the "just a class" variant and get loggers without node names
attached. There isn't any real consistency here. Some loggers get the
node name because it is convenient and some do not.
This change makes the node name available to all loggers all the time.
Almost. There are some caveats are testing that I'll get to. But in
*production* code the node name is node available to all loggers. This
means we can stop using the "class and settings" variants to fetch
loggers which was the real goal here, but a pleasant side effect is that
the ndoe name is now consitent on every log line and optional by editing
the logging pattern. This is all powered by setting the node name
statically on a logging formatter very early in initialization.
Now to tests: tests can't set the node name statically because
subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in
the same class loader. Also, lots of tests don't run with a real node so
they don't *have* a node name at all. To support multiple nodes in the
same JVM tests suss out the node name from the thread name which works
surprisingly well and easy to test in a nice way. For those threads
that are not part of an `ESIntegTestCase` node we stick whatever useful
information we can get form the thread name in the place of the node
name. This allows us to keep the logger format consistent.
We removed the default_fs store type yet the docs still contain a
reference to them. This commit addresses that by removing this
reference, and changing a reference to this section of the docs to
instead refer to mmapfs.
In the section of the bootstrap checks docs for the maximum map count
check, we refer to max size virtual memory check and explicitly call out
the maximum size virtual memory check as being the previous
point. However, this is not correct as the previous point is currently
the max file size check. It does make sense for these two checks to be
proximate to each other in the docs so this commit reorders the checks
so that the maximum size virtual memory check indeed comes before the
maximum map count check. This makes the sense in the maximum map count
check correct.
This commit removes the link to an oss-MSI; there is only one version of the MSI, which includes X-Pack.
(cherry picked from commit d2e5db8a806ec8a25162f79db5209aceed4f30f7)
This switches the docs tests from the `oss-zip` distribution to the
`zip` distribution so they have xpack installed and configured with the
default basic license. The goal is to be able to merge the
`x-pack/docs` directory into the `docs` directory, marking the x-pack
docs with some kind of marker. This is the first step in that process.
This also enables `-Dtests.distribution` support for the `docs`
directory so you can run the tests against the `oss-zip` distribution
with something like
```
./gradlew -p docs check -Dtests.distribution=oss-zip
```
We can set up Jenkins to run both.
Relates to #30665
This commit adds the distribution type to the startup scripts so that we
can discern from log output and the main response the type of the
distribution (deb/rpm/tar/zip).
This commit adds the distribution flavor (default versus oss) to the
build process which is passed through the startup scripts to
Elasticsearch. This change will be used to customize the message on
attempting to install/remove x-pack based on the distribution flavor.
Historically, the bootstrap checks used 2048 as the minimum limit for
the maximum number of threads. This limit was guided by the fact that
the number of processors was artificially capped at 32. This limit was
removed in 6.0.0 and the minimum limit was raised to 4096 to accommodate
this. However, the docs were not updated and this commit addresses that
miss.
This is a follow up to a previous change which set the error file path
for the package distributions. The observation here is that we always
set the working directory of Elasticsearch to the root of the
installation (i.e., Elasticsearch home). Therefore, we can specify the
error file path relative to this directory and default it to the logs
directory, similar to the package distributions.
This is a follow up to a previous change which set the heap dump path
for the package distributions. The observation here is that we always
set the working directory of Elasticsearch to to the root of
installation (i.e., Elasticsearch home). Therefore, we can specify the
heap dump path relative to this directory and default it to the data
directory, similar to the package distributions.
We previously specified the -server flag to force the JVM to use the
server JVM. This is the default on all the systems that we support when
using a 64-bit JVM (and we no longer support 32-bit JVMs). There was
some trouble with this flag for the Windows service since procrun did
not understand what to do with it; as such, we had to filter this flag
out in the service. When we migrated to parsing JVM options in Java (via
the JVM options parser) we simplified this situation and removed
specifying the -server flag. This commit removes a leftover statement
that we are forcing the server JVM.
Relates #28738
The Windows service will use a private temporary directory under the
user that is performing the installation. In cases when the service will
run as a different user, operators need a method to set this temporary
directory elsewhere. We have such a mechanism, so this commit merely
adds a note to the documentation on how to utilize it.
Relates #28712
* Only bind loopback addresses when binding to local
Today when binding to local (the default) we bind to any address that is
a loopback address, or any address on an interface that declares itself
as a loopback interface. Yet, not all addresses on loopback interfaces
are loopback addresses. This arises on macOS where there is a link-local
address assigned to the loopback interface (fe80::1%lo0) and in Docker
services where virtual IPs of the service are assigned to the loopback
interface (docker/libnetwork#1877). These situations cause problems:
- because we do not handle the scope ID of a link-local address, we end
up bound to an address for which publishing of that address does not
allow that address to be reached (since we drop the scope)
- the virtual IPs in the Docker situation are not loopback addresses,
they are not link-local addresses, so we end up bound to interfaces
that cause the bootstrap checks to be enforced even though the
instance is only bound to local
We address this by only binding to actual loopback addresses, and skip
binding to any address on a loopback interface that is not a loopback
address. This lets us simplify some code where in the bootstrap checks
we were skipping link-local addresses, and in writing the ports file
where we had to skip link-local addresses because again the formatting
of them does not allow them to be connected to by another node (to be
clear, they could be connected to via the scope-qualified address, but
that information is not written out).
Relates #28029
This commit reorganizes some of the content in the configuring
Elasticsearch section of the docs. The changes are:
- move JVM options out of system configuration into configuring
Elasticsearch
- move JVM options to its own page of the docs
- move configuring the heap to important Elasticsearch settings
- move configuring the heap to its own page of the docs
- move all important settings to individual pages in the docs
- remove bootstrap.memory_lock from important settings, this is covered
in the swap section of system configuration
Relates #27755
JDK 9 has removed JVM options that were valid in JDK 8 (e.g., GC logging
flags) and replaced them with new flags that are not available in JDK
8. This means that a single JVM options file can no longer apply to JDK
8 and JDK 9, complicating development, complicating our packaging story,
and complicating operations. This commit extends the JVM options syntax
to specify the range of versions the option applies to. If the running
JVM matches the range of versions, the flag will be used to start the
JVM otherwise the flag will be ignored.
We implement this parser in Java for simplicity, and with this we start
our first step towards a Java launcher.
Relates #27675
For too long we have been groping around in the dark when faced with GC
issues because we rarely have GC logs at our disposal. This commit
enables GC logging by default out of the box.
Relates #27610
When these docs were moved they should have been moved to the system
configuration docs. This commit does that, and also fixes a missing
heading that broke the docs build.