When a security manager is present, the JVM will cache positive hostname
lookups indefinitely. This can be problematic, especially in the modern
world with cloud services where DNS addresses can change, or
environments using Docker containers where IP addresses could be
considered ephemeral. This behavior impacts cluster discovery,
cross-cluster replication and cross-cluster search, reindex from remote,
snapshot repositories, webhooks in Watcher, external authentication
mechanisms, and the Elastic Stack Monitoring Service. The experience of
watching a DNS lookup change yet not be reflected within Elasticsearch
is a poor experience for users. The reason the JVM has this is guard
against DNS cache posioning attacks. Yet, there is already a defense in
the modern world against such attacks: TLS. With proper certificate
validation, even if a resolver falls prey to a DNS cache poisoning
attack, using TLS would neuter the attack. Therefore we have a policy
with dubious security value that significantly impacts usability. As
such we make the usability/security tradeoff towards usability, since
the security risks are very low. This commit introduces new system
properties that Elasticsearch observes to override the JVM DNS cache
policy.
Currently is `java` is not in $PATH the preinst script fails
prematurely and prevents an appropriate message from getting displayed
to the user.
Make package installation more user friendly when java is not in
$PATH and add a test for it.
Also use a she-bang in the preinst script, as, at least in Debian,
maintainer scripts must start with the #! convention [1].
Relates #31845
[1] https://www.debian.org/doc/debian-policy/ch-maintainerscripts.html
In the long run we want to move all of startup to a Java program. This
will simplify our startup scripts and make maintenance of startup less
dependent on the underlying platform that we run on. This commit moves
the creation of the temporary directory off of system-dependent commands
and onto a simple Java program.
This commit updates the procrun manager and service exes to 1.1.0. There
are a few bug fixes, including for a bug which can cause lingering
processes when removing the service.
To pass the HOSTNAME envrionment variable to the Windows service, we
have to add some command line flags to the service invocation. Namely,
we have to specify that we are passing HOSTNAME variable, and we will
pass for it the value of %%COMPUTERNAME%%. This ensures that if the
hostname is changed, we pick this up the next time that the service is
started. This change is needed for the service now that we use the
HOSTNAME as the default node name.
#32281 adds elasticsearch-shard to provide bwc version of elasticsearch-translog for 6.x; have to remove elasticsearch-translog for 7.0
Relates to #31389
* Add commented out JVM options for G1GC
These options are available now that we will be supporting G1GC for Java 10 and
above. They are also designed so that the CMS options don't have to be commented
out in order for the G1 options to take effect.
* Update wording
First, some background: we have 15 different methods to get a logger in
Elasticsearch but they can be broken down into three broad categories
based on what information is provided when building the logger.
Just a class like:
```
private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class);
```
or:
```
protected final Logger logger = Loggers.getLogger(getClass());
```
The class and settings:
```
this.logger = Loggers.getLogger(getClass(), settings);
```
Or more information like:
```
Loggers.getLogger("index.store.deletes", settings, shardId)
```
The goal of the "class and settings" variant is to attach the node name
to the logger. Because we don't always have the settings available, we
often use the "just a class" variant and get loggers without node names
attached. There isn't any real consistency here. Some loggers get the
node name because it is convenient and some do not.
This change makes the node name available to all loggers all the time.
Almost. There are some caveats are testing that I'll get to. But in
*production* code the node name is node available to all loggers. This
means we can stop using the "class and settings" variants to fetch
loggers which was the real goal here, but a pleasant side effect is that
the ndoe name is now consitent on every log line and optional by editing
the logging pattern. This is all powered by setting the node name
statically on a logging formatter very early in initialization.
Now to tests: tests can't set the node name statically because
subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in
the same class loader. Also, lots of tests don't run with a real node so
they don't *have* a node name at all. To support multiple nodes in the
same JVM tests suss out the node name from the thread name which works
surprisingly well and easy to test in a nice way. For those threads
that are not part of an `ESIntegTestCase` node we stick whatever useful
information we can get form the thread name in the place of the node
name. This allows us to keep the logger format consistent.
The C2 compiler in JDK 10 appears to have an issue compiling to AVX-512
instructions (on hardware that supports such). As a workaround, this
commit adds a JVM flag on JDK 10+ to disable the use of AVX-512
instructions until a fix is introduced to the JDK. Instead, we use a
flag to enable AVX and AVX2 only.
Note: Based on my reading of the C2 code, this flag does not appear to
have any impact on hardware that does not support AVX2. I have tested
this manually on an Intel Atom C2538 processor that supports neither AVX
nor AVX2. I have also tested this manually on an Intel i5-3317U
processor that supports AVX but not AVX2.
The java version checker requires being written with java 7 APIs.
In order to use java 8 apis in other launcher utilities, this commit
moves the java version checker back to its own jar.
This commit adjusts the indentation in the CLI scripts to give a clear
visual indication that the line being indented is a continuation of the
previous line.
A previous refactoring of the CLI scripts migrated all of the CLI tools
to shell to a common script, elasticsearch-cli. This approach is fine in
Bash where it is easy to tear arguments apart but it doesn't work so
well on Windows where quoting is insane. To avoid having to tear the
arguments apart to separate the first argument to elasticsearch-cli from
the remaining arguments, we instead choose a strategy where we can avoid
tearing the arguments apart. To do this, we will instead pass the main
class by an environment variable and then we can pass the arguments
straight through. This will let us avoid awful quoting issues on
Windows. This is the Windows side of that effort and the Bash side was
in a previous commit.
A previous refactoring of the CLI scripts migrated all of the CLI tools
to shell to a common script, elasticsearch-cli. This approach is fine in
Bash where it is easy to tear arguments apart but it doesn't work so
well on Windows where quoting is insane. To avoid having to tear the
arguments apart to separate the first argument to elasticsearch-cli from
the remaining arguments, we instead choose a strategy where we can avoid
tearing the arguments apart. To do this, we will instead pass the main
class by an environment variable and then we can pass the arguments
straight through. This will let us avoid awful quoting issues on
Windows. This is the non-Windows side of that effort and the Windows
side will be in a follow-up.
If you invoke elasticsearch-plugin (or any other CLI script on Windows)
with a path that has a percent-encoded space (or any other
percent-encoded character) because the CLI scripts now shell into a
common shell script (elasticsearch-cli) the percent-encoded space ends
up being interpreted as a parameter. For example passing install --batch
file:/c:/encoded%20%space/analysis-icu-7.0.0.zip to elasticsearch-plugin
leads to the %20 being interpreted as %2 followed by a zero. Here, the
%2 is interpreted as the second parameter (--batch) and the
InstallPluginCommand class ends up seeing
file:/c/encoded--batch0space/analysis-icu-7.0.0.zip as the path which
will not exist. This commit addresses this by escaping the %* that is
used to pass the parameters to the common CLI script so that the common
script sees the correct parameters without the %2 being substituted.
We sign our official plugins yet this is not well-advertised and not at
all consumed during plugin installation. For plugins that are installed
over the intertubes, verifying that the downloaded artifact is signed by
our signing key would establish both integrity and validity of the
downloaded artifact. The chain of trust here is simple: our installable
artifacts (archive and package distributions) so that if a user trusts
our packages via their signatures, and our plugin installer (which would
be executing trusted code) verifies the downloaded plugin, then the user
can trust the downloaded plugin too. This commit adds verification of
official plugins downloaded during installation. We do not add
verification for offline plugin installs; a user can download our
signatures and verify the artifacts themselves.
This commit also needs to solve a few interesting challenges. One of
these is that we want the bouncy castle JARs on the classpath only for
the plugin installer, but not for the runtime
Elasticsearch. Additionally, we want these JARs to not be present for
the JAR hell checks. To address this, we shift these JARs into a
sub-directory of lib (lib/tools/plugin-cli) that is only loaded for the
plugin installer, and in the plugin installer we filter any JARs in this
directory from the JAR hell check.
This commit reduces the Windows CLI scripts to one-liners by moving all
of the redundant logic to an elasticsearch-cli script. This commit is
only the Windows side, a previous commit covered the Linux side.
This commit reduces the Linux CLI scripts to one-liners by moving all of
the redundant logic to an elasticsearch-cli script. This commit is only
the Linux side, a follow-up will do this for Windows too.
If the elasticsearch-env bash script chooses $ES_TMPDIR
then it also creates the directory. This change makes
elasticsearch-env.bat do the same thing: if %ES_TMPDIR%
is chosen by the script then the script will ensure it
exists, but if %ES_TMPDIR% is already set then the user
is responsible for creating it.
Relates #27609
Relates #28217
This commit adds the distribution type to the startup scripts so that we
can discern from log output and the main response the type of the
distribution (deb/rpm/tar/zip).
This commit adds the distribution flavor (default versus oss) to the
build process which is passed through the startup scripts to
Elasticsearch. This change will be used to customize the message on
attempting to install/remove x-pack based on the distribution flavor.
This is a follow up to a previous change which set the heap dump path
for the package distributions. The observation here is that we always
set the working directory of Elasticsearch to to the root of
installation (i.e., Elasticsearch home). Therefore, we can specify the
heap dump path relative to this directory and default it to the data
directory, similar to the package distributions.
The cd command on Windows has an oddity regarding changing
directories. If the drive of the current directory is a different drive
than than of the directory that was passed to the cd command, cd acts in
query mode and does not change the current directory. Instead, a flag is
needed to put the cd command into set mode so that the directory
actually changes. This causes a problem when starting Elasticsearch from
a directory different than the one where it is installed and this commit
fixes the issue.
Today we allow any other method of starting Elastisearch to override
jvm.options via ES_JAVA_OPTS. Yet, for some settings in the Windows
service, we do not allow this. This commit removes this in favor of
being consistent with other packaging choices.
This commit adds a JVM flag to ensure that the JVM fatal error logs land
in the default log directory. Users that wish to use an alternative
location should change the path configured here.
This commit moves the distribution specific tasks into the respective
archives and packages builds. The collocation of common and distribution
specific tasks make it much easier to reason about what is expected in a
particular distribution.
There is a bug in the for statement where we execute the JVM options
parser. The bug manfiests in the handling of paths with ) in the
name. The problem is this: we use a for statement to capture the output
of the JVM options parser. A for statement that executes a command
defers execution to cmd. There is this gem from the help:
1. If all of the following conditions are met, then quote characters
on the command line are preserved:
- no /S switch
- exactly two quote characters
- no special characters between the two quote characters,
where special is one of: &<>()@^|
- there are one or more whitespace characters between the
two quote characters
- the string between the two quote characters is the name
of an executable file.
2. Otherwise, old behavior is to see if the first character is
a quote character and if so, strip the leading character and
remove the last quote character on the command line, preserving
any text after the last quote character.
This means that the ) causes the quotes to be stripped which ruins
everything. This commit fixes this by delaying expansion of the paths.
Relates #28753
Previously a user could set a custom config path to a relative directory
using ES_PATH_CONF. In a previous change related to enabling GC logging
by default, we forced the working directory for Elasticsearch to be
ES_HOME. This had the impact of causing all relative paths to be
relative to ES_HOME, against the intent of the user. This commit
addresses this by making ES_PATH_CONF absolute before we switch the
working directory to ES_HOME.
Relates #28700
When Elasticsearch is run as a service we should not use the console
logger otherwise we end up duplicating logging (to the Elasticsearch
logs and whereever standard output is captured). Previously we disabled
the console logger when started as a service using systemd (otherwise
the console logs are duplicated to the journal). This commit does the
same for the Windows service, starting Elasticsearch with the --quiet
flag to avoid standard output being written to the service stdout logs.
Relates #28618
Java 9 added some enhancements to the internationalization support that
impact our date parsing support. To ensure flawless BWC and consistent
behavior going forward Java 9 runtimes requrie the system property
`java.locale.providers=COMPAT` to be set.
Closes#10984
We document that users can set custom service names on Windows. Alas,
the functionality does not work. This commit fixes the issue by passing
the environment variable SERVICE_ID as the service name otherwise
defaulting to elasticsearch-service-x64.
Relates #25255
JDK 9 has removed JVM options that were valid in JDK 8 (e.g., GC logging
flags) and replaced them with new flags that are not available in JDK
8. This means that a single JVM options file can no longer apply to JDK
8 and JDK 9, complicating development, complicating our packaging story,
and complicating operations. This commit extends the JVM options syntax
to specify the range of versions the option applies to. If the running
JVM matches the range of versions, the flag will be used to start the
JVM otherwise the flag will be ignored.
We implement this parser in Java for simplicity, and with this we start
our first step towards a Java launcher.
Relates #27675
GNU mktemp and BSD mktemp have different command line flags. On some
macOS systems users have mktemp from coreutils in their PATH overriding
the system mktemp from BSD. This commit adds detection for the coreutils
mktemp versus the BSD mktemp and uses the appropriate syntax based on
the detection.
Relates #27659
The LimitMEMLOCK suggestion was removed from systemd service file and
instead users should use an override file, so a comment in the
environment file should be updated to reflect the same.
Relates #27630
For too long we have been groping around in the dark when faced with GC
issues because we rarely have GC logs at our disposal. This commit
enables GC logging by default out of the box.
Relates #27610
This change ensures that the temporary directory used for java.io.tmpdir
is a private temporary directory. To achieve this we use mktemp on macOS
and Linux to give us a private temporary directory and the value of the
environment variable TMP on Windows. For this to work with our
packaging, we add java.io.tmpdir=${ES_TMPDIR} to our packaged
jvm.options, we set ES_TMPDIR respectively in our startup scripts, and
resolve the value of the template ${ES_TMPDIR} at startup.
Relates #27609
The existing log rotation configuration allowed the index
and search slow log to grow unbounded. This commit removes the
date based rotation and adds the same size based rotation, that
the depreciation log already has.
This commit fixes an issue with the handling of paths containing
parentheses on Windows. When such a path is used as a component of
Elasticsearch home, then a later echo statement that is guarded by an if
will fail because the parentheses in the path will be confused with the
parentheses defining the if block. This commit fixes the issue by
protecting this echo statement by wrapping the possibly offending path
in quotes.
Relates #26916
* Removes minimum master nodes default number
At the moment the elasticsearch.yml contains the minimum master node setting commented out but with a value of 3. This has lead to users uncommenting the value and assuming it is a good default without reading that they need to change it to a quorum of master eligible nodes causing split brain in their cluster and defeating the point of the setting.
The default of 3 is not even a good default for our recommended setup of 3 dedicated master eligible nodes.
This changes the value o fthe commented out setting to something that will not produce valid config and should highlight that the value needs to be changed so users no longer uncomment the line without considering what the correct value for their setup should be.
* Addresses review comment
The JVM defaults to dumping the heap to the working directory of
Elasticsearch. For the RPM and Debian packages, this location is
/usr/share/elasticsearch. This directory is not writable by the
elasticsearch user, so by default heap dumps in this situation are
lost. This commit modifies the packaging for the RPM and Debian packages
to set the heap dump path to /var/lib/elasticsearch as the default
location for dumping the heap. This location is writable by the
elasticsearch user by default. We add documentation of this important
setting if /var/lib/elasticsearch is not suitable for receiving heap
dumps.
Relates #26755
When creating the keystore explicitly (from executing
elasticsearch-keystore create) or implicitly (for plugins that require
the keystore to be created on install) on an Elasticsearch package
installation, we are running as the root user. This leaves
/etc/elasticsearch/elasticsearch.keystore having the wrong ownership
(root:root) so that the elasticsearch user can not read the keystore on
startup. This commit adds setgid to /etc/elasticsearch on package
installation so that when executing this directory (as we would when
creating the keystore), we will end up with the correct ownership
(root:elasticsearch). Additionally, we set the permissions on the
keystore to be 660 so that the elasticsearch user via its group can read
this file on startup.
Relates #26412
This commit makes the security code aware of the Java 9 FilePermission changes (see #21534) and allows us to remove the `jdk.io.permissionsUseCanonicalPath` system property.
When Elasticsearch starts up, it tries to create a keystore if one does
not exist; this is so the keystore can be seeded. With the RPM and
Debian packages, the keystore would be located in
/etc/elasticsearch. This configuration directory is typically not
writable by the elasticsearch user so the Elasticsearch process will not
have permission to create the keystore. Instead, the RPM and Debian
packages should create the keystore (if it does not exist) on package
installation. This commit enables these packages to do that in the
post-install routines.
Relates #26282
We need to check if JAVA_TOOL_OPTIONS, and JAVA_OPTS are set, and if
ES_PATH_CONF is not set. However, if these variables are defined and
contain quotes, the current mechanism busts on them. Instead, we should
use safer mechanism for checking if these variable are defined or
not. This commit does that.
Relates #26268
We previously explicitly set the HOSTNAME environment variable so that
${HOSTNAME} could be used a placeholder for defining the node.name in
elasticsearch.yml. We removed explicitly setting this because bash
defines HOSTNAME. The problem is that bash defines HOSTNAME as a bash
variable, not as an environment variable. Therefore, to restore the
previous behavior, we export the bash value for HOSTNAME as an
environment variable named HOSTNAME. For consistency between Windows and
the Unix-like systems, we also define HOSTNAME with a value equal to the
environment variable COMPUTERNAME on Windows.
Relates #26262