This change is the first towards providing the ability to store
sensitive settings in elasticsearch. It adds the
`elasticsearch-keystore` tool, which allows managing a java keystore.
The keystore is loaded upon node startup in Elasticsearch, and used by
the Setting infrastructure when a setting is configured as secure.
There are a lot of caveats to this PR. The most important is it only
provides the tool and setting infrastructure for secure strings. It does
not yet provide for keystore passwords, keypairs, certificates, or even
convert any existing string settings to secure string settings. Those
will all come in follow up PRs. But this PR was already too big, so this
at least gets a basic version of the infrastructure in.
The two main things to look at. The first is the `SecureSetting` class,
which extends `Setting`, but removes the assumption for the raw value of the
setting to be a string. SecureSetting provides, for now, a single
helper, `stringSetting()` to create a SecureSetting which will return a
SecureString (which is like String, but is closeable, so that the
underlying character array can be cleared). The second is the
`KeyStoreWrapper` class, which wraps the java `KeyStore` to provide a
simpler api (we do not need the entire keystore api) and also extend
the serialized format to add metadata needed for loading the keystore
with no assumptions about keystore type (so that we can change this in
the future) as well as whether the keystore has a password (so that we
can know whether prompting is necessary when we add support for keystore
passwords).
Today if an older version of a plugin exists, we fail to notify the user
with a helpful error message. This happens because during plugin
verification, we attempt to read the plugin descriptors for all existing
plugins. When an older version of a plugin is sitting on disk, we will
attempt to read this old plugin descriptor and fail due to a version
mismatch. This leads to an unhelpful error message. Instead, we should
check for existence of the plugin as part of the verification phase, but
before attempting to read plugin descriptors for existing plugins. This
enables us to provide a helpful error message to the user.
Relates #22305
* Internal: Refactor SettingCommand into EnvironmentAwareCommand
This change renames and changes the behavior of SettingCommand to have
its primary method take in a fully initialized Environment for
elasticsearch instead of just a map of settings. All of the subclasses
of SettingCommand already did this at some point, so this just removes
duplication.
We are currenlty checking that no deprecation warnings are emitted in our query tests. That can be moved to ESTestCase (disabled in ESIntegTestCase) as it allows us to easily catch where our tests use deprecated features and assert on the expected warnings.
Today in the codebase we refer to seccomp everywhere instead of system
call filter even if we are not specifically referring to Linux. This
commit is a purely mechanical change to refer to system call filter
where appropriate instead of the general seccomp, and only leaves
seccomp in place when actually referring to the Linux implementation.
Relates #22243
This commit enables CLI commands to be closeable and installs a runtime
shutdown hook to ensure that if the JVM shuts down (as opposed to
aborting) the close method is called.
It is not enough to wrap uses of commands in main methods in
try-with-resources blocks as these will not run if, say, the virtual
machine is terminated in response to SIGINT, or system shutdown event.
Relates #22126
* Scripting: Remove groovy scripting language
Groovy was deprecated in 5.0. This change removes it, along with the
legacy default language infrastructure in scripting.
This changes adds a test discovery (which internally uses the existing
mock zenping by default). Having the mock the test framework selects be a discovery
greatly simplifies discovery setup (no more weird callback to a Node
method).
Today if you start Elasticsearch with the status logger configured to
the warn level, or use a transport client with the default status logger
level, you will see warn messages about deprecation loggers being
created with different message factories and that formatting might be
broken. This happens because the deprecation logger is constructed using
the message factory from its parent, an artifact leftover from the first
Log4j 2 implementation that used a custom message factory. When that
custom message factory was removed, this constructor invocation should
have been changed to not explicitly use the message factory from the
parent. This commit fixes this invocation. However, we also had some
status checking to all tests to ensure that there are no warn status log
messages that might indicate a configuration problem with Log4j 2. These
assertions blow up badly without the fix for the deprecation logger
construction, and also caught a misconfiguration in one of the logging
tests.
Relates #21339
The usage information for `elasticsearch-plugin` is quiet verbose and makes the
actual error message that is shown when trying to remove an non-existing plugin
hard to spot. This changes the error code to not trigger printing the usage
information.
Closes#21250
Plugins: Remove pluggability of ZenPing
ZenPing is the part of zen discovery which knows how to ping nodes.
There is only one alternative implementation, which is just for testing.
This change removes the ability to add custom zen pings, and instead
hooks in the MockZenPing for tests through an overridden method in
MockNode. This also folds in the ZenPingService (which was really just a
single method) into ZenDiscovery, and removes the idea of having
multiple ZenPing instances. Finally, this was the last usage of the
ExtensionPoint classes, so that is also removed here.
`LocalDiscovery` is a discovery implementation that uses static in memory maps to keep track of current live nodes. This is used extensively in our tests in order to speed up cluster formation (i.e., shortcut the 3 second ping period used by `ZenDiscovery` by default). This is sad as that mean that most of the test run using a different discovery semantics than what is used in production. Instead of replacing the entire discovery logic, we can use a similar approach to only shortcut the pinging components.
This change proposes the removal of all non-tcp transport implementations. The
mock transport can be used by default to run tests instead of local transport that has
roughly the same performance compared to TCP or at least not noticeably slower.
This is a master only change, deprecation notice in 5.x will be committed as a
separate change.
this change adds a hard limit to `index.number_of_shard` that prevents
indices from being created that have more than 1024 shards. This is still
a huge limit and can only be changed via settings a system property.
Today when executing the install plugin command without a plugin id, we
end up throwing an NPE because the plugin id is null yet we just keep
going (ultimatley we try to lookup the null plugin id in a set, the
direct cause of the NPE). This commit modifies the install command so
that a missing plugin id is detected and help is provided to the user.
Relates #20660
When testing tribe nodes in an integration test, we should pass the classpath
plugins of the node down to the tribe client nodes. Without this the tribe client
nodes could be prevented from communicating with the tribes.
Today when CLI tools are executed, logging statements can intentionally
or unintentionally be executed when logging is not configured. This
leads to log messages that the status logger is not configured. This
commit reworks logging configuration for CLI tools so that logging is
always configured.
Relates #20575
Today when starting Elasticsearch without a Log4j 2 configuration file,
we end up throwing an array index out of bounds exception. This is
because we are passing no configuration files to Log4j. Instead, we
should throw a useful error message to the user. This commit modifies
the Log4j configuration setup to throw a user exception if no Log4j
configuration files are present in the config directory.
Relates #20493
The Log4j shutdown hack test tests that a hack we have in place to
workaround a bug in Log4j during shutdown is effective. Log4j can use
JMX to control logging levels, but we disable this through the use of a
system property log4j2.disable.jmx (mainly because there is no need for
this feature, but it also means granting additional security
permissions). The bug in Log4j is that during shutdown, it neglects to
check whether or not its usage of JMX is disable and so it attempts to
unregister management beans, leading to a permissions violation. The
test works by attempting to shutdown Log4j and thus triggering the bad
code path. With the Log4j hack in place, we have introduced jar hell so
that its our code running instead of code from the Log4j jar. Our code
correctly checks that the usage of JMX is disabled and thus does not
trip on a permissions violation. The test was a little complicated in
that it attempted to just grant the minimal permissions needed for Log4j
to do its thing, but this can sometimes lead to other unwanted
permissions violations because the permissions put in place are more
restrictive necessary. This commit simplifies this situation by
rewriting the test to only deny Log4j the sole permission needed to
trigger the bug.
Relates #20476
Today when setting the logging level via the command-line or an API
call, the expectation is that the logging level should trickle down the
hiearchy to descendant loggers. However, this is not necessarily the
case. For example, if loggers x and x.y are already configured then
setting the logging level on x will not descend to x.y. This is because
the logging config for x.y has already been forked from the logging
config for x. Therefore, we must explicitly descend the hierarchy when
setting the logging level and that is what this commit does.
Relates #20463
Today we add a prefix when logging within Elasticsearch. This prefix
contains the node name, and index and shard-level components if
appropriate.
Due to some implementation details with Log4j 2 , this does not work for
integration tests; instead what we see is the node name for the last
node to startup. The implementation detail here is that Log4j 2 there is
only one logger for a name, message factory pair, and the key derived
from the message factory is the class name of the message factory. So,
when the last node starts up and starts setting prefixes on its message
factories, it will impact the loggers for the other nodes.
Additionally, the prefixes are lost when logging an exception. This is
due to another implementation detail in Log4j 2. Namely, since we log
exceptions using a parameterized message, Log4j 2 decides that that
means that we do not want to use the message factory that we have
provided (the prefix message factory) and so logs the exception without
the prefix.
This commit fixes both of these issues.
Relates #20429
This commit adds a -q/--quiet option to Elasticsearch so that it does not log anything in the console and closes stdout & stderr streams. This is useful for SystemD to avoid duplicate logs in both journalctl and /var/log/elasticsearch/elasticsearch.log while still allows the JVM to print error messages in stdout/stderr if needed.
closes#17220
The evil logger tests rely on external configuration. This configuration
is shared between these tests which means that changing the
configuration for one test can cause an unrelated test to fail. In
particular, removing the appenders on the root logger so that inherited
loggers in one test do not have a console and file appender by default
breaks tests that were expecting the root logger to have these
appenders. This commit separates these configs so that these tests are
not subject to this problem.
Log4j has a bug where on shutdown it ignores that JMX might be disabled;
since it does not respect this on shutdown, it proceeds to attempt to
access JMX leading to a security exception that should have otherwise
not occurred had it respected that JMX is disabled. This commit
intentionally introduces jar hell with the Server class to work around
this bug until a fix is released.
Relates #20389
Previously we would disable console logging in certain circumstances
(for example, if Elasticsearch is not in the foreground, or if
Elasticsearch is in the foreground but an exception was thrown during
bootstrap). This commit makes this handling work with Log4j 2. This will
prevent users from seeing double bootstrap check failure messages.
Relates #20387
The 5.x series of Elasticsearch emits a warning if any of the old
logging configuration formats are present. This commit removes that
warning.
Relates #20386
By default, when an exception causes the JVM to terminate, the stack
trace is printed. In the case of failing bootstrap checks, this stack
trace is useless to the user, and might even distract them from seeing
that the bootstrap checks failed for reasons under their control. With
this commit, we cause the stack trace for a failing bootstrap check to
be truncated.
We also modify some methods to not declare that they throw the top level
checked exception type Exception, but instead explicitly declare the
exceptions that they throw. These exceptions are caught and wrapped in a
BootstrapException so that we can percolate only two exception types out
of Bootstrap#init as checked exception, BootstrapException and
NodeValidationException.
Relates #19989
The logging configuration tests write to log files which are deleted at
the end of the test. If these files are not closed, some operating
systems will complain when these deletes are performed. This commit
ensures that the logging system is properly shutdown so that these files
can be properly deleted.
The evil logging tests write to log files which are deleted at the end
of the test. If these files are not closed, some operating systems will
complain when these deletes are performed. This commit ensures that the
logging system is properly shutdown so that these files can be properly
deleted.
This commit expands on the message printed when config files are
preserved when removing a plugin to give the user an indication of the
reason the config files are preserved.
When removing a plugin with a config directory, we preserve the config
directory. This is because the workflow for upgrading a plugin involves
removing and then installing the plugin again and losing the plugin
config in this case would be terrible. This commit causes a message
regarding this to be printed in case the user wants to manually delete
these files.
This commit enables CLI tools to have console logging. For the CLI
tools, we skip configuring the logging infrastructure via the config
file, and instead set the level only via a system property.
This commit fixes failing evil logging configuration tests. The test for
resolving multiple configuration files was failing after
9a58fc2348 removed some of the
configuration needed for this test. The solution is revert the removal
of that configuration, but remove additivity from the test logger to
prevent the evil logger tests from failing.
This commit defaults the max local storage nodes to one. The motivation
for this change is that a default value greather than one is dangerous
as users sometimes end up unknowingly starting a second node and start
thinking that they have encountered data loss.
Relates #19964
This commit fixes a test bug in
EvilJNANativesTests#testSetMaximumNumberOfThreads. Namely, the test was
not checking whether or not the value from /proc/self/limits was equal
to "unlimited" before attempting to parse as a long. This commit fixes
that error.
Today `node.mode` and `node.local` serve almost the same purpose, they
are a shortcut for `discovery.type` and `transport.type`. If `node.local: true`
or `node.mode: local` is set elasticsearch will start in _local_ mode which means
only nodes within the same JVM are discovered and a non-network based transport
is used. The _local_ mode it only really used in tests or if nodes are embedded.
For both, embedding and tests explicit configuration via `discovery.type` and `transport.type`
should be preferred.
This change removes all the usage of these settings and by-default doesn't
configure a default transport implemenation since netty is now a module. Yet, to make
the user expericence flawless, plugins or modules can set a `http.type.default` and
`transport.type.default`. Plugins set this via `PluginService#additionalSettings()`
which enforces _set-once_ which prevents node startup if set multiple times. This means
that our distributions will just startup with netty transport since it's packaged as a
module unless `transport.type` or `http.transport.type` is explicitly set.
This change also found a bunch of bugs since several NamedWriteables were not registered if a
transport client is used. Now that we don't rely on the `node.mode` leniency which is inherited
instead of using explicit settings, `TransportClient` uses `AssertingLocalTransport` which detects these problems since it serializes all messages.
Closes#16234
this commit moves the most of the http related integ tests out into it's own
`qa/smoke-test-http` project where most of the test can run against the external cluster.
The top-level class Throwable represents all errors and exceptions in
Java. This hierarchy is divided into Error and Exception, the former
being serious problems that applications should not try to catch and the
latter representing exceptional conditions that an application might
want to catch and handle. This commit renames
org.elasticsearch.cli.UserError to org.elasticsearch.UserException to
make its name consistent with where it falls in this hierarchy.
Relates #19254