This commit introduces a consistent, and type-safe manner for handling
global build parameters through out our build logic. Primarily this
replaces the existing usages of extra properties with static accessors.
It also introduces and explicit API for initialization and mutation of
any such parameters, as well as better error handling for uninitialized
or eager access of parameter values.
Closes#42042
* Always pass user-specified MaxDirectMemorySize
We had been testing whether a user had passed a value for
MaxDirectMemorySize by parsing the output of "java -XX:PrintFlagsFinal
-version". If MaxDirectMemorySize equals zero, we set it to half of max
heap. The problem is that on Windows with JDK 8, a JDK bug incorrectly
truncates values over 4g and returns multiples of 4g as zero. In order
to always respect the user-defined settings, we need to check our input
to see if an "-XX:MaxDirectMemorySize" value has been passed.
* Always warn for Windows/jdk8 ergo issue
Even if a user has set MaxDirectMemorySize, they aren't future-proof for
this JDK bug. With this change, we issue a general warning for the
windows/JDK8 issue, and a specific warning if MaxDirectMemorySize is
unset.
This commit moves JVM options that we are setting on behalf of the user
that we do not expect them to fiddle with out of the jvm.options
configuration file and into the JVM options parser. In this way, we
discourage fiddling with these settings, but more importantly, we ensure
that as we evolve or add to these settings that a user would pick these
pick instead of being left behind if they have a modified jvm.options
file and do not pick any new that come with the distribution.
Our JVM ergonomics extract max heap size from JDK PrintFlagsFinal output.
On JDK 8, there is a system-dependent bug where memory sizes are cast to
32-bit integers. On affected systems (namely, Windows), when 1/4 of physical
memory is more than the maximum integer value, the output of PrintFlagsFinal
will be inaccurate. In the pathological case, where the max heap size would
be a multiple of 4g, the test will fail.
The practical effect of this bug, beyond test failures, is that we may set
MaxDirectMemorySize to an incorrect value on Windows. This commit adds a
warning about this situation during startup.
This commit removes the option to change the netty system properties to
reenable the direct buffer pooling. It also removes the need for us to
disable the buffer pooling in the system properties file. Instead, we
programmatically craete an allocator that is used by our networking
layer.
This commit does introduce an Elasticsearch property which allows the
user to fallback on the netty default allocator. If they choose this
option, they can configure the default allocator how they wish using the
standard netty properties.
This test has been failing very frequently on Windows checks for 7.4. We've also seen it for 7.x as well as for the new 7.5 tests. I'm muting during investigation so we can see if this is masking any other issues.
Compilation was accidentally broken here when a backport used code from
JDK 9, which is not supported in 7.x. This commit addresses this by
using JDK 8 compatiable APIs.
This commit moves the ES_TMPDIR substitution that we do for JVM options
into the JVM options parser itself. This solves a problem where the fact
that the we do not make the substitution before ergonomics parsing can
lead to the JVM that we start for computing the ergonomic values failing
to start. Additionally, moving this substitution here enables us to
simplify the shell scripts since we do not need to implement this there,
and twice for Bash and Windows.
This is the Java side of https://github.com/elastic/ml-cpp/pull/593
with a fallback so that ml-cpp bundles with either the
new or old directory structure work for the time being.
A few days after merging the C++ changes a followup to
this change will be made that removes the fallback.
We currently configure io.netty.allocator.numDirectArenas to be 0 in the
jvm erconomics class. This is a config that we always want to set, so it
makes sense to move it to jvm.options.
Most of our CLI tools use the Terminal class, which previously did not provide methods for writing to standard output. When all output goes to standard out, there are two basic problems. First, errors and warnings are "swallowed" in pipelines, making it hard for a user to know when something's gone wrong. Second, errors and warnings are intermingled with legitimate output, making it difficult to pass the results of interactive scripts to other tools.
This commit adds a second set of print commands to Terminal for printing to standard error, with errorPrint corresponding to print and errorPrintln corresponding to println. This leaves it to developers to decide which output should go where. It also adjusts existing commands to send errors and warnings to stderr.
Usage is printed to standard output when it's correctly requested (e.g., bin/elasticsearch-keystore --help) but goes to standard error when a command is invoked incorrectly (e.g. bin/elasticsearch-keystore list-with-a-typo | sort).
Elasticsearch does not grant Netty reflection access to get Unsafe. The
only mechanism that currently exists to free direct buffers in a timely
manner is to use Unsafe. This leads to the occasional scenario, under
heavy network load, that direct byte buffers can slowly build up without
being freed.
This commit disables Netty direct buffer pooling and moves to a strategy
of using a single thread-local direct buffer for interfacing with sockets.
This will reduce the memory usage from networking. Elasticsearch
currently derives very little value from direct buffer usage (TLS,
compression, Lucene, Elasticsearch handling, etc all use heap bytes). So
this seems like the correct trade-off until that changes.
This commit applies a normalization process to environment paths, both
in how they are stored internally, also their settings values. This
normalization is done via two means:
- we make the paths absolute
- we remove redundant name elements from the path (what Java calls
"normalization")
This change ensures that when we compare and refer to these paths within
the system, we are using a common ground. For example, prior to the
change if the data path was relative, we would not compare it correctly
to paths from disk usage. This is because the paths in disk usage were
being made absolute.
* Mute failing test
tracked in #44552
* mute EvilSecurityTests
tracking in #44558
* Fix line endings in ESJsonLayoutTests
* Mute failing ForecastIT test on windows
Tracking in #44609
* mute BasicRenormalizationIT.testDefaultRenormalization
tracked in #44613
* fix mute testDefaultRenormalization
* Increase busyWait timeout windows is slow
* Mute failure unconfigured node name
* mute x-pack internal cluster test windows
tracking #44610
* Mute JvmErgonomicsTests on windows
Tracking #44669
* mute SharedClusterSnapshotRestoreIT testParallelRestoreOperationsFromSingleSnapshot
Tracking #44671
* Mute NodeTests on Windows
Tracking #44256
Today when checksumming a plugin zip during plugin install, we read all
of the bytes of the zip into memory at once. When trying to run the
plugin installer on a small heap (say, 64 MiB), this can lead to the
plugin installer running out of memory when checksumming large
plugins. This commit addresses this by reading the plugin bytes in 8 KiB
chunks, thus using a constant amount of memory independent of the size
of the plugin.
This change makes the process of verifying the signature of
official plugins FIPS 140 compliant by defaulting to use the
BouncyCastle FIPS provider and adding a dependency to bcpg-fips
that implement parts of openPGP in a FIPS compliant manner.
In already FIPS 140 enabled environments that use the
BouncyCastle FIPS provider, the bcfips dependency is redundant
but doesn't cause an issue as it will be added only in the classpath
of the cli-tools
This is a backport of #44224
Java 8 presents the JVM options slightly differently when displaying via
-XX:+PrintFlagsFinal. This commit adapts the JVM options parser for this
possibility.
Relates #42009
This commit removes manual parsing of JVM options when calculating
ergonomics. This is to avoid a situation that we parse values
differently than the JVM would. In fact, we already have a bug along
these lines today. It is possible to start the JVM with the same flag
multiple times on the command line. In this case, the last value
wins. For example, -Xmx1g -Xmx2g would start the JVM with a heap size of
two gigabytes. Our JVM ergonomics ignores this possibility and instead
the first value is winning!
Our strategy to avoid manual parsing of the JVM options is to start the
Java command line parser (without actually starting a JVM) by invoking
java with the same command line flags as presented and request that the
JVM tell us what values it would start with. This ensures that we have
the correct values when making ergonomic decisions.
Moreover, our strategy also is ignoring ES_JAVA_OPTS which could
override the heap size as well leading to incorrect ergonomic
choices. This commit address this issue too.
hamcrest has some improvements in newer versions, like FileMatchers
that make assertions regarding file exists cleaner. This commit upgrades
to the latest version of hamcrest so we can start using new and improved
matchers.
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)
This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.
(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)
* Fix forking JVM runner
* Don't bump shadow plugin version
This commit deprecates versions of Java prior to Java 11. This commit
will cause a warning to be printed to standard error when any command
line tool is invoked, or when Elasticsearch is started. Additionally, we
log a deprecation message when Elasticsearch is started.
* Testing conventions now checks for tests in main
This is the last outstanding feature of the old NamingConventionsTask,
so time to remove it.
* PR review
We added some special handling for installing and removing the
ingest-geoip and ingest-user-agent plugins when we converted them to
modules. This special handling was done to minimize breaking users in a
minor release. However, do not want to maintain this behavior forever so
this commit removes that special handling in the master branch so that
starting with 7.0.0 this special handling will be gone.
* Don't print download progress in batch mode
With this change we will no longer provide the progress bar in batch
mode.
Assuming that this is mode is mainly for consumption by tools which
will serialize the output, we shouldn't print a progress bar to be
for every percentile.
* PR review
In the long run we want to move all of startup to a Java program. This
will simplify our startup scripts and make maintenance of startup less
dependent on the underlying platform that we run on. This commit moves
the creation of the temporary directory off of system-dependent commands
and onto a simple Java program.
The list of official plugins accidentally included `qa` projects like,
well, `qa` and `amazon-ec2`. This changes the mechanism that we use to
build the list and adds a test to catch this.
Closes#35623
With this change, `Version` no longer carries information about the qualifier,
we still need a way to show the "display version" that does have both
qualifier and snapshot. This is now stored by the build and red from `META-INF`.
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309Closes#32899
- third party audit detects jar hell with JDK so we disable it
- jdk non portable in forbiddenapis detects classes being used from the
JDK ( for fips ) that are not portable, this is intended so we don't
scan for it on fips.
- different exclusion rules for third party audit on fips
Closes#33179
Ensure our tests can run in a FIPS JVM
JKS keystores cannot be used in a FIPS JVM as attempting to use one
in order to init a KeyManagerFactory or a TrustManagerFactory is not
allowed.( JKS keystore algorithms for private key encryption are not
FIPS 140 approved)
This commit replaces JKS keystores in our tests with the
corresponding PEM encoded key and certificates both for key and trust
configurations.
Whenever it's not possible to refactor the test, i.e. when we are
testing that we can load a JKS keystore, etc. we attempt to
mute the test when we are running in FIPS 140 JVM. Testing for the
JVM is naive and is based on the name of the security provider as
we would control the testing infrastrtucture and so this would be
reliable enough.
Other cases of tests being muted are the ones that involve custom
TrustStoreManagers or KeyStoreManagers, null TLS Ciphers and the
SAMLAuthneticator class as we cannot sign XML documents in the
way we were doing. SAMLAuthenticator tests in a FIPS JVM can be
reenabled with precomputed and signed SAML messages at a later stage.
IT will be covered in a subsequent PR
With this commit we add the possibility to define further JVM options (and
system properties) based on the current environment. As a proof of concept, it
chooses Netty's allocator ergonomically based on the maximum defined heap size.
We switch to the unpooled allocator at 1GB heap size (value determined
experimentally, see #30684 for more details). We are also explicit about the
choice of the allocator in either case.
Relates #30684
This was silly; Bouncy Castle has an armored input stream for reading
keys in ASCII armor format. This means that we do not need to strip the
header ourselves and base64 decode the key. This had problems anyway
because of discrepancies in the padding that Bouncy Castle would produce
and the JDK base64 decoder was expecting. Now that we armor input/output
the whole way during tests, we fix all random failures in test cases
too.