We hit a bug where we can't partially update documents created in a
mixed cluster between 5.x and 6.x. Although this bug does not affect
7.0 or later, we should have a good test that catches this issue.
Relates #46198
* Pass COMPUTERNAME env var to elasticsearch.bat
When we run bin/elasticsearch with bash, we get a $HOSTNAME builtin that
contains the hostname of the machine the script is running on. When
there's no provided nodename, Elasticsearch uses the HOSTNAME to create
a nodename. On Windows, Powershell provides a $COMPUTERNAME variable for
the same purpose. CMD.EXE provides the same thing, except it's called
%COMPUTERNAME%. bin/elasticsearch.bat sets $HOSTNAME to the value of
$COMPUTERNAME. However, when testclusters invokes bin/elasticsearch.bat,
the COMPUTERNAME variable doesn't get passed in, leaving HOSTNAME null
and breaking an integration test on Windows.
This commit sets COMPUTERNAME in the environment so that our tests get
the value that Elasticsearch would have when bin/elasticsearch.bat is
invoked from the shell.
* Add null check to protect in non-Windows case
What good is it a developer to gain the whole Windows if they forfeit
their Unix? The value that fixes things on Windows is null on
Linux/Darwin, so let's null-check it.
* Override system hostnames for testclusters
Rather than relying on variable system behavior, let's just override
HOSTNAME and COMPUTERNAME and test for correct values in the integration
test that was originally failing.
* Rename constants for clarity
Since we are setting HOSTNAME and COMPUTERNAME regardless of whether the
tests are running on Windows or Linux, we shouldn't imply that constants
are only used in one case or the other.
This commit moves many features of individual distro tests into the base
class so that other test cases can utilize them. It also standardizes
the pattern for tests adding assumptions for the particular
distributions to test.
Most of our CLI tools use the Terminal class, which previously did not provide methods for writing to standard output. When all output goes to standard out, there are two basic problems. First, errors and warnings are "swallowed" in pipelines, making it hard for a user to know when something's gone wrong. Second, errors and warnings are intermingled with legitimate output, making it difficult to pass the results of interactive scripts to other tools.
This commit adds a second set of print commands to Terminal for printing to standard error, with errorPrint corresponding to print and errorPrintln corresponding to println. This leaves it to developers to decide which output should go where. It also adjusts existing commands to send errors and warnings to stderr.
Usage is printed to standard output when it's correctly requested (e.g., bin/elasticsearch-keystore --help) but goes to standard error when a command is invoked incorrectly (e.g. bin/elasticsearch-keystore list-with-a-typo | sort).
The java based distribution tests currently have a single Tests class
which encapsulates all of the tests for a particular distribution. The
test task in gradle then depends on all distributions being built, and
each individual tests class looks for the particular distribution it is
trying to test. This means that reproducing a single test failure
triggers all the distributions to be built, even though only one is
needed for the test.
This commit reworks the java distribution tests to pass in a particular
distribution to be tested, and changes the base test classes to be
actual test classes which have assumptions around which distributions
they operate on. For example, the archives tests will be skipped when
run with an rpm distribution, and vice versa for the package tests. This
makes reproduction much more granular. It also also better splitting up
tests around a particular use case. For example, all tests for systemd
behavior can be in one test class, and run independently of all tests
against rpm/deb distributions.
This commit addresses an issue when trying to using Elasticsearch on
systems with Sys V init and the bundled JDK was not being used. Instead,
we were still inadvertently trying to fallback on the path. This commit
removes that fallback as that is against our intentions for 7.x where we
only support the bundled JDK or an explicit JDK via JAVA_HOME.
The system level tests for our distributions have historically be run in
vagrant, and thus the name of the gradle project has been "vagrant".
However, as we move to running these tests in other environments (eg
GCP) the name vagrant no longer makes sense. This commit renames the
project to "os" (short for operating system), since these tests ensure
all of our distributions run correctly on our supported operating
systems.
The bats tests currently require many additional artifacts to be built.
In addition to the current distributions, they need all the plugins to
be installed, as well as a randomly chosen bwc distribution. This commit
splits these two cases into their own bats task, so the dependencies do
not slow down other tasks like distroTests which do not need them.
The distro test plugin was originally designed to be applied within each
subproject, per operating system we run in a VM with vagrant. However,
for efficiency, and also ease of having a single task to run in CI when
launching within individual OS VMs, having the "destructive" tasks in a
single place is more convenient. This commit reworks the distro test
plugin to be applied to the qa/vagrant project, which now creates only
the wrapper tasks in each of the subprojects for each vagrant VM.
The vagrant based tests currently reside in a single project, creating
dozens of tasks to manage starting and stopping the vagrant VM along
with running java and bats tests within each image. This all-in-one
pattern makes parallelizing packaging tests difficult.
This commit rewrites the vagrant testing infrastructure to be
independent of the actual test runners, thus allowing each platform to
be handled in a separate subproject. Additionally, the java and bats
tests are changed to be run through a "destructive" gradle task, which
is run inside the VM. The combination of these will allow
parallelization both locally (through running several VMs at once) as
well as running the destructive tasks in CI machines dedicated to each
platform (thus removing the need for vagrant in CI).
* Restrict which tasks can use testclusters
This PR fixes a problem between the interaction of test-clusters and
build cache.
Before this any task could have used a cluster without tracking it as
input.
With this change a new interface is introduced to track the tasks that
can use clusters and we do consider the cluster as input for all of
them.
This commit applies a normalization process to environment paths, both
in how they are stored internally, also their settings values. This
normalization is done via two means:
- we make the paths absolute
- we remove redundant name elements from the path (what Java calls
"normalization")
This change ensures that when we compare and refer to these paths within
the system, we are using a common ground. For example, prior to the
change if the data path was relative, we would not compare it correctly
to paths from disk usage. This is because the paths in disk usage were
being made absolute.
Today we recover a replica by copying operations from the primary's translog.
However we also retain some historical operations in the index itself, as long
as soft-deletes are enabled. This commit adjusts peer recovery to use the
operations in the index for recovery rather than those in the translog, and
ensures that the replication group retains enough history for use in peer
recovery by means of retention leases.
Reverts #38904 and #42211
Relates #41536
Backport of #45136 to 7.x.
Deprecation logger was filtering log entries by key, that means that if two log messages with the same key are logged from different users, then the second log messages will be filtered.
This change allows to log deprecation message with the same key by different users.
relates #41354
backport #44587
* Mute failing test
tracked in #44552
* mute EvilSecurityTests
tracking in #44558
* Fix line endings in ESJsonLayoutTests
* Mute failing ForecastIT test on windows
Tracking in #44609
* mute BasicRenormalizationIT.testDefaultRenormalization
tracked in #44613
* fix mute testDefaultRenormalization
* Increase busyWait timeout windows is slow
* Mute failure unconfigured node name
* mute x-pack internal cluster test windows
tracking #44610
* Mute JvmErgonomicsTests on windows
Tracking #44669
* mute SharedClusterSnapshotRestoreIT testParallelRestoreOperationsFromSingleSnapshot
Tracking #44671
* Mute NodeTests on Windows
Tracking #44256
In https://github.com/elastic/elasticsearch/pull/41913 setting up the
temp dir for ES was moved from the env script to individual cli scripts.
However, moving it to the windows service cli was missed. This commit
restores setting up the temp dir for the windows service control script.
Test clusters currently has its own set of logic for dealing with
finding different versions of Elasticsearch, downloading them, and
extracting them. This commit converts testclusters to use the
DistributionDownloadPlugin.
This is a refactor to current JSON logging to make it more open for extensions
and support for custom ES log messages used inDeprecationLogger IndexingSlowLog , SearchSLowLog
We want to include x-opaque-id in deprecation logs. The easiest way to have this as an additional JSON field instead of part of the message is to create a custom DeprecatedMessage (extends ESLogMEssage)
These messages are regular log4j messages with a text, but also carry a map of fields which can then populate the log pattern. The logic for this lives in ESJsonLayout and ESMessageFieldConverter.
Similar approach can be used to refactor IndexingSlowLog and SearchSlowLog JSON logs to contain fields previously only present as escaped JSON string in a message field.
closes#41350
backport #41354
* Testclusters: Convert additional projects
Found some more that were not using testclusters from elasticsearch-ci/1
* Allow IOException too
* Make the client more resilient
This test is likely to kill the server in the middle of writing logs. This means that we can end up with logs with partially written json log lines and standard json parsers would fail on this.
This fix is to use regular expressions on json logs.(just like the previous approach on plain text logs)
closes#43413
In hamcrest 2.1 warnings for unchecked varargs were fixed by hamcrest using @SafeVarargs for those matchers where this warning occurred.
This PR is aimed to remove these annotations when Matchers.contains ,Matchers.containsInAnyOrder or Matchers.hasItems was used
backport #41528
We had this as a dependency for legacy dependencies that still needed
the Log4j 1.2 API. This appears to no longer be necessary, so this
commit removes this artifact as a dependency.
To remove this dependency, we had to fix a few places where we were
accidentally relying on Log4j 1.2 instead of Log4j 2 (easy to do, since
both APIs were on the compile-time classpath).
Finally, we can remove our custom Netty logger factory. This was needed
when we were on Log4j 1.2 and handled logging in our own unique
way. When we migrated to Log4j 2 we could have dropped this
dependency. However, even then Netty would still pick up Log4j 1.2 since
it was on the classpath, thus the advantage to removing this as a
dependency now.
ES does not accept doc type starting with underscore until 6.2.0. We
have to use "doc" instead of "_doc" in FullClusterRestartIT if we are
upgrading from a 6.2.0- cluster.
Closes#42581
This change verifies and aborts recovery if source and target have the
same syncId but different sequenceId. This commit also adds an upgrade
test to ensure that we always utilize syncId.
The date_histogram accepts an interval which can be either a calendar
interval (DST-aware, leap seconds, arbitrary length of months, etc) or
fixed interval (strict multiples of SI units). Unfortunately this is inferred
by first trying to parse as a calendar interval, then falling back to fixed
if that fails.
This leads to confusing arrangement where `1d` == calendar, but
`2d` == fixed. And if you want a day of fixed time, you have to
specify `24h` (e.g. the next smallest unit). This arrangement is very
error-prone for users.
This PR adds `calendar_interval` and `fixed_interval` parameters to any
code that uses intervals (date_histogram, rollup, composite, datafeed, etc).
Calendar only accepts calendar intervals, fixed accepts any combination of
units (meaning `1d` can be used to specify `24h` in fixed time), and both
are mutually exclusive.
The old interval behavior is deprecated and will throw a deprecation warning.
It is also mutually exclusive with the two new parameters. In the future the
old dual-purpose interval will be removed.
The change applies to both REST and java clients.
This commit adds deletion of the bin directory to postrm cleanup. While
the package's bin files are cleaned up by the package manager, plugins
may have created subdirectories under bin. We already cleanup plugins,
but not the extra bin dirs their installation created.
closes#18109
The deb package has been updated several times in the past to contain
overrides in order to pass lintian inspection. However, there have never
been any tests to ensure we do not fallback to failure. This commit
updates the overrides file given things that have changed since 2.x like
adding ML and bundling the jdk.
closes#17185
We have faked some Ivy repositories on a few artifact locations. Today
when Gradle attempts to resolve these artifacts, it follows its default
strategy to search for Gradle metadata, then Maven POM files, then Ivy
descriptors, and finally will fallback to looking directly for the
artifact. This wastes times on remote network calls that will 404 anyway
since these metadata resources will not exist for these fake Ivy
repositories. This commit overrides the Gradle strategy to look directly
for artifacts.
When Elasticsearch is run from a package installation, the running
process does not have permissions to write to the keystore. This is
because of the root:root ownership of /etc/elasticsearch. This is why we
create the keystore if it does not exist during package installation. If
the keystore needs to be upgraded, that is currently done by the running
Elasticsearch process. Yet, as just mentioned, the Elasticsearch process
would not have permissions to do that during runtime. Instead, this
needs to be done during package upgrade. This commit adds an upgrade
command to the keystore CLI for this purpose, and that is invoked during
package upgrade if the keystore already exists. This ensures that we are
always on the latest keystore format before the Elasticsearch process is
invoked, and therefore no upgrade would be needed then. While this bug
has always existed, we have not heard of reports of it in practice. Yet,
this bug becomes a lot more likely with a recent change to the format of
the keystore to remove the distinction between file and string entries.
hamcrest has some improvements in newer versions, like FileMatchers
that make assertions regarding file exists cleaner. This commit upgrades
to the latest version of hamcrest so we can start using new and improved
matchers.
The pid dir for both systemd and init.d is already managed by those
respective systems (tmpfiles.d and the init script, respectively). Since
the /var/run dir is often mounted as tmpfs, it does not make sense to
have the elasticsearch pid dir added by the package installation. This
commit removes that empty dir from deb and rpm.
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)
This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.
(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)
* Fix forking JVM runner
* Don't bump shadow plugin version
If the relocation is throttled, the subsequent search request on the
target node (i.e., with preference _only_nodes=target_node) will fail
because some shards have not moved to that node yet. With this change,
we will wait for the relocation happens by busily checking the routing
table of the testing index on the target node.
Closes#34950
* Avoid sharing source directories as it breaks intellij
* Subprojects share main project output classes directory
* Fix jar hell
* Fix sql security with ssl integ tests
* Relax dependency ordering rule so we don't explode on cycles