This commit disables caching of BWC snapshot distributions in the "trunk" (aka master) branch.
Since the previous major release branches move quickly we rarely get cache hits for these
tasks, and the artifacts themselves are very large. This means the overhead here is high and
savings basically zero. We conditionally disable task output caching in this scenario in CI to
avoid excessive build cache overhead as well as causing too much turn in the cache itself which
would lead to lots of cache entry evictions.
With the next minor release of Elasticsearch we will drop support for
JDK 12 and bump to JDK 13. While we want to use AdoptOpenJDK as the
bundled JDK, we are waiting for a release there. This commit moves to
OpenJDK 13 for now, and we will move to AdoptOpenJDK 13 as soon as its
available. Since macOS Catalina is delayed until October, we have some
time to update this.
This commit teaches the build how to bundle AdoptOpenJDK with our
artifacts, and switches to AdoptOpenJDK as the bundled JDK. We keep the
functionality to also bundle Oracle OpenJDK distributions.
In some cases (for example some AdoptOpenJDK builds), the java.vendor is
mistakenly populated as "Oracle Corporation" while the real value is
under "java.vendor.version". Since "java.vendor.version" is mandatory
since JDK 10, this commit changes to use "java.vendor.version" as the
favored system property to find the JVM vendor, and we fallback to
"java.vendor" if this is not populated (as happens in some Oracle
builds). Ugh.
Before this change we would run bwc nodes with their bundled jdk if
these supported it, so the passed in runtime JDK was not honored.
This became obvius when running with FIPS.
Closes#41721
In order to track down #46091:
* Enables debug logging in REST tests for `master` and `coordination` packages
since we suspect that issues are caused by failed and then retried publications
Previously we only turned on tests if we saw either `// CONSOLE` or
`// TEST`. These magic comments are difficult for the docs build to deal
with so it has moved away from using them where possible. We should
catch up. This adds another trigger to enable testing: marking a snippet
with the `console` language. It looks like this:
```
[source,console]
----
GET /
----
```
This saves a line which is nice, I guess. But it is more important to me
that this is consistent with the way the docs build works now.
Similarly this enables response testing when you mark a snippet with the
language `console-result`. That looks like:
```
[source,console-result]
----
{
"result": "0.1"
}
----
```
`// TESTRESPONSE` is still available for situations like `// TEST`: when
the response isn't *in* the console-result language (like `_cat`) or
when you want to perform substitutions on the generated test.
Should unblock #46159.
This adds support for verifying that snippets with the `console-result`
language are valid json. It also switches the response snippets on the
`docs/get` page from `js` to `console-result` which will allow clients
to provide "alternatives" for them like they can now do with
`// CONSOLE` snippets.
* Pass COMPUTERNAME env var to elasticsearch.bat
When we run bin/elasticsearch with bash, we get a $HOSTNAME builtin that
contains the hostname of the machine the script is running on. When
there's no provided nodename, Elasticsearch uses the HOSTNAME to create
a nodename. On Windows, Powershell provides a $COMPUTERNAME variable for
the same purpose. CMD.EXE provides the same thing, except it's called
%COMPUTERNAME%. bin/elasticsearch.bat sets $HOSTNAME to the value of
$COMPUTERNAME. However, when testclusters invokes bin/elasticsearch.bat,
the COMPUTERNAME variable doesn't get passed in, leaving HOSTNAME null
and breaking an integration test on Windows.
This commit sets COMPUTERNAME in the environment so that our tests get
the value that Elasticsearch would have when bin/elasticsearch.bat is
invoked from the shell.
* Add null check to protect in non-Windows case
What good is it a developer to gain the whole Windows if they forfeit
their Unix? The value that fixes things on Windows is null on
Linux/Darwin, so let's null-check it.
* Override system hostnames for testclusters
Rather than relying on variable system behavior, let's just override
HOSTNAME and COMPUTERNAME and test for correct values in the integration
test that was originally failing.
* Rename constants for clarity
Since we are setting HOSTNAME and COMPUTERNAME regardless of whether the
tests are running on Windows or Linux, we shouldn't imply that constants
are only used in one case or the other.
Since credentials are required to access such a repository, and these
repositories are accessed over an encrypted protocol (https), this
commit adds support to consider S3-backed artifact repositories as
secure. Additionally, we add tests for this functionality.
This commit adds a destructiveDistroTest task which depends on all of
the distribution specific destructive tasks, which can be used by CI.
closes#45769
The java based distribution tests currently have a single Tests class
which encapsulates all of the tests for a particular distribution. The
test task in gradle then depends on all distributions being built, and
each individual tests class looks for the particular distribution it is
trying to test. This means that reproducing a single test failure
triggers all the distributions to be built, even though only one is
needed for the test.
This commit reworks the java distribution tests to pass in a particular
distribution to be tested, and changes the base test classes to be
actual test classes which have assumptions around which distributions
they operate on. For example, the archives tests will be skipped when
run with an rpm distribution, and vice versa for the package tests. This
makes reproduction much more granular. It also also better splitting up
tests around a particular use case. For example, all tests for systemd
behavior can be in one test class, and run independently of all tests
against rpm/deb distributions.
* Add input and outut tracking of built bwc versions
This PR adds tracking of the bwc versions git has as input and all the
expected files as output.
The effect is that `gradlew` is not called at all when the git has
doesn't change and the version was allready built.
Previusly gradlew would be called for the bwc version and it would have
to configure the project and go trough up to date checks to figure out
that nothing changed.
This helps when working on bwc tests locally needing to run the test
multiple times.
This should also help in CI not re-build bwc versions across different
runs.
* Enable caching of bwc builds
This commit adds CNAME reporting for transport.publish_address same way
it's done for http.publish_address.
Relates #32806
Relates #39970
(cherry picked from commit e0a2558a4c3a6b6fbfc6cd17ed34a6f6ef7b15a9)
Today we shell out to git rev-parse to read the git revision. Forking
another process is slower than reading the revision directly. This
commit changes to directly read the git revision from the repository,
avoiding to fork another process.
The dependency on copying distributions was accidentally masked by an
earlier refactoring. This commit fixes the copyDistributions task to be
run before bats tests run.
The bats tests currently require many additional artifacts to be built.
In addition to the current distributions, they need all the plugins to
be installed, as well as a randomly chosen bwc distribution. This commit
splits these two cases into their own bats task, so the dependencies do
not slow down other tasks like distroTests which do not need them.
The distro test plugin was originally designed to be applied within each
subproject, per operating system we run in a VM with vagrant. However,
for efficiency, and also ease of having a single task to run in CI when
launching within individual OS VMs, having the "destructive" tasks in a
single place is more convenient. This commit reworks the distro test
plugin to be applied to the qa/vagrant project, which now creates only
the wrapper tasks in each of the subprojects for each vagrant VM.
Before #45064, the bats tests skipped the upgrade tests when the random
upgrade version is before 6.3.0. This commit restores that behavior.
closes#45476
The vagrant based tests currently reside in a single project, creating
dozens of tasks to manage starting and stopping the vagrant VM along
with running java and bats tests within each image. This all-in-one
pattern makes parallelizing packaging tests difficult.
This commit rewrites the vagrant testing infrastructure to be
independent of the actual test runners, thus allowing each platform to
be handled in a separate subproject. Additionally, the java and bats
tests are changed to be run through a "destructive" gradle task, which
is run inside the VM. The combination of these will allow
parallelization both locally (through running several VMs at once) as
well as running the destructive tasks in CI machines dedicated to each
platform (thus removing the need for vagrant in CI).
* Restrict which tasks can use testclusters
This PR fixes a problem between the interaction of test-clusters and
build cache.
Before this any task could have used a cluster without tracking it as
input.
With this change a new interface is introduced to track the tasks that
can use clusters and we do consider the cluster as input for all of
them.
This commit makes the gitRevision property a lazy loaded value by
returning an Object implementing toString(). The Dockerfile template is
also changed to use groovy templates instead of the mavenfilter hack, so
converting to String will not happen until runtime.
We configure the service ID as the node's toString but this containes
characters that Windows doesn't like.
This PR fixes it by allowing only alphanumeric characters
This commit simplifies the handling of git revision in the build. In
particular we remove pushing git revision through the generate build
info and print build info tasks as the git revision does not need to be
cached.
This commit switches to using the full hash to build into the JAR
manifest, which is used in node startup and the REST main action to
display the build hash.
Testclusters currently provides protection from clusters living past the
life of a build by adding a shutdown hook to java. While this works in
some cases, it does not cover all cases like where the daemon is killed
with SIGKILL.
To handle these other cases, this commit replaces the shutdown hooks with
a separate process (one per build) that manages reaping external services
if gradle dies.
This commit adds the commit hash to the global build info, and adds the
git revision as an extension. There are a couple motivations for this
change:
- the current mechanism of getting the build hash does not work with
git worktrees (because jgit does not understand them)
- a follow-up will want to use the git revision when building the
Docker images, so we want it available as an extension
- it allows us to simplify our usage around the build hash as we no
longer have to hack around silliness in the info-scm plugin
A follow-up will also stop using the short hash in the product build, so
that we use the full hash there. We already know that short hashes in
our codebase do collide, so we should move to the full hash to avoid
this problem.
In https://github.com/elastic/elasticsearch/pull/41913 setting up the
temp dir for ES was moved from the env script to individual cli scripts.
However, moving it to the windows service cli was missed. This commit
restores setting up the temp dir for the windows service control script.
Backport of #43177 so that VersionProperties is Java 8 compatible and
can be used by https://github.com/elastic/elasticsearch-hadoop
to retrieve snapshot versions for Lucene.
(cherry picked from commit ec3ac9b62452f04ce44dea0a904a6e2b31dd8076)
A tool to work with snapshots.
Co-authored by @original-brownbear.
This commit adds snapshot tool and the single command cleanup, that
cleans up orphaned files for S3.
Snapshot tool lives in x-pack/snapshot-tool.
(cherry picked from commit fc4aed44dd975d83229561090f957a95cc76b287)
* Detect process third party audit being killed by OOM
It's very common for the third party audit to be killed by the OOM
killer when the system is running low on memory.
Since the forbidden APIs call is expected to fail, we were ignoring
these and incorrectly interpreting the partial output.
With this change we detect and provide a proper error message when this
happens.
The test EmptyDirTaskTests#testCreateEmptyDirNoPermissions may fail on
Windows. However, the test is only meaningful for Unix permissions
structures, so we should assume a Unix-family OS and skip the test on
Windows.
Fixes#44064
Test clusters currently has its own set of logic for dealing with
finding different versions of Elasticsearch, downloading them, and
extracting them. This commit converts testclusters to use the
DistributionDownloadPlugin.
Due to recent changes are done for converting `repository-hdfs` to test
clusters (#41252), the `integTestSecure*` tasks did not depend on
`secureHdfsFixture` which when running would fail as the fixture
would not be available. This commit adds the dependency of the fixture
to the task.
The `secureHdfsFixture` is a `AntFixture` which is spawned a process.
Internally it waits for 30 seconds for the resources to be made available.
For my local machine, it took almost 45 seconds to be available so I have
added the wait time as an input to the `AntFixture` defaults to 30 seconds
and set it to 60 seconds in case of secure hdfs fixture.
The integ test for secure hdfs was disabled for a long time and so
the changes done in #42090 to fix the tests are also done in this commit.
* Improoce how log is tailed in testclusters on failure
- only print last few lines
- print all errors and warnings
- compact repeating errors and warnings
* Test fixtures improovements
Don't disable some of the precommit tasks on fixtures.
This no longer makes sense now that a project can both produce and use a
fixture.
In order for this to be possible, had to add an additional configuration
to make JarHell class accessible to the task even if it's not a
dependency of the project and fix some of the third party audit fallout
from #43671 which wasn't detected at the time due to the issue being
fixed here.
Closes#43918
* TestClusters: Convert the security plugin
This PR moves security tests to use TestClusters.
The TLS test required support in testclusters itself, so the correct
wait condition is configgured based on the cluster settings.
* PR review
Several types of distributions are built and tested in elasticsearch,
ranging from the current version, to building or downloading snapshot or
released versions. Currently tests relying on these have to contain
logic deciding where and how to pull down these distributions.
This commit adds an distributiond download plugin for each project to
manage which versions and variants the project needs. It abstracts away
all need for knowing where a particular version comes from, like a local
distribution or bwc project, or pulling from the elastic download
service. This will be used in a followup PR by the testclusters and
vagrant tests.
When starting BWC nodes, it could be that runtime Java home is set. Yet,
runtime Java home can advance beyond what a BWC node might be compatible
with. For example, if runtime Java home is set to JDK 13 and we are
starting a 7.1.2 node, we do not have any guarantees that 7.1.2 is
compatible with JDK 13 (since we never did any work to make it so). This
will continue to be the case as JDK releases advance, but we still need
to test against BWC nodes. This commit stops applying runtime Java home
when starting a BWC node. Instead, we would use the bundled JDK.
We initially added `requireDocker` for a way for tasks to say that they
absolutely must have it, like the build docker image tasks.
Projects using the test fixtures plugin are not in this both, as the
intent with these is that they will be skipped if docker and docker-compose
is not available.
Before this change we were lenient, the docker image build would succeed
but produce nothing. The implementation was also confusing as it was not
immediately obvious this was the case due to all the indirection in the
code.
The reason we have this leniency is that when we added the docker image
build, docker was a fairly new requirement for us, and we didn't have
it deployed in CI widely enough nor had CI configured to prefer workers
with docker when possible. We are in a much better position now.
The other reason was other stack teams running `./gradlew assemble`
in their respective CI and the possibility of breaking them if docker is
not installed. We have been advocating for building specific distros for
some time now and I will also send out an additional notice
The PR also removes the use of `requireDocker` from tests that actually
use test fixtures and are ok without it, and fixes a bug in test
fixtures that would cause incorrect configuration and allow some tasks
to run when docker was not available and they shouldn't have.
Closes #42680 and #42829 see also #42719