The testclusters registory is a singleton extension element added to the
root project which tracks which test clusters are used throughout the
multi project. But having the same name as the extension used to
configure test clusters within each subprojects breaks using a single
project for an external plugin. This commit renames the registry
extension to make it unique.
closes#49787
This commit goes from using a JvmArgumentProvider to using the normal
Test task APIs for passing the `HeapDumpOnOutOfMemoryError` JVM
argument. This makes it simpler for subprojects, such as lang-painless
to override this setting if necessary.
Closes#49117
(cherry picked from commit e97c38ff8e862abdc1d7816c66f9869ed216031f)
We have a long history of advancing the required compiler to the newest
JDK. JDK 13 has been with us for awhile, but we were blocked from
upgrading since Gradle was not compatible with JDK 13. With the
advancement in our project to Gradle 6 which supports JDK 13, we can now
advance our minimum compiler version. This commit updates the minimum
compiler version to JDK 13.
The elasticsearch-node tools allow manipulating the on-disk cluster state. The tool is currently
unaware of plugins and will therefore drop custom metadata from the cluster state once the
state is written out again (as it skips over the custom metadata that it can't read). This commit
preserves unknown customs when editing on-disk metadata through the elasticsearch-node
command-line tools.
This upgrade required a few significant changes. Firstly, the build
scan plugin has been renamed, and changed to be a Settings plugin rather
than a project plugin so the declaration of this has moved to our
settings.gradle file. Second, we were using a rather old version of the
Nebula ospackage plugin for building deb and rpm packages, the migration
to the latest version required some updates to get things working as
expected as we had some workarounds in place that are no longer
applicable with the latest bug fixes.
(cherry picked from commit 87f9c16e2f8870e3091062cde37b43042c3ae1c5)
Move BuildParams class to 'minimumRuntime' source set to retain compatibility
with build-tools for builds using a Java 8 runtime.
Closes#49766
(cherry picked from commit 1059f823acdfa7a2f1f9bff21c7256dae4f3e23c)
When external plugin authors use build-tools, their integ tests depend
on the integ-test-zip artifact. However, this dependency was broken in
7.5.0 by accidentally removing the `@zip` qualifier on the maven
dependency, which works around the fact the pom for the integ-test-zip
claims the artifact is a pom instead of zip packaging. This commit
restores the workaround of using `@zip` until the pom can be fixed.
closes#49787
This rewrites long sort as a `DistanceFeatureQuery`, which can
efficiently skip non-competitive blocks and segments of documents.
Depending on the dataset, the speedups can be 2 - 10 times.
The optimization can be disabled with setting the system property
`es.search.rewrite_sort` to `false`.
Optimization is skipped when an index has 50% or more data with
the same value.
Optimization is done through:
1. Rewriting sort as `DistanceFeatureQuery` which can
efficiently skip non-competitive blocks and segments of documents.
2. Sorting segments according to the primary numeric sort field(#44021)
This allows to skip non-competitive segments.
3. Using collector manager.
When we optimize sort, we sort segments by their min/max value.
As a collector expects to have segments in order,
we can not use a single collector for sorted segments.
We use collectorManager, where for every segment a dedicated collector
will be created.
4. Using Lucene's shared TopFieldCollector manager
This collector manager is able to exchange minimum competitive
score between collectors, which allows us to efficiently skip
the whole segments that don't contain competitive scores.
5. When index is force merged to a single segment, #48533 interleaving
old and new segments allows for this optimization as well,
as blocks with non-competitive docs can be skipped.
Backport for #48804
Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
This commit introduces a workaround for an issue related to our recent
notarization of distributions starting with the 6.8.5 release. An
unintended side effect of notarization was that the file entries of the
release tar all have a `./` prefix in the path. This causes a number of
issues, not least of which is that our Gradle extract tasks end up
copying an empty fileset to the destination directory. The workaround
here is imply to remove the leading `./` path segment from each file
when performing the extraction. For more details see this issue:
https://github.com/elastic/elasticsearch/issues/49417
Tasks intending to use a particular java home provided by JAVA<N>_HOME
use the getJavaHome method, which verifies the given java home is
available, or will be if the task will run. However, the verification
logic was broken, in addition to unnecessarily delaying retrieving the
java home until runtime. This commit fixes the verification logic to run
at either config time, delaying verification, or at runtime which
immediately checks if java home is available.
closes#49153
This PR adds build configuration to use the `jdk-download` plugin with
unit tests when no runtime java is configured externally.
It's a first part in a longer chain of changes described in #40531.
The bats tests require several distributions to all be built into a
single directory. The addition of docker packaging tests now cause the
bats tests to depend on docker, even though docker is not used there.
This commit filters out docker distributions from those that bats
depends on.
The RunTask is responsible for logging output from nodes to the console
and also stays active since we want the cluster to keep running.
However, the implementation of the logging and waiting resulted in a
spin loop that continually polls for data to have been written to one
of the nodes' output files. On my laptop, this causes an idle
invocation of `gradle run` to consume an entire core.
The JDK provides a method to be notified of changes to files through
the use of a WatchService. While a WatchService based implementation
for logging and waiting works, a delay of up to ten seconds is
encountered when running on macOS. This is due to the lack of a native
WatchService implementation that uses kqueue or FSEvents; the current
WatchService implementation in the JDK uses polling with a default
interval of ten seconds. While the interval can be changed
programmatically it is not an acceptable solution due to the need to
access the com.sun.nio.file.SensitivityWatchEventModifier enum, which
is in an internal package.
The change in this commit instead introduces a check to see if any data
was available to read and log. If no data is available in any of the
node output files, the thread sleeps for 100ms. This is enough time to
prevent consuming large amounts of cpu while still providing output to
the console in a timely fashion.
Backport of #48849. Update `.editorconfig` to make the Java settings the
default for all files, and then apply a 2-space indent to all `*.gradle`
files. Then reformat all the files.
Backport of #48450.
Make a number of changes so that code in the `server` directory is more
resilient to automatic formatting. This covers:
* Reformatting multiline JSON to embed whitespace in the strings
* Move some comments around to they aren't auto-formatted to a strange
place. This also required moving some `&&` and `||` operators from the
end-of-line to start-of-line`.
* Add helper method `reformatJson()`, to strip whitespace from a JSON
document using XContent methods. This is sometimes necessary where
a test is comparing some machine-generated JSON with an expected
value.
Also, `HyperLogLogPlusPlus.java` is now excluded from formatting because it
contains large data tables that don't reformat well with the current settings,
and changing the settings would be worse for the rest of the codebase.
Backport of #48898.
We no longer configure distributions for prior versions for Docker. This
is because doing so prompts Gradle to try and resolve the Docker
dependencies, which doesn't work as they can't be downloaded via Ivy
(configured in DistributionDownloadPlugin). Since we need these for the
BATS upgrade tests, and those tests only cover .rpm and .deb, it's OK to
omit creating such distributions in the first place. We may need to
revisit this in the future, to allow upgrade testing using Docker
containers.
Backport of #48883.
Per elastic/infra#15864, the Elasticsearch CI images are failing due to
a packer_cache failure. This is because Gradle is trying to resolve
a `.docker` file through the Ivy repository, which doesn't work. Disable
the Docker tests again until we figure out the way forward.
Backport of #46599 and #47640. Add packaging tests for Docker.
* Introduce packaging tests for Docker (#46599)
Closes#37617. Add packaging tests for our Docker images, similar to what
we have for RPMs or Debian packages. This works by running a container and
probing it e.g. via `docker exec`. Test can also be run in Vagrant, by
exporting the Docker images to disk and loading them again in VMs. Docker
is installed via `Vagrantfile` in a selection of boxes.
* Only define Docker pkg tests if Docker is available (#47640)
Closes#47639, and unmutes tests that were muted in b958467.
The Docker packaging tests were being defined irrespective of whether
Docker was actually available in the current environment. Instead,
implement exclude lists so that in environments where Docker is not
available, no Docker packaging tests are defined. For CI hosts, the build
checks `.ci/dockerOnLinuxExclusions`. The Vagrant VMs can defined the
extension property `shouldTestDocker` property to opt-in to packaging
tests.
As part of this, define a seperate utility class for checking Docker,
and call that instead of defining checks in-line in BuildPlugin.groovy
This commit introduces a consistent, and type-safe manner for handling
global build parameters through out our build logic. Primarily this
replaces the existing usages of extra properties with static accessors.
It also introduces and explicit API for initialization and mutation of
any such parameters, as well as better error handling for uninitialized
or eager access of parameter values.
Closes#42042
This commit eliminates some custom logic we have in place for post-hoc
cleanup of POM files generated by Gradle. There were to main issues this
logic was meant to address:
First, for dependencies marked as `transitive = false`, Gradle by
default creates a "wildcard" exclusion in the generated POM file. It
turns out that Ivy didn't handle these types of exclusions well, even
though they are perfectly valid and dealt with by Gradle and Maven as
expected. We've since confirmed that this issues is indeed resolved in
the most recent Ivy release (2.5.0-rc1) so going forward the suggestion
to folks consuming Elasticsearch dependencies with Ivy will be to use
this version.
Second, earlier versions of Gradle would incorrectly assign compile
dependencies to the "runtime" scope in the publish POM file. This could
cause issues if the dependencies were indeed needed at compile time
because their APIs were exposed. This has since been fixed and these
dependencies are correctly marked as "compile" scope in the POM.
Since these two issues have been resolved in their respective projects
we can eliminate this logic and all the supporting code, such as having
to create lots of "internal" configurations for tracking transitive
dependencies.
This commit simplifies and standardizes our usage of the Gradle Shadow
plugin to conform more to plugin conventions. The custom "bundle" plugin
has been removed as it's not necessary and performs the same function
as the Shadow plugin's default behavior with existing configurations.
Additionally, this removes unnecessary creation of a "nodeps" artifact,
which is unnecessary because by default project dependencies will in
fact use the non-shadowed JAR unless explicitly depending on the
"shadow" configuration.
Finally, we've cleaned up the logic used for unit testing, so we are
now correctly testing against the shadow JAR when the plugin is applied.
This better represents a real-world scenario for consumers and provides
better test coverage for incorrectly declared dependencies.
(cherry picked from commit 3698131109c7e78bdd3a3340707e1c7b4740d310)
This commit bumps the bundled JDK to 13.0.1+9. Since AdoptOpenJDK did
not release 13.0.1+9 for Windows, this commit also enables that the
bundled JDK version can vary by platform.
Reverting the change introducing IsoLocal.ROOT and introducing IsoCalendarDataProvider that defaults start of the week to Monday and requires minimum 4 days in first week of a year. This extension is using java SPI mechanism and defaults for Locale.ROOT only.
It require jvm property java.locale.providers to be set with SPI,COMPAT
closes#41670
backport #48209
* Always publish a build scan in CI
This PR changes the build scan configuration to alwasy publisha build
scan when running in our CI.
We should alkready be passing these env vars into the Vagrant VM so this
will make it produce a build scan too.
The old properties to accept build scan ToS on the public server are
thus no longer relevant and will be cleaned up from the Jenkins config
once this is merged.
* Pass env vars to vagrant VM
* Enable running in parallel in the VM
* Add job name and build nomber as custom values
The classpath for some project could outgrow the max allowed command
line on Windows. Using an env var is not fault proof, but give more
breathing room
This commit removes the option to change the netty system properties to
reenable the direct buffer pooling. It also removes the need for us to
disable the buffer pooling in the system properties file. Instead, we
programmatically craete an allocator that is used by our networking
layer.
This commit does introduce an Elasticsearch property which allows the
user to fallback on the netty default allocator. If they choose this
option, they can configure the default allocator how they wish using the
standard netty properties.
Before this change one needed to re-start debugging several times, as we
launched multiple JVMs in debug mode.
With this option the IDE has the option to re-launch and listen for
connections again leading for to a more pleasant experience.
The distribution download plugin which handles finding built
distributions for testing currently only knows how to find locally built
snapshots. When an external Elasticsearch plugin uses build-tools, these
snapshots do not exist. This commit extends the download plugin so it
pulls from the Elastic snapshots service when used outside of the
Elasticsearch repository.
closes#47123
* GlobalBuildInfo plugin searches packed references
In recent versions of Git, references may be packed in a packed-refs
file. If this happens, Gradle will need to look in that file to find
build information.
We fixed warnings related to task input and outputs in #45098.
This particular input was not considered, a warning was present for it
and Gradle didn't use it as part of task inputs.
As soon as we fixed it Gradle started considering it an input and
enforced that it exists.
With this change we make it optional as the task can work both wih and
without this directory.
In order to work with external elasticsearch plugins, some parts of
build-tools need to know when the current build is part of the
elasticsearch project or an external build. We identify these "internal"
builds by a marker in our buildSrc classpath. However, some build-tools
integ tests need to override this flag to test both the external and
internal behavior.
This commit moves the storage of the flag indicating whether we are
running in an internal build to global build info. This will allow
testkit projects to override the flag.
The global build info plugin prints high level information about the
build, including the test seed. However, BuildPlugin sets up the test
seed, which creates an odd implicit dependency on it. This commit moves
the intialization of the testSeed property into the global build info.
This commit adds a thread filter for gradle unit tests which omits
threads gradle may create that we have no control over shutting down.
The current example of this is for project.exec which gradle pools.
closes#47417
* Convert RunTask to use testclusers, remove ClusterFormationTasks
This PR adds a new RunTask and a way for it to start a
testclusters cluster out of band and block on it to replace
the old RunTask that used ClusterFormationTasks.
With this we can now remove ClusterFormationTasks.
* Use versions specific distribution folders so we don't need to clean up (#46539)
* Retry deleting distro dir on windows
When retarting the cluster we clean up old distribution files that might
still be in use by the OS.
Windows closes resources of ded processes async, so we do a couple of
retries to get arround it.
Closes#46014
* Avoid having to delete the distro folder.
* Remove the use of ClusterFormationTasks form RestTestTask (#47022)
This PR removes a use-case of the ClusterFormationTasks and converts a
project that flew under the radar so far.
There's probably more clean-up possible here, but for now the goal is
to be able to remove that code after `RunTask` is also updated.
* Migrate some 7.x only projects
* Bwc testclusters all (#46265)
Convert all bwc projects to testclusters
* Fix bwc versions config
* WIP fix rolling upgrade
* Fix bwc tests on old versions
* Fix rolling upgrade
This PR makes the necesary adaptations to the tests and adds a power shell script to
invoke the OS tests on GCP instances connected as CI workers.
Also noticed that logs were not being produced by the tests and that theses were not using log4j so fixed that too.
One of the difficulties in working on theses tests was that the tests just stalled with no indication where the problem is.
To ease with the debugging, after process explorer suggested that the tests are running some commands, we now have multiple timeouts: one for the tests ( which will generate a thread dump ) and one for individual commands ( that bails with the command being ran and output and error so far ) to make it easier to see what went wrong.
The tests were blocking because apparently the pipes to the sub-process were not closing, thus the threads were blocking on them and we were blocking indefinitely on the join. I'm not sure why this doesn't happen in vagrant, but we now properly deal with it.
* Remove eclipse conditionals
We used to have some meta projects with a `-test` prefix because
historically eclipse could not distinguish between test and main
source-sets and could only use a single classpath.
This is no longer the case for the past few Eclipse versions.
This PR adds the necessary configuration to correctly categorize source
folders and libraries.
With this change eclipse can import projects, and the visibility rules
are correct e.x. auto compete doesn't offer classes from test code or
`testCompile` dependencies when editing classes in `main`.
Unfortunately the cyclic dependency detection in Eclipse doesn't seem to
take the difference between test and non test source sets into account,
but since we are checking this in Gradle anyhow, it's safe to set to
`warning` in the settings. Unfortunately there is no setting to ignore
it.
This might cause problems when building since Eclipse will probably not
know the right order to build things in so more wirk might be necesarry.
* Add support for bwc for testclusters and convert full cluster restart (#45374)
* Testclusters fix bwc (#46740)
Additions to make testclsuters work with lather versions of ES
* Do common node config on bwc tests
Before this PR we always ever ran `ElasticsearchCluster.start` once, and
the common node config was never done.
This becomes apparent in upgrading from `6.x` to `7.x` as the new config
is missing preventing the cluster from starting.
* Do common node config on bwc tests
Before this PR we always ever ran `ElasticsearchCluster.start` once, and
the common node config was never done.
This becomes apparent in upgrading from `6.x` to `7.x` as the new config
is missing preventing the cluster from starting.
* Fix logic to pick up snapshot from 6.x
* Make sure ports are cleared
* Fix test
* Don't clear all the config as we rely on it
* Fix removal of keys
#46180 added support for the `[source,console]`
language for snippets which should be tested.
This removes support for the `// CONSOLE` magic comment,
which serve a similar purpose.
Snippets that include the `// CONSOLE` magic comment will return
an exception.
Currently in production instances of Elasticsearch we set a couple of
system properties by default. We currently do not apply all of these
system properties in tests. This commit applies these properties in the
tests.
This PR adds some restrictions around testfixtures to make sure the same service ( as defiend in docker-compose.yml ) is not shared between multiple projects.
Sharing would break running with --parallel.
Projects can still share fixtures as long as each has it;s own service within.
This is still useful to share some of the setup and configuration code of the fixture.
Project now also have to specify a service name when calling useCluster to refer to a specific service.
If this is not the case all services will be claimed and the fixture can't be shared.
For this reason fixtures have to explicitly specify if they are using themselves ( fixture and tests in the same project ).
This commit disables caching of BWC snapshot distributions in the "trunk" (aka master) branch.
Since the previous major release branches move quickly we rarely get cache hits for these
tasks, and the artifacts themselves are very large. This means the overhead here is high and
savings basically zero. We conditionally disable task output caching in this scenario in CI to
avoid excessive build cache overhead as well as causing too much turn in the cache itself which
would lead to lots of cache entry evictions.
This commit teaches the build how to bundle AdoptOpenJDK with our
artifacts, and switches to AdoptOpenJDK as the bundled JDK. We keep the
functionality to also bundle Oracle OpenJDK distributions.
In some cases (for example some AdoptOpenJDK builds), the java.vendor is
mistakenly populated as "Oracle Corporation" while the real value is
under "java.vendor.version". Since "java.vendor.version" is mandatory
since JDK 10, this commit changes to use "java.vendor.version" as the
favored system property to find the JVM vendor, and we fallback to
"java.vendor" if this is not populated (as happens in some Oracle
builds). Ugh.
Before this change we would run bwc nodes with their bundled jdk if
these supported it, so the passed in runtime JDK was not honored.
This became obvius when running with FIPS.
Closes#41721
In order to track down #46091:
* Enables debug logging in REST tests for `master` and `coordination` packages
since we suspect that issues are caused by failed and then retried publications
Previously we only turned on tests if we saw either `// CONSOLE` or
`// TEST`. These magic comments are difficult for the docs build to deal
with so it has moved away from using them where possible. We should
catch up. This adds another trigger to enable testing: marking a snippet
with the `console` language. It looks like this:
```
[source,console]
----
GET /
----
```
This saves a line which is nice, I guess. But it is more important to me
that this is consistent with the way the docs build works now.
Similarly this enables response testing when you mark a snippet with the
language `console-result`. That looks like:
```
[source,console-result]
----
{
"result": "0.1"
}
----
```
`// TESTRESPONSE` is still available for situations like `// TEST`: when
the response isn't *in* the console-result language (like `_cat`) or
when you want to perform substitutions on the generated test.
Should unblock #46159.
This adds support for verifying that snippets with the `console-result`
language are valid json. It also switches the response snippets on the
`docs/get` page from `js` to `console-result` which will allow clients
to provide "alternatives" for them like they can now do with
`// CONSOLE` snippets.
* Pass COMPUTERNAME env var to elasticsearch.bat
When we run bin/elasticsearch with bash, we get a $HOSTNAME builtin that
contains the hostname of the machine the script is running on. When
there's no provided nodename, Elasticsearch uses the HOSTNAME to create
a nodename. On Windows, Powershell provides a $COMPUTERNAME variable for
the same purpose. CMD.EXE provides the same thing, except it's called
%COMPUTERNAME%. bin/elasticsearch.bat sets $HOSTNAME to the value of
$COMPUTERNAME. However, when testclusters invokes bin/elasticsearch.bat,
the COMPUTERNAME variable doesn't get passed in, leaving HOSTNAME null
and breaking an integration test on Windows.
This commit sets COMPUTERNAME in the environment so that our tests get
the value that Elasticsearch would have when bin/elasticsearch.bat is
invoked from the shell.
* Add null check to protect in non-Windows case
What good is it a developer to gain the whole Windows if they forfeit
their Unix? The value that fixes things on Windows is null on
Linux/Darwin, so let's null-check it.
* Override system hostnames for testclusters
Rather than relying on variable system behavior, let's just override
HOSTNAME and COMPUTERNAME and test for correct values in the integration
test that was originally failing.
* Rename constants for clarity
Since we are setting HOSTNAME and COMPUTERNAME regardless of whether the
tests are running on Windows or Linux, we shouldn't imply that constants
are only used in one case or the other.
Since credentials are required to access such a repository, and these
repositories are accessed over an encrypted protocol (https), this
commit adds support to consider S3-backed artifact repositories as
secure. Additionally, we add tests for this functionality.
This commit adds a destructiveDistroTest task which depends on all of
the distribution specific destructive tasks, which can be used by CI.
closes#45769
The java based distribution tests currently have a single Tests class
which encapsulates all of the tests for a particular distribution. The
test task in gradle then depends on all distributions being built, and
each individual tests class looks for the particular distribution it is
trying to test. This means that reproducing a single test failure
triggers all the distributions to be built, even though only one is
needed for the test.
This commit reworks the java distribution tests to pass in a particular
distribution to be tested, and changes the base test classes to be
actual test classes which have assumptions around which distributions
they operate on. For example, the archives tests will be skipped when
run with an rpm distribution, and vice versa for the package tests. This
makes reproduction much more granular. It also also better splitting up
tests around a particular use case. For example, all tests for systemd
behavior can be in one test class, and run independently of all tests
against rpm/deb distributions.
* Add input and outut tracking of built bwc versions
This PR adds tracking of the bwc versions git has as input and all the
expected files as output.
The effect is that `gradlew` is not called at all when the git has
doesn't change and the version was allready built.
Previusly gradlew would be called for the bwc version and it would have
to configure the project and go trough up to date checks to figure out
that nothing changed.
This helps when working on bwc tests locally needing to run the test
multiple times.
This should also help in CI not re-build bwc versions across different
runs.
* Enable caching of bwc builds
This commit adds CNAME reporting for transport.publish_address same way
it's done for http.publish_address.
Relates #32806
Relates #39970
(cherry picked from commit e0a2558a4c3a6b6fbfc6cd17ed34a6f6ef7b15a9)
Today we shell out to git rev-parse to read the git revision. Forking
another process is slower than reading the revision directly. This
commit changes to directly read the git revision from the repository,
avoiding to fork another process.
The dependency on copying distributions was accidentally masked by an
earlier refactoring. This commit fixes the copyDistributions task to be
run before bats tests run.
The bats tests currently require many additional artifacts to be built.
In addition to the current distributions, they need all the plugins to
be installed, as well as a randomly chosen bwc distribution. This commit
splits these two cases into their own bats task, so the dependencies do
not slow down other tasks like distroTests which do not need them.
The distro test plugin was originally designed to be applied within each
subproject, per operating system we run in a VM with vagrant. However,
for efficiency, and also ease of having a single task to run in CI when
launching within individual OS VMs, having the "destructive" tasks in a
single place is more convenient. This commit reworks the distro test
plugin to be applied to the qa/vagrant project, which now creates only
the wrapper tasks in each of the subprojects for each vagrant VM.
Before #45064, the bats tests skipped the upgrade tests when the random
upgrade version is before 6.3.0. This commit restores that behavior.
closes#45476
The vagrant based tests currently reside in a single project, creating
dozens of tasks to manage starting and stopping the vagrant VM along
with running java and bats tests within each image. This all-in-one
pattern makes parallelizing packaging tests difficult.
This commit rewrites the vagrant testing infrastructure to be
independent of the actual test runners, thus allowing each platform to
be handled in a separate subproject. Additionally, the java and bats
tests are changed to be run through a "destructive" gradle task, which
is run inside the VM. The combination of these will allow
parallelization both locally (through running several VMs at once) as
well as running the destructive tasks in CI machines dedicated to each
platform (thus removing the need for vagrant in CI).
* Restrict which tasks can use testclusters
This PR fixes a problem between the interaction of test-clusters and
build cache.
Before this any task could have used a cluster without tracking it as
input.
With this change a new interface is introduced to track the tasks that
can use clusters and we do consider the cluster as input for all of
them.
This commit makes the gitRevision property a lazy loaded value by
returning an Object implementing toString(). The Dockerfile template is
also changed to use groovy templates instead of the mavenfilter hack, so
converting to String will not happen until runtime.
We configure the service ID as the node's toString but this containes
characters that Windows doesn't like.
This PR fixes it by allowing only alphanumeric characters
This commit simplifies the handling of git revision in the build. In
particular we remove pushing git revision through the generate build
info and print build info tasks as the git revision does not need to be
cached.
This commit switches to using the full hash to build into the JAR
manifest, which is used in node startup and the REST main action to
display the build hash.
Testclusters currently provides protection from clusters living past the
life of a build by adding a shutdown hook to java. While this works in
some cases, it does not cover all cases like where the daemon is killed
with SIGKILL.
To handle these other cases, this commit replaces the shutdown hooks with
a separate process (one per build) that manages reaping external services
if gradle dies.
This commit adds the commit hash to the global build info, and adds the
git revision as an extension. There are a couple motivations for this
change:
- the current mechanism of getting the build hash does not work with
git worktrees (because jgit does not understand them)
- a follow-up will want to use the git revision when building the
Docker images, so we want it available as an extension
- it allows us to simplify our usage around the build hash as we no
longer have to hack around silliness in the info-scm plugin
A follow-up will also stop using the short hash in the product build, so
that we use the full hash there. We already know that short hashes in
our codebase do collide, so we should move to the full hash to avoid
this problem.
In https://github.com/elastic/elasticsearch/pull/41913 setting up the
temp dir for ES was moved from the env script to individual cli scripts.
However, moving it to the windows service cli was missed. This commit
restores setting up the temp dir for the windows service control script.
Backport of #43177 so that VersionProperties is Java 8 compatible and
can be used by https://github.com/elastic/elasticsearch-hadoop
to retrieve snapshot versions for Lucene.
(cherry picked from commit ec3ac9b62452f04ce44dea0a904a6e2b31dd8076)
A tool to work with snapshots.
Co-authored by @original-brownbear.
This commit adds snapshot tool and the single command cleanup, that
cleans up orphaned files for S3.
Snapshot tool lives in x-pack/snapshot-tool.
(cherry picked from commit fc4aed44dd975d83229561090f957a95cc76b287)
* Detect process third party audit being killed by OOM
It's very common for the third party audit to be killed by the OOM
killer when the system is running low on memory.
Since the forbidden APIs call is expected to fail, we were ignoring
these and incorrectly interpreting the partial output.
With this change we detect and provide a proper error message when this
happens.
The test EmptyDirTaskTests#testCreateEmptyDirNoPermissions may fail on
Windows. However, the test is only meaningful for Unix permissions
structures, so we should assume a Unix-family OS and skip the test on
Windows.
Fixes#44064
Test clusters currently has its own set of logic for dealing with
finding different versions of Elasticsearch, downloading them, and
extracting them. This commit converts testclusters to use the
DistributionDownloadPlugin.
Due to recent changes are done for converting `repository-hdfs` to test
clusters (#41252), the `integTestSecure*` tasks did not depend on
`secureHdfsFixture` which when running would fail as the fixture
would not be available. This commit adds the dependency of the fixture
to the task.
The `secureHdfsFixture` is a `AntFixture` which is spawned a process.
Internally it waits for 30 seconds for the resources to be made available.
For my local machine, it took almost 45 seconds to be available so I have
added the wait time as an input to the `AntFixture` defaults to 30 seconds
and set it to 60 seconds in case of secure hdfs fixture.
The integ test for secure hdfs was disabled for a long time and so
the changes done in #42090 to fix the tests are also done in this commit.
* Improoce how log is tailed in testclusters on failure
- only print last few lines
- print all errors and warnings
- compact repeating errors and warnings
* Test fixtures improovements
Don't disable some of the precommit tasks on fixtures.
This no longer makes sense now that a project can both produce and use a
fixture.
In order for this to be possible, had to add an additional configuration
to make JarHell class accessible to the task even if it's not a
dependency of the project and fix some of the third party audit fallout
from #43671 which wasn't detected at the time due to the issue being
fixed here.
Closes#43918
* TestClusters: Convert the security plugin
This PR moves security tests to use TestClusters.
The TLS test required support in testclusters itself, so the correct
wait condition is configgured based on the cluster settings.
* PR review
Several types of distributions are built and tested in elasticsearch,
ranging from the current version, to building or downloading snapshot or
released versions. Currently tests relying on these have to contain
logic deciding where and how to pull down these distributions.
This commit adds an distributiond download plugin for each project to
manage which versions and variants the project needs. It abstracts away
all need for knowing where a particular version comes from, like a local
distribution or bwc project, or pulling from the elastic download
service. This will be used in a followup PR by the testclusters and
vagrant tests.
When starting BWC nodes, it could be that runtime Java home is set. Yet,
runtime Java home can advance beyond what a BWC node might be compatible
with. For example, if runtime Java home is set to JDK 13 and we are
starting a 7.1.2 node, we do not have any guarantees that 7.1.2 is
compatible with JDK 13 (since we never did any work to make it so). This
will continue to be the case as JDK releases advance, but we still need
to test against BWC nodes. This commit stops applying runtime Java home
when starting a BWC node. Instead, we would use the bundled JDK.
We initially added `requireDocker` for a way for tasks to say that they
absolutely must have it, like the build docker image tasks.
Projects using the test fixtures plugin are not in this both, as the
intent with these is that they will be skipped if docker and docker-compose
is not available.
Before this change we were lenient, the docker image build would succeed
but produce nothing. The implementation was also confusing as it was not
immediately obvious this was the case due to all the indirection in the
code.
The reason we have this leniency is that when we added the docker image
build, docker was a fairly new requirement for us, and we didn't have
it deployed in CI widely enough nor had CI configured to prefer workers
with docker when possible. We are in a much better position now.
The other reason was other stack teams running `./gradlew assemble`
in their respective CI and the possibility of breaking them if docker is
not installed. We have been advocating for building specific distros for
some time now and I will also send out an additional notice
The PR also removes the use of `requireDocker` from tests that actually
use test fixtures and are ok without it, and fixes a bug in test
fixtures that would cause incorrect configuration and allow some tasks
to run when docker was not available and they shouldn't have.
Closes #42680 and #42829 see also #42719
Moves the test infrastructure away from using node.max_local_storage_nodes, allowing us in a
follow-up PR to deprecate this setting in 7.x and to remove it in 8.0.
This also changes the behavior of InternalTestCluster so that starting up nodes will not automatically
reuse data folders of previously stopped nodes. If this behavior is desired, it needs to be explicitly
done by passing the data path from the stopped node to the new node that is started.
This commit removes the jdk11 download in vagrant provisioning and
converts it to using the jdk downloader for the system jdk, and sets up
a separate jdk for use by the test (which will be converted to running
gradle in a followup).
This commit adds a guard around reading the spooled LoggedExec output.
It is possible the exec command did not output anything, and failed,
which would trigger a failure to read the output file.
This commit fixes the logging in LoggedExec which uses an in memory
buffer to read from a local reference, instead of with
getStandardOutput() of the Exec task. This is due to gradle internally
wrapping with a TeeOutputStream, breaking our cast.
Previously we used LoggedExec for running the internal bwc builds.
However, this had bad performance implications as all the output was
buffered into memory, thus we changed back to normal Exec. This commit
adds a `spoolOutput` setting to LoggedExec which can be used for
commands with large amounts of output, and switches the bwc builds to
use this flag.
* Fix slow sync test clustres artifacts task
The task was mistakenly adding a combinational explosion of task
actions all doing the same thing.
With this PR this is fixed and each version - distribution pair is only
extracted once.
I appologieze for the SSD wear.
* Look for configurations on the root project
* Add dependency on configurations
* This should be a `copy` so we don't blow away all the other distros
* Don't copy example plugin build directory in integration tests
This commit disables rhel 8 from being tested in vagrant packaging
tests. The vagrant image we use is beta release, but RHEL 8 was just
released, which has caused the package mirrors for the beta to stop
working.
This commit reworks the tests for jdk download to test the old and new
url pattern from oracle. Additionally it limits to one repository
created per version, based on the old or new pattern, and restricts
other repositories from trying to resolve jdks.
closes#41998
We currently download 3 variants of the same version of the jdk for
bundling into the distributions. Additionally, the vagrant images do
their own downloading. This commit moves the jdk downloading into a
utility gradle plugin. This will be used in a future PR by the packaging
tests.
The new plugin exposes a "jdks" project extension which allows creating
named jdks. Once the jdk version and platform are set for a named jdk,
the jdk object may be used as a lazy String for the jdk home path, or a
file collection for copying.
testclusters detect from settings that security is enabled
if a user is not specified using the DSL introduced in this PR, a default one is created
the appropriate wait conditions are used authenticating with the first user defined in the DSL ( or the default user ).
an example DSL to create a user is user username:"test_user" password:"x-pack-test-password" role: "superuser" all keys are optional and default to the values shown in this example
The run task is supposed to run elasticsearch with the given plugin or
module. However, for modules, this is most realistic if using the full
distribution. This commit changes the run setup to use the default or
oss as appropriate.
* Revert "Revert "Clean up clusters between tests (#41187)""
This reverts commit 9efc853aa668e285ede733d37b6fc7a0f4b02041.
* Remove the jdk directory to save space on bwc tests
This PR adresses the same concern as #41187 in a different way.
It removes only the JDK from the distribution once the cluster stops,
so we keep the same disk space requirements as before adding the JDK.
This is still a temporary measure, testclusters already deals with this
by doing the equivalent of `cp -l` instead of an actual copy.
Today we allow adding entries from a file or from a string, yet we
internally maintain this distinction such that if you try to add a value
from a file for a setting that expects a string or add a value from a
string for a setting that expects a file, you will have a bad time. This
causes a pain for operators such that for each setting they need to know
this difference. Yet, we do not need to maintain this distinction
internally as they are bytes after all. This commit removes that
distinction and includes logic to upgrade legacy keystores.
To reduce configuration time, we fork some threads to compute the Java
version for the various configured Javas. However, as the number of
JAVA${N}_HOME variable increases, the current implementation creates as
many threads as there are such variables, which could be more than the
number of physical cores on the machine. It is not likely that we would
see benefits to trying to execute all of these once beyond the number of
physical cores (maybe simultaneous multi-threading helps though, who
knows. Therefore, this commit limits the parallelization here to the
number number of physical cores.
If no Java versions are set then when we size the executor thread pool
we end up with zero threads, which is illegal. This commit avoids that
problem by only starting the executor when needed.
ClusterFormationTasks auto configured these properties for clusters.
This PR adds FIPS specific configuration across all test clusters from
the main build script to prevent coupling betwwen testclusters and the
build plugin.
Closes#40904
This will help with reproduction lines and running tests form IDEs and
other operations that are quick and executed often enough for the
configuration time to matter.
Running Gradle with a FIPS JVM is not supproted, so if the runtime JVM
is the same one, no need to spend time checking for fips support.
Verification of the JAVA<version>_HOME env vars is now async and
parallel so it doesn't block configuration.
This PR adds additional cleanup when stopping the node.
The data dir is excepted because it gets reused in some tests.
Without this cleanup the number of working dir copies could grew to
exhaust all available disk space.
This is related to #36652. We intend to deprecate a number of transport
settings in 7.x and remove them in 8.0. This commit removes the string
usages of these settings.
With the 7.0.0 release, we switched to download the packages instead of
using locally built ones.
This PR fixes the artifact names to include the architecture as
introduced in the 7.0.0 release.
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)
This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.
(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)
* Fix forking JVM runner
* Don't bump shadow plugin version
This commit sets the version to ensure that we use the bundled Java when
running integration tests for all eligible versions. In particular,
since we started bundling Java with 7.0.0, this commits sets said
version to 7.0.0.
Many gradle projects specifically use the -try exclude flag, because
there are many cases where auto-closeable resource ignore is never
referenced in body of corresponding try statement. Suppressing this
warning specifically in each case that it happens using
`@SuppressWarnings("try")` would be very verbose.
This change removes `-try` from any gradle project and adds it to the
build plugin. Also this change removes exclude flags from gradle projects
that is already specified in build plugin (for example -deprecation).
Relates to #40366
By default, in integ tests we wait for the standalone cluster to start
by using the ant Get task to retrieve the cluster health endpoint.
However the ant task has no facilities for customising the trusted
CAs for a https resource, so if the integ test cluster has TLS enabled
on the http interface (using a custom CA) we need a separate utility
for that purpose.
Backport of: #40573
Replaces the vagrant based kerberos fixtures with docker based test fixtures plugin.
The configuration is now entirely static on the docker side and no longer driven by Gradle,
also two different services are being configured since there are two different consumers of the fixture that can run in parallel and require different configurations.
* Add support for setting and keystore settings
* system properties and env var config
* use testclusters for repository-s3
* Some cleanup of the build.gradle file for plugin-s3
* add runner {} to rest integ test task
The platformTest gradle task was a packaging test meant to ensure unit
tests run on the various supported operating systems without relying on
CI to maintain a full matrix of platforms. Howevever, it never really
worked out as intended and is now additional code in our vagrant setup
to maintain. This commit removes the platformTest task.
* Revert "Configure TMP for test nodes on Windows (#39959)"
This reverts commit 97562a874fcb1f29fb05272ab860a0307e97d1aa.
* Configure a tmp dir without spaces
* Pass on TMP instead of changing it
Here are the highlights of this release:
- Feature variants AKA "optional dependencies"
- Type-safe accessors in Kotlin precompiled script plugins
- Gradle Module Metadata 1.0
For more details see https://docs.gradle.org/5.3/release-notes.html
This commit adds a variant for every official distribution that omits
the bundled jdk. The "no-jdk" naming is conveyed through the package
classifier, alongside the platform. Package tests are also added for
each new distribution.
This breaks on windows where TMP dir default to C:\Windows and startup
fails with a permission error.
I tried to create a tmp dir and pass in `TMP` env, but it lead to a
class not found error, and since testclusers is already independent of
the calling environment I stopped there.
The changes in #39732 mean that nodes in the IntegTest clusters will
now run with whichever java version is defined as `runtime.java` and
not JAVA_HOME anymore.
This means that these nodes will also run in JVM with fips approved
mode enabled and as such, need to have access to the password for the
BCFKS keystore that is used as the default keystore/truststore.
This change sets the two necessary system properties.
Resolves#39855
* Bundle java in distributions
Setting up a jdk is currently a required external step when installing
elasticsearch. This is particularly problematic for the rpm/deb packages
as installing a jdk in the same package installation command does not
guarantee any order, so must be done in separate steps. Additionally,
JAVA_HOME must be set and often causes problems in selecting a correct
jdk when, for example, the system java is an older unsupported version.
This commit bundles platform specific openjdks into each distribution.
In addition to eliminating the issues above, it also presents future
possible improvements like using jlink to build jdk images only
containing modules that elasticsearch uses.
closes#31845
This commit introduces the forget follower API. This API is needed in cases that
unfollowing a following index fails to remove the shard history retention leases
on the leader index. This can happen explicitly through user action, or
implicitly through an index managed by ILM. When this occurs, history will be
retained longer than necessary. While the retention lease will eventually
expire, it can be expensive to allow history to persist for that long, and also
prevent ILM from performing actions like shrink on the leader index. As such, we
introduce an API to allow for manual removal of the shard history retention
leases in this case.
* Back port build changes from #39102
This back-ports how versions are determined and bwc test are set up from
#39102 without enabling the bwc from current version tests so it's
easier/possible to backmerge future buld changes.
It's expected that the tets are lacking many of the required fixes in
this version to enable them.
* methods to run bin script
* Add support for specifying and installing plugins
* Add OS specific distirbution support
* Add test to verify plugin installed
* Remove use of Gradle internal OperatingSystem
* Un-mute and fix BuildExamplePluginsIT
There doesn't seem to be anything wrong with the test iteself.
I think the failure were CI performance related, but while it was muted,
some failures managed to sneak in.
Closes#38784
* PR review
When test clusters are stood up, one of the steps in the wait task is to wait for
ports files to appear. An exception throw was added if this were to time out
instead of failing with no information, but the exception text uses a missing
variable which further obfuscates the problem.
Backports #39321
Packaging tests did not honor bwc tests being off.
This was also the reason for which we were building the BWC versinons
even if the tests are off, so this closes#35347.
This fixes a bug in the sensing of the current OS family in the test cluster
formation code. Previously all builds would assume every environment
was windows and would jump to using the windows zip build. This fixes
the OS sensing code as well as updates some tests to account for
different build flavors.
Backport of #38457
This commit fixes a bug which resulted in every RandomizedTestingTask
generating its own unique test seed rather than using the global test
seed generated by the the RandomizedTestingPlugin.
The issue didn't affect build invocations which explicitly ran with
-Dtest.seed. In those cases, the Ant junit task correctly uses the
global seed.
Since the Ant random testing target looks for an Ant project property
for determining the existing seed we now explicitly set this inside
RandomizedTestingPlugin. It just so happens that, incidentally, Ant will
inherit system properties, thus wy the -Dtest.seed mechanism continued
to work.
In the ClusterConfiguration class of the build source, there is an Ant waitfor block
that runs to ensure that the seed node's transport ports file is created before
trying to read it. If the wait times out, the file read fails with not much helpful info.
This just adds a timeout property to the waitfor block and throws a descriptive
exception instead.
Backports #37574
This commit moves validation logic for ensuring our testclusters
configuration doesn't contain unexpected artifacts into the plugin
itself. This change allows us to remove the custom copy task
implementation altogether.
Additionally, the error message has been improved to display component
ids in addition to the artifacts to make it easier to figure out what
actual dependency is at fault.
Handle the case of `Description` being null which is a valid case as
described in the `HeartBeatEvent`'s javadoc, which previously resulted
in exceptions that "pollute" the build output.
Follows: #28563
Backport: #38799
Recently we changed where we source released artifacts for usage in
backwards compatibility tests. We now source these from
artifacts.elastic.co. To avoid polluting the download stats from builds,
we want to add the X-Elastic-No-KPI header to requests from
artifacts.elastic.co. To do this, we hack the Ivy feature of custom HTTP
header credentials and specify our desired headers.
When we are preparing to release a major version the rules around
"unreleased" versions and branches get a bit more complex.
This change implements the following rules:
- If the tip version on the previous major is a .0 (e.g. 6.7.0) then
the tip of the minor before that (e.g. 6.6.1) must be unreleased.
(This is because 6.7.0 would be "staged" in preparation for release,
but 6.6.1 would be open for bug fixes on the release 6.6.x line)
(in VersionCollection & VersionUtils)
- The "major.x" branch (if it exists) will always point to the latest
minor in that series. Anything that is not the latest minor, must
therefore be on a the "major.minor" branch
For example, if v7.1.0 exists then the "7.x" branch must be 7.1.0,
and 7.0.0 must be on the "7.0" branch
(in VersionCollection)
This commit adds the 7.1 version constant to the 7.x branch.
Co-authored-by: Andy Bristol <andy.bristol@elastic.co>
Co-authored-by: Tim Brooks <tim@uncontended.net>
Co-authored-by: Christoph Büscher <cbuescher@posteo.de>
Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
Co-authored-by: markharwood <markharwood@gmail.com>
Co-authored-by: Ioannis Kakavas <ioannis@elastic.co>
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
Co-authored-by: David Roberts <dave.roberts@elastic.co>
Co-authored-by: Jason Tedor <jason@tedor.me>
Co-authored-by: Alpar Torok <torokalpar@gmail.com>
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Tim Vernum <tim@adjective.org>
Co-authored-by: Albert Zaharovits <albert.zaharovits@gmail.com>
Renames the following settings to remove the mention of `zen` in their names:
- `discovery.zen.hosts_provider` -> `discovery.seed_providers`
- `discovery.zen.ping.unicast.concurrent_connects` -> `discovery.seed_resolver.max_concurrent_resolvers`
- `discovery.zen.ping.unicast.hosts.resolve_timeout` -> `discovery.seed_resolver.timeout`
- `discovery.zen.ping.unicast.hosts` -> `discovery.seed_addresses`
Today we pass `discovery.zen.minimum_master_nodes` to nodes started up in
tests, but for 7.x nodes this setting is not required as it has no effect.
This commit removes this setting so that nodes are started with more realistic
configurations, and deprecates it.
The script is used to create a cache on ephemeral CI workers.
Changes:
- create and use a `pullFixture` task that always exists regardless
of docker support
- wire dependencies correctly so any pre fixture setup runs for pull
as well
- set up java env vars so bwc versions can build