The goal of this commit is to address unknown licenses when producing
the dependencies info report. We have two different checks that we run
on licenses. The first check is whether or not we have stashed a copy of
the license text for a dependency in the repository. The second is to
map every dependency to a license type (e.g., BSD 3-clause). The problem
here is that the way we were handling licenses in the second check
differs from how we handle licenses in the first check. The first check
works by finding a license file with the name of the artifact followed
by the text -LICENSE.txt. Yet in some cases we allow mapping an artifact
name to another name used to check for the license (e.g., we map
lucene-.* to lucene, and opensaml-.* to shibboleth. The second check
understood the first way of looking for a license file but not the
second way. So in this commit we teach the second check about the
mappings from artifact names to license names. We do this by copying the
configuration from the dependencyLicenses task to the dependenciesInfo
task and then reusing the code from the first check in the second
check. There were some other challenges here though. For example,
dependenciesInfo was checking too many dependencies. For now, we should
only be checking direct dependencies and leaving transitive dependencies
from another org.elasticsearch artifact to that artifact (we want to do
this differently in a follow-up). We also want to disable
dependenciesInfo for projects that we do not publish, users only care
about licenses they might be exposed to if they use our assembled
products. With all of the changes in this commit we have eliminated all
unknown licenses. A follow-up will enforce that when we add a new
dependency it does not get mapped to unknown, these will be forbidden in
the future. Therefore, with this change and earlier changes are left
having no unknown licenses and two custom licenses; custom here means it
does not map to an SPDX license type. Those two licenses are xz and
ldapsdk. A future change will not allow additional custom licenses
unless they are explicitly whitelisted. This ensures that if a new
dependency is added it is mapped to an SPDX license or mapped to custom
because it does not have an SPDX license.
Most of our license file names strip the version off the artifact name
when deducing the license filename. However, the version on the GCS SDK
(google-api-services-storage) does not match the usual format and
instead starts with a vee. This means that the license filename for this
license ended up carrying the version and we should not do that. This
commit adjusts the regex the deduces the license filename to account for
this case, and adjusts the google-api-services-storage license files
accordingly.
This commit enhances the license detection that we have for various
licenses. Here we improve the detection for all licenses (especially the
Apache 2.0 License), the BSD 2-clause license, the MIT (with
attribution) license, and we add detection for the BSD 3-clause
license. One way that we achieved this improvement is by changing how
the license files are read so that rather than reading them as a
multi-line string which ended up represented as "[line1, line2, line3,
...]" internally, we read the full bytes of the license text and replace
all whitespace with a single space so the license text is now loaded as
"line1 line2 line3". For the MIT license we add the actual license text
and remove the "MIT" string as not all copies of the license clearly
indicate that the text is the MIT license. We take a similar strategy
for the BSD-2 and BSD-3 clause licenses. With this change, we reduce the
number of "custom" licenses in the codebase from 31 to 2. The two
remaining appear to be truly custom licenses, not carrying licenses
identifiable by SPDX. A follow-up will address "unknown" licenses.
Use all running nodes as unicast seeds in the rolling restart tests to
avoid a race between pinging and the tests. Without this if the tests
are too fast then when a new node comes up and pings its single
configured seed node that node *might* not have a ping from the other
running node.
This commit adds a new writeBlobAtomic() method to the BlobContainer
interface that can be implemented by repository implementations which
support atomic writes operations.
When the BlobContainer implementation does not provide a specific
implementation of writeBlobAtomic(), then the writeBlob() method is used.
Related to #30680
This snapshot includes:
- LUCENE-8341: Record soft deletes in SegmentCommitInfo which will resolve#30851
- LUCENE-8335: Enforce soft-deletes field up-front
Today when executing REST tests we take full responsibility for cluster
configuration. Yet, there are use cases for brining your own cluster to
the REST tests. This commit is a small first step towards that effort by
skipping creating the cluster if the tests.rest.cluster and test.cluster
system properties are set. In this case, the user takes full
responsibility for configuring the cluster as expected by the REST
tests. This step is by no means meant to be perfect or complete, only a
baby step.
This commit ensures the delete of the upgrade_is_oss indicator for
the packaging tests is always deleted before each run. It works by
moving the check on version which skips the task into the doFirst block,
replacing the onlyIf.
closes#30682
Ports the first couple tests for archive distributions from the old bats
project to the new java project that includes windows platforms,
consolidating them into one test method that tests that the
distributions can be extracted and their contents verified. Includes the
zip distributions which were not tested in the bats project.
The new snapshot includes LUCENE-8324 which fixes missing checkpoint
after a fully deletes segment is dropped on flush. This snapshot should
resolves failed tests in the CorruptedFileIT suite.
Closes#30741Closes#30577
* disable annotation processor for docs
Could not find evidence that the log4j annotation processor is used.
The compiler flag enables the Gradle 5.0 behavior
Closes#30476
* Disable annotation processors for all tests
* remove redundant `-proc:none` already handled by required plugins
* Revert unintentional changes
Meta plugins existed only for a short time, in order to enable breaking
up x-pack into multiple plugins. However, now that x-pack is no longer
installed as a plugin, the need for them has disappeared. This commit
removes the meta plugins infrastructure.
Adds windows server 2012r2 and 2016 vagrant boxes to packaging tests.
They can only be used if IDs for their images are specified, which are
passed to gradle and then to vagrant via env variables. Adds options
to the project property `vagrant.boxes` to choose between linux and
windows boxes.
Bats tests are run only on linux boxes, and portable packaging tests run
on all boxes. Platform tests are only run on linux boxes since they are
not being maintained.
For #26741
This commit changes the default out-of-the-box configuration for the
number of shards from five to one. We think this will help address a
common problem of oversharding. For users with time-based indices that
need a different default, this can be managed with index templates. For
users with non-time-based indices that find they need to re-shard with
the split API in place they no longer need to resort only to
reindexing.
Since this has the impact of changing the default number of shards used
in REST tests, we want to ensure that we still have coverage for issues
that could arise from multiple shards. As such, we randomize (rarely)
the default number of shards in REST tests to two. This is managed via a
global index template. However, some tests check the templates that are
in the cluster state during the test. Since this template is randomly
there, we need a way for tests to skip adding the template used to set
the number of shards to two. For this we add the default_shards feature
skip. To avoid having to write our docs in a complicated way because
sometimes they might be behind one shard, and sometimes they might be
behind two shards we apply the default_shards feature skip to all docs
tests. That is, these tests will always run with the default number of
shards (one).
This commit adds the ability to specify a plugin from maven for a
test cluster to use. Currently, only local projects may be used as
plugins, except when testing bwc, where the coordinates of the project
are used. However, that assumes all projects always keep the same
coordinates, or are even still plugins, which is no longer the case for
x-pack. The full cluster and rolling restart tests are changed to use
this new method when pulling x-pack versions before 6.3.0.
We have a pile of documentation describing how to rebuild the built in
language analyzers and, previously, our documentation testing framework
made sure that the examples successfully built *an* analyzer but they
didn't assert that the analyzer built by the documentation matches the
built in anlayzer. Unsuprisingly, some of the examples aren't quite
right.
This adds a mechanism that tests that the analyzers built by the docs.
The mechanism is fairly simple and brutal but it seems to be working:
build a hundred random unicode sequences and send them through the
`_analyze` API with the rebuilt analyzer and then again through the
built in analyzer. Then make sure both APIs return the same results.
Each of these calls to `_anlayze` takes about 20ms on my laptop which
seems fine.
This is a follow-up to our previous change to only fork javac if needed
to respect the Java compiler home (via JAVA_HOME). This commit makes the
same change for groovyc: we only fork groovyc if the JDK for Gradle is
not the JDK specified for the compiler (via JAVA_HOME).
We started forking javac to avoid GC overhead when running builds. Yet,
we do not seem to have this problem anymore and not forking leads to a
substantial speed improvement. This commit stops forking javac.
Upgrade to lucene-7.4.0-snapshot-1ed95c097b
This version contains:
* An Analyzer for Korean
* An IntervalQuery and IntervalsSource that retrieve minimum intervals of positional queries.
* A new API to retrieve matches (offsets and positions) of a query for a single document.
* Support for soft deletes in the index writer.
* A fixed shingle filter that handles index time synonyms.
* Support for emoji sequence in ICUTokenizer (with an upgrade to icu 61.1)
Uses a filter on the copy task for the eclipse settings files to
replace the token @@LICENSE_HEADER_TEXT@@ with the correct licence
header from the new buildSrc/src/main/resources/license-headers
directory
xpack core contains a fork of `Cron` from quartz who's javadoc has a
`<table>` with non-html5 compatible stuff. This html5ifies the table and
switches the `:x-pack:plugin:core` project to building javadoc with
HTML5.
[test] add java packaging test project
Adds a project for building and running packaging tests written in java
for portability. The vagrant tasks use jars on the packagingTest
configuration, which are built in the same project. No tests are added
yet.
Corresponding changes are not made to :x-pack:qa:vagrant because the
java packaging tests will all be consolidated into one project.
For #26741
`javadoc` will switch from detaulting to html4 to html5 in "a future
release". We should get ahead of it so we're not surprised. Also, HTML5
is the future! Er, the present. Anyway, this follows up from #30220 to
make the Javadoc for two of the four remaining projects HTML5
compatible.
This *mostly* silences `javadoc`'s warning about defaulting to
generating html4 files by enabling generating html5 file for the
projects for which that works. It didn't work in a half dozen projects,
about half of which I've fixed in this PR, entirely by replacing
`<tt>thing</tt>` with `{@code thing}`.
There are a few remaining projects that contain javadoc with invalid
html5. I'll fix those projects in a followup.
Add the oss tar distribution to the packaging test plugin. Test the oss
tar distribution in the core packaging tests, and the non-oss tar
distribution in the x-pack packaging tests.
Today we update index settings directly via IndexService instead of the
cluster state in IndexServiceTests. However, those changes will be lost
if there is a cluster state update. In general, we should update index
settings via client and limit the direct usage in only special tests.
This commit replaces direct usages by the updateSettings api of client.
Closes#24491
Today when forking setup commands we do not set JAVA_HOME. This means
that we might not use a version of Java compatible with the version of
Java the command is expecting to run on (for example, 5.6 nodes would
expect JDK 8, and this is true even for their setup commands). This
commit sets JAVA_HOME when configuring setup command tasks.
This commit moves the apache and elastic license files into a new
root level `licenses` directory and rewrites the top level LICENSE.txt
to clarify the repository has a mix of apache and elastic licensed code.
With the switch to X-Pack as a module, we lost production of POMs for
the JARs that we publish, and did not have a license/notice file in the
zip archives nor the exploded module. This commit ensures that we
generate these POMs, and license/notice files.
This commit makes x-pack a module and adds it to the default
distrubtion. It also creates distributions for zip, tar, deb and rpm
which contain only oss code.
This commit moves the checks on JAVAX_HOME (where X is the java version
number) existing to the end of gradle's configuration phase, and based
on whether the tasks needing the java home are configured to execute.
relates #29519
This commit is a minor cleanup of a code block in NodeInfo.groovy. We
remove an unused variable, make the formatting of the code consistent,
and cast a property that is typed as an Object to a String to avoid an
annoying IDE warning.
Some build tasks require older JDKs. For example, the BWC build tasks
for older versions of Elasticsearch require older JDKs. It is onerous to
require these be configured when merely compiling Elasticsearch, the
requirement that they be strictly set to appropriate values should only
be enforced if these tasks are going to be executed. To address this, we
lazy configure these tasks.
Today we have a nodeVersion property on the NodeInfo class that we use
to carry around information about a standalone node that we will start
during tests. This property is a String which we usually end up parsing
to a Version anyway to do various checks on it. This can end up
happening a lot during configuration so it would be more efficient and
safer to have this already be strongly-typed as a Version and parsed
from a String only once for each instance of NodeInfo. Therefore, this
commit makes NodeInfo#nodeVersion strongly-typed as a Version.
There are some scenarios where the license on a source file is one that
is compatible with our projects yet we do not want to add the license to
the list of approved license headers (to keep the number of files with
that compatible license contained). This commit adds the ability to
exclude a file from the license check.