The goal of this commit is to address unknown licenses when producing
the dependencies info report. We have two different checks that we run
on licenses. The first check is whether or not we have stashed a copy of
the license text for a dependency in the repository. The second is to
map every dependency to a license type (e.g., BSD 3-clause). The problem
here is that the way we were handling licenses in the second check
differs from how we handle licenses in the first check. The first check
works by finding a license file with the name of the artifact followed
by the text -LICENSE.txt. Yet in some cases we allow mapping an artifact
name to another name used to check for the license (e.g., we map
lucene-.* to lucene, and opensaml-.* to shibboleth. The second check
understood the first way of looking for a license file but not the
second way. So in this commit we teach the second check about the
mappings from artifact names to license names. We do this by copying the
configuration from the dependencyLicenses task to the dependenciesInfo
task and then reusing the code from the first check in the second
check. There were some other challenges here though. For example,
dependenciesInfo was checking too many dependencies. For now, we should
only be checking direct dependencies and leaving transitive dependencies
from another org.elasticsearch artifact to that artifact (we want to do
this differently in a follow-up). We also want to disable
dependenciesInfo for projects that we do not publish, users only care
about licenses they might be exposed to if they use our assembled
products. With all of the changes in this commit we have eliminated all
unknown licenses. A follow-up will enforce that when we add a new
dependency it does not get mapped to unknown, these will be forbidden in
the future. Therefore, with this change and earlier changes are left
having no unknown licenses and two custom licenses; custom here means it
does not map to an SPDX license type. Those two licenses are xz and
ldapsdk. A future change will not allow additional custom licenses
unless they are explicitly whitelisted. This ensures that if a new
dependency is added it is mapped to an SPDX license or mapped to custom
because it does not have an SPDX license.
This commit adds a check that any class in X-Pack that is a feature
aware custom also implements the appropriate mix-in interface in
X-Pack. These interfaces provide a default implementation of
FeatureAware#getRequiredFeature that returns that x-pack is the required
feature. By implementing this interface, this gives a consistent way for
X-Pack feature aware customs to return the appopriate required feature
and this check enforces that all such feature aware customs return the
appropriate required feature.
* Accept Gradle build scan argreement
Scans will be produced only when passing
`--scan`
* Condition TOS acceptance with property
* Switch to boolean flags
Uses a filter on the copy task for the eclipse settings files to
replace the token @@LICENSE_HEADER_TEXT@@ with the correct licence
header from the new buildSrc/src/main/resources/license-headers
directory
This commit moves the gradle wrapper jar file to a hidden directory, so
that it does not clutter the top level names seen when doing an ls in
the project. The actual jar file is never manually edited, and only
changed by running `./gradlew wrapper ...` so it is not important for
this directory to be "visible".
Adds tasks that check that the all jars that we build have LICENSE.txt
and NOTICE.txt files and that the files are correct. Sets check to
depend on these task.
This is mostly there for extra parnoia because we automatically
configure all Jar tasks to include the LICENSE.txt and NOTICE.txt
files anyway. But it is quite possible to add configuration to those
tasks that would override either file.
This causes check to depend on several more things than it used to.
Take, for example, javadoc:
check depends on the new verifyJavadocJarNotice which depends on
extractJavadocJar which depends on javadocJar which depends on
javadoc, this check now depends on javadoc.
This commit moves the apache and elastic license files into a new
root level `licenses` directory and rewrites the top level LICENSE.txt
to clarify the repository has a mix of apache and elastic licensed code.
This commit makes x-pack a module and adds it to the default
distrubtion. It also creates distributions for zip, tar, deb and rpm
which contain only oss code.
Fails the build if any subprojects of `:libs` have dependencies in `:libs`
except for `:libs:elasticsearch-core`.
Since we now have three places where we resolve project substitutions
I've added `dependencyToProject` to `project.ext` in all projects. It
resolves both `project` style dependencies and "external" style (like
"org.elasticsearch:elasticsearch-core:${version}") dependencies to
`Project`s using the `projectSubstitutions`. I use this new function all
three places where resovle project substitutions.
Finally this pulls `apply plugin: 'elasticsearch.build'` out of
`libs/*/build.gradle` and into a subprojects clause in
`libs/build.gradle`. I do this entirely so that I can call
`tasks.precommit.dependsOn checkDependencies` without waiting for the
subprojects to be evaluated or worrying about whether or not they have
`precommit` set up in a normal way.
Rather than checking a substring match, now that
VersionProperties#elasticsearch is a strongly-typed instance of Version,
we can use the Version#isSnapshot convenience method. This commit
switches the root build file to do this.
Today we have a nodeVersion property on the NodeInfo class that we use
to carry around information about a standalone node that we will start
during tests. This property is a String which we usually end up parsing
to a Version anyway to do various checks on it. This can end up
happening a lot during configuration so it would be more efficient and
safer to have this already be strongly-typed as a Version and parsed
from a String only once for each instance of NodeInfo. Therefore, this
commit makes NodeInfo#nodeVersion strongly-typed as a Version.
* Begin moving XContent to a separate lib/artifact
This commit moves a large portion of the XContent code from the `server` project
to the `libs/xcontent` project. For the pieces that have been moved, some
helpers have been duplicated to allow them to be decoupled from ES helper
classes. In addition, `Booleans` and `CheckedFunction` have been moved to the
`elasticsearch-core` project.
This decoupling is a move so that we can eventually make things like the
high-level REST client not rely on the entire ES jar, only the parts it needs.
There are some pieces that are still not decoupled, in particular some of the
XContent tests still remain in the server project, this is because they test a
large portion of the pluggable xcontent pieces through
`XContentElasticsearchException`. They may be decoupled in future work.
Additionally, there may be more piecese that we want to move to the xcontent lib
in the future that are not part of this PR, this is a starting point.
Relates to #28504
This commit adds intermediate gradle projects for archive based
distributions (zip, tar) and package based distributions (rpm, deb). The
grouping allows the common distribution build file to be considerably
shorter and clearly separated from the common zip/tar and rpm/deb
configuration.
Generalizing BWC building so that there is less code to modify for a release. This ensures we do not
need to think about what major or minor version is in the gradle code. It follows the general rules of the
elastic release structure. For more information on the rules, see the VersionCollection's javadoc.
This also removes the additional bwc snapshots that will never be released, such as 6.0.2, which were
being built and tested against every time we ran bwc tests.
Additionally, it creates 4 new projects that correspond to the different types of snapshots that may exist
for a given version. Its possible to now run those individual tasks to work out bwc logic whereas
previously it was impossible and the entire suite of bwc tests had to be run to work out any logic
changes in the build tools' bwc project. Please note that if the project does not make sense for the
version that is current, that an error will be thrown from that individual project if an attempt is made to
run it.
This should allow for automating the version bumps as well, since it removes all the hardcoded version
logic from the configs.
When elasticsearch was originally moved to gradle, the "provided" equivalent in maven had to be done through a plugin. Since then, gradle added the "compileOnly" configuration. This commit removes the provided plugin and replaces all uses with compileOnly.
The bwc tests can be disabled in order to facilitate commits necessary
to older branches to maintain backcompat. A check already exists to
ensure this flag is not left disabled too long (once a day by the
branchConsistency check). However, it can be surprising if you try
running bwc tests explicitly and they look like nothing is happening.
This commit adds a warning during configuration to ensure it is clear
the bwc tests are disabled and enforces a link to a PR which is in the
process of being backported.
Plugin descriptors currently contain an elasticsearch version,
which the plugin was built against, and a java version, which the plugin
was built with. These versions are read and validated, but not stored.
This commit keeps them in PluginInfo so they can be used later.
While seeing the elasticsearch version is less interesting (since it is
enforced to match that of the running elasticsearc node), the java
version is interesting since we only validate the format, not the actual
version. This also makes PluginInfo have full parity with the plugin
properties file.
These tests were disabled to facilitate backport of a PR which is not
yet complete. These tests were accidentally reenabled after a merge
conflict was resolved in the wrong direction. This commit addresses this
issue.
Adds allow_partial_search_results flag to search requests with default setting = true.
When false, will error if search either timeouts, has partial errors or has missing shards rather
than returning partial search results. A cluster-level setting provides a default for search requests with no flag.
Closes#27435
This is related to #27933. It introduces a jar named elasticsearch-core
in the lib directory. This commit moves the JarHell class from server to
elasticsearch-core. Additionally, PathUtils and some of Loggers are
moved as JarHell depends on them.
We have agreed to introduce the Gradle wrapper to simplify workflows for
developers, and managing infrastructure (e.g., CI, release builds, etc.)
as well as consideration for the fact that other projects in our stack
use Gradle and do not necessarily want to be tied to our Gradle version.
Relates #28065
This commit adds the infrastructure to plugin building and loading to
allow one plugin to extend another. That is, one plugin may extend
another by the "parent" plugin allowing itself to be extended through
java SPI. When all plugins extending a plugin are finished loading, the
"parent" plugin has a callback (through the ExtensiblePlugin interface)
allowing it to reload SPI.
This commit also adds an example plugin which uses as-yet implemented
extensibility (adding to the painless whitelist).
This is related to #27802. This commit adds a jar called
elasticsearch-nio that contains the base nio classes that will be used
for the tcp nio transport and eventually the http nio transport.
The jar does not depend on elasticsearch:core, so all references to core
have been removed.
This change removes the module named aggs-composite and adds the `composite` aggs
as a core aggregation. This allows other plugins to use this new aggregation
and simplifies the integration in the HL rest client.
Projects the depend on the CLI currently depend on core. This should not
always be the case. The EnvironmentAwareCommand will remain in :core,
but the rest of the CLI components have been moved into their own
subproject of :core, :core:cli.
* This change adds a module called `aggs-composite` that defines a new aggregation named `composite`.
The `composite` aggregation is a multi-buckets aggregation that creates composite buckets made of multiple sources.
The sources for each bucket can be defined as:
* A `terms` source, values are extracted from a field or a script.
* A `date_histogram` source, values are extracted from a date field and rounded to the provided interval.
This aggregation can be used to retrieve all buckets of a deeply nested aggregation by flattening the nested aggregation in composite buckets.
A composite buckets is composed of one value per source and is built for each document as the combinations of values in the provided sources.
For instance the following aggregation:
````
"test_agg": {
"terms": {
"field": "field1"
},
"aggs": {
"nested_test_agg":
"terms": {
"field": "field2"
}
}
}
````
... which retrieves the top N terms for `field1` and for each top term in `field1` the top N terms for `field2`, can be replaced by a `composite` aggregation in order to retrieve **all** the combinations of `field1`, `field2` in the matching documents:
````
"composite_agg": {
"composite": {
"sources": [
{
"field1": {
"terms": {
"field": "field1"
}
}
},
{
"field2": {
"terms": {
"field": "field2"
}
}
},
}
}
````
The response of the aggregation looks like this:
````
"aggregations": {
"composite_agg": {
"buckets": [
{
"key": {
"field1": "alabama",
"field2": "almanach"
},
"doc_count": 100
},
{
"key": {
"field1": "alabama",
"field2": "calendar"
},
"doc_count": 1
},
{
"key": {
"field1": "arizona",
"field2": "calendar"
},
"doc_count": 1
}
]
}
}
````
By default this aggregation returns 10 buckets sorted in ascending order of the composite key.
Pagination can be achieved by providing `after` values, the values of the composite key to aggregate after.
For instance the following aggregation will aggregate all composite keys that sorts after `arizona, calendar`:
````
"composite_agg": {
"composite": {
"after": {"field1": "alabama", "field2": "calendar"},
"size": 100,
"sources": [
{
"field1": {
"terms": {
"field": "field1"
}
}
},
{
"field2": {
"terms": {
"field": "field2"
}
}
}
}
}
````
This aggregation is optimized for indices that set an index sorting that match the composite source definition.
For instance the aggregation above could run faster on indices that defines an index sorting like this:
````
"settings": {
"index.sort.field": ["field1", "field2"]
}
````
In this case the `composite` aggregation can early terminate on each segment.
This aggregation also accepts multi-valued field but disables early termination for these fields even if index sorting matches the sources definition.
This is mandatory because index sorting picks only one value per document to perform the sort.
It is the exciting return of the global checkpoint background
sync. Long, long ago, in snapshot version far, far away we had and only
had a global checkpoint background sync. This sync would fire
periodically and send the global checkpoint from the primary shard to
the replicas so that they could update their local knowledge of the
global checkpoint. Later in time, as we sped ahead towards finalizing
the initial version of sequence IDs, we realized that we need the global
checkpoint updates to be inline. This means that on a replication
operation, the primary shard would piggy back the global checkpoint with
the replication operation to the replicas. The replicas would update
their local knowledge of the global checkpoint and reply with their
local checkpoint. However, this could allow the global checkpoint on the
primary to advance again and the replicas would fall behind in their
local knowledge of the global checkpoint. If another replication
operation never fired, then the replicas would be permanently behind. To
account for this, we added one more sync that would fire when the
primary shard fell idle. However, this has problems:
- the shard idle timer defaults to five minutes, a long time to wait
for the replicas to learn of the new global checkpoint
- if a replica missed the sync, there was no follow-up sync to catch
them up
- there is an inherent race condition where the primary shard could
fall idle mid-operation (after having sent the replication request to
the replicas); in this case, there would never be a background sync
after the operation completes
- tying the global checkpoint sync to the idle timer was never natural
To fix this, we add two additional changes for the global checkpoint to
be synced to the replicas. The first is that we add a post-operation
sync that only fires if there are no operations in flight and there is a
lagging replica. This gives us a chance to sync the global checkpoint to
the replicas immediately after an operation so that they are always kept
up to date. The second is that we add back a global checkpoint
background sync that fires on a timer. This timer fires every thirty
seconds, and is not configurable (for simplicity). This background sync
is smarter than what we had previously in the sense that it only sends a
sync if the global checkpoint on at least one replica is lagging that of
the primary. When the timer fires, we can compare the global checkpoint
on the primary to its knowledge of the global checkpoint on the replicas
and only send a sync if there is a shard behind.
Relates #26591