Backport of #39350
Contains the following:
* LUCENE-8635: Move terms dictionary off-heap for non-primary-key fields in `MMapDirectory`
* LUCENE-8292: `TermsEnum` is fully abstract
* LUCENE-8679: Return WITHIN in `EdgeTree#relateTriangle` only when polygon and triangle share one edge
* LUCENE-8676: Nori tokenizer deals correctly with large buffers
* LUCENE-8697: `GraphTokenStreamFiniteStrings` better handles side paths with gaps
* LUCENE-8664: Add `equals` and `hashCode` to `TotalHits`
* LUCENE-8660: `TopDocsCollector` returns accurate hit counts if the total equals the threshold
* LUCENE-8654: `Polygon2D#relateTriangle` fix for when the polygon is inside the triangle
* LUCENE-8645: `Intervals#fixField` can merge intervals from different fields
* LUCENE-8585: Create jump-tables for DocValues at index time
Initially in #38910, ShardFollowTask was reusing ImmutableFollowParameters'
serialization logic. After merging, bwc tests failed sometimes and
the binary serialization that ShardFollowTask was originally was using
was added back. ImmutableFollowParameters is using optional fields (optional vint)
while ShardFollowTask was not (vint).
* Make pullFixture a task dependency of resolveAllDependencies
With this change we will pull the docker test fixtures ( and thus cache
the images ) for older versions too as the `resolveAllDependencies` is
already being called on the bwc checkouts too.
For some users, the built in authorization mechanism does not fit their
needs and no feature that we offer would allow them to control the
authorization process to meet their needs. In order to support this,
a concept of an AuthorizationEngine is being introduced, which can be
provided using the security extension mechanism.
An AuthorizationEngine is responsible for making the authorization
decisions about a request. The engine is responsible for knowing how to
authorize and can be backed by whatever mechanism a user wants. The
default mechanism is one backed by roles to provide the authorization
decisions. The AuthorizationEngine will be called by the
AuthorizationService, which handles more of the internal workings that
apply in general to authorization within Elasticsearch.
In order to support external authorization services that would back an
authorization engine, the entire authorization process has become
asynchronous, which also includes all calls to the AuthorizationEngine.
The use of roles also leaked out of the AuthorizationService in our
existing code that is not specifically related to roles so this also
needed to be addressed. RequestInterceptor instances sometimes used a
role to ensure a user was not attempting to escalate their privileges.
Addressing this leakage of roles meant that the RequestInterceptor
execution needed to move within the AuthorizationService and that
AuthorizationEngines needed to support detection of whether a user has
more privileges on a name than another. The second area where roles
leaked to the user is in the handling of a few privilege APIs that
could be used to retrieve the user's privileges or ask if a user has
privileges to perform an action. To remove the leakage of roles from
these actions, the AuthorizationService and AuthorizationEngine gained
methods that enabled an AuthorizationEngine to return the response for
these APIs.
Ultimately this feature is the work included in:
#37785#37495#37328#36245#38137#38219Closes#32435
Because concurrent sync requests from a primary to its replicas could be
in flight, it can be the case that an older retention leases collection
arrives and is processed on the replica after a newer retention leases
collection has arrived and been processed. Without a defense, in this
case the replica would overwrite the newer retention leases with the
older retention leases. This commit addresses this issue by introducing
a versioning scheme to retention leases. This versioning scheme is used
to resolve out-of-order processing on the replica. We persist this
version into Lucene and restore it on recovery. The encoding of
retention leases is starting to get a little ugly. We can consider
addressing this in a follow-up.
This commit adapts the version used in StartedShardEntry serialization
after the backport of #37899 and reenables bwc tests.
Related to #37899
Related to #38074
Adds reindex.ssl.* settings for reindex from remote.
This uses the ssl-config/ internal library to parse and load SSL
configuration and files. This is applied when using the low level
rest client to connect to a remote ES node
Relates: #37287Resolves: #29755
Replaces intermediate geo objects built by ShapeBuilders with
objects from the libs/geo hierarchy. This should allow us to build
all geo functionality around a single hierarchy.
Follow up for #35320
Currently integration tests which use either bwc snapshot versions or
the current version of elasticsearch depend on project substitutions to
link to the build of those artifacts. Likewise, vagrant tests use
dependency substitutions to get to bwc snapshots of rpm and debs.
This commit changes those to depend on the relevant project/configuration
and removes the dependency substitutions for distributions we do not
publish.
The rpm, deb and tar distributions were removed some time ago from maven
central. The zip distribution still exists there, but it does not need
to. Instead, this commit sets up an ivy repository with pattern pointing
to the elasticsearch artifacts download service. Note that the
integ-test-zip remains in maven central, since it is not present in the
download service.
- The current version was hard coded in the test, causing it to fail as
we removed the qualifier
- also applied the base plugin to the root project to get the `clean`
task to work as expected. This was preventing the failure from
reproducing locally.
* Manage dependencies for test clusters
Create a configuration and add the distribution to it automatically.
A task is created and added as a dependency to any task that uses a test
cluster.
The task extracts all the zip archives ( only zip support for now )
in the configuration.
We do this only once because most tests mostly use the same distribution
and thus we can avoid extracting it multiple times.
With this we will be able to start the node from the same files which
will most of the time live in OS caches or COW if the
configuration requires it.
* Introduce property to set version qualifier
- VersionProperties.elasticsearch is now a string which can have qualifier
and snapshot too
- The Version class in the build no longer cares about snapshot and
qualifier.
This change introduces stats per processors. Total, time, failed,
current are currently supported. All pipelines will now show all
top level processors that belong to it. Failure processors are not
displayed, however, the time taken to execute the failure chain is part
of the stats for the top level processor.
The processor name is the type of the processor, ordered as defined in
the pipeline. If a tag for the processor is found, then the tag is
appended to the type.
Pipeline processors will have the pipeline name appended to the name of
the name of the processors (before the tag if one exists). If more
then one pipeline is used to process the document, then each pipeline
will carry its own stats. The outer most pipeline will also include the
inner most pipeline stats.
Conditional processors will only included in the stats if the condition evaluates
to true.
With the introduction of immutable workers into our CI, we now have a
problem where we would have to download dependencies on every single
build. To this end our infrastructure team has introduced the
possibility to download dependencies one time and then bake these into
the base immutable worker image. For this, we introduce our version of
this special script which runs a task that downloads all dependencies of
all configurations. With this script, our infrastructure team will be
rebuilding these images on an at-least daily basis. This helps us avoid
having to download the dependencies for every single build.
This reworks how we configure the `shadow` plugin in the build. The major
change is that we no longer bundle dependencies in the `compile` configuration,
instead we bundle dependencies in the new `bundle` configuration. This feels
more right because it is a little more "opt in" rather than "opt out" and the
name of the `bundle` configuration is a little more obvious.
As an neat side effect of this, the `runtimeElements` configuration used when
one project depends on another now contains exactly the dependencies needed
to run the project so you no longer need to reference projects that use the
shadow plugin like this:
```
testCompile project(path: ':client:rest-high-level', configuration: 'shadow')
```
You can instead use the much more normal:
```
testCompile "org.elasticsearch.client:elasticsearch-rest-high-level-client:${version}"
```
Add tests for build-tools to make sure example plugins build stand-alone using it.
This will catch issues such as referencing files from the buildSrc directly, breaking external uses of build-tools.
* Determine the minimum gradle version based on the wrapper
This is restrictive and forces users of the plugin to move together with
us, but without integration tests it's close to impossible to make sure
that the claimed compatability is really there.
If we do want to offer more flexibility, we should add those tests
first.
* Track gradle version in individual file
* PR review