Zen2 is now feature-complete enough to run most ESIntegTestCase tests. The changes in this PR
are as follows:
- ClusterSettingsIT is adapted to not be Zen1 specific anymore (it was using Zen1 settings).
- Some of the integration tests require persistent storage of the cluster state, which is not fully
implemented yet (see #33958). These tests keep running with Zen1 for now but will be switched
over as soon as that is fully implemented.
- Some very few integration tests are not running yet with Zen2 for other reasons, depending on
some of the other open points in #32006.
Elasticsearch node is responsible for storing cluster metadata.
There are 2 types of metadata: global metadata and index metadata.
`GatewayMetaState` implements `ClusterStateApplier` and receives all
`ClusterStateChanged` events and is responsible for storing modified
metadata to disk.
When new `ClusterStateChanged` event is received, `GatewayMetaState`
checks if global metadata has changed and if it's the case writes new
global metadata to disk. After that `GatewayMetaState` checks if index
metadata has changed or there are new indices assigned to this node and
if it's the case writes new index metadata to disk. Atomicity of global
metadata and index metadata writes is ensured by `MetaDataStateFormat`
class.
Unfortunately, there is no atomicity when more than one metadata changes
(global and index, or metadata for two indices). And atomicity is
important for Zen2 correctness.
This commit adds atomicity by adding a notion of manifest file,
represented by `MetaState` class. `MetaState` contains pointers to
current metadata.
More precisely, it stores global state generation as long and map from
`Index` to index metadata generation as long. Atomicity of writes for
manifest file is ensured by `MetaStateFormat` class.
The algorithm of writing changes to the disk would be the following:
1. Write global metadata state file to disk and remember
it's generation.
2. For each new/changed index write state file to disk and remember
it's generation. For each not-changed index use generation from
previous manifest file. If index is removed or this node is no longer
responsible for this index - forget about the index.
3. Create `MetaState` object using previously remembered generations and
write it to disk.
4. Remove old state files for global metadata, indices metadata and
manifest.
Additonally new implementation relies on enhanced `MetaDataStateFormat`
failure semantics, `applyClusterState` throws IOException, whose
descendant `WriteStateException` could be (and should be in Zen2)
explicitly handled.
Today, the bootstrapping of a Zen2 cluster is driven externally, requiring
something else to wait for discovery to converge and then to inject the initial
configuration. This is hard to use in some situations, such as REST tests.
This change introduces the `ClusterBootstrapService` which brings the bootstrap
retry logic within each node and allows it to be controlled via an (unsafe)
node setting.
The docs are not resilient to timing issues where the ILM metadata is not set on newly
created indices, so we shouldn't be so strict on the returned response
The DefaultAuthenticationFailureHandler has a deprecated constructor
that was present to prevent a breaking change to custom realm plugin
authors in 6.x. This commit removes the constructor and its uses.
Refactors and simplifies the logic around stopping nodes, making sure that for a full cluster restart
onNodeStopped is only called after the nodes are actually all stopped (and in particular not while
starting up some nodes again). This change also ensures that a closed node client is not being used
anymore (which required a small change to a test).
Relates to #35049
- The current version was hard coded in the test, causing it to fail as
we removed the qualifier
- also applied the base plugin to the root project to get the `clean`
task to work as expected. This was preventing the failure from
reproducing locally.
The documentation of `search_after` recommends to use the `_id`
field as a tiebreaker for the sort without warning against
the additional memory required. This change changes the recommandation
to use a copy of the `_id` field with doc_values enabled.
* Manage dependencies for test clusters
Create a configuration and add the distribution to it automatically.
A task is created and added as a dependency to any task that uses a test
cluster.
The task extracts all the zip archives ( only zip support for now )
in the configuration.
We do this only once because most tests mostly use the same distribution
and thus we can avoid extracting it multiple times.
With this we will be able to start the node from the same files which
will most of the time live in OS caches or COW if the
configuration requires it.
* Swithc to jcenter
Jcenter suposedly operates on a CDN.
It should be faster and more reliable.
It's the default in Andorid Studio currently
( it switched from mavenCentral ).
This change is safe because jcenter is a superset of mavenCentral.
* update comment
Today, the TransportReplicationAction checks the global level blocks and
the index level blocks before routing the operation to the primary, in the
ReroutePhase, and it happens at the very beginning of the transport
replication action execution. For the upcoming rework of the Close Index
API and in order to deal with primary relocation, we'll need to also check
for blocks before executing the operation on the primary (while holding a
permit) but before routing to the new primary.
This pull request change the AsyncPrimaryAction so that it checks for
replication action's blocks before executing the operation locally or before
routing the primary action to the newly primary shard. The check is done
while holding a PrimaryShardReference.
Related to #33888
For some time, the PutUser REST API has supported storing a pre-hashed
password for a user. The change adds validation and tests around that
feature so that it can be documented & officially supported.
It also prevents the request from containing both a "password" and a "password_hash".
The way ScoreAccessor implements `compareTo()` is problematic because it doesn't
completely follow the Comparable contract, specificaly symmetry (if x is a
ScoreAccessor and y any Number then x.comparTo(y) works, but y.compareTo(x)
generally does not even compile). Fortunately we don't seem to use the fact that
ScoreAccessor is a Comparable anywhere, so we can simply remove it.
Today the `PeerFinder` probes each address it obtains, identifies the node to
which it just connected, and then returns all such nodes. However, this can
lead to duplicates if a node manages to connect to another node via two
distinct addresses. This causes bootstrapping to fail since
`BootstrapConfiguration#resolve` forbids duplicates.
This change alters the behaviour of the `PeerFinder` to remove duplicates in
this situation.
This adds a `wait_for_completion` flag which allows the user to block
the Stop API until the task has actually moved to a stopped state,
instead of returning immediately. If the flag is set, a `timeout` parameter
can be specified to determine how long (at max) to block the API
call. If unspecified, the timeout is 30s.
If the timeout is exceeded before the job moves to STOPPED, a
timeout exception is thrown. Note: this is just signifying that the API
call itself timed out. The job will remain in STOPPING and evenutally
flip over to STOPPED in the background.
If the user asks the API to block, we move over the the generic
threadpool so that we don't hold up a networking thread.
* Adds HLRC docs for put lifecycle policy
* Adds link to docs in client javadocs
* Fixes checkstyle
* Make the documentation use the right ack response
If shutting down half or more of the master-eligible nodes, their votes must
first be explicitly withdrawn to ensure that the cluster doesn't lose its
quorum. This works via _voting tombstones_, stored in the cluster state, which
tell the reconfigurator to remove nodes from the voting configuration.
This change introduces voting tombstones to the cluster state, together with
transport APIs for adding and removing them, and makes use of these APIs in
`InternalTestCluster` to support tests which remove at least half of the
master-eligible nodes at once (e.g. shrinking from two master-eligible nodes to
one).
* [ILM] Add documentation for error handling in ILM
This adds some initial documentation for error handling and retrying failed
steps for index lifecycle management
AbstractComponent was deprecated in #35140 and is looking like it will be
removed at some point by #34888. Today all it does is provide a logger. This
change removes the usages of AbstractComponent that live solely in the zen2
feature branch to avoid some future merge pain, and replaces it where necessary
with some directly-created loggers.