This operator handles nulls in different way than the normal `=`.
If one of the operants is `null` and the other not it returns `false`.
If both operants are `null` it returns `true`. Therefore in contrary to
`=`, which returns `null` if at least one of the operants is `null`, this one
never returns `null` as a result.
Closes: #35871
Code that operates on-top of the engine requires all readers returned to be
unwrapped into ElasticsearchDirectoryReader. The special reader
the FrozenEngine uses wasn't wrapped.
We didn't check that the ExplainLifecycleRequest was constructed with at least
one index before, now that we do we must also make sure the tests
mutateInstance() method used in equals/hashCode checks doesn't accidentally
create an empty index array.
Closes#35822
When there is no persistent tasks metadata we could hit a null pointer
exception when executing a follower stats request. This is because we
inspect the persistent tasks metadata. Yet, if no tasks have been
registered, this is null (as opposed to empty). We need to avoid
de-referencing the persistent tasks metadata in this case. That is what
this commit does, and we add a test for this situation.
By setting the cron to 2017, we ensure it won't trigger. This makes it
easier to test because we know the job will _only_ be in STARTED,
and we can ignore INDEXING states due to transient triggers.
Closes#35779
Today we have a way to atomically persist global MetaData and
IndexMetaData to disk when new ClusterState is received. All other
ClusterState fields are not persisted.
However, there are other parts of ClusterState that should be
persisted, namely:
version
term
lastCommittedConfiguration
lastAcceptedConfiguration
votingTombstones
version is changed frequently, other fields are not. We decided
to group term, lastCommittedConfiguration,
lastAcceptedConfiguration and votingTombstones into
CoordinationMetaData class and make CoordinationMetaData a field
inside MetaData.
MetaData.toXContent and MetaData.fromXContent should take care of
CoordinationMetaData.
version stays as a top level field in ClusterState and will be
persisted as part of Manifest in a follow-up commit.
Also MetaData.isGlobalStateEquals should be extended to include
coordinationMetaData in comparison.
This commit favors exposing getters, such as getTerm directly in
ClusterState to avoid massive code changes.
An example of CoordinationMetaState.toXContent:
{
"term": 1,
"last_committed_config": [
"TiIuBcbBtpuXyDDVHXeD",
"ZIAoVbkjjLPLUuYLaTkw"
],
"last_accepted_config": [
"OwkXbXZNOZPJqccdFHdz",
"LouzsGYwmQzpeQMrboZe",
"fCKGRZdjLTqzXAqPUtGL",
"pLoxshjpJXwDhbgjfYJy",
"SjINLwFIlIEFZCbjrSFo",
"MDkVncJEVyZLJktopWje"
]
}
Move away from performing eager, fail-fast validation of mismatched
mapping to a lazy evaluation based on the fields actually used in the
query. This allows queries to run on the parts of the indices that
"work" which is not just convenient but also a necessity for large
mappings (like logging) where alignment is hard/impossible to achieve.
Fix#35659
Creates the manage_token cluster privilege and adds it to the
kibana_system role. This is required if kibana were to use the token
service for its authenticator process.
Because kibana_system already has manage_saml this effectively
only adds the privilege to create tokens.
Introduce INTERVAL as a DataType
Add INTERVAL to the grammar which supports the standard SQL declaration
(without precision):
> INTERVAL '1 23:45:01.123456789' DAY TO SECOND
but also number for single unit intervals:
> INTERVAL 1 YEAR
as well as the plurals of the units:
> INTERVAL 2 YEARS
Interval are internally supported as just another Literal being backed
by java.time.Period and java.time.Duration
Move JDBC away from JDBCType enum to SQLType interface
Refactor DataType by moving it into server core and adding dedicated (and
much simpler) JDBC driver type
Improve internal JDBC conversion by normalizing on the DataType
Rename JDBC columnInfo to JdbcColumnInfo to differentiate between it and
the SQL ColumnInfo
Fix#29990
This commit removes the parsing code from the PutLicenseResponse server variant, and the toXContent portion from the corresponding client variant.
Relates to #35547
This commit removes the parsing code from the PostStartBasicResponse server variant. It also makes the server response implement StatusToXContent which allows us to save a couple of lines of code in the corredponding REST action.
Relates to #35547
The RestHasPrivilegesAction previously handled its own XContent
generation. This change moves that into HasPrivilegesResponse and
makes the response implement ToXContent.
This allows HasPrivilegesResponseTests to be used to test
compatibility between HLRC and X-Pack internals.
A serialization bug (cluster privs) was also fixed here.
* The port assigned to all loopback interfaces doesn't necessarily have to be the same for ipv4 and ipv6
=> use actual address from profile instead of just port + loopback in test
* Closes#35584
This parameter in the `query_string` query was deprecated in 6.0 and ignored
since then. Its API methods and remaining uses can be removed in the upcoming
major version.
Relates to #35734
This commit adds a rest endpoint for freezing and unfreezing an index.
Among other cleanups mainly fixing an issue accessing package private APIs
from a plugin that got caught by integration tests this change also adds
documentation for frozen indices.
Note: frozen indices are marked as `beta` and available as a basic feature.
Relates to #34352
Currently there is a common NPE in the IndexFollowingIT that does not
indicate the test failing. This is when a cluster state listener is
called and certain index metadata is not yet available.
This commit checks that the metadata is not null before performing the
logic that depends on the metadata.
PR #35242 formalised support for the password_hash field in the body
of the Put User security API.
Since this field is now validated and tested, it can also be
documented.
The Put User API also supports a "refresh" query parameter that was
not documented. This commit adds it to the docs.
Zen2 is now feature-complete enough to run most ESIntegTestCase tests. The changes in this PR
are as follows:
- ClusterSettingsIT is adapted to not be Zen1 specific anymore (it was using Zen1 settings).
- Some of the integration tests require persistent storage of the cluster state, which is not fully
implemented yet (see #33958). These tests keep running with Zen1 for now but will be switched
over as soon as that is fully implemented.
- Some very few integration tests are not running yet with Zen2 for other reasons, depending on
some of the other open points in #32006.
* ML: Removing result_finalization_window && overlapping_buckets
* Reverting bad method deletions
* Setting to current before backport to try and get a green build
* fixing testBuildAutodetectCommand test
* disabling bwc tests for backport
Fix bug in Analyzer that caused it to report unsupported fields only
when declared in projections. The rule has been extended to all field
declarations.
Fix#35673
RolloverStep previously had a name of "attempt_rollover", which was
inconsistent with all other step names due it its use of an underscore
instead of a dash.
This commit switches from using java util's default timezone method to
using joda. The former can cause problems when the string representation
of the timezone is unknown to joda.
closes#35518
Removed extending of AbstractComponent and changed logger usage to
explicit declaration. Abstract classes still have logger
declaration using this.getClass() in order to show implementation class
name in its logs.
See #34488
* Deprecate types in count requests.
* Move RestCountAction to the 'search' package.
* Deprecate types in multi search requests.
* Add tests for types deprecation in the _search endpoint.
RolloverAction will now periodically check the rollover conditions using
the Rollover API with the dry_run option as an AsyncWaitStep, then run
the rollover itself by calling the Rollover API with no conditions,
which will always roll over, as an AsyncActionStep. This will resolve
race condition issues in policies using RolloverAction.
Today, the bootstrapping of a Zen2 cluster is driven externally, requiring
something else to wait for discovery to converge and then to inject the initial
configuration. This is hard to use in some situations, such as REST tests.
This change introduces the `ClusterBootstrapService` which brings the bootstrap
retry logic within each node and allows it to be controlled via an (unsafe)
node setting.
* ML: Adding missing datacheck to datafeedjob
* Adding client side and docs
* Making adjustments to validations
* Making values default to on, having more sensible limits
* Intermittent commit, still need to figure out interval
* Adjusting delayed data check interval
* updating docs
* Making parameter Boolean, so it is nullable
* bumping bwc to 7 before backport
* changing to version current
* moving delayed data check config its own object
* Separation of duties for delayed data detection
* fixing checkstyles
* fixing checkstyles
* Adjusting default behavior so that null windows are allowed
* Mentioning the default value
* Fixing comments, syncing up validations
In the event that the target index does not exist when `CopyExecutionStateStep`
executes, this avoids a `NullPointerException` and provides a more helpful error
to the ILM user.
Resolves#35567
Kibana now uses the tasks API to manage automatic reindexing of the
.kibana index during upgrades.
The implementation of the tasks API requires that
1. the user executing the task can create & write to the ".tasks" index
2. the user checking on the status of the task can read (Get) the
relevant document from the ".tasks" index
Response classes in Elasticsearch (and xpack) only need to implement ToXContent, which is needed to print their output put in the REST layer and return the response in json (or others) format. On the other hand, response classes that are added to the high-level REST client, need to do the opposite: parse xcontent and create a new object based on that.
This commit removes the parsing code from the XPackInfoResponse server variant, and the toXContent portion from the corresponding client variant. It also removes a client specific test class that looks redundant now that we have a single test class for both classes.
This pull request replaces some blocks of code that must be run once
and that are currently based on AtomicBoolean by the convient RunOnce
class added in #35489.
The DefaultAuthenticationFailureHandler has a deprecated constructor
that was present to prevent a breaking change to custom realm plugin
authors in 6.x. This commit removes the constructor and its uses.
For some time, the PutUser REST API has supported storing a pre-hashed
password for a user. The change adds validation and tests around that
feature so that it can be documented & officially supported.
It also prevents the request from containing both a "password" and a "password_hash".
This adds a `wait_for_completion` flag which allows the user to block
the Stop API until the task has actually moved to a stopped state,
instead of returning immediately. If the flag is set, a `timeout` parameter
can be specified to determine how long (at max) to block the API
call. If unspecified, the timeout is 30s.
If the timeout is exceeded before the job moves to STOPPED, a
timeout exception is thrown. Note: this is just signifying that the API
call itself timed out. The job will remain in STOPPING and evenutally
flip over to STOPPED in the background.
If the user asks the API to block, we move over the the generic
threadpool so that we don't hold up a networking thread.
This change adds a special caching reader that caches all relevant
values for a range query to rewrite correctly in a can_match phase
without actually opening the underlying directory reader. This
allows frozen indices to be filtered with can_match and in-turn
searched with wildcards in a efficient way since it allows us to
exclude shards that won't match based on their date-ranges without
opening their directory readers.
Relates to #34352
Depends on #34357
The NPE would occur if should_trim_field was overridden to
true and any field value was completely blank. This change
defends against this situation.
Fixes#35462
Before, moving to a failed step would only change the step info
to be that of the failed step. This means two things.
1. Async Steps would never be triggered to execute
2. If there are inherent problems with the action definition that can
be fixed with a policy update, these changes were not being reflected
by the new execution info.
Changes now
1. Async steps are executed after the move to the failed step in cluster state
2. the lifecycle execution info's phase definition is updated from the current
latest policy definition, even though the index isn't moving to a new phase.
Closes#35397.
avoid the assertions that check the log files, because that does not work on Windows.
The rest of the test is still useful and should work on Windows CI.
Currently on Windows CI this qa module fails because there is just one test and
that test si ignored if OS is Windows.
This change adds a high level freeze API that allows to mark an
index as frozen and vice versa. Indices must be closed in order to
become frozen and an open but frozen index must be closed to be
defrosted. This change also adds a index.frozen setting to
mark frozen indices and integrates the frozen engine with the
SearchOperationListener that resets and releases the directory
reader after and before search phases.
Relates to #34352
Depends on #34357
Validate remote cluster license as part of put auto follow pattern api call
in addition of validation that when auto follow coordinator starts auto
following indices in the leader cluster.
Also added qa module that tests what happens to ccr after downgrading to basic license.
Existing active follow indices should remain to follow,
but the auto follow feature should not pickup new leader indices.
Add `IsNull` node in parser to simplify expressions so that `<value> IS NULL` is
no longer translated internally to `NOT(<value> IS NOT NULL)`
Replace `IsNotNullProcessor` with `CheckNullProcessor` to encapsulate both
isNull and isNotNull functionality.
Closes: #34876Fixes: #35171
* DISCOVERY: Fix RollingUpgradeTests
* Don't manually manage min master nodes if not necessary
* Remove some dead code
* Allow for manually supplying list of seed nodes
* Closes#35178
This documents how to include the search queries in the audit log.
There is a catch, that even if enabling `emit_request_body`, which should
output queries included in request bodies, search queries were not output
because, implicitly, no REST layer audit event type was included.
This folk knowledge is herein imprinted.
Many realm tests were written to use separate setting objects for
"global settings" and "realm settings".
Since #30241 there is no distinction between these settings, so these
tests can be cleaned up to use a single Settings object.
Change `nullable()` logic of AND and OR to false since in the Optimizer
we cannot fold to null as we might be handling and expression in the
SELECT clause.
Introduce folding of null for AND and OR expressions in PruneFilter()
since we now know that we are in HAVING or WHERE clause and we
can fold `null` to `false`
Fixes: #35088
Co-authored-by: Costin Leau <costin.leau@gmail.com>
This is related to #34483. It introduces a namespaced setting for
compression that allows users to configure compression on a per remote
cluster basis. The transport.tcp.compress remains as a fallback
setting. If transport.tcp.compress is set to true, then all requests
and responses are compressed. If it is set to false, only requests to
clusters based on the cluster.remote.cluster_name.transport.compress
setting are compressed. However, after this change regardless of any
local settings, responses will be compressed if the request that is
received was compressed.
Today our OS information returned in node stats only returns a
high-level name of the OS (e.g., "Linux"). Yet, for some uses this is
too high-level and knowing at a finer level of granularity the
underlying OS can be useful. This commit extracts the pretty name on
Linux from /etc/os-release. This pretty name usually includes the Linux
vendor and the Linux vendor version number (e.g., Fedora 28).
- Introduces a transport API for bootstrapping a Zen2 cluster
- Introduces a transport API for requesting the set of nodes that a
master-eligible node has discovered and for waiting until this comprises the
expected number of nodes.
- Alters ESIntegTestCase to use these APIs when forming a cluster, rather than
injecting the initial configuration directly.
Adjust list of dynamic index settings that should be replicated
and added a test that verifies whether builtin dynamic index settings
are classified as replicated or non replicated (whitelisted).
This commit uses the index settings version so that a follower can
replicate index settings changes as needed from the leader.
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
There is no longer a concept of non-global "realm settings". All realm
settings should be loaded from the node's settings using standard
Setting classes.
This change renames the "globalSettings" field and method to simply be
"settings".
The file realm has not supported custom filenames/locations since at
least 5.0, but this test still tried to configure them.
Remove all configuration of file locations, and cleaned up a few other
warnings and deprecations
Ensure that Watcher is correctly started and stopped between tests for
SmokeTestWatcherWithSecurityIT,
SmokeTestWatcherWithSecurityClientYamlTestSuiteIT,
SmokeTestWatcherTestSuiteIT, WatcherRestIT,
XDocsClientYamlTestSuiteIT, and XPackRestIT
The change here is to throw an `AssertionError` instead of `break;` to
allow the `assertBusy()` to continue to busy wait until the desired
state is reached.
closes#33291, closes#29877, closes#34462, closes#30705, closes#34448
Since it's still possible to shrink an index when replicas are unassigned, we
should not check that all copies are available when performing the shrink, since
we set the allocation requirement for a single node.
Resolves#35321
* [ILM] Check shard and relocation status in AllocationRoutedStep
This is a follow-up from #35161 where we now check for started and relocating
state in `AllocationRoutedStep`.
Resolves#35258
This change adds a `frozen` engine that allows lazily open a directory reader
on a read-only shard. The engine wraps general purpose searchers in a LazyDirectoryReader
that also allows to release and reset the underlying index readers after any and before
secondary search phases.
Relates to #34352
An auto follow pattern:
* cannot start with `_`
* cannot contain a `,`
* can be encoded in UTF-8
* the length of UTF-8 encoded bytes is no longer than 255 bytes
Adds basic rolling upgrade tests to check that lifecycles are still recognizable between rolling cluster upgrades and managed indices stay managed.
This is a placeholder for discussing types of checks so they are ready once we backported
This is a re-boot of the previous PR against index-lifecycle that needed to be
reverted due to CI bwc issues. #32828
With this change, `Version` no longer carries information about the qualifier,
we still need a way to show the "display version" that does have both
qualifier and snapshot. This is now stored by the build and red from `META-INF`.
Grammar's identifiers can be completely skipped from counting depths
as they just add another level to the tree and they are always children
of some other expression which gets counted.
Increased maximum depth from 100 to 200. After testing on production
configuration with -Xss1m, depths of at least 250 can be used, so being
conservative we put the limit lower.
Fixes: #35299
The elasticsearch-croneval CLI tool uses local dates to display when
something gets triggered the next time. This is very confusing.
This commit ensures, that UTC and local timezone times will be written
out.
The output looks like this and contains localized dates for each trigger
date as well as for `now`.
Now is [Tue, 28 Aug 2018 17:23:51 +0000] in UTC, local time is [ᏔᎵᏁ, 28 ᎦᎶ 2018 12:23:51 -0500]
Here are the next 10 times this cron expression will trigger:
1. Mon, 2 Jan 2040 11:00:00 +0000
ᏉᏅᎯ, 2 ᎤᏃ 2040 06:00:00 -0500
2. ...
This also removes an old outstanding TODO to use the jopt parsing to
cast the count to an integer instead of doing it ourselves.
In order to start shard follow tasks, the resume follow api already
needs execute N requests to the elected master node.
The pause follow API is also a master node action, which would make
how both APIs execute more consistent.
This is related to #29023. Additionally at other points we have
discussed a preference for removing the need to unnecessarily block
threads for opening new node connections. This commit lays the groudwork
for this by opening connections asynchronously at the transport level.
We still block, however, this work will make it possible to eventually
remove all blocking on new connections out of the TransportService
and Transport.
The remove-ilm-from-index API was using the DELETE http method
to signify that something is being removed. Although, metadata
about ILM for the index is being deleted, no entity/resource
is being deleted during this operation. POST is more in line with
what this API is actually doing, it is modifying the metadata for
an index. As part of this change, `remove` is also appended to the path
to be more explicit about its actions.
Error was thrown if leader index had no soft deletes enabled, but it then continued creating the follower index.
The test caught this bug, but very rarely due to timing issue.
Build failure instance:
```
1> [2018-11-05T20:29:38,597][INFO ][o.e.x.c.LocalIndexFollowingIT] [testDoNotCreateFollowerIfLeaderDoesNotHaveSoftDeletes] before test
1> [2018-11-05T20:29:38,599][INFO ][o.e.c.s.ClusterSettings ] [node_s_0] updating [cluster.remote.local.seeds] from [[]] to [["127.0.0.1:9300"]]
1> [2018-11-05T20:29:38,599][INFO ][o.e.c.s.ClusterSettings ] [node_s_0] updating [cluster.remote.local.seeds] from [[]] to [["127.0.0.1:9300"]]
1> [2018-11-05T20:29:38,609][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s_0] [leader-index] creating index, cause [api], templates [random-soft-deletes-templat
e, one_shard_index_template], shards [2]/[0], mappings []
1> [2018-11-05T20:29:38,628][INFO ][o.e.c.r.a.AllocationService] [node_s_0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[leader-
index][0]] ...]).
1> [2018-11-05T20:29:38,660][INFO ][o.e.x.c.a.TransportPutFollowAction] [node_s_0] [follower-index] creating index, cause [ccr_create_and_follow], shards [2]/[0]
1> [2018-11-05T20:29:38,675][INFO ][o.e.c.s.ClusterSettings ] [node_s_0] updating [cluster.remote.local.seeds] from [["127.0.0.1:9300"]] to [[]]
1> [2018-11-05T20:29:38,676][INFO ][o.e.c.s.ClusterSettings ] [node_s_0] updating [cluster.remote.local.seeds] from [["127.0.0.1:9300"]] to [[]]
1> [2018-11-05T20:29:38,678][INFO ][o.e.x.c.LocalIndexFollowingIT] [testDoNotCreateFollowerIfLeaderDoesNotHaveSoftDeletes] after test
1> [2018-11-05T20:29:38,678][INFO ][o.e.x.c.LocalIndexFollowingIT] [testDoNotCreateFollowerIfLeaderDoesNotHaveSoftDeletes] [LocalIndexFollowingIT#testDoNotCreateFoll
owerIfLeaderDoesNotHaveSoftDeletes]: cleaning up after test
1> [2018-11-05T20:29:38,678][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s_0] [follower-index/TlWlXp0JSVasju2Kr_hksQ] deleting index
1> [2018-11-05T20:29:38,678][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s_0] [leader-index/FQ6EwIWcRAKD8qvOg2eS8g] deleting index
FAILURE 0.23s J0 | LocalIndexFollowingIT.testDoNotCreateFollowerIfLeaderDoesNotHaveSoftDeletes <<< FAILURES!
> Throwable #1: java.lang.AssertionError:
> Expected: <false>
> but: was <true>
> at __randomizedtesting.SeedInfo.seed([7A3C89DA3BCA17DD:65C26CBF6FEF0B39]:0)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.elasticsearch.xpack.ccr.LocalIndexFollowingIT.testDoNotCreateFollowerIfLeaderDoesNotHaveSoftDeletes(LocalIndexFollowingIT.java:83)
> at java.lang.Thread.run(Thread.java:748)
```
Build failure: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.5+intake/46/console
Override `process()` in `BinaryLogicProcessor` which doesn't immediately
return null if left or right argument is null, which is the behaviour of
`process()` of the parent class `BinaryProcessor`.
Also, add more tests for `AND` and `OR` in SELECT clause with literal.
Fixes: #35240
Today if a wildcard, date-math expression or alias expands/resolves
to an index that is search-throttled we still search it. This is likely
not the desired behavior since it can unexpectedly slow down searches
significantly.
This change adds a new indices option that allows `search`, `count`
and `msearch` to ignore throttled indices by default. Users can
force expansion to throttled indices by using `ignore_throttled=true`
on the rest request to expand also to throttled indices.
Relates to #34352
This moves all Realm settings to an Affix definition.
However, because different realm types define different settings
(potentially conflicting settings) this requires that the realm type
become part of the setting key.
Thus, we now need to define realm settings as:
xpack.security.authc.realms:
file.file1:
order: 0
native.native1:
order: 1
- This is a breaking change to realm config
- This is also a breaking change to custom security realms (SecurityExtension)
This is a forward port of some of the changes made in #8445, specifically the change mentioned in https://github.com/elastic/elasticsearch/pull/34023#issuecomment-433212636.
Currently, in master, the `cluster_stats` collector collects _all_ cluster metadata and indexes it into `.monitoring-es-*`. However, per the discussion linked to above, we decided to collect _only_ the `display_name` cluster metadata setting for now. This PR makes this change.
Add NotEquals node in parser to simplify expressions so that <value1> != <value2> is
no longer translated internally to NOT(<value1> = <value2>)
Closes: #35210Fixes: #35233
This adds a new step for checking whether an index is allocated correctly based
on the rules added prior to running the shrink step. It also fixes a bug where
for shrink we are not allowed to have the shards relocating for the shrink step.
This also allows us to simplify AllocationRoutedStep and provide better
feedback in the step info for why either the allocation or the shrink checks
have failed.
Resolves#34938
If the Rollover step would fail due to the next index in sequence
already existing, just skip to the next step instead of going to the
Error step.
This prevents spurious `ResourceAlreadyExistsException`s created by
simultaneous RolloverStep executions from causing ILM to error out
unnecessarily.
This commit removes the Joda time usage from ILM and the HLRC components of ILM.
It also fixes an issue where using the `?human=true` flag could have caused the
parser not to work. These millisecond fields now follow the standard we use
elsewhere in the code, with additional fields added iff the `human` flag is
specified.
This is a breaking change for ILM, but since ILM has not yet been released, no
compatibility shim is needed.
The suite FollowerFailOverIT is failing because some documents are not
replicated to the follower. Maybe the FollowTask is not working as
expected or the background indexers eat all resources while the follower
cluster is trying to reform after a failover; then CI is not fast enough
to replicate all the indexed docs within 60 seconds (sometimes I see 80k
docs on the leader).
This commit limits the number of documents to be indexed into the leader
index by the background threads so that we can eliminate the latter
case. This change also replaces a docCount assertion with a docIds
assertion so we can have more information if these tests fail again.
Relates #33337
Previously, testRunStateChangePolicyWithNextStep asserted that the
ClusterState before and after running the steps were equal. The test
only passed due to a race condition: The latch would be triggered by the
step execution, but the cluster state update thread would continue
running before committing the change to the cluster state. This allowed
the test to read the old cluster state and pass the equality check about
99.99% of the time.
The test now waits for the new cluster state to be committed before
checking that it is _not_ equal to the old cluster state.