When a node is elected as master or receives a join request, we submit a cluster state update task. We should give the node join update task a lower priority than the elect as master to increase the chance it will not be rejected. During master election there is a big chance that these will happen concurrently.
This commit lowers the priority of node joins from IMMEDIATE to URGENT
Closes#7733
Some tests like CorruptedTranslogTests rely on the fact that we
are recovering from translog. In those cases we need to prevent
flushes from happening during indexing. This change adds an optional
flag on the #indexRandom utility to disable flushes.
ActionNamesTests#testIncomingAction rarely uses a random action name to make sure that actions registered via plugins work properly. In some cases the random action would conflict with existing one (e.g. tv) and make the test fail. Fixed also testOutgoingAction although the probability of conflict there is way lower due to longer action names used from 1.4 on.
Most notably the elected_as_master task should run as soon as possible. This is an issue as node join request do use `Priority.IMMEDIATE` and can be unjustly rejected.
Closes#7718
It uses a cluster state update task and it gets rejected if not run on a master node. We should enable running on non-masters if the local flag is set.
Also, report any unexpected error that may happen during this cluster state update task
Closes#7731
Similar to NettyTransport.doStop() all actions which disconnect
from a node (and thus call awaitUnterruptibly) should not be executed
on the I/O thread.
This patch ensures that all disconnects happen in the generic threadpool, trying to avoid unnecessary `disconnectFromNode` calls.
Also added a missing return statement in case the component was not yet
started when catching an exception on the netty layer.
Closes#7726
The Unicast Zen Ping mechanism is configured to ping certain host:port combinations in order to discover other node. Since this is only a ping, we do not setup a full connection but rather do a light connect with one channel. This light connection is closed at the end of the pinging.
During pinging, we may discover disco nodes which are not yet connected (via temporalResponses). UnicastZenPing will setup the same light connection for those node. However, during pinging a cluster state may arrive with those nodes in it. In that case , we will mistakenly believe those nodes are connected and at the end of pinging we will mistakenly disconnect those valid node.
This commit makes sure that all nodes UnicastZenPing connects to have a unique id and can be safely disconnected.
Closes#7719
When cancelling recoveries, we wait for up to 10s for the source node to be notified before continuing. This is not needed in two cases:
1) The source node has been disconnected due to node shutdown (recovery is canceled as a response to cluster state processing)
2) The current thread is the one that will be notifying the source node (happens when one of the calls from the source nodes discoveres the local index is closed)
The first one is especially important as it may delay cluster state update processing with 10s.
Closes#7717
Make the http addresses within the REST client final. It makes no sense to update them before each test if we don't check the version of the nodes again, which would mean adding too much overhead (an additional http call before each test) for no reason. We just reuse the same nodes for the whole suite and check the version once while initializing the client. Would be nice to make the REST client within the execution context final but its initialization still needs to happen after the `ElasticsearchIntegrationTest#beforeInternal` that assigns `GLOBAL_CLUSTER` to `currentCluster`.
Closes#7723
When communicating with 1.3 and earlier nodes, it's possible that the
field data breaker info is not sent at all. When this happens, we should
leave the `breaker` variable as-is (unset) instead of creating an
AllCircuitBreakerStats object with a null fd breaker and fake request &
parent breakers.
The terms aggregation can now support sorting on multiple criteria by replacing the sort object with an array or sort object whose order signifies the priority of the sort. The existing syntax for sorting on a single criteria also still works.
Contributes to #6917
Replaces #7588
Also fix the test
FunctionScoreTests#simpleWeightedFunctionsTestWithRandomWeightsAndRandomCombineMode
which sometimes failed due to rounding issues. Make sure
only floats are returned as scores to assure ratio of
expected and returned score is 1.0f.
The test starts a cluster with random nodes as unicast hosts but *doesn't* use min_master_nodes. If the unicast hosts are started last, nodes may elect themselves as master as they do not have mechanism yet to share information.
ElasticsearchRestTests has now a `restClientSettings` method that can be overriden to provide headers as settings (similarly to what we do with transport client). Those headers will be sent together with every REST requests within the tests.
Closes#7710
Adds a special case to the GeoBoundingBoxFilterParser so that the left of the box is not normalised in the case where left and right are 360 apart. Before this change the left would be normalised to 180 in this case and the filter would only match points with a longitude of 180 (or -180).
Closes#5128
GroupShardsIterator is used in many places like the search execution
to determin which shards to query. This can hold shards of one index
as well as shards of multiple indices. The iteration order is used
to assigne a per-request shard ID for each shard that is used as a
tie-breaker when scores are the same. Today the iteration order is
soely depending on the HashMap iteration order which is undefined or
rather implementation dependent. This causes search requests to return
inconsistent results across requests if for instance different nodes
are coordinating the requests.
Simple queries like `match_all` may return results in arbitrary order
if pagination is used or may even return different results for the same
request even though there hasn't been a refresh call and preferences are
used.
Today if we run into exception like NumberFormatException or IAE
when we try to open a commit point to retrieve checksums and calculate
store metadata we just bubble them up. Yet, those are very likely index
corruptions. In such a case we should really mark the shard as
corrupted.
In the reduce logic of the top_hits aggregation if the first shard result to process contained has no results then the merging of all the shard results can go wrong resulting in an incorrect sorted hits.
This bug can only manifest with a sort other than score.
Closes#7697
If you have previously corrupted files, this method currently builds an
exception like:
```
failed engine [corrupted preexisting index]
failed to start shard
```
Followed by a CorruptIndexException. This commit writes the entire
stacktrace to provide additional information. It also changes the
failure message from `corrupted preexisting index` to `preexisting
corrupted index` to prevent confusion.
Closes#7596
With #7594 we replaced the static `BaseRestHandler#addUsefulHeaders` by introducing the `RestClientFactory` that can be injected and used to register the relevant headers. To simplify things, we can now register relevant headers through the `RestController` and remove the `RestClientFactory` that was just introduced.
Closes#7675
Returns information about settings, aliases, warmers, and mappings. Basically returns the IndexMetadata. This new endpoint replaces the /{index}/_alias|_aliases|_mapping|_mappings|_settings|_warmer|_warmers and /_alias|_aliases|_mapping|_mappings|_settings|_warmer|_warmers endpoints whilst maintaining the same response formats. The only exception to this is on the /_alias|_aliases|_warmer|_warmers endpoint which will now return a section for 'aliases' or 'warmers' even if no aliases or warmers exist. This backwards compatibility change is documented in the reference docs.
Closes#4069
With the change in #7493, we introduced a pinging round when a master nodes goes down. That pinging round helps validating the current state of the cluster and takes, by default, 3 seconds. It may be that during that window, a new node tries to join the cluster and starts pinging (this is typical when you quickly restart the current master). If this node gets elected as the new master it will force recovery from the gateway (it has no in memory cluster state), which in turn will cause a full cluster shard synchronisation. While this is not a problem on it's own, it's a shame. This commit demotes "new" nodes during master election so the will only be elected if really needed.
Closes#7558
By default the reroute API should return the new cluster state, excluding the metadata. It was however it was wrongly using an old parameter (filter_metadata) and thus failed to do so. This commits restores but wiring it to the correct `metric` parameter. We also add an enum representing the possible metrics, to avoid similar future mistakes.
Closes#7520Closes#7523
The functionality of copying headers in the REST layer (from REST requests to transport requests) remains the same. Made it a bit nicer by introducing a RestClientFactory component that is a singleton and allows to register useful headers without requiring static methods.
Plugins just have to inject the RestClientFactory now, and call its `addRelevantHeaders` method that is not static anymore.
Relates to #6513Closes#7594
Also renamed HeadersCopyClientTests since it tests a class that was renamed and removed randomization around client wrapper, as it now needs to be weapped all the time to copy the context, doesn't depend on useful headers that have been registered anymore.
Relates to #7610
Today we throw an error if local transport is configured with BWC tests.
Yet, the BWC test need network to be enabled so test can just set the
required defaults.
Closes#7660
Previously we incorrectly sent them in the wrong order, which can cause
validators not to be run for dynamic settings that have been added
matching a particular wildcard.
Fixes#7651
get rid of readSource() entirely, it sucked, Operations should be able
to provide the source themselves.
No more TranslogStream headers, you are now required to pass an
StreamInput or StreamOutput for all operations, which means no extra
state is needed and no need to construct new versions when detecting the
version.
Read and write translog op sizes in TranslogStreams
Previously we handled these integers outside of the translog stream
itself, which was very unclean because other code had to know about
reading the size, or about writing the correct header sometimes.
There is some additional code in LocalIndexShardGateway to handle the
legacy case for older translogs, because we need to read and discard the
size in order to maintain the compatibility for the streaming
operations (they did not read or write the size for 1.3.x and earlier).
Additionally, we need to handle a case where the header is truncated
when recovering from disk
Use a NoopStreamOutput instead of byte arrays
Instead of writing translog operations to a temporary byte array and
then writing that byte array to the stream, we now write the operation
twice, once to a No-op stream to get the size, then again to the real
size.
This trades a little more CPU usage for less memory usage.
Today we serialize the snashot metadata to a byte array and then copy
the byte array to a stream. Instead this commit moves the serialization
directly to the target stream without the intermediate representation.
Closes#7637
We keep around a noop stats indicating how many update operations ended up not updating the document (typically because it didn't change). However, the TransportUpdateAction update that counter only after returning the result. This can throw off stats check which are done immediately after, potentially causing test failures.
Closes#7639
Today there are two different ways to cleanup search contexts which can
potentially lead to double releasing of a context. This commit unifies
the methods and prevents double closing.
Closes#7625
Updates on the _timestamp field were silently ignored.
Now _timestamp undergoes the same merge as regular
fields. This includes exceptions if a property cannot
be changed.
"path" and "default" cannot be changed.
closes#5772closes#6958closes#7614
partially fixes#777
The terms aggregation can now support sorting on multiple criteria by replacing the sort object with an array or sort object whose order signifies the priority of the sort. The existing syntax for sorting on a single criteria also still works.
Contributes to #6917
Reduce Phases can be expensive and some of them like the aggregations
reduce phase might even execute a one-off call via an internal client
that might cause a deadlock due to execution on the network thread
that is needed to handle the one-off call. This commit dispatches
the reduce phase to the search threadpool to ensure we don't wait
for the current thread to be available.
Closes#7623
Lucene's experimental codecs (from the codecs module) do not provide
backwards compatibility and are free to change from release to
release. When they do change, they typically cannot in general read
older indices and the resulting exceptions look like index corruption.
So, we are removing built-in support for them to prevent applications
from choosing one and then seeing strange exceptions on upgrade.
Closes#7566Closes#7604
Enables filtering the actions on both sides - request and response. Also added a base class for filter implementations (cleans up filters that only need to filter one side)
Also refactored the filter & filter chain methods to more intuitive names
BlobContainer used to provide async APIs which are not used
internally. The implementation of these APIs are also not async
by nature and neither is any of the pluggable BlobContainers. This
commit simplifies the API to a simple input / output stream and
reduces the hierarchy of BlobContainer dramatically.
NOTE: This is a breaking change!
Closes#7551
Similar to the one in `TransportMessage`. Added the `ContextHolder` base class where both `TransportMessage` and `RestRequest` derive from
Now next to the known headers, the context is always copied over from the rest request to the transport request (when the injected client is used)
This adds the ability to the Term Vector API to generate term vectors for
artifical documents, that is for documents not present in the index. Following
a similar syntax to the Percolator API, a new 'doc' parameter is used, instead
of '_id', that specifies the document of interest. The parameters '_index' and
'_type' determine the mapping and therefore analyzers to apply to each value
field.
Closes#7530
The get, put and delete indexed script apis map to get, index and delete api and internally create those corresponding requests. We need to make sure that the original headers are handed over to the new request by passing the original request in the constructor when creating the new one.
Also streamlined the support for version and version_type in the REST layer since the parameters were not consistently parsed and set to the internal java API requests.
Modified the REST delete template and delete script actions to make use of a client instead of using the `ScriptService` directly.
Closes#7569
Removed CHM in favour of an OpenHashMap and synchronized accessor/mutator methods. Also, the context is now lazily inititialied (just like we do with the headers)
The useful headers are now stored into a `Set` instead of an array so we can easily deduplicate them. A set is also returned instead of an array by the `usefulHeaders` static getter.
Relates to #6513Closes#7590
Serialization if "index" setting for boost did not work since
the serialization was just true/false instead of valid options
"no"/"not_analyzed"/"analyzed".
closes#7557
`GetIndexedScriptRequest` now extends `ActionRequest` instead of `SingleShardOperationRequest`, as the index field that was provided with the previous base class is not needed (hardcoded).
Closes#7553
Aggregations are collection-wide statistics, which is incompatible with the
collection mode of search_type=SCAN since it doesn't collect all matches on
calls to the search API.
Close#7429
Aggregations are collection-wide statistics so they would always be the same.
In order to save CPU/bandwidth, we can just return them on the first page.
Same as #1642 but for aggregations.
After a node fails to respond to a ping correctly (master or node fault detection), they are removed from the cluster state through an UpdateTask. When a node is removed, a background task is scheduled using the generic threadpool to actually disconnect the node. However, in the case of temporary node failures (for example) it may be that the node was re-added by the time the task get executed, causing an untimely disconnect call. Disconnect is cheep and should be done during the UpdateTask.
Closes#7543
Enable lucene verification of checksums on segments before merging them.
This prevents corruption from existing segments from silently slipping into
newer merged segments.
Closes#7360
System properties are typically set via the command line and therefore override the node settings. If one has `node.local=true` or `node.mode=local` it can result in cryptic error messages during the test run.
System properties are typically set via the command line and therefore override the node settings. If one has `node.local=true` or `node.mode=local` it can result in cryptic error messages during the test run.
The global cluster gets created from a static block and shared through all tests in the same jvm. The `buildTestCluster` method can't get called passing in `Scope.GLOBAL`, hence removed its mention from it as it might be misleading. The only two scopes supported within the `buildTestCluster` method are `SUITE` and `TEST`.
Merging the accumulated work from the feautre/improve_zen branch. Here are the highlights of the changes:
__Testing infra__
- Networking:
- all symmetric partitioning
- dropping packets
- hard disconnects
- Jepsen Tests
- Single node service disruptions:
- Long GC / Halt
- Slow cluster state updates
- Discovery settings
- Easy to setup unicast with partial host list
__Zen Discovery__
- Pinging after master loss (no local elects)
- Fixes the split brain issue: #2488
- Batching join requests
- More resilient joining process (wait on a publish from master)
Closes#7493
Previous implementation used a marker interface and had no explicit failure call back for the case update task was run on a non master (i.e., the master stepped down after it was submitted). That lead to a couple of instance of checks.
This approach moves ClusterStateUpdateTask from an interface to an abstract class, which allows adding a flag to indicate whether it should only run on master nodes (defaults to true). It also adds an explicit onNoLongerMaster call back to allow different error handling for that case. This also removed the need for the NoLongerMaster.
Closes#7511
We currently have two ways to randomize the number of shards and replicas: random index template, that stays the same for all indices created under the same scope, and the overridable `indexSettings` method, called by `createIndex` and `prepareCreate` which returns different values each time.
Now that the `randomIndexTemplate` method is not static anymore, we can easily apply the same logic to both. Especially for number of replicas, we used to have slightly different behaviours, where more than one replicas were only rarely used through random index template, which gets now applied to the `indexSettings` method too (might speed up the tests a bit)
Side note: `randomIndexTemplate` had its own logic which didn't depend on `numberOfReplicas` or `maximumNumberOfReplicas`, which was causing bw comp tests failures since in some cases too many copies of the data are requested, which cannot be allocated to older nodes, and the write consistency quorum cannot be met, thus indexing times out.
Closes#7522
Settings that are not default for _size, _index and _timestamp were only build in
toXContent if these fields were actually enabled.
_timestamp, _index and _size can be dynamically enabled or disabled.
Therfore the settings must be kept, even if the field is disabled.
(Dynamic enabling/disabling was intended, see TimestampFieldMapper.merge(..)
and SizeMappingTests#testThatDisablingWorksWhenMerging
but actually never worked, see below).
To avoid that _timestamp is overwritten by a default mapping
this commit also adds a check to mapping merging if the type is already
in the mapping. In this case the default is not applied anymore.
(see
SimpleTimestampTests#testThatUpdatingMappingShouldNotRemoveTimestampConfiguration)
As a side effect, this fixes
- overwriting of paramters from the _source field by default mappings
(see DefaultSourceMappingTests).
- dynamic enabling and disabling of _timestamp and _size ()
(see SimpleTimestampTests#testThatTimestampCanBeSwitchedOnAndOff and
SizeMappingIntegrationTests#testThatTimestampCanBeSwitchedOnAndOff )
Tests:
Enable UpdateMappingOnClusterTests#test_doc_valuesInvalidMappingOnUpdate again
The missing settings in the mapping for _timestamp, _index and _size caused a the
failure: When creating a mapping which has settings other than default and the
field disabled, still empty field mappings were built from the type mappers.
When creating such a mapping, the mapping source on master and the rest of the cluster
can be out of sync for some time:
1. Master creates the index with source _timestamp:{_store:true}
mapper classes are in a correct state but source is _timestamp:{}
2. Nodes update mapping and refresh source which then completely misses _timestamp
3. After a while source is refreshed again also on master and the _timestamp:{}
vanishes there also.
The test UpdateMappingOnCusterTests#test_doc_valuesInvalidMappingOnUpdate failed
because the cluster state was sampled from master between 1. and 3. because the
randomized testing injected a default mapping with disabled _size and _timestamp
fields that have settings which are not default.
The test
TimestampMappingTests#testThatDisablingFieldMapperDoesNotReturnAnyUselessInfo
must be removed because it actualy expected the timestamp to remove
parameters when it was disabled.
closes#7137
The root endpoint returns basic information about this node, like it's name and ES version etc. The cluster name is an important information that belongs in that list.
Closes#7524
The reverse_nested aggregator requires that the emitted doc ids are always in ascending order, which is already enforced on the scorer level,
but this also needs to be enforced on the nested aggrgetor level otherwise incorrect counts are a result.
Closes#7505Closes#7514
During a test run we have a global shared cluster and potentially a suite level or even a test level cluster running. All of those share the same node name pattern (node_#). This can be confusing if you're debugging discovery related tests where those nodes from the different clusters potentially interact (and reject each other). This commit gives each cluster type a unique prefix to make tracing and log filtering simpler.
Closes#7518
Comparisons for the BigArrays breaker use "greater than" instead of
"greater than or equal", which was never an issue before because the
test size was not right on a page boundary. A test with an exactly
divisible page boundary (4mb exactly in this case) caused the sizes to
be equal to, but not exceed, the limit, and never break.
The limit should be smaller than the test increments the breaker anyway.
Today we have logic that removes a shard from the indexservice if
the shard has changed ie. from replica to primary or if it's recovery
source vanished etc. This can cause shards from been not allocated at
all on a nodes causeing delete requests to timeout since we were waiting
for shards on nodes that got dropped due to a IndexShardMissingException
Closes#7509
1) One issue reported by a user is due to the truncation of the geohash string. Added Junit test for this scenario
2) Another suspect piece of code was the “toAutomaton” method that only merged the first of possibly many precisions into the result.
Closes#7368
When we corrupt a file in the snapshot/restore case we have to corrupt
a per-segment file. The .del file might change with the commit / flush
that is triggered by the snapshot operation.
This commit makes the default number of shards for the .scripts index to ````1````, it also
forces the auto_expand replicas to ````1-all````. This change means that script index GET requests to load
scripts from the index should always use the local copy of the scripts index, preventing any network traffic or calls
on script GET.
Today the iteration order of the interfaces might change across JVMs
this commit cleans up the NetworkUtils class and attempts to ensure
consistent iteration order across JVMs.
For reliability and debug purposes each test JVM should use it's own
TCP port range if executed in parallel. This also moves away from the
default port range to prevent conflicts with running ES instance on the local
machine.
Only when segments are merged away due to merging then entries in this cache are cleaned up.
Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the FixedBitSetFilterCache does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default via the FixedBitSetFilterCache.
Closes#7037Closes#7031
ClusterState has a reference to the cluster name since version 1.1.0 (df7474b9fc) . However, if the state was sent from a master of an older version, this name can be set to null. This is an unexpected and can cause bugs. The bad part is that it will never correct it self until a full cluster restart where the cluster state is rebuilt using the code of the latest version.
This commit changes the default to the node's cluster name.
Relates to #7386Closes#7414
RandomScoreFunction previously relied on the order the documents were
iterated in from Lucene. This caused changes in ordering, with the same
seed, if documents moved to different segments. With this change, a
murmur32 hash of the _uid for each document is used as the "random"
value. Also, the hash is adjusted so as to only return values between
0.0 and 1.0 to enable easier manipulation to fit into users' scoring
models.
closes#6907, #7446
At the moment, when a node looses connection to the master (due to a partition or the master was stopped), we ping the unicast hosts in order to discover other nodes and elect a new master or get of another master than has been elected in the mean time. This can go wrong if all unicast targets are on the same side of a minority partition and therefore will never rejoin once the partition is healed.
Closes#7336
Requests are handled by the worked thread pool of the target node instead of the generic thread pool of the source node.
Also this change is required in order to make GC disruption work with local transport. Previously the handling of the a request was performed on on a node that that was being GC disrupted, resulting in some actions being performed while GC was being simulated.
The cluster state version allows resolving the case where a old master node become unresponsive and later wakes up and pings all the nodes in the cluster, allowing the newly elected master to decide whether it should step down or ask the old master to rejoin.
Both master fault detection and nodes fault detection request should also send the cluster name, so that on the receiving side the handling of these requests can be failed with an error. This error can be caught on the sending side and for master fault detection the node can fail the master locally and for nodes fault detection the node can be failed.
Note this validation will most likely never fail in a production cluster, but in during automated tests where cluster / nodes are created and destroyed very frequently.
In large clusters when a new elected master is chosen, there are many join requests to handle. By batching them up the the cluster state doesn't get published for each individual join request, but many handled at the same time, which results into a single new cluster state which ends up be published.
Closes#6984
After master election, nodes send join requests to the elected master. Master is then responsible for publishing a new cluster state which sets the master on the local node's cluster state. If something goes wrong with the cluster state publishing, this process will not successfully complete. We should check it after the join request returns and if it failed, retry pinging.
Closes#6969
Currently, pinging results are only used if the local node is elected master or if they detect another *already* active master. This has the effect that master election requires two pinging rounds - one for the elected master to take is role and another for the other nodes to detect it and join the cluster. We can be smarter and use the election of the first round on other nodes as well. Those nodes can try to join the elected master immediately. There is a catch though - the elected master node may still be processing the election and may reject the join request if not ready yet. To compensate a retry mechanism is introduced to try again (up to 3 times by default) if this happens.
Closes#6943
Reduced expected time to heal to 0 (we interrupt and wait on stop disruption). It was also wrongly indicated in seconds.
We didn't properly wait between slow cluster state tasks
added explicit cleaning of temp unicast ping results
reduce gateway local.list_timeout to 10s.
testVerifyApiBlocksDuringPartition: verify master node has stepped down before restoring partition
Since it runs in a background thread after a node is added, or submits a cluster state update when a node leaves, it may be that by the time it is executed the local node is no longer master.
After a node joins the clusters, it starts pinging the master to verify it's health. Before, the cluster join request was processed async and we had to give some time to complete. With #6480 we changed this to wait for the join process to complete on the master. We can therefore start pinging immediately for fast detection of failures. Similar change can be made to the Node fault detection from the master side.
Closes#6706
* waiting time should be long enough depending on the type of the disruption scheme
* MockTransportService#addUnresponsiveRule if remaining delay is smaller than 0 don't double execute transport logic
This commit adds the notion of ServiceDisruptionScheme allowing for introducing disruptions in our test cluster. This
abstraction as used in a couple of wrappers around the functionality offered by MockTransportService to simulate various
network partions. There is also one implementation for causing a node to be slow in processing cluster state updates.
This new mechnaism is integrated into existing tests DiscoveryWithNetworkFailuresTests.
A new test called testAckedIndexing is added to verify retrieval of documents whose indexing was acked during various disruptions.
Closes#6505
If the master FD flags master as gone while there are still pending cluster states, the processing of those cluster states we re-instate that node a master again.
Closes#6526
The previous default was true, which means that after a node disconnected event we try to connect to it as an extra validation. This can result in slow detection of network partitions if the extra reconnect times out before failure.
Also added tests to verify the settings' behaviour
We have an optimization which compares routing/meta data version of cluster states and tries to reuse the current object if the versions are equal. This can cause rare failures during recovery from a minimum_master_node breach when using the "new light rejoin" mechanism and simulated network disconnects. This happens where the current master updates it's state, doesn't manage to broadcast it to other nodes due to the disconnect and then steps down. The new master will start with a previous version and continue to update it. When the old master rejoins, the versions of it's state can equal but the content is different.
Also improved DiscoveryWithNetworkFailuresTests to simulate this failure (and other improvements)
Closes#6466
When a node steps down from being a master (because, for example, min_master_node is breached), it may still have
cluster state update tasks queued up. Most (but not all) are tasks that should no longer be executed as the node
no longer has authority to do so. Other cluster states updates, like electing the current node as master, should be
executed even if the current node is no longer master.
This commit make sure that, by default, `ClusterStateUpdateTask` is not executed if the node is no longer master. Tasks
that should run on non masters are changed to implement a new interface called `ClusterStateNonMasterUpdateTask`
Closes#6230