For reliability and debug purposes each test JVM should use it's own
TCP port range if executed in parallel. This also moves away from the
default port range to prevent conflicts with running ES instance on the local
machine.
Only when segments are merged away due to merging then entries in this cache are cleaned up.
Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the FixedBitSetFilterCache does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default via the FixedBitSetFilterCache.
Closes#7037Closes#7031
ClusterState has a reference to the cluster name since version 1.1.0 (df7474b9fc) . However, if the state was sent from a master of an older version, this name can be set to null. This is an unexpected and can cause bugs. The bad part is that it will never correct it self until a full cluster restart where the cluster state is rebuilt using the code of the latest version.
This commit changes the default to the node's cluster name.
Relates to #7386Closes#7414
RandomScoreFunction previously relied on the order the documents were
iterated in from Lucene. This caused changes in ordering, with the same
seed, if documents moved to different segments. With this change, a
murmur32 hash of the _uid for each document is used as the "random"
value. Also, the hash is adjusted so as to only return values between
0.0 and 1.0 to enable easier manipulation to fit into users' scoring
models.
closes#6907, #7446
At the moment, when a node looses connection to the master (due to a partition or the master was stopped), we ping the unicast hosts in order to discover other nodes and elect a new master or get of another master than has been elected in the mean time. This can go wrong if all unicast targets are on the same side of a minority partition and therefore will never rejoin once the partition is healed.
Closes#7336
Requests are handled by the worked thread pool of the target node instead of the generic thread pool of the source node.
Also this change is required in order to make GC disruption work with local transport. Previously the handling of the a request was performed on on a node that that was being GC disrupted, resulting in some actions being performed while GC was being simulated.
The cluster state version allows resolving the case where a old master node become unresponsive and later wakes up and pings all the nodes in the cluster, allowing the newly elected master to decide whether it should step down or ask the old master to rejoin.
Both master fault detection and nodes fault detection request should also send the cluster name, so that on the receiving side the handling of these requests can be failed with an error. This error can be caught on the sending side and for master fault detection the node can fail the master locally and for nodes fault detection the node can be failed.
Note this validation will most likely never fail in a production cluster, but in during automated tests where cluster / nodes are created and destroyed very frequently.
In large clusters when a new elected master is chosen, there are many join requests to handle. By batching them up the the cluster state doesn't get published for each individual join request, but many handled at the same time, which results into a single new cluster state which ends up be published.
Closes#6984
After master election, nodes send join requests to the elected master. Master is then responsible for publishing a new cluster state which sets the master on the local node's cluster state. If something goes wrong with the cluster state publishing, this process will not successfully complete. We should check it after the join request returns and if it failed, retry pinging.
Closes#6969
Currently, pinging results are only used if the local node is elected master or if they detect another *already* active master. This has the effect that master election requires two pinging rounds - one for the elected master to take is role and another for the other nodes to detect it and join the cluster. We can be smarter and use the election of the first round on other nodes as well. Those nodes can try to join the elected master immediately. There is a catch though - the elected master node may still be processing the election and may reject the join request if not ready yet. To compensate a retry mechanism is introduced to try again (up to 3 times by default) if this happens.
Closes#6943
Reduced expected time to heal to 0 (we interrupt and wait on stop disruption). It was also wrongly indicated in seconds.
We didn't properly wait between slow cluster state tasks
added explicit cleaning of temp unicast ping results
reduce gateway local.list_timeout to 10s.
testVerifyApiBlocksDuringPartition: verify master node has stepped down before restoring partition
Since it runs in a background thread after a node is added, or submits a cluster state update when a node leaves, it may be that by the time it is executed the local node is no longer master.
After a node joins the clusters, it starts pinging the master to verify it's health. Before, the cluster join request was processed async and we had to give some time to complete. With #6480 we changed this to wait for the join process to complete on the master. We can therefore start pinging immediately for fast detection of failures. Similar change can be made to the Node fault detection from the master side.
Closes#6706