Fix cluster health wait conditions in rolling restart tests

In the rolling upgrade tests, there is a test to create an index with
replica shards and ensure that in the mixed cluster environment, the
cluster health is green before any other tests are executed.  However,
there were two problems with this.  First, if the replica shard was
residing on the restarted node, then delayed allocation will kick in and
cause the cluster health request to timeout after 1m.  The fix to this
was to drastically lower the delayed allocation setting.  Second, if the
primary exists on the higher version node, then the replica cannot be
assigned to the lower version node because recovery cannot happen from
lower lucene versions.  The fix here was to wait for the cluster health
to be yellow instead of green in the mixed cluster environment.  In the
fully upgraded cluster, the cluster health check waits for a green
cluster as before.

Closes #25185
This commit is contained in:
Ali Beyad 2017-07-06 14:34:14 -04:00
parent e9f6210dac
commit cc1f40ca18
2 changed files with 6 additions and 2 deletions

View File

@ -2,7 +2,7 @@
"Index data and search on the mixed cluster": "Index data and search on the mixed cluster":
- do: - do:
cluster.health: cluster.health:
wait_for_status: green wait_for_status: yellow
wait_for_nodes: 2 wait_for_nodes: 2
- do: - do:

View File

@ -33,7 +33,11 @@
- do: - do:
indices.create: indices.create:
index: index_with_replicas # dummy index to ensure we can recover indices with replicas just fine index: index_with_replicas # dummy index to ensure we can recover indices with replicas just fine
body:
# if the node with the replica is the first to be restarted, then delayed
# allocation will kick in, and the cluster health won't return to GREEN
# before timing out
index.unassigned.node_left.delayed_timeout: "100ms"
- do: - do:
bulk: bulk:
refresh: true refresh: true