From 8848fcfb2277f5b86b1fe7aa484f25745ad0a476 Mon Sep 17 00:00:00 2001 From: Tanguy Leroux Date: Fri, 26 Jul 2019 10:12:59 +0200 Subject: [PATCH] Ensure cluster is stable in ShrinkIndexIT.testShrinkThenSplitWithFailedNode (#44860) The test ShrinkIndexIT.testShrinkThenSplitWithFailedNode sometimes fails because the resize operation is not acknowledged (see #44736). This resize operation creates a new index "splitagain" and it results in a cluster state update (TransportResizeAction uses MetaDataCreateIndexService.createIndex() to create the resized index). This cluster state update is expected to be acknowledged by all nodes (see IndexCreationTask.onAllNodesAcked()) but this is not always true: the data node that was just stopped in the test before executing the resize operation might still be considered as a "faulty" node (and not yet removed from the cluster nodes) by the FollowersChecker. The cluster state is then acked on all nodes but one, and it results in a non acknowledged resize operation. This commit adds an ensureStableCluster() check after stopping the node in the test. The goal is to ensure that the data node has been correctly removed from the cluster and that all nodes are fully connected to each before moving forward with the resize operation. Closes #44736 --- .../action/admin/indices/create/ShrinkIndexIT.java | 2 ++ 1 file changed, 2 insertions(+) diff --git a/server/src/test/java/org/elasticsearch/action/admin/indices/create/ShrinkIndexIT.java b/server/src/test/java/org/elasticsearch/action/admin/indices/create/ShrinkIndexIT.java index 582ab09a1f8..1ee344e326a 100644 --- a/server/src/test/java/org/elasticsearch/action/admin/indices/create/ShrinkIndexIT.java +++ b/server/src/test/java/org/elasticsearch/action/admin/indices/create/ShrinkIndexIT.java @@ -580,7 +580,9 @@ public class ShrinkIndexIT extends ESIntegTestCase { .build()).setResizeType(ResizeType.SHRINK).get()); ensureGreen(); + final int nodeCount = cluster().size(); internalCluster().stopRandomNode(InternalTestCluster.nameFilter(shrinkNode)); + ensureStableCluster(nodeCount - 1); // demonstrate that the index.routing.allocation.initial_recovery setting from the shrink doesn't carry over into the split index, // because this would cause the shrink to fail as the initial_recovery node is no longer present.