Fix Broken Clone Snapshot CS Update (#64116) (#64159)

We must not remove the snapshot from the initializing set
in the `timeout` getter. This was a plain oversight/mistake
and went unnoticed. It can lead to the removal of a valid
snapshot clone from the cluster state in rare circumstances
(e.g. when a node concurrently joins the cluster or a routing
change happens as it did in the linked test failure).

Closes #64115
This commit is contained in:
Armin Braun 2020-10-26 14:32:42 +01:00 committed by GitHub
parent 93b52df8c1
commit e02561476e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 0 additions and 1 deletions

View File

@ -558,7 +558,6 @@ public class SnapshotsService extends AbstractLifecycleComponent implements Clus
@Override
public TimeValue timeout() {
initializingClones.remove(snapshot);
return request.masterNodeTimeout();
}
}, "clone_snapshot [" + request.source() + "][" + snapshotName + ']', listener::onFailure);