Remove global checkpoint assertion in index shard

Due to races, this assertion in index shard can be wrong. This commit removes that assertion and adjusts the explanatory comment.
2025-03-24 17:09:48 +00:00 · 2017-05-04 10:29:00 -04:00 · 2017-05-04 10:29:00 -04:00 · 50b617f73a
commit 50b617f73a
parent 977016ba25
1 changed files with 10 additions and 10 deletions
--- a/core/src/main/java/org/elasticsearch/index/shard/IndexShard.java
+++ b/core/src/main/java/org/elasticsearch/index/shard/IndexShard.java
@ -1523,20 +1523,20 @@ public class IndexShard extends AbstractIndexShardComponent implements IndicesCl
        verifyReplicationTarget();
        final SequenceNumbersService seqNoService = getEngine().seqNoService();
        final long localCheckpoint = seqNoService.getLocalCheckpoint();
-        if (globalCheckpoint <= localCheckpoint) {
-            seqNoService.updateGlobalCheckpointOnReplica(globalCheckpoint);
-        } else {
+        if (globalCheckpoint > localCheckpoint) {
            /*
             * This can happen during recovery when the shard has started its engine but recovery is not finalized and is receiving global
-             * checkpoint updates from in-flight operations. However, since this shard is not yet contributing to calculating the global
-             * checkpoint, it can be the case that the global checkpoint update from the primary is ahead of the local checkpoint on this
-             * shard. In this case, we ignore the global checkpoint update. This should only happen if we are in the translog stage of
-             * recovery. Prior to this, the engine is not opened and this shard will not receive global checkpoint updates, and after this
-             * the shard will be contributing to calculations of the the global checkpoint.
+             * checkpoint updates. However, since this shard is not yet contributing to calculating the global checkpoint, it can be the
+             * case that the global checkpoint update from the primary is ahead of the local checkpoint on this shard. In this case, we
+             * ignore the global checkpoint update. This can happen if we are in the translog stage of recovery. Prior to this, the engine
+             * is not opened and this shard will not receive global checkpoint updates, and after this the shard will be contributing to
+             * calculations of the the global checkpoint. However, we can not assert that we are in the translog stage of recovery here as
+             * while the global checkpoint update may have emanated from the primary when we were in that state, we could subsequently move
+             * to recovery finalization, or even finished recovery before the update arrives here.
             */
-            assert recoveryState().getStage() == RecoveryState.Stage.TRANSLOG
-                    : "expected recovery stage [" + RecoveryState.Stage.TRANSLOG + "] but was [" + recoveryState().getStage() + "]";
+            return;
        }
+        seqNoService.updateGlobalCheckpointOnReplica(globalCheckpoint);
    }

    /**