Roll translog generation on primary promotion

When a primary is promoted, rolling the translog generation here makes
simpler reasoning about the relationship between primary terms and
translog generation. Note that this is not strictly necessary for
correctness (e.g., to avoid duplicate operations with the same sequence
number within a single generation).

Relates #27313
This commit is contained in:
Jason Tedor 2017-11-08 09:14:08 -05:00 committed by GitHub
parent bd5e7002be
commit 927d7f6b6c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 50 additions and 1 deletions

View File

@ -472,8 +472,12 @@ public class IndexShard extends AbstractIndexShardComponent implements IndicesCl
* subsequently fails before the primary/replica re-sync completes successfully and we are now being
* promoted, the local checkpoint tracker here could be left in a state where it would re-issue sequence
* numbers. To ensure that this is not the case, we restore the state of the local checkpoint tracker by
* replaying the translog and marking any operations there are completed.
* replaying the translog and marking any operations there are completed. Rolling the translog generation is
* not strictly needed here (as we will never have collisions between sequence numbers in a translog
* generation in a new primary as it takes the last known sequence number as a starting point), but it
* simplifies reasoning about the relationship between primary terms and translog generations.
*/
getEngine().rollTranslogGeneration();
getEngine().restoreLocalCheckpointFromTranslog();
getEngine().fillSeqNoGaps(newPrimaryTerm);
getEngine().seqNoService().updateLocalCheckpointForShard(currentRouting.allocationId().getId(),

View File

@ -508,6 +508,51 @@ public class IndexShardTests extends IndexShardTestCase {
closeShards(indexShard);
}
public void testPrimaryPromotionRollsGeneration() throws Exception {
final IndexShard indexShard = newStartedShard(false);
final long currentTranslogGeneration = indexShard.getTranslog().getGeneration().translogFileGeneration;
// promote the replica
final ShardRouting replicaRouting = indexShard.routingEntry();
final ShardRouting primaryRouting =
newShardRouting(
replicaRouting.shardId(),
replicaRouting.currentNodeId(),
null,
true,
ShardRoutingState.STARTED,
replicaRouting.allocationId());
indexShard.updateShardState(primaryRouting, indexShard.getPrimaryTerm() + 1, (shard, listener) -> {},
0L, Collections.singleton(primaryRouting.allocationId().getId()),
new IndexShardRoutingTable.Builder(primaryRouting.shardId()).addShard(primaryRouting).build(), Collections.emptySet());
/*
* This operation completing means that the delay operation executed as part of increasing the primary term has completed and the
* gaps are filled.
*/
final CountDownLatch latch = new CountDownLatch(1);
indexShard.acquirePrimaryOperationPermit(
new ActionListener<Releasable>() {
@Override
public void onResponse(Releasable releasable) {
releasable.close();
latch.countDown();
}
@Override
public void onFailure(Exception e) {
throw new RuntimeException(e);
}
},
ThreadPool.Names.GENERIC);
latch.await();
assertThat(indexShard.getTranslog().getGeneration().translogFileGeneration, equalTo(currentTranslogGeneration + 1));
closeShards(indexShard);
}
public void testOperationPermitsOnPrimaryShards() throws InterruptedException, ExecutionException, IOException {
final ShardId shardId = new ShardId("test", "_na_", 0);
final IndexShard indexShard;