OpenSearch/qa
Boaz Leskes 1ca0b5e9e4 Introduce a History UUID as a requirement for ops based recovery (#26577)
The new ops based recovery, introduce as part of  #10708, is based on the assumption that all operations below the global checkpoint known to the replica do not need to be synced with the primary. This is based on the guarantee that all ops below it are available on primary and they are equal. Under normal operations this guarantee holds. Sadly, it can be violated when a primary is restored from an old snapshot. At the point the restore primary can miss operations below the replica's global checkpoint, or even worse may have total different operations at the same spot. This PR introduces the notion of a history uuid to be able to capture the difference with the restored primary (in a follow up PR).

The History UUID is generated by a primary when it is first created and is synced to the replicas which are recovered via a file based recovery. The PR adds a requirement to ops based recovery to make sure that the history uuid of the source and the target are equal. Under normal operations, all shard copies will stay with that history uuid for the rest of the index lifetime and thus this is a noop. However, it gives us a place to guarantee we fall back to file base syncing in special events like a restore from snapshot (to be done as a follow up) and when someone calls the truncate translog command which can go wrong when combined with primary recovery (this is done in this PR).

We considered in the past to use the translog uuid for this function (i.e., sync it across copies) and thus avoid adding an extra identifier. This idea was rejected as it removes the ability to verify that a specific translog really belongs to a specific lucene index. We also feel that having a history uuid will serve us well in the future.
2017-09-14 21:25:02 +03:00
..
evil-tests Refactor bootstrap check results and error messages 2017-09-13 21:30:27 -04:00
full-cluster-restart Introduce a History UUID as a requirement for ops based recovery (#26577) 2017-09-14 21:25:02 +03:00
mixed-cluster Revert shading for the low level rest client (#26367) 2017-08-25 14:13:12 -05:00
multi-cluster-search Respect cluster alias in `_index` aggs and queries (#25885) 2017-07-26 09:16:52 +02:00
no-bootstrap-tests Fix permissions handling on Windows spawner test 2017-04-07 06:25:24 -04:00
query-builder-bwc Revert shading for the low level rest client (#26367) 2017-08-25 14:13:12 -05:00
reindex-from-old Revert shading for the low level rest client (#26367) 2017-08-25 14:13:12 -05:00
rolling-upgrade Update version to 7.0.0-alpha1 (#25876) 2017-08-01 15:47:48 -04:00
smoke-test-client Use nio transport in test clusters (#25986) 2017-08-01 16:19:31 -05:00
smoke-test-http Revert shading for the low level rest client (#26367) 2017-08-25 14:13:12 -05:00
smoke-test-ingest-disabled Tests: Change rest test extension from .yaml to .yml (#24659) 2017-05-16 17:24:35 -07:00
smoke-test-ingest-with-all-dependencies ScriptService: Replace max compilation per minute setting with max compilation rate (#26399) 2017-09-01 10:15:27 +02:00
smoke-test-multinode Tests: Change rest test extension from .yaml to .yml (#24659) 2017-05-16 17:24:35 -07:00
smoke-test-plugins Tests: Change rest test extension from .yaml to .yml (#24659) 2017-05-16 17:24:35 -07:00
smoke-test-reindex-with-all-modules ScriptService: Replace max compilation per minute setting with max compilation rate (#26399) 2017-09-01 10:15:27 +02:00
smoke-test-tribe-node Tests: Change rest test extension from .yaml to .yml (#24659) 2017-05-16 17:24:35 -07:00
vagrant [Docs] "The the" is a great band, but ... (#26644) 2017-09-14 15:08:20 +02:00
verify-version-constants Fix docs lucene version check error message 2017-06-26 15:45:13 -07:00
wildfly Revert shading for the low level rest client (#26367) 2017-08-25 14:13:12 -05:00