activemq-artemis/docs/user-manual/activation-tools.adoc
Justin Bertram 576622571a ARTEMIS-4559 refactor HA docs & code/module naming
This commit does the following:

 - Updates HA docs including the chapter on network isolation (i.e.
   split brain). The network isolation chapter is now more about
   high-level explanation and the HA doc now has all the configuration
   parameters.
 - Changes references to "pluggable quorum voting" to "pluggable lock
   manager." The pluggable functionality really isn't about voting.
   Conceptually is much more like the functionality you'd get from a
   distributed lock so this naming is more clear. Both the docs and the
   code have been changed.
 - Reorganize lock manager modules as sub-modules. The API and RI
   modules are renamed, but that should be OK based on the
   "experimental" tag that's been on this feature up to this point.
 - Remove the "experimental" tag from the lock manager.

These changes will not break folks using the standalone broker. However,
they will break folks embedding the broker *if* they are using the
artemis-quorum-ri or artemis-quorum-api modules or the
o.a.a.a.c.c.h.DistributedPrimitiveManagerConfiguration class.

There are no functional changes here. Renaming these modules is more a
conceptual change to facilitate better documentation and increased
adoption.
2024-03-15 10:18:05 -04:00

45 lines
2.0 KiB
Plaintext

= Activation Sequence Tools
:idprefix:
:idseparator: -
You can use the Artemis CLI to execute activation sequence maintenance/recovery tools for xref:ha.adoc#replication[Replication] with Pluggable Lock Manager.
The 2 main commands are `activation list` and `activation set`, that can be used together to recover some disaster happened to local/coordinated activation sequences.
Here is a disaster scenario built around the RI (using https://zookeeper.apache.org/[Apache ZooKeeper] and https://curator.apache.org/[Apache curator]) to demonstrate the usage of such commands.
== ZooKeeper cluster disaster
A proper ZooKeeper cluster should use at least 3 nodes, but what happens if all these nodes crash loosing any activation state information required to manage replication?
During the disaster (i.e. ZooKeeper nodes are no longer reachable) the follow occurs:
* Active brokers shutdown (and if restarted, should hang waiting to reconnect to the ZooKeeper cluster again)
* Passive brokers unpair and wait to reconnect to the ZooKeeper cluster again
Necessary administrative action:
. Stop all brokers
. Restart ZooKeeper cluster
. Search for brokers with the highest local activation sequence for their `NodeID` by running this command from the `bin` folder of the broker:
+
[,bash]
----
$ ./artemis activation list --local
Local activation sequence for NodeID=7debb3d1-0d4b-11ec-9704-ae9213b68ac4: 1
----
. From the `bin` folder of the brokers with the highest local activation sequence
+
[,bash]
----
# assuming 1 to be the highest local activation sequence obtained at the previous step
# for NodeID 7debb3d1-0d4b-11ec-9704-ae9213b68ac4
$ ./artemis activation set --remote --to 1
Forced coordinated activation sequence for NodeID=7debb3d1-0d4b-11ec-9704-ae9213b68ac4 from 0 to 1
----
. Restart all brokers: previously active ones should be able to be active again
The more ZooKeeper nodes there are the less chance that a disaster like this requires administrative intervention because it allows the ZooKeeper cluster to tolerate more failures.