This commit does the following:
- Updates HA docs including the chapter on network isolation (i.e.
split brain). The network isolation chapter is now more about
high-level explanation and the HA doc now has all the configuration
parameters.
- Changes references to "pluggable quorum voting" to "pluggable lock
manager." The pluggable functionality really isn't about voting.
Conceptually is much more like the functionality you'd get from a
distributed lock so this naming is more clear. Both the docs and the
code have been changed.
- Reorganize lock manager modules as sub-modules. The API and RI
modules are renamed, but that should be OK based on the
"experimental" tag that's been on this feature up to this point.
- Remove the "experimental" tag from the lock manager.
These changes will not break folks using the standalone broker. However,
they will break folks embedding the broker *if* they are using the
artemis-quorum-ri or artemis-quorum-api modules or the
o.a.a.a.c.c.h.DistributedPrimitiveManagerConfiguration class.
There are no functional changes here. Renaming these modules is more a
conceptual change to facilitate better documentation and increased
adoption.
Whenever we create a queue with a filter we're instantiating 3 different
`org.apache.activemq.artemis.core.filter.impl.FilterImpl` objects. This
is wasteful and entirely avoidable.
This fix was based on static analysis of the code and inspection of the
XA specification. There is no test associated with it due to the
difficult nature in reproducing the failure. This code has been
essentially the same for a decade and only now have there been any
reports of it actually sending back the wrong XA code.
This is particularly true for the Mirrored SNF queue. Redistribution is not meant for internal queues. If an internal queue happens to have the same name on another server, it should not trigger redistribution when consumers are removed.
It would be possible to work around this by adding an address-setting specific to the address with redistribution disabled.
ClusteredMirrorSoakTest was intermittently failing because of this. For a few seconds while the mirror connection is still being made connections could move messages from one node towards another node if both have the same name.
Fix intermittent test failures in pull consumer test by asserting that there
are the expected number of message on the queue before running the JMS consume
cycle to consume credit and trigger federation credit to flow.
Tests need to ensure federation links are up before sending to and address
or the sent message can get discarded before the federation consumer is there
to receive it.
When federation is configured in two directions between nodes for an address
the message can reflect from one node to another if max hops is not set or not
set correctly and in some federation topologies the max hops value can't solve
the issue and still result in a working configuration. This reflection should
be prevented at the federation consumer level for address consumers.
These tests are using an asynchronous feature. the check on Log has to use the Wait.assertEquals
I had to make a few changes to the methods to allow the use of Wait
Generate MQTT message IDs from full allowed range of 1-65535 and skip
currently used values. Do not use atomic integer for current ID, because
all accesses and modifications are performed in synchronized context.
little tool showing current folders on the CLI. Useful to figure out where you're at when using the shell.
I get myself typing pwd all the time I am using the CLI, trying to figure out what is my current location and broker being used.
this would be useful to make sure you are going to use the right broker when multiple CLI instances are running from multiple terminals.
When Queue consumers attach with filters use those instead of the Queue
filter to filter the messages that are federated to avoid stranding of
messages on the local broker. This will result in multiple federation
consumers if the various attached local consumers all use different
filters but does keep unwanted messages on the remote so that consumers
there can consume those.
Under some scenarios federation demand tracking is losing track of total demand
for a federated resource leading to teardown of federated links before all local
demand has been removed from the resource. This occurs most often if the attempts
to establish a federation link are refused because the resource hasn't yet been
created and an eventual attach succeeds, but can also occur in combination with
a plugin blocking or not blocking federation link creation in some cases.
Large message support was added to
o.a.a.a.c.s.f.FederatedQueueConsumerImpl#onMessage via cf85d35 for
ARTEMIS-3308. The problem with that change is that when onMessage
returns o.a.a.a.c.c.i.ClientConsumerImpl#callOnMessage will eventually
call o.a.a.a.c.c.i.ClientLargeMessageImpl#discardBody which eventually
ends up in o.a.a.a.c.c.i.LargeMessageControllerImpl#popPacket waiting 30
seconds (i.e. the default readTimeout) for more packets to arrive (which
never do). This happens because the FederatedQueueConsumer short-cuts
the "normal" process by using LargeMessageControllerImpl#take.
This commit fixes that by tracking the number of bytes "taken" and then
looking at that value later when discarding the body effectively
skipping the 30 second wait.