In some setups, there could be a few hundred thousand queues that are
created due to many consumers that are connecting. However, most of
these are empty and stay empty for the entire day since there aren't
necessarily messages to be sent. The 8K intermediateMessageReferences
instantiates an 64KB buffer (Object[]). This means we have large
allocation and live heap that ultimately remains empty for almost the
entire day.
In this commit, we introduce initial-queue-buffer-size, which defaults
to the current value of 8192. It can be set programmatically via
QueueConfiguration#setInitialQueueBufferSize(int).
Note that this must be a positive power of 2.
If, for whatever reason, the response for a packet sent with blocking
semantics is never returned it's possible that an async response
received in the interventing time will be interpreted as the current
response. This is because ChannelImpl does not verify the correlation
ID set on the response packet when it is received.
Files will leak on the target and messages will not be received after failover.
Also messages will be written to wrong destinations on the replica. The leak is actually between destinations, and the consequence is the file leak.
I usually keep test and fix on the same commit, but in this case I have been heavily validating the server with and without the fix,
so I will open an exception in this case and keep the fix and test separated.
When a bridge is stopped it doesn't wait for pending send
acknowledgements to arrive. However, when a bridge is paused it does
wait. The behavior should be consistent and more importantly
configurable. This commit implements these improvements and generally
refactors BridgeImpl to clarify and simplify the code. In total, this
commit includes the follow changes:
- Removes the hard-coded 60-second timeout for pending acks when
pausing the bridge and adds a new config parameter (i.e.
"pending-ack-timeout").
- Applies the new pending-ack-timeout when the bridge is stopped.
- Updates existing and adds new logging messages for clarity.
- De-duplicates code for sending bridge-related notifications.
- Avoids converting bridge name to/from SimpleString.
- Removes unnecessary comments.
- Renames variables & functions for clarity.
- Replaces the `started`, `stopping`, & `active` booleans with a
single `state` variable which is an enum.
- Adds `final` to a few variables that were functionally final.
- Synchronizes `stop` & `pause` methods to add safety when invoked
concurrently with `handle` (since both deal with `state` and execute
runnables on the ordered executor).
- Reorganizes and removes a few methods for clarity.
- Relocates `connect` method directly into `ConnectRunnable` (mirroring
the structure of the `StopRunnable` and `PauseRunnable`).
- Eliminates unnecessary variables in `ConnectRunnable` and
`ScheduledConnectRunnable`.
- Adds test to verify pending ack timeout works as expected with both
`stop` & `pause` with both regular and large messages.
Updates artemis-log-annotation-processor to use artemis-project so that
artemis-pom can reference artemis-log-annotation-processor without cycle.
Split out its tests to their own module to faciltate, also exercising the
profile mechanism to enable the processor usage with trigger file.
Simplify disabling processing in the module using maven.compiler.proc prop
available since maven-compiler-plugin 3.13.0
Uses a dummy non-processor path at root to 'disable' processsing on JDK < 23,
accounting for Maven 3 not being able to unset maven.compiler.proc from a
parent, and JDKs < 21 requiring newest builds to support -proc:full value
needed otherwise to reenable processing once explicitly disabled.
This commit uses lambdas or method references wherever possible. There
are still a handful of places that appear like they could be changed but
couldn't mainly because they use "this" and the meaning of "this"
changes when using a lambda.
This commit does the following:
- deprecate all QueueConfiguration ctors
- add `of` static factory methods for all the deprecated ctors
- replace any uses of the normal ctors with the `of` counterparts
This makes the code more concise and readable.
This commit does the following:
- deprecate the verbosely named `toSimpleString` static factory
methods
- add `of` static factory methods for all the ctors
- replace any uses of the normal ctors with the `of` counterparts
This makes the code more concise and readable.
This commit includes the following changes:
- Management operations to get sucess & failure counts for authn and
authz along with the corresponding audit logging.
- Export the aforementioned authn & authz metrics.
- Export metrics for the underlying authn & authz caches including the
ability to enable/disable them.
- Update metrics tests to validate tags in addition to keys and values.
- Update documentation to explain new functionality and clarify
existing metric tags.
This is a list of improvements done as part of this commit / task:
* Page Transactions on mirror target are now optional.
If you had an interrupt mirror while the target destination was paging, duplicate detection would be ineffective unless you used paged transactions
Users can now configure the ack manager retries intervals.
Say you need some time to remove a consumer from a target mirror. The delivering references would prevent acks from happening. You can allow bigger retry intervals and number of retries by tinkiering with ack manager retry parameters.
* AckManager restarted independent of incoming acks
The ackManager was only restarted when new acks were coming in. If you stopped receiving acks on a target server and restarted that server with pending acks, those acks would never be exercised. The AckManager is now restarted as soon as the server is started.
This commit does the following:
- Updates HA docs including the chapter on network isolation (i.e.
split brain). The network isolation chapter is now more about
high-level explanation and the HA doc now has all the configuration
parameters.
- Changes references to "pluggable quorum voting" to "pluggable lock
manager." The pluggable functionality really isn't about voting.
Conceptually is much more like the functionality you'd get from a
distributed lock so this naming is more clear. Both the docs and the
code have been changed.
- Reorganize lock manager modules as sub-modules. The API and RI
modules are renamed, but that should be OK based on the
"experimental" tag that's been on this feature up to this point.
- Remove the "experimental" tag from the lock manager.
These changes will not break folks using the standalone broker. However,
they will break folks embedding the broker *if* they are using the
artemis-quorum-ri or artemis-quorum-api modules or the
o.a.a.a.c.c.h.DistributedPrimitiveManagerConfiguration class.
There are no functional changes here. Renaming these modules is more a
conceptual change to facilitate better documentation and increased
adoption.
This fix was based on static analysis of the code and inspection of the
XA specification. There is no test associated with it due to the
difficult nature in reproducing the failure. This code has been
essentially the same for a decade and only now have there been any
reports of it actually sending back the wrong XA code.
Large message support was added to
o.a.a.a.c.s.f.FederatedQueueConsumerImpl#onMessage via cf85d35 for
ARTEMIS-3308. The problem with that change is that when onMessage
returns o.a.a.a.c.c.i.ClientConsumerImpl#callOnMessage will eventually
call o.a.a.a.c.c.i.ClientLargeMessageImpl#discardBody which eventually
ends up in o.a.a.a.c.c.i.LargeMessageControllerImpl#popPacket waiting 30
seconds (i.e. the default readTimeout) for more packets to arrive (which
never do). This happens because the FederatedQueueConsumer short-cuts
the "normal" process by using LargeMessageControllerImpl#take.
This commit fixes that by tracking the number of bytes "taken" and then
looking at that value later when discarding the body effectively
skipping the 30 second wait.