As a follow-up to #3618/dc7de893747b90b627d729f9f18a758bb4dad9d5 update
checkstyle to the latest version, restoring the originally intended
"RightCurly" style, and updating all the code to properly adhere to the
style as enforced by the new checkstyle version.
The version of checkstyle we used before the aforementioned commit had
a bug which didn't properly enforced our intended "RightCurly" style
(see https://github.com/checkstyle/checkstyle/issues/6345). That commit
changed the style to accommodate the handful of unintended style
violations. This commit reverts that change for 2 main reasons:
- The style was always intended to use `alone` for both `METHOD_DEF`
and `CTOR_DEF`.
- There are over 1,000 existing uses of the intended style and around
30 violations of this style which were unintentionally allowed.
Reverting the style back to the original and cleaning up the unintented
violations makes the code more consistent and prevents further style
inconsistencies in the future.
There were a handful of other changes related to checkstyle bugs which
allowed unintended style violations. These were related to indentation
levels.
This closes#3619
(with some minor changes from Robbie to fix remaining violations)
This is a Large commit where I am refactoring largeMessage Body out of CoreMessage
which is now reused with AMQP.
I had also to fix Reference Counting to fix how Large Messages are Acked
And I also had to make sure Large Messages are transversing correctly when in cluster.
If a broker loses its file lock on the journal and doesn't notice (e.g.
network connection failure to an NFS mount) then it can continue to run
after its backup activates resulting in split-brain.
This commit implements periodic journal lock evaluation so that if a live
server loses its lock it will automatically restart itself.
The parameter failoverOnInitialConnection wouldn't seem to be used and
makes no sense any more, because the connectors are retried in a loop.
So someone can just add the backup in the initial connection.
Any failure to deploy an address or queue will short-circuit the broker
initialization process preventing any other addresses or queues from
being deployed as well as other critical resources like acceptors, etc.
The ServerSessionPacketHandler has a close() callback handler which will
delete any pending large messages. However, there is a race where a
large message can be routed, then the close delete the associated large
message resulting in data loss.
Quorum voting is used by both the live and the backup to decide what to do if a replication connection is disconnected.
Basically, the server will request each live server in the cluster to vote as to whether it thinks the server it is replicating to or from is still alive.
You can also configure the time for which the quorum manager will wait for the quorum vote response.
Currently, the value is hardcoded as 30 sec. We should change this 30-second wait to be configurable.
- Added Wait.assert methods, what would make it easier to assert on future conditions
- Moved Wait to artemis-junit, we are now using that module on the testsuite
The Critical analyzer is supposed to catch Deadlocks, and such
can only be resolved by killing the VM.
This test is using shutdown as it would be a bit more complex to
handle Byteman to then halt the VM. This will validate the
CriticalAnalyzer is capturing the event, and in such it would halt the VM.
When calling a consumer to receive message with a timeout
(receive(long timeout), if the consumer's buffer is empty, it sends
a 'forced delivery' the waiting forever, expecting the server to
send back a 'forced delivery" message if the queue is empty.
If the connection is disconnected as the arrived 'forced
delivery' message is corrupted, this 'forced delivery' message
never gets to consumer. After the session is reconnected,
the consumer never knows that and stays waiting.
To fix that we can send a 'forced delivery' to server right
after the session is reconnected.
The MAPPED journal refactoring include:
- simplified lifecycle and logic (eg fixed file size with single mmap memory region)
- supports for the TimedBuffer to coalesce msyncs (via Decorator pattern)
- TLAB pooling of direct ByteBuffer like the NIO journal
- remove of old benchmarks and benchmark dependencies
This method name would clash with ServiceComponent
As the real meaning here on this method is just to failover
So I've renamed the method to avoid the clash with my next commit
(I've done this on a separate commit as you may need to redo this
commit from scratch again in other branches instead of lots of clashes on cherry-pick)
When a large message is being diverted, a new copy of the original
message is created and replicated (if there is a backup) to the backup.
In LargeServerMessageImpl.copy(long) it reuse a byte array to copy
message body. It is possible that one block of date is read into
the byte array before the previous read has been replicated,
causing the replicated bytes to corrupt.
If we make a copy of the byte array before replication, the corruption
of data will be avoided.
When a broken packet arrives at client side it causes decoding error.
Currently artemis doesn't handle it properly. It should catch such
errors and disconnect the underlying connection, logging a proper
warning message
In a cluster if a node is shut down (or crashed) when a
message is being routed to a remote binding, a internal
property may be added to the message and persisted. The
name of the property is like _AMQ_ROUTE_TOsf.my-cluster*.
if the node starts back, it will load and reroute this message
and if it goes to a local consumer, this property won't
get removed and goes to the client.
The fix is to remove this internal property before it
is sent to any client.