Any failure to deploy an address or queue will short-circuit the broker
initialization process preventing any other addresses or queues from
being deployed as well as other critical resources like acceptors, etc.
The ServerSessionPacketHandler has a close() callback handler which will
delete any pending large messages. However, there is a race where a
large message can be routed, then the close delete the associated large
message resulting in data loss.
Quorum voting is used by both the live and the backup to decide what to do if a replication connection is disconnected.
Basically, the server will request each live server in the cluster to vote as to whether it thinks the server it is replicating to or from is still alive.
You can also configure the time for which the quorum manager will wait for the quorum vote response.
Currently, the value is hardcoded as 30 sec. We should change this 30-second wait to be configurable.
- Added Wait.assert methods, what would make it easier to assert on future conditions
- Moved Wait to artemis-junit, we are now using that module on the testsuite
The Critical analyzer is supposed to catch Deadlocks, and such
can only be resolved by killing the VM.
This test is using shutdown as it would be a bit more complex to
handle Byteman to then halt the VM. This will validate the
CriticalAnalyzer is capturing the event, and in such it would halt the VM.
When calling a consumer to receive message with a timeout
(receive(long timeout), if the consumer's buffer is empty, it sends
a 'forced delivery' the waiting forever, expecting the server to
send back a 'forced delivery" message if the queue is empty.
If the connection is disconnected as the arrived 'forced
delivery' message is corrupted, this 'forced delivery' message
never gets to consumer. After the session is reconnected,
the consumer never knows that and stays waiting.
To fix that we can send a 'forced delivery' to server right
after the session is reconnected.
The MAPPED journal refactoring include:
- simplified lifecycle and logic (eg fixed file size with single mmap memory region)
- supports for the TimedBuffer to coalesce msyncs (via Decorator pattern)
- TLAB pooling of direct ByteBuffer like the NIO journal
- remove of old benchmarks and benchmark dependencies
This method name would clash with ServiceComponent
As the real meaning here on this method is just to failover
So I've renamed the method to avoid the clash with my next commit
(I've done this on a separate commit as you may need to redo this
commit from scratch again in other branches instead of lots of clashes on cherry-pick)
When a large message is being diverted, a new copy of the original
message is created and replicated (if there is a backup) to the backup.
In LargeServerMessageImpl.copy(long) it reuse a byte array to copy
message body. It is possible that one block of date is read into
the byte array before the previous read has been replicated,
causing the replicated bytes to corrupt.
If we make a copy of the byte array before replication, the corruption
of data will be avoided.
When a broken packet arrives at client side it causes decoding error.
Currently artemis doesn't handle it properly. It should catch such
errors and disconnect the underlying connection, logging a proper
warning message
In a cluster if a node is shut down (or crashed) when a
message is being routed to a remote binding, a internal
property may be added to the message and persisted. The
name of the property is like _AMQ_ROUTE_TOsf.my-cluster*.
if the node starts back, it will load and reroute this message
and if it goes to a local consumer, this property won't
get removed and goes to the client.
The fix is to remove this internal property before it
is sent to any client.
If broker fails to decode any packets from buffer, it should
treat it as a critical bug and disconnect immediately.
Currently broker only logs an error message.