When a large message is replicated to backup, a pendingID is generated
when the large message is finished. This pendingID is generated by a
BatchingIDGenerator at backup.
It is possible that a pendingID generated at backup may be a duplicate
to an ID generated at live server.
This can cause a problem when a large message with a messageID that is
the same as another largemessage's pendingID is replicated and stored
in the backup's journal, and then a deleteRecord for the pendingID
is appended. If backup becomes live and loads the journal, it will
drop the large message add record because there is a deleteRecord of
the same ID (even though it is a pendingID of another message).
As a result the expecting client will never get this large message.
So in summary, the root cause is that the pendingIDs for large
messages are generated at backup while backup is not alive.
The solution to this is that instead of the backup generating
the pendingID, we make them all be generated in advance
at live server and let them replicated to backup whereever needed.
The ID generater at backup only works when backup becomes live
(when it is properly initialized from journal).
(cherry picked from commit d50f577cd5)
When a large message is being diverted, a new copy of the original
message is created and replicated (if there is a backup) to the backup.
In LargeServerMessageImpl.copy(long) it reuse a byte array to copy
message body. It is possible that one block of date is read into
the byte array before the previous read has been replicated,
causing the replicated bytes to corrupt.
If we make a copy of the byte array before replication, the corruption
of data will be avoided.
(cherry picked from commit 045021f7df)
If replication blocked anything on the journal
the processing from clients would be blocked
and nothing would work.
As part of this fix I am using an executor on ServerSessionPacketHandler
which will also scale better as the reader from Netty would be feed immediately.
Before sending of messages to server 0 begins, the test
should wait until consumer is registered at RemoteQueueBindingImpl
on server 0. Otherwise some messages may not be rebalanced
to server 1.
(cherry picked from commit c5a25d3322)
- NIO/ASYNCIO new TimedBuffer with adapting batch window heuristic
- NIO/ASYNCIO improved TimedBuffer write monitoring with
lightweight concurrent performance counters
- NIO/ASYNCIO journal/paging operations benefit from less buffer copy
- NIO/ASYNCIO any buffer copy is always performed with raw batch copy
using SIMD instrinsics (System::arrayCopy) or memcpy under the hood
- NIO improved clear buffers using SIMD instrinsics (Arrays::fill) and/or memset
- NIO journal operation perform by default TLABs allocation pooling (off heap)
retaining only the last max sized buffer
- NIO improved file copy operations using zero-copy FileChannel::transfertTo
- NIO improved zeroing using pooled single OS page buffer to clean the file
+ pwrite (on Linux)
- NIO deterministic release of unpooled direct buffers to avoid OOM errors
due to slow GC
- Exposed OS PAGE SIZE value using Env class
(cherry picked from commit 21c9ed85cf)
Added a wait-for-activation option to shared-store master HA policies.
This option is enabled by default to ensure unchanged server startup behavior.
If this option is enabled, ActiveMQServer.start() with a shared-store master server will not return
before the server has been activated.
If this options is disabled, start() will return after a background activation thread has been started.
The caller can use waitForActivation() to wait until server is activated, or just check the current activation status.
(cherry picked from commit 6017e305d9)
(cherry picked from commit 7aa50546b3)
This is fixing an issue introduced on 4b47461f03 (ARTEMIS-822)
The Transactions were being looked up without the readLock and some of the controls for Read and Write lock
were broken after this.
(cherry picked from commit ddacda5062)
AIOFileLockManager doesn't work on NFS-mounted share store directories.
Since the GFS2 bug https://bugzilla.redhat.com/show_bug.cgi?id=678585
has been fixed end of 2011, the class AIOFileLockManager is no longer needed and I have removed it.
(cherry picked from commit 557f02ba4d)