This is a Large commit where I am refactoring largeMessage Body out of CoreMessage
which is now reused with AMQP.
I had also to fix Reference Counting to fix how Large Messages are Acked
And I also had to make sure Large Messages are transversing correctly when in cluster.
It use RandomAccessFile to allow using heap buffers without additional
copies and/or leaks of direct buffers, as performed by FileChannel JDK
implementation (see https://bugs.openjdk.java.net/browse/JDK-8147468)
Add max record size check before adding a record to prevent that the
broker shuts down, when there is one really large header sent with the
message. Add message size check before allocating large message resource
if it can't be stored.
Refactored thread local ByteBuffer pooling, alignment
and zeroing in order to avoid duplicate code and
improve code coverage with tests.
In addition are being provided faster branchless
alignment operations and optional zeroing of
pooled ByteBuffers for both ASYNCIO and
NIO/MAPPED journal types.
Large messages pendingRecordID is not accessed atomically, leading
to races that would lead to records that cannot been found on the
journal for deletion: it would lead to cause NPE that won't clean
the pending tasks on the current OperationContextImpl.
Adding a cleanup on error of those tasks and avoiding the race
to happen by adding proper synchronization will both enforce
correct clean up when something bad happen and avoid NPE.
Compaction is now reusing direct ByteBuffers on both
reading and writing with explicit and deterministic
release to avoid high peak of native memory utilisation
after compaction.
Compaction cannot free a sliced view of a ByteBuffer on Java >=9:
the fix is using the original ByteBuffer instead of the slice
to perform a file write and allow it to be correctly released by
the Cleaner.
The method is named "lookupRecord".
"lookupRecord" seems to find a related record.
But the method is checking whether recordsSnapshot contains the id or not.
Thus, the method name "containsRecord" is more intuitive than "lookupRecord".
Travis CI has been reporting test failures.
Looking on logs I could see a critical failure happening but not much information on why.
This will help identify further issues.
Adding new metrics for tracking message counts and sizes on a Queue.
This includes tracking metrics for pending, delivering and scheduled
messages. The paging store also tracks message size now.
When live start replication, it must make sure there is
no pending write in message & bindings journal, or we may
lost journal records during initial replication.
So we need flush append executor after acquire StorageManager's
write lock, before Journal's write lock.
Also we set a 10 seconds timeout when flush, the same as
Journal::flushExecutor. If we failed to flush in 10 seconds,
we abort replication, backup will try again later.
Use OrderedExecutorFactory::flushExecutor to flush executor
The MappedSequentialFile relies on the assumption that any writers
won't exceed the maximum capacity of the file, leaving the JVM to crash otherwise.
This commit adds proper bounds checking on write operations (and position changes too)
in order to provide recoverable effects if such scenario should occour.
In addition are provided minor fixes on Mapped and Nio SequentialFile::fill behaviour
to match the original contract.
- it is now possible to disable the TimedBuffer
- this is increasing the default on libaio maxAIO to 4k
- The Auto Tuning on the journal will use asynchronous writes to simulate what would happen on faster disks
- If you set datasync=false on the CLI, the system will suggest mapped and disable the buffer timeout
This closes#1436
This commit superseeds #1436 since it's now disabling the timed buffer through the CLI
Instead of wait to flush an executor,
I have added a method isFlushed() which will just translate to the
state on the OrderedExecutor.
In the case another executor is provided (for tests) there's a delegate
into normal executors.
This is replacing an executor on ServerSessionPacketHandler
by a this actor.
This is to avoid creating a new runnable per packet received.
Instead of creating new Runnable, this will use a single static runnable
and the packet will be send by a message, which will be treated by a listener.
Look at ServerSessionPacketHandler on this commit for more information on how it works.
If replication blocked anything on the journal
the processing from clients would be blocked
and nothing would work.
As part of this fix I am using an executor on ServerSessionPacketHandler
which will also scale better as the reader from Netty would be feed immediately.
The MAPPED journal refactoring include:
- simplified lifecycle and logic (eg fixed file size with single mmap memory region)
- supports for the TimedBuffer to coalesce msyncs (via Decorator pattern)
- TLAB pooling of direct ByteBuffer like the NIO journal
- remove of old benchmarks and benchmark dependencies