Add max record size check before adding a record to prevent that the
broker shuts down, when there is one really large header sent with the
message. Add message size check before allocating large message resource
if it can't be stored.
Refactored thread local ByteBuffer pooling, alignment
and zeroing in order to avoid duplicate code and
improve code coverage with tests.
In addition are being provided faster branchless
alignment operations and optional zeroing of
pooled ByteBuffers for both ASYNCIO and
NIO/MAPPED journal types.
Large messages pendingRecordID is not accessed atomically, leading
to races that would lead to records that cannot been found on the
journal for deletion: it would lead to cause NPE that won't clean
the pending tasks on the current OperationContextImpl.
Adding a cleanup on error of those tasks and avoiding the race
to happen by adding proper synchronization will both enforce
correct clean up when something bad happen and avoid NPE.
Compaction is now reusing direct ByteBuffers on both
reading and writing with explicit and deterministic
release to avoid high peak of native memory utilisation
after compaction.
Compaction cannot free a sliced view of a ByteBuffer on Java >=9:
the fix is using the original ByteBuffer instead of the slice
to perform a file write and allow it to be correctly released by
the Cleaner.
The method is named "lookupRecord".
"lookupRecord" seems to find a related record.
But the method is checking whether recordsSnapshot contains the id or not.
Thus, the method name "containsRecord" is more intuitive than "lookupRecord".
Travis CI has been reporting test failures.
Looking on logs I could see a critical failure happening but not much information on why.
This will help identify further issues.
Adding new metrics for tracking message counts and sizes on a Queue.
This includes tracking metrics for pending, delivering and scheduled
messages. The paging store also tracks message size now.
When live start replication, it must make sure there is
no pending write in message & bindings journal, or we may
lost journal records during initial replication.
So we need flush append executor after acquire StorageManager's
write lock, before Journal's write lock.
Also we set a 10 seconds timeout when flush, the same as
Journal::flushExecutor. If we failed to flush in 10 seconds,
we abort replication, backup will try again later.
Use OrderedExecutorFactory::flushExecutor to flush executor
The MappedSequentialFile relies on the assumption that any writers
won't exceed the maximum capacity of the file, leaving the JVM to crash otherwise.
This commit adds proper bounds checking on write operations (and position changes too)
in order to provide recoverable effects if such scenario should occour.
In addition are provided minor fixes on Mapped and Nio SequentialFile::fill behaviour
to match the original contract.
- it is now possible to disable the TimedBuffer
- this is increasing the default on libaio maxAIO to 4k
- The Auto Tuning on the journal will use asynchronous writes to simulate what would happen on faster disks
- If you set datasync=false on the CLI, the system will suggest mapped and disable the buffer timeout
This closes#1436
This commit superseeds #1436 since it's now disabling the timed buffer through the CLI
Instead of wait to flush an executor,
I have added a method isFlushed() which will just translate to the
state on the OrderedExecutor.
In the case another executor is provided (for tests) there's a delegate
into normal executors.
This is replacing an executor on ServerSessionPacketHandler
by a this actor.
This is to avoid creating a new runnable per packet received.
Instead of creating new Runnable, this will use a single static runnable
and the packet will be send by a message, which will be treated by a listener.
Look at ServerSessionPacketHandler on this commit for more information on how it works.
If replication blocked anything on the journal
the processing from clients would be blocked
and nothing would work.
As part of this fix I am using an executor on ServerSessionPacketHandler
which will also scale better as the reader from Netty would be feed immediately.
The MAPPED journal refactoring include:
- simplified lifecycle and logic (eg fixed file size with single mmap memory region)
- supports for the TimedBuffer to coalesce msyncs (via Decorator pattern)
- TLAB pooling of direct ByteBuffer like the NIO journal
- remove of old benchmarks and benchmark dependencies
Building on ARTEMIS-905 JCtools ConcurrentMap replacement first proposed but currently parked by @franz1981, replace the collections with primitive key concurrent collections to avoid auto boxing.
The goal of this is to reduce/remove autoboxing on the hot path.
We are just adding jctools to the broker (should not be in client dependencies)
Like wise targeting specific use case with specific implementation rather than a blanket replace all.
Using collections from Bookkeeper, reduces outside tlab allocation, on resizing compared to JCTools, which occurs frequently on testing.
- NIO/ASYNCIO new TimedBuffer with adapting batch window heuristic
- NIO/ASYNCIO improved TimedBuffer write monitoring with
lightweight concurrent performance counters
- NIO/ASYNCIO journal/paging operations benefit from less buffer copy
- NIO/ASYNCIO any buffer copy is always performed with raw batch copy
using SIMD instrinsics (System::arrayCopy) or memcpy under the hood
- NIO improved clear buffers using SIMD instrinsics (Arrays::fill) and/or memset
- NIO journal operation perform by default TLABs allocation pooling (off heap)
retaining only the last max sized buffer
- NIO improved file copy operations using zero-copy FileChannel::transfertTo
- NIO improved zeroing using pooled single OS page buffer to clean the file
+ pwrite (on Linux)
- NIO deterministic release of unpooled direct buffers to avoid OOM errors
due to slow GC
- Exposed OS PAGE SIZE value using Env class
This is fixing an issue introduced on 4b47461f03 (ARTEMIS-822)
The Transactions were being looked up without the readLock and some of the controls for Read and Write lock
were broken after this.