The iterQueue transaction commits are locked by the synchronization
context. So removing all messages from a huge queue causes the creation
of too locked transactions for the paged messages and so OOM.
The iteration on paged message is executed out the iterQueue
synchronization context to avoid to lock the transaction commits.
Active Directory servers are unable to handle referrals automatically.
This causes a PartialResultException to be thrown if a referral is
encountered beneath the base search DN, even if the LDAPLoginModule is
set to ignore referrals.
This option may be set to 'true' to ignore these exceptions, allowing
login to proceed with the query results received before the exception
was encountered.
Note: there are no tests for this change as I could not reproduce the
issue with the ApacheDS test server. The issue is specific to directory
servers that don't support the ManageDsaIT control such as Active
Directory.
A new feature to preserve messages sent to an address for queues that will be
created on the address in the future. This is essentially equivalent to the
"retroactive consumer" feature from 5.x. However, it's implemented in a way
that fits with the address model of Artemis.
Improve wildcard support for the key attribute in the roles access
match element and whitelist entry element, allowing prefix match for
the mBean properties.
In LargeMessageImpl.copy(long) it need to open the underlying
file in order to read and copy bytes into the new copied message.
However there is a chance that another thread can come in and close
the file in the middle, making the copy failed
with "channel is null" error.
This is happening in cases where a large message is sent to a jms
topic (multicast address). During delivery it to multiple
subscribers, some consumer is doing delivery and closed the
underlying file after. Some other consumer is rolling back
the messages and eventually move it to DLQ (which will call
the above copy method). So there is a chance this bug being hit on.
The crititical analyser trigger the broker shutdown if try to
removeAllMessages with a huge queue. The iterQueue is split so as
not to keep the lock too time.
The LocalMonitor tick log is very useful to establish a "heartbeat" log
statement. It is moved into its own logger from PagingManager logger,
which is too verbose to leave activated indefinitely in production.
After a node is scaled down to a target node, the sf queue in the
target node is not deleted.
Normally this is fine because may be reused when the scaled down
node is back up.
However in cloud environment many drainer pods can be created and
then shutdown in order to drain the messages to a live node (pod).
Each drainer pod will have a different node-id. Over time the sf
queues in the target broker node grows and those sf queues are
no longer reused.
Although use can use management API/console to manually delete
them, it would be nice to have an option to automatically delete
those sf queue/address resources after scale down.
In this PR it added a boolean configuration parameter called
cleanup-sf-queue to scale down policy so that if the parameter
is "true" the broker will send a message to the
target broker signalling that the SF queue is no longer
needed and should be deleted.
If the parameter is not defined (default) or is "false"
the scale down won't remove the sf queue.
By default, such a cancelled task is not automatically removed from the
work queue until its delay elapses. It may cause unbounded retention of
cancelled tasks. To avoid this, set remove on cancel policy to true.
The core server session tracks details about producers like what
addresses have had messages sent to them, the most recent message ID
sent to each address, and the number of messages sent to each address.
This information is made available to users via the
listProducersInfoAsJSON method on the various management interfaces
(JMX, web console, etc.). However, in situations where a server session
is long lived (e.g. in a pool) and is used to send to many different
addresses (e.g. randomly named temporary JMS queues) this info can
accumulate to a problematic degree. Therefore, we should limit the
amount of producer details saved by the session.
Certain devices or file systems won't support record level locking.
For that reason I am changing FileLockNodeManager to use separate files (one for each position) instead of using tryLock(position);
A good example for this would be cephFS where channel.tryLock or channel.tryLock works but it fails at a record level.
Wait netty event loop group shutdown to avoid too many opened FDs after
server stops, when netty configuration is used. Clear server
activateCallbacks to avoid reactivation of previous nodeManager and
consequent FD leaks on restart. Fix LargeServerMessageImpl.copy to avoid
FD leaks when a large message expiry or it is sent to DLA. Terminate
HawtDispatcher global queue to avoid pipes and eventpolls leaks after a
MQTT test.
cherry-picking commit 9617058ba0649af4eea15ce8793f86de827c4b7f
NO-JIRA adding check for open FD on the testsuite
cherry-picking commit 0facb7ddf4d3baa14a3add4290684aff7fd46053
NO-JIRA addressing connections leaks on integration tests
If a jms client (be it openwire, amqp, or core jms) receives a message that
is from a different protocol, the JMSMessageID maybe null when the
jms client expects it.
Add max record size check before adding a record to prevent that the
broker shuts down, when there is one really large header sent with the
message. Add message size check before allocating large message resource
if it can't be stored.
AbstractJournalStorageManager::performCachedLargeMessageDeletes
must enforce acquisition of manager write lock (as documented)
to avoid unlucky racing calls of stopReplication while stopping
to deadlock.
FileConfigurationParserTest was creating a data folder.
This is simply disabling persistence from the configuration used by the server on this test as it is not needed.
There are a few issues with prefixing and compatibility.
This is basically an issue when integrated with Wildfly or any other case
where prefix is activated
and playing with older versions.
Page::read is allocating a new ChannelBufferWrapper on each
paged message read: to reduce the allocation rate, it could be
reused until a new wrapped ByteBuffer is created
This test has been failing as part of the main testsuite
and it should really be a smoke test as it is using a real test.
so, I'm moving it as smoke-test
This is fixing these tests:
- org.apache.activemq.artemis.tests.integration.paging.PagingOrderTest#testPagingOverCreatedDestinationQueues
- org.apache.activemq.artemis.tests.integration.paging.PagingOrderTest#testPagingOverCreatedDestinationTopics
No additional tests are needed as this change is covereted by the current testsuite