Due to the changes in 6b5fff40cb the
config parameter message-expiry-thread-priority is no longer needed. The
code now uses a ScheduledExecutorService and a thread pool rather than
dedicating a thread 100% to the expiry scanner. The pool's size can be
controlled via scheduled-thread-pool-max-size.
Using a property on AMQPLargeMessage instead of a ThreadLocal
This was causing issues on the journal as the message may transverse different threads on the journal.
There is no guarantee that the encodeSize size is the same in AMQP right after read.
As the protocol may add additional bytes right after decoded such as header, extra properties.. etc.
The drain control has to immediately flush
otherwise a next flow control event may remove the previous status from Proton.
So, this really cannot wait the next executor, and it has to be done immediately.
Historically speaking, all message properties starting with AMQ HDR
would not be passed to OpenWire messages. However, that blocked the
properties from management notifications so ARTEMIS-1209 was raised and
the solution there was to pass properties that started with _AMQ *if*
the consumer was connected to the management notification address.
However, in this case messages are diverted to a different address so
this check fails and the properties are removed. My solution will be to
check the message itself to see if it has the _AMQ_NotifType property
(which all notification messages do) rather than checking where the
consumer is connected.
In case there is a hardware, firewal or any other thing making the UDP connection to go deaf
we will now reopen the connection in an attempt to go over possible issues.
This is also improving locking around DiscoveryGroup initial connection.
This is a Large commit where I am refactoring largeMessage Body out of CoreMessage
which is now reused with AMQP.
I had also to fix Reference Counting to fix how Large Messages are Acked
And I also had to make sure Large Messages are transversing correctly when in cluster.
Update netty version to 4.1.43.Final and netty-tcnative version to 2.0.26.Final.
Change restricted-security-client.policy because Netty 4.1.43.Final requires
access to two more files: /etc/os-release and /usr/lib/os-release.
There is an optimization in AMQP, that properties are only parsed over demand.
It happens that after ARTEMIS-2294 (commit 2dd0671698),
every send would request for the property on the message, resulting the properties to always be parsed upon send.
Even when there's no use of application properties.
This is a surprisingly large change just to fix some log messages, but
the changes were necessary in order to get the relevant data to where it
was being logged. The fact that the data wasn't readily available is
probably why it wasn't logged in the first place.
Add a paged message to the tail, when the QueueIterateAction doesn't handle it, to avoid removing unhandled paged message. Move the refRemoved calls from the QueueIterateActions to the iterQueue to fix the queue stats.
When AMQPMessages are redistributed from one node to
another, the internal property of message is not
cleaned up and this causes a message to be routed
to a same queue more than once, causing duplicated
messages.
This commit introduces the ability to configure a downstream connection
for federation. This works by sending information to the remote broker
and that broker will parse the message and create a new upstream back
to the original broker.
A new feature to preserve messages sent to an address for queues that will be
created on the address in the future. This is essentially equivalent to the
"retroactive consumer" feature from 5.x. However, it's implemented in a way
that fits with the address model of Artemis.
The parameter failoverOnInitialConnection wouldn't seem to be used and
makes no sense any more, because the connectors are retried in a loop.
So someone can just add the backup in the initial connection.
In LargeMessageImpl.copy(long) it need to open the underlying
file in order to read and copy bytes into the new copied message.
However there is a chance that another thread can come in and close
the file in the middle, making the copy failed
with "channel is null" error.
This is happening in cases where a large message is sent to a jms
topic (multicast address). During delivery it to multiple
subscribers, some consumer is doing delivery and closed the
underlying file after. Some other consumer is rolling back
the messages and eventually move it to DLQ (which will call
the above copy method). So there is a chance this bug being hit on.
The crititical analyser trigger the broker shutdown if try to
removeAllMessages with a huge queue. The iterQueue is split so as
not to keep the lock too time.
It use RandomAccessFile to allow using heap buffers without additional
copies and/or leaks of direct buffers, as performed by FileChannel JDK
implementation (see https://bugs.openjdk.java.net/browse/JDK-8147468)
When an openwire client closes the session, the broker doesn't
clean up its server consumer references even though the core
consumers are closed. This results a leak when sessions within
a connection are created and closed when the connection keeps open.
It would fail on cannot destroy queue
as the failure could be asynchronous, introducing a quick race, which is acceptable
you just need to make sure the async operation will finish before removing the queue.
Fix is to introduce a Wait.assertEquals call.
After a node is scaled down to a target node, the sf queue in the
target node is not deleted.
Normally this is fine because may be reused when the scaled down
node is back up.
However in cloud environment many drainer pods can be created and
then shutdown in order to drain the messages to a live node (pod).
Each drainer pod will have a different node-id. Over time the sf
queues in the target broker node grows and those sf queues are
no longer reused.
Although use can use management API/console to manually delete
them, it would be nice to have an option to automatically delete
those sf queue/address resources after scale down.
In this PR it added a boolean configuration parameter called
cleanup-sf-queue to scale down policy so that if the parameter
is "true" the broker will send a message to the
target broker signalling that the SF queue is no longer
needed and should be deleted.
If the parameter is not defined (default) or is "false"
the scale down won't remove the sf queue.
When converting from AMQP to core and back again support annotations that
aren't able to be placed into Core message properties by storing the bytes
from encoding the types to AMQP encodings and then decoding them again
when converting back into AMQP messages.
Requires update to proton-j 0.33.2 for encoding fix
The core server session tracks details about producers like what
addresses have had messages sent to them, the most recent message ID
sent to each address, and the number of messages sent to each address.
This information is made available to users via the
listProducersInfoAsJSON method on the various management interfaces
(JMX, web console, etc.). However, in situations where a server session
is long lived (e.g. in a pool) and is used to send to many different
addresses (e.g. randomly named temporary JMS queues) this info can
accumulate to a problematic degree. Therefore, we should limit the
amount of producer details saved by the session.
this test was basically broken, it was silently failing as it was ignoring results and taking a long time to finish.
As this test is multiplied along many options (Netty, Replicated, JDBC) this was taking considerable extra time
on the testsuite.
Wait netty event loop group shutdown to avoid too many opened FDs after
server stops, when netty configuration is used. Clear server
activateCallbacks to avoid reactivation of previous nodeManager and
consequent FD leaks on restart. Fix LargeServerMessageImpl.copy to avoid
FD leaks when a large message expiry or it is sent to DLA. Terminate
HawtDispatcher global queue to avoid pipes and eventpolls leaks after a
MQTT test.
cherry-picking commit 9617058ba0649af4eea15ce8793f86de827c4b7f
NO-JIRA adding check for open FD on the testsuite
cherry-picking commit 0facb7ddf4d3baa14a3add4290684aff7fd46053
NO-JIRA addressing connections leaks on integration tests
If a jms client (be it openwire, amqp, or core jms) receives a message that
is from a different protocol, the JMSMessageID maybe null when the
jms client expects it.
Add max record size check before adding a record to prevent that the
broker shuts down, when there is one really large header sent with the
message. Add message size check before allocating large message resource
if it can't be stored.
When user attempts unauthorized anonymous sasl the broker can return an
error of 'failed' instead of the security error that is expected in
these cases.
* Upgrading versions
* Adding wildfly-common dependency as jboss-logmanager now depends on it
for simple common operations such as getting hostname or process id
* Updating bootclasspath with wildfly-common
This test was playing with an ignore packet, which does not make any more sense
after the last change.
After a packet loss the bridge will reconnect, and this test makes no more sense.
Add tests
Add fix - if timeout occurs on sending packet, calls same code that is invoked if timeout occurs on during ping aligning logic, and ensuring JMS connection exception listener gets invoked to inform the client logic to react.
This test has been failing as part of the main testsuite
and it should really be a smoke test as it is using a real test.
so, I'm moving it as smoke-test
The changes from ARTEMIS-2189 mean that
o.a.a.a.c.s.i.ServerSessionImpl#deleteQueue
is no longer called from the same ServerSessionImpl instance that
created it which means that TempQueueCleanerUpper instances will leak.
To resolve the leak the client will only create a new session when
necessary instead of every time delete() is invoked.
Implement using the ActiveMQ5 JMSXGroupFirstForConsumer, property as default, but make it possible for future to make it configurable easily. (Not this PR)
Add test
In adding auto-delete queue level feature, its been noticed as some feature bits were added during hot fix branch, that there's api break with the 2.6.x hotfix branch.
This addresses that by fixing this in 2.7.x
Add test that exhibits the issue when sending AMQP (non JMS) to Artemis that one mapping to Core JMS the destination is not resolving as the RoutingType can be missing.
Add fix.
LocalMonitor::under on PagingManagerImpl won't log anymore with a
warning message if the producers got unblocked and with info
if disk it getting freed
Performing direct deliveries of management messages could enter
a code path on QueueImpl::addTail with a NULL pageIterator: performing
a null check will avoid it to throw NPE.
The test may fail if the live crashes too soon and the
message is directly sent to backup and the expected
blocking send will never happen.
To fix that a wait is added to ensure the message
is sent to the live (and intercepted) before
crashing the live.
The Audit log allows user to log some important actions,
such as ones performed via management APIs or clients,
like queue management, sending messages, etc.
The log tries to record who (the user if any) doing what
(like deleting a queue) with arguments (if any) and timestamps.
By default the audit log is disabled. Through configuration can
be easily turned on.
When the MQTT consumer client (cleanSession property set to true) reconnected, there are certain probabilities that these two bugs will occur.
This is because the MQTT consumer client thinks that its connection has been disconnected and triggers reconnection, but the MQTT connection is still alive at Artemis broker. This bug occurs when new and old connections occur while operating the same queue for unsafe behavior.
Multiple consumers using the same clientId in the cluster, the last consumer connection should close the previous consumer connection!
ARTEMIS-2226 last consumer connection should close the previous consumer connection
to address apache-rat-plugin:0.12:check
ARTEMIS-2226 last consumer connection should close the previous consumer connection
to address checkstyle
ARTEMIS-2226 last consumer connection should close the previous consumer connection
adjust the code structure
ARTEMIS-2226 last consumer connection should close the previous consumer connection
adjust the code structure
ARTEMIS-2226 last consumer connection should close the previous consumer connection
adjust the code structure
ARTEMIS-2226 last consumer connection should close the previous consumer connection
adjust the code structure
ARTEMIS-2226 last consumer connection should close the previous consumer connection
adjust the code structure
ARTEMIS-2226 last consumer connection should close the previous consumer connection
add javadoc
Added test reproducer and changed Queue::isDurableMessage usages into
Queue::isDurable to allow acks to hit the journal and being
correctly replicated across nodes.
Add ability to configure when creating auto created queues at the queue level
Add support for configuring message count check
Add test cases
Update docs
Support using group buckets on a queue for better local group scaling
Support disabling message groups on a queue
Support rebalancing groups when a consumer is added.
* Using SpawnedVMSupport (used to be on testsuite, moving it to Utils)
* Building the classpath for ./lib, similar to what happens on Bootstrap
* Using Path as much as possible to avoid issues encoding files
Push isDirectDeliver method from netty impl, to the Connection interface
Add support to InVMConnection for isDirectDeliver flag and ability to set via config, defaulting to false, to keep current default behavior.
Extend DirectDeliverTest to check InVM as well.
Add consumer priority support
Includes refactor of consumer iterating in QueueImpl to its own logical class, to be able to implement.
Add OpenWire JMS Test - taken from ActiveMQ5
Add Core JMS Test
Add AMQP Test
Add Docs
When broker's advisory is disabled (supportAdvisory=false) any
advisory consumer won't get created at broker and the advisory
consumer ID won't be stored.
Legacy openwire clients can have a reference of advisory consumer
regardless broker's settings and therefore when it closes the
advisory consumer the broker has no reference to it.
Therefore broker throws an exception like:
javax.jms.IllegalStateException: Cannot
remove a consumer that had not been registered
If the broker stores the consumer info (even it doesn't create
it) the exception can be avoided.
There's a *slight* semantic change with the behavior of the queue query
and binding query to make them consistent with the address query, namely
that they will return the name of the queue and the name of the address
in every case and the returned names will be not use the FQQN syntax but
will be parsed to reflect their actual names in the broker.
MULTICAST messages forwarded by a core bridge will not be routed to any
ANYCAST queues and vice-versa. Diverts have the ability to configure how
routing-type is treated. Core bridges now support this same kind of
functionality. By default the bridge does not alter the routing-type of
forwarded messages to maintain compatibility with existing behavior.
Large messages pendingRecordID is not accessed atomically, leading
to races that would lead to records that cannot been found on the
journal for deletion: it would lead to cause NPE that won't clean
the pending tasks on the current OperationContextImpl.
Adding a cleanup on error of those tasks and avoiding the race
to happen by adding proper synchronization will both enforce
correct clean up when something bad happen and avoid NPE.
If a client sends a message to a multicast address and using a qpid-jms
client to receive the message from one of the queues using fully
qualified queue name will fail with following error message:
Address xxxx is not configured for queue support
[condition = amqp:illegal-state]
It should be able to receive the message without any error.
These improvements were also part of this task:
- Routing is now cached as much as possible.
- A new Runnable is avoided for each individual message,
since we use the Netty executor to perform delivery
https://issues.apache.org/jira/browse/ARTEMIS-2205
When a receiving transaction is committed in a paging situation,
if a page happens to be completed and it will be deleted in a
transaction operation (PageCursorTx). The other tx operation
RefsOperation needs to access the page (in PageCache) to finish
its job. There is a chance that the PageCursorTx removes the
page before RefsOperation and it will cause the RefsOperation
failed to find a message in a page.
When a node tries to reconnects to another node in a scale down cluster,
the reconnect request gets denied by the other node and keeps retrying,
which causes tasks in the ordered executor accumulate and eventually OOM.
The fix is to change the ActiveMQPacketHandler#handleCheckForFailover
to allow reconnect if the scale down node is the node itself.
Previously the port was always random. This caused problems with
remote JMX connections that needed to overcome firewalls. As of
this patch it's possible to make the RMI port static and whitelist
it in the firewall settings.
With AMQP protocol when some messages are received in a transaction,
calling JMX QueueControl.listDeliveringMessages() returns empty list
before the transaction is committed.
Add test cases
Add GroupSequence to Message Interface
Implement Support closing/reset group in queue impl
Update Documentation (copy from activemq5)
Change/Fix OpenWireMessageConverter to use default of 0 if not set, for OpenWire as per documentation http://activemq.apache.org/activemq-message-properties.html
This test can verify an issue fixed by the commit:
7a463f038a (ARTEMIS-2096)
The issue was reported later by users where messages
sent by amqp clients are having their body size
randomly being zero when received by multiple core
consumers.
This was also fixed on commit 48e0fc8f42 but that
was exclusively on 2.6.x branch.
Implement custom LVQ Key and Non-Destructive in broker - protocol agnostic
Make feature configurable via broker.xml, core apis and activemqservercontrol
Add last-value-key test cases
Add non-destructive with lvq test cases
Add non-destructive with expiry-delay test cases
Update documents
Add new methods to support create, update with new attributes
Refactor to pass through queue-attributes in client side methods to reduce further method changes for adding new attributes in future and avoid methods with endless parameters. (note: in future this should prob be done server side too)
Update existing test cases and fake impls for new methods/attributes
Further fix around network loss.
If network loss (split) and slave activates, for a period its config used when it initializes in initialisePart2 was stale.
Add/Extend test to ensure address-setting change made on live preserved on slave (after activation)
Add/Extend test to ensure security-setting change made on live preserved on backup (after activation)
Major refactoring of the AMQPMessage abstraction to resolve
some issue of message corruption still present in the code and
improve the API handling of message changes and re-encoding.
Improves handling of decoding of message sections limiting the
work to only the portions needed and ensuring the state data
is always updated with what has been done. Fixes issues of
corrupt state on copy of message or other changes in filters.
The waits for expiration are set at the same value as the expiration
interval default to in the test support setup so on some runs there is a
race when waiting for the message to expire and the expiry task.
Once a re-encode of the message is done the buffer is not being marked
as valid and so subsequent checks on the buffer are all assuming the
message data is not valid and re-encoding over and over. This can lead
to poor performance in some cases and corrupted data in others.
Extend test case to reproduce problem of client created queues being incorrectly removed on simple reload of config.
Add a flag/field to the queues created by configuration/broker.xml so we can correctly filter only queues created/managed by config.
Update listConfiguredQueues to use the new queue flag
Add Tests
Add implementation inline with other queue updatable settings.
Enhance tests to ensure queue is not destroyed during config change and messages in queue already are preserved
Revert previous fix
Keep original ConfigChangeTest
Apply new non-destructive fix.
Enhance tests to ensure messages in queues are not lost either on reload when running or when config changed on-restart (e.g. queue i not destroyed)
Ensure the broker looks at local receiver credit when checking for
credit top off threshold and then do a proper top off back to the high
water mark to sync with how client receivers manage their credit.
First, QueueQuery should use address name for address settings
The name used for looking up address settings for a queue now uses the
address name if there is a local queue binding
Second, make sure sent credits to the server is the correct value
In some cases users who migrate from 1.x to 2.x may still want to keep
the legacy prefixes for their JMS destinations (i.e. "jms.queue.",
"jms.topic.", etc.). This commit adds a boolean on our ConnectionFactory
implementation so that it will use the old prefixes when invoking the
queue/topic creation methods on the Session implementation.
Fix checkstyle
Avoid duplicated logic
Ability to filter and group
Instantiate SimpleString property key once
Get property value via getObjectProprty to ensure all special mapped properties such as in AMQPMessage would return
Avoid a custom string to represent null, instead rely on Java's representation "null" by using Objects.toString to get the string value of the property value used to group by.
An OpenWire client can use a compound destination name of the form
"a,b,c..." and consume from, or subscribe to, multiple destinations.
Such a compound destination only works for topics when the subscriber
is non-durable. Attempting to create a durable subscription on a
compound address will end up with an error.
The cause is when creating durable subs to multiple topics/addresses
the broker uses the same name to create internal queues, which
causes duplicate name conflict.
Parameters going into Wait.waitFor were originally wrong, because
`durationMillis: 3, sleepMillis: 100` means you would test the condition
only once. This commit is changing the durationMillis from 3ms to 3s,
swapping the two numbers (duration 100ms, sleep 3ms) would also be reasonable, I think.
Next, Wait.assertEquals is here being used, instead of Assert.assertTrue.
I saw the test fail only once, and never was able to reproduce it again,
but I think this commit does improve the test and so it is worthwhile.
java.lang.AssertionError
at org.apache.activemq.artemis.tests.integration.management.QueueControlTest.assertMetrics(QueueControlTest.java:2651)
at org.apache.activemq.artemis.tests.integration.management.QueueControlTest.assertMessageMetrics(QueueControlTest.java:2615)
at org.apache.activemq.artemis.tests.integration.management.QueueControlTest.testRemoveAllWithPagingMode(QueueControlTest.java:1554)
The occasional assertion error is prevented by using Wait.assertEquals
where Assert.assertEquals was used previously.
java.lang.AssertionError:
Expected :1
Actual :0
[...]
at org.junit.Assert.assertEquals(Assert.java:542)
at org.apache.activemq.artemis.tests.integration.management.QueueControlTest.testResetMessagesExpired(QueueControlTest.java:2370)
The occasional assertion error is prevented by using Wait.assertEquals
where Assert.assertEquals was used previously.
I did not observe the timing issue on all asserts (only on the first
two), but there is no harm in replacing them all.
java.lang.AssertionError:
Expected :2
Actual :1
The below error is prevented by using Wait.assertEquals
where Assert.assertEquals was used previously.
java.lang.AssertionError:
Expected :2
Actual :1
[...]
at org.junit.Assert.assertEquals(Assert.java:542)
at org.apache.activemq.artemis.tests.integration.management.QueueControlTest.testListMessagesWithNullFilter(QueueControlTest.java:804)
The below error is prevented by using Wait.assertEquals
where Assert.assertEquals was used previously.
java.lang.AssertionError:
Expected :2
Actual :1
[...]
at org.apache.activemq.artemis.tests.integration.management.QueueControlTest.testListMessagesWithEmptyFilter(QueueControlTest.java:827)
This commit adds support for tracking metrics for bridges for both
normal bridges and bridges that are part of a cluster. The two
statistics added in this commit are messages pending acknowledgement
and messages acknowledged but more can be added later.
The below error is prevented by adding Wait.assertEquals,
where Assert.assertEquals was used previously. Timeout is
set to small increments, since we rarely need to wait more
than 100 ms for the condition to become true.
java.lang.AssertionError: expected:<1> but was:<0>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at org.apache.activemq.artemis.tests.integration.client.HeuristicXATest.doRecoverHeuristicCompletedTxWithRestart(HeuristicXATest.java:306)
at org.apache.activemq.artemis.tests.integration.client.HeuristicXATest.testRecoverHeuristicCommitWithRestart(HeuristicXATest.java:251)
Anonymous senders (those created without a target address) are not
blocked when max-disk-usage is reached. The cause is that when such
a sender is created on the broker, the broker doesn't check the
disk/memory usage and gives out the credit immediately.
In a live-backup scenario, if the live is restarted and shutdown too soon,
the client have a chance to fail on failover because it's internal topology
is inconsistent with the final status. The client keeps connecting to live
already shut down, never trying to connect to the backup.
It's a porting from HORNETQ-1572.
Tests should always extend ActiveMQTestBase whenever is possible.
This is because there are a few rules to avoid thread leakages.
The test was also leaking an executor and I believe it was
not always stopping the servers, which I fixed here.
When "large" messages are converted to / from core in order to be stored
in the large message store the type of the AMQP body section is being
lost and reconstituted incorrectly in some cases. The message needs to
be annotated with the original AMQP type for the body and that used to
manage the conversion back to AMQP from Core.