JGroups 3.x hasn't been updated in some time now. The last release was
in April 2020 almost 2 years ago. Lots of protocols have been updated
and added and users are wanting to use them. There is also increasing
concern about using older components triggered mainly by other
recently-discovered high-profile vulnerabilities in the wider Open
Source Java community.
This commit bumps JGroups up to the latest release - 5.2.0.Final.
However, there is a cost associated with upgrading.
The old-style properties configuration is no longer supported. I think
it's unlikely that end-users are leveraging this because it is not
exposed via broker.xml. The JGroups XML configuration has been around
for a long time, is widely adopted, and is still supported. I expect
most (if not all) users are using this. However, a handful of tests
needed to be updated and/or removed to deal with this absence.
Some protocols and/or protocol properties are no longer supported. This
means that users may have to change their JGroups stack configurations
when they upgrade. For example, our own clustered-jgroups example had to
be updated or it wouldn't run properly.
MQTT 5 is an OASIS standard which debuted in March 2019. It boasts
numerous improvments over its predecessor (i.e. MQTT 3.1.1) which will
benefit users. These improvements are summarized in the specification
at:
https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html#_Toc3901293
The specification describes all the behavior necessary for a client or
server to conform. The spec is highlighted with special "normative"
conformance statements which distill the descriptions into concise
terms. The specification provides a helpful summary of all these
statements. See:
https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html#_Toc3901292
This commit implements all of the mandatory elements from the
specification and provides tests which are identified using the
corresponding normative conformance statement. All normative
conformance statements either have an explicit test or are noted in
comments with an explanation of why an explicit test doesn't exist. See
org.apache.activemq.artemis.tests.integration.mqtt5 for all those
details.
This commit also includes documentation about how to configure
everything related to the new MQTT 5 features.
scenario - avoid paging, if address is full chain another broker and produce to the head, consume from the tail using producer and consumer roles to partition connections. When tail is drained, drop it.
- adds a option to treat an idle consumer as slow
- adds basic support for credit based address blocking ARTEMIS-2097
- adds some more visiblity to address memory usage and balancer attribute modifier operations
The HTML output methods are hold-overs from way back when the code-base
started off as JBoss Messaging 2 and the broker mainly ran in JBoss AS 4
and 5 which leveraged an HTML-based JMX console where these methods
would be executed and spit out nicely formatted data. That stuff has all
long since been retired so this commit deprecates the HTML-based
management methods so they can be removed completely in a future release.
JSON is a better structured output format for this and most of the
deprecated methods have JSON alternatives.
Commit 481b73c8ca from ARTEMIS-3502
inadvertently broke this functionality. This commit restores the
original behavior.
autoDeleteAddress was renamed to forceAutoDeleteAddress which will ignore the address settings.
delete temporary queues will use forceAutoDeleteAddress=true.
this is done in collaboration with Justin Bertram
- Avoid blowing up on string bodies of any size if the valueSizeLimit bits are configured to disable limit
- Dont NPE if amqp-value + binary body is sent without a content-type, as it always should be.
- Include expected prefix when adding delivery delay and ingress time annotations.
- Use the actual name for ingress time annotation, as with all other annotations.
- Use correct object type when testing equality with content-type value.
- Use consistent case for 'groupId' in different properties.
Casting the result of getPeerCertificates() to X509Certificate[] mirrors
what is done in the ActiveMQ "Classic" code-base.
A few tests which were imported from ActiveMQ "Classic" to verify our
OpenWire implementation were removed as they relied on a "stub"
implementation of javax.net.ssl.SSLSession that never would have worked
across multiple JDKs once javax.security.cert.X509Certificate[] was
removed. Furthermore, the tests appeared to be related to the OpenWire
*client* and not relevant to our broker-side implementation.
Previously, when during reconnect one session couldn't be transferred
to the new connection, we instantly returned and didn't execute failover
for the other sessions. This produced the issue that for sessions
where no failover was executed, their channels were still present on the
old connection. When the old connection was then destroyed, these channels
were closed although the reconnect was still ongoing, which lead to
"dead" sessions.
Now, if a session failover fails, for the remaining sessions the "client-side" part
of failover is executed, which removes the sessions from the old connection so that
they are not closed when the old connection is closed afterwards.
In 73c4e399d9 a description is added to DiskStoreUsage. It incorrectly describes the diskStoreUsage as a percentage. This commit changes it to a fraction which it is (also before the description change). A percentage would be better, since MaxDiskUsage is also specified as percentage.
The provider of an SSL key/trust store is different from that store's
type. However, the broker currently doesn't differentiate these and uses
the provider for both. Changing this *may* potentially break existing
users who are setting the provider, but I don't see any way to avoid
that. This is a bug that needs to be fixed in order to support use-cases
like PKCS#11.
Change summary:
- Added documentation.
- Consolidated several 2-way SSL tests classes into a single
parameterized test class. All these classes were essentially the same
except for a few key test parameters. Consolidating them avoided
having to update the same code in multiple places.
- Expanded tests to include different providers & types.
- Regenerated all SSL artifacts to allow tests to pass with new
constraints.
- Improved logging for when SSL handler initialization fails.
Change summary:
- Remove the existing Xalan-based XPath evaluator since Xalan appears
to be no longer maintained.
- Implement a JAXP XPath evaluator (from the ActiveMQ 5.x code-base).
- Pull in the changes from https://issues.apache.org/jira/browse/AMQ-5333
to enable configurable XML parser features.
- Add a method to the base Message interface to make it easier to get
the message body as a string. This relieves the filter from having
to deal with message implementation details.
- Update the Qpid JMS client to get the jms.validateSelector parameter.
If an application wants to use a special key/truststore for Artemis but
have the remainder of the application use the default Java store, the
org.apache.activemq.ssl.keyStore needs to take precedence over Java's
javax.net.ssl.keyStore. However, the current implementation takes the
first non-null value from
System.getProperty(JAVAX_KEYSTORE_PATH_PROP_NAME),
System.getProperty(ACTIVEMQ_KEYSTORE_PATH_PROP_NAME),
keyStorePath
So if the default Java property is set, no override is possible. Swap
the order of the JAVAX_... and ACTIVEMQ_... property names so that the
ActiveMQ ones come first (as a component-specific overrides), the
standard Java ones comes second, and finally a local attribute value
(through Stream.of(...).firstFirst()).
(In our case the application uses the default Java truststore location
at $JAVA_HOME/lib/security/jssecacerts, and only supplies its password
in javax.net.ssl.trustStorePassword, and then uses a dedicated
truststore for Artemis. Defining both org.apache.activemq.ssl.trustStore
and org.apache.activemq.ssl.trustStorePassword now makes Artemis use the
dedicated truststore (javax.net.ssl.trustStore is not set as we use the
default location, so the second choice
org.apache.activemq.ssl.trustStore applies), but with the Java default
truststore password (first choice javax.net.ssl.trustStorePassword
applies instead of the second choice because it is set for the default
truststore). Obviously, this does not work unless both passwords are
identical!)
Replaces direct jdbc connections with dbcp2 datasource. Adds
configuration options to use alternative datasources and to alter the
parameters. While adding slight overhead, this vastly improves the
management and pooling capabilities with db connections.
This reverts commit dbb3a90fe6.
The org.apache.activemq.artemis.core.server.Queue#getRate method is for
slow-consumer detection and is designed for internal use only.
Furthermore, it's too opaque to be trusted by a remote user as it only
returns the number of message added to the queue since *the last time
it was called*. The problem here is that the user calling it doesn't
know when it was invoked last. Therefore, they could be getting the
rate of messages added for the last 5 minutes or the last 5
milliseconds. This can lead to inconsistent and misleading results.
There are three main ways for users to track rates of message
production and consumption:
1. Use a metrics plugin. This is the most feature-rich and flexible
way to track broker metrics, although it requires tools (e.g.
Prometheus) to store the metrics and display them (e.g. Grafana).
2. Invoke the getMessageCount() and getMessagesAdded() management
methods and store the returned values along with the time they were
retrieved. A time-series database is a great tool for this job. This is
exactly what tools like Prometheus do. That data can then be used to
create informative graphs, etc. using tools like Grafana. Of course, one
can skip all the tools and just do some simple math to calculate rates
based on the last time the counts were retrieved.
3. Use the broker's message counters. Message counters are the broker's
simple way of providing historical information about the queue. They
provide similar results to the previous solutions, but with less
flexibility since they only track data while the broker is up and
there's not really any good options for graphing.
Both authentication and authorization will hit the underlying security
repository (e.g. files, LDAP, etc.). For example, creating a JMS
connection and a consumer will result in 2 hits with the *same*
authentication request. This can cause unwanted (and unnecessary)
resource utilization, especially in the case of networked configuration
like LDAP.
There is already a rudimentary cache for authorization, but it is
cleared *totally* every 10 seconds by default (controlled via the
security-invalidation-interval setting), and it must be populated
initially which still results in duplicate auth requests.
This commit optimizes authentication and authorization via the following
changes:
- Replace our home-grown cache with Google Guava's cache. This provides
simple caching with both time-based and size-based LRU eviction. See more
at https://github.com/google/guava/wiki/CachesExplained. I also thought
about using Caffeine, but we already have a dependency on Guava and the
cache implementions look to be negligibly different for this use-case.
- Add caching for authentication. Both successful and unsuccessful
authentication attempts will be cached to spare the underlying security
repository as much as possible. Authenticated Subjects will be cached
and re-used whenever possible.
- Authorization will used Subjects cached during authentication. If the
required Subject is not in the cache it will be fetched from the
underlying security repo.
- Caching can be disabled by setting the security-invalidation-interval
to 0.
- Cache sizes are configurable.
- Management operations exist to inspect cache sizes at runtime.
This is allowing journal appends to happen in burst
during replication, by batching replication response
into the network at the end of the append burst.
1 of 2) - Porting of HORNETMQ-1575
In a live-backup scenario, when live is down and backup becomes live, clients
using HA Connection Factories can failover automatically. However if a
client decides to create a new connection by itself (as in camel jms case)
there is a chance that the new connection is pointing to the dead live
and the connection won't be successful. The reason is that if the old
connection is gone the backup will not get a chance to announce itself
back to client so it fails on initial connection.
The fix is to let CF remember the old topology and use it on any
initial connection attempts.
Since getDiskStoreUsage on the ActiveMQServerControl is converting a
double to a long the value is always 0 in the management API. It should
return a double instead.
Adding this metric required moving the meter registration code from the
AddressInfo class to the ManagementService in order to get clean access
to both the AddressInfo and AddressControl classes.
The calculation used by
ActiveMQServerControlImpl.getDiskStoreUsagePercentage() is incorrect. It
uses disk space info with global-max-size which is for address memory.
Also, the existing getDiskStoreUsage() method *already* returns a
percentage of total disk store usage so this method seems redundant.
Correct two annotation inconsistences in the new overload of
ActiveMQServerControl#addAddressSettings introduced by ARTEMIS-2810.
These names don't follow the key names used in getAddressSettingsAsJSON.
ORIG message propertes like _AMQ_ORIG_ADDRESS are added to messages
during various broker operations (e.g. diverting a message, expiring a
message, etc.). However, if multiple operations try to set these
properties on the same message (e.g. administratively moving a message
which eventually gets sent to a dead-letter address) then important
details can be lost. This is particularly problematic when using
auto-created dead-letter or expiry resources which use filters based on
_AMQ_ORIG_ADDRESS and can lead to message loss.
This commit simply over-writes the existing ORIG properties rather than
preserving them so that the most recent information is available.
Add the command `check` to the Command Line utility. This command exposes some
checks for nodes and queues using the management API for most of them.
The checks have been implemented to be modular. Each user can compose his own
health check, ie to produce and consume from a queue the command is
`artemis check queue --name TEST --produce 1 --consume 1`.
- when sending messages to DLQ or Expiry we now use x-opt legal names
- we now support filtering thorugh annotations if using m. as a prefix.
- enabling hyphenated_props: to allow m. as a prefix
This commit does the following:
- Deprecates existing overloaded createQueue, createSharedQueue,
createTemporaryQueue, & updateQueue methods for ClientSession,
ServerSession, ActiveMQServer, & ActiveMQServerControl where
applicable.
- Deprecates QueueAttributes, QueueConfig, & CoreQueueConfiguration.
- Deprecates existing overloaded constructors for QueueImpl.
- Implements QueueConfiguration with JavaDoc to be the single,
centralized configuration object for both client-side and broker-side
queue creation including methods to convert to & from JSON for use in
the management API.
- Implements new createQueue, createSharedQueue & updateQueue methods
with JavaDoc for ClientSession, ServerSession, ActiveMQServer, &
ActiveMQServerControl as well as a new constructor for QueueImpl all
using the new QueueConfiguration object.
- Changes all internal broker code to use the new methods.
Due to the changes in 6b5fff40cb the
config parameter message-expiry-thread-priority is no longer needed. The
code now uses a ScheduledExecutorService and a thread pool rather than
dedicating a thread 100% to the expiry scanner. The pool's size can be
controlled via scheduled-thread-pool-max-size.
Add a Netty socks proxy handler during channel initialisation to allow
Artemis to communicate via a SOCKS proxy. Supports SOCKS version 4a & 5.
Even if enabled in configuration, the proxy will not be used when the
target host is a loopback address.
In case there is a hardware, firewal or any other thing making the UDP connection to go deaf
we will now reopen the connection in an attempt to go over possible issues.
This is also improving locking around DiscoveryGroup initial connection.
This is a Large commit where I am refactoring largeMessage Body out of CoreMessage
which is now reused with AMQP.
I had also to fix Reference Counting to fix how Large Messages are Acked
And I also had to make sure Large Messages are transversing correctly when in cluster.
- Avoid some Properties Decoding, checking if we need certain properties like scheduled delivery
- Avoid creating some unnecessary SimpleString instances
- Removed some intermediate ActiveMQBuffer allocation
- Removed some intermediate UnreleasableByteBuf allocation
Second attempt to fix the following compiler warning that is reported in Travis builds, this time using the correct cast type `Class<?>[]` which prevents temporary object allocation because of var-args handling:
```java
/home/travis/build/apache/activemq-artemis/artemis-core-client/src/main/java/org/apache/activemq/artemis/core/protocol/core/impl/wireformat/FederationStreamConnectMessage.java:151: warning: non-varargs call of varargs method with inexact argument type for last parameter;
return (FederationPolicy) Class.forName(clazz).getConstructor(null).newInstance();
cast to Class<?> for a varargs call
cast to Class<?>[] for a non-varargs call and to suppress this warning
```
This fixes the following compiler warning that is reported in Travis builds:
```
/home/travis/build/apache/activemq-artemis/artemis-core-client/src/main/java/org/apache/activemq/artemis/core/protocol/core/impl/wireformat/FederationStreamConnectMessage.java:151: warning: non-varargs call of varargs method with inexact argument type for last parameter;
return (FederationPolicy) Class.forName(clazz).getConstructor(null).newInstance();
cast to Class<?> for a varargs call
cast to Class<?>[] for a non-varargs call and to suppress this warning
```
When AMQPMessages are redistributed from one node to
another, the internal property of message is not
cleaned up and this causes a message to be routed
to a same queue more than once, causing duplicated
messages.
This commit introduces the ability to configure a downstream connection
for federation. This works by sending information to the remote broker
and that broker will parse the message and create a new upstream back
to the original broker.
A new feature to preserve messages sent to an address for queues that will be
created on the address in the future. This is essentially equivalent to the
"retroactive consumer" feature from 5.x. However, it's implemented in a way
that fits with the address model of Artemis.
The parameter failoverOnInitialConnection wouldn't seem to be used and
makes no sense any more, because the connectors are retried in a loop.
So someone can just add the backup in the initial connection.
When CoreMessage is doing copyHeadersAndProperties() it doesn't
make a full copy of its properties (a TypedProperties object).
It will cause problem when multiple threads/parties are modifying the
properties of the copied messages from the same message.
This will be particular bad if the message is a large message
where moveHeadersAndProperties is being used.
After a node is scaled down to a target node, the sf queue in the
target node is not deleted.
Normally this is fine because may be reused when the scaled down
node is back up.
However in cloud environment many drainer pods can be created and
then shutdown in order to drain the messages to a live node (pod).
Each drainer pod will have a different node-id. Over time the sf
queues in the target broker node grows and those sf queues are
no longer reused.
Although use can use management API/console to manually delete
them, it would be nice to have an option to automatically delete
those sf queue/address resources after scale down.
In this PR it added a boolean configuration parameter called
cleanup-sf-queue to scale down policy so that if the parameter
is "true" the broker will send a message to the
target broker signalling that the SF queue is no longer
needed and should be deleted.
If the parameter is not defined (default) or is "false"
the scale down won't remove the sf queue.