This commit does the following:
- Updates HA docs including the chapter on network isolation (i.e.
split brain). The network isolation chapter is now more about
high-level explanation and the HA doc now has all the configuration
parameters.
- Changes references to "pluggable quorum voting" to "pluggable lock
manager." The pluggable functionality really isn't about voting.
Conceptually is much more like the functionality you'd get from a
distributed lock so this naming is more clear. Both the docs and the
code have been changed.
- Reorganize lock manager modules as sub-modules. The API and RI
modules are renamed, but that should be OK based on the
"experimental" tag that's been on this feature up to this point.
- Remove the "experimental" tag from the lock manager.
These changes will not break folks using the standalone broker. However,
they will break folks embedding the broker *if* they are using the
artemis-quorum-ri or artemis-quorum-api modules or the
o.a.a.a.c.c.h.DistributedPrimitiveManagerConfiguration class.
There are no functional changes here. Renaming these modules is more a
conceptual change to facilitate better documentation and increased
adoption.
This fix was based on static analysis of the code and inspection of the
XA specification. There is no test associated with it due to the
difficult nature in reproducing the failure. This code has been
essentially the same for a decade and only now have there been any
reports of it actually sending back the wrong XA code.
Large message support was added to
o.a.a.a.c.s.f.FederatedQueueConsumerImpl#onMessage via cf85d35 for
ARTEMIS-3308. The problem with that change is that when onMessage
returns o.a.a.a.c.c.i.ClientConsumerImpl#callOnMessage will eventually
call o.a.a.a.c.c.i.ClientLargeMessageImpl#discardBody which eventually
ends up in o.a.a.a.c.c.i.LargeMessageControllerImpl#popPacket waiting 30
seconds (i.e. the default readTimeout) for more packets to arrive (which
never do). This happens because the FederatedQueueConsumer short-cuts
the "normal" process by using LargeMessageControllerImpl#take.
This commit fixes that by tracking the number of bytes "taken" and then
looking at that value later when discarding the body effectively
skipping the 30 second wait.
This commit:
- Eliminates MQTT session storage on every successful connection.
Instead data is only written when subsriptions are created or
destroyed.
- Adds a configuration property for the storage timeout.
- Updates the documentation with relevant information.
- Refactors a few bits of code to eliminate unnecessary variables, etc.
If both scale-down and cluster-connection are using the same JGroups
discovery-group then when the cluster-connection stops it will close the
underlying org.jgroups.JChannel and when the scale-down process tries to
use it to find a server it will fail.
This commit ensures that the JGroupsBroadcastEndpoint implementation of
BroadcastEndpoint#openClient initializes the channel if it has been
closed.
in cluster.
When we know that a node leaves a clustercleanly we shouldn't log WARN
messages about it.
Signed-off-by: Emmanuel Hugonnet <ehugonne@redhat.com>
This commit does the following:
- Replaces non-inclusive terms (e.g. master, slave, etc.) in the
source, docs, & configuration.
- Supports previous configuration elements, but logs when old elements
are used.
- Provides migration documentation.
- Updates XSD with new config elements and simplifies by combining some
overlapping complexTypes.
- Removes ambiguous "live" language that's used with regard to high
availability.
- Standardizes use of "primary," "backup," "active," & "passive" as
nomenclature to describe both configuration & runtime state for high
availability.
Using a prefix "netty.http.header." to be able to define http headers
used for http request from the netty connector.
Issue: https://issues.apache.org/jira/browse/ARTEMIS-4452
Signed-off-by: Emmanuel Hugonnet <ehugonne@redhat.com>
As I worked through implementing a more generic JSON marshaller, I tried using reflection through BeanUtils and other ways
however the endresult was always worse as there were a few caveats that were not as easy to accomplish.
For that reason I went to a declarative appraoch where I define a meta-data object on AddressSettings and AddressSettingsInfo and
reuse the metadata in a few other places.
I was not able to reproduce the actual issue here, but I heavily used this test during debugging.
This will not serve as a reproducer to the Ghost consumer issue, but this is a valid test.
* Since the constructors in RemotingConnectionImpl can be used from the server and the client.
If the server calling code is in a different classloader then the
constructor can't be called.
* same for the 'active' boolean property of ActiveMQChannelHandler
Issue: https://issues.apache.org/jira/browse/ARTEMIS-4467
Signed-off-by: Emmanuel Hugonnet <ehugonne@redhat.com>
When big messages are produced if a consumer receives an expired message, the credits are not updated, so if the consumer is too slow and an expiry delay has been set, we can end up with a situation where there are no more credits which prevents the consumer from receiving any more messages.
Durable subscrption state is part of the MQTT specification which has
not been supported until now. This functionality is implemented via an
internal last-value queue. When an MQTT client creates, updates, or
adds a subscription a message using the client-ID as the last-value is
sent to the internal queue. When the broker restarts this data is read
from the queue and populates the in-memory MQTT data-structures.
Therefore subscribers can reconnect and resume their session's
subscriptions without have to manually resubscribe.
MQTT state is now managed centrally per-broker rather than in the
MQTTProtocolManager since there is one instance of MQTTProtocolManager
for each acceptor allowing MQTT connections. Managing state per acceptor
would allow odd behavior with clients connecting to different acceptors
with the same client ID.
The subscriptions are serialized as raw bytes with a "version" byte for
potential future use, but I intentionally avoided adding complex
scaffolding to support multiple versions. We can add that complexity
later if necessary.
Some tests needed to be changed since instantiating an MQTT protocol
manager now creates an internal queue. A handful of tests assume that no
queues will exist other than the ones they create themselves. I updated
the main test super-class so that an MQTT protocol manager is not
automatically instantiated when configuring a broker for in-vm support.
The exception thrown by serverLocator.connect() should be all you need on such case
and the caller should then be responsible for taking appropriate action.
This commit contains the following changes:
- eliminate used, undeclared dependencies
- eliminate unused, declared dependencies
- fix scope for test dependencies
- eliminate org.hamcrest completely as its use involved deprecated code
as well as dependencies from multiple versions
In rare cases a store operation could silently fails or starves, blocking the
related server session and all delivering messages. Those server sessions can
be closed adding a management method that cleans their operation context
before closing them.
When resource audit logging is enabled STOMP is completely inoperable
due to an NPE during the protocol handshake. Unfortunately the failure
is completely silent. There are no logs to indicate a problem.
This commit fixes this problem via the following changes:
- Mitigate the original NPE via a check for null
- Move the logic necessary to set the "protocol connection" on the
"transport connection" to a class shared by all implementations.
- Add exception handling to log failures like this in the future.
- Add tests to ensure the audit logging is correct.
Currently JavaDoc is generated for many classes that don't need it.
JavaDoc should be reserved for user-facing classes (e.g. those used by
client application developers and developers embedding a broker into
their application). This commit narrows down the configuration to just
the classes that are needed. This will save time during release builds,
and save disk space wherever these files are stored (e.g. Apache
website).
Improve the CORE client failover connecting to other live servers when all
reconnect attempts fails, i.e. in a cluster composed of 2 live servers,
when the server to which the CORE client is connected goes down the CORE
client should reconnect its sessions to the other liver broker.
When sending, for example, to a predefined anycast address and queue
from a multicast (JMS topic) producer, the routed count on the address
is incremented, but the message count on the matching queue is not. No
indication is given at the client end that the messages failed to get
routed - the messages are just silently dropped.
Fixing this problem requires a slight semantic change. The broker is now
more strict in what it allows specifically with regards to
auto-creation. If, for example, a JMS application attempts to send a
message to a topic and the corresponding multicast address doesn't exist
already or the broker cannot automatically create it or update it then
sending the message will fail.
Also, part of this commit moves a chunk of auto-create logic into
ServerSession and adds an enum for auto-create results. Aside from
helping fix this specific issue this can serve as a foundation for
de-duplicating the auto-create logic spread across many of the protocol
implementations.
- interrupted message breaking reference counting
After the server writing to the client is interrupted in AMQP, the reference counting was broken what would require the server restarted
in order to cleanup the files of any interrupted sends.
- Removed consumer during large message delivery damaging large messages
If the consumer failed to deliver messages for any reason, the message on the queue would be duplicated. what would wipe out the body of the message
and other journal errors would happen because of this.
extra debug capabilities added into RefCountMessage as part of ARTEMIS-4206 in order to identify these issues