On completion of drain the response is not flushed and the
client can wait a few seconds before another broker task
flushes the work. Flush the connection after updating the
linked as being drained. Also perform the work with the
connection lock held to prevent conccurent update of proton
state.
This is replacing an executor on ServerSessionPacketHandler
by a this actor.
This is to avoid creating a new runnable per packet received.
Instead of creating new Runnable, this will use a single static runnable
and the packet will be send by a message, which will be treated by a listener.
Look at ServerSessionPacketHandler on this commit for more information on how it works.
Add krb5sslloginmodule that will populate userPrincipal that can be mapped to roles independently
Generalised callback handlers to take a connection and pull certs or peerprincipal based on
callback. This bubbled up into api change in securitystore and security manager
If replication blocked anything on the journal
the processing from clients would be blocked
and nothing would work.
As part of this fix I am using an executor on ServerSessionPacketHandler
which will also scale better as the reader from Netty would be feed immediately.
Core client with netty connector and acceptor doing kerberos
jaas.doAs around sslengine init such that the SSL handshake can do kerberos ticket
generaton and validation.
The kerberos authenticated user is then validated with the security manager before
being populated into the message userId.
The feature is enabled with the kerb5Config property. When lowercase it is the
principal. With a leading uppercase char it is the login.config entry to use.
The MAPPED journal refactoring include:
- simplified lifecycle and logic (eg fixed file size with single mmap memory region)
- supports for the TimedBuffer to coalesce msyncs (via Decorator pattern)
- TLAB pooling of direct ByteBuffer like the NIO journal
- remove of old benchmarks and benchmark dependencies
The default id-cache-size is 20000 and the default
confirmation-window-size is 1MB. It turns out the 1MB
size is too small for id-cache-size.
To fix it we adjust the confirmation-window-size to 10MB. Also
a test is added to guarantee it won't break this rule when this
default value is to be changed to any new value.
When a large message is replicated to backup, a pendingID is generated
when the large message is finished. This pendingID is generated by a
BatchingIDGenerator at backup.
It is possible that a pendingID generated at backup may be a duplicate
to an ID generated at live server.
This can cause a problem when a large message with a messageID that is
the same as another largemessage's pendingID is replicated and stored
in the backup's journal, and then a deleteRecord for the pendingID
is appended. If backup becomes live and loads the journal, it will
drop the large message add record because there is a deleteRecord of
the same ID (even though it is a pendingID of another message).
As a result the expecting client will never get this large message.
So in summary, the root cause is that the pendingIDs for large
messages are generated at backup while backup is not alive.
The solution to this is that instead of the backup generating
the pendingID, we make them all be generated in advance
at live server and let them replicated to backup whereever needed.
The ID generater at backup only works when backup becomes live
(when it is properly initialized from journal).
It fixes compatibility issues with JMS Core clients using the old address model, allowing the client to query JMS temporary queues too.
you would eventually see this issue when using older clients:
AMQ119019: Queue already exists
This method name would clash with ServiceComponent
As the real meaning here on this method is just to failover
So I've renamed the method to avoid the clash with my next commit
(I've done this on a separate commit as you may need to redo this
commit from scratch again in other branches instead of lots of clashes on cherry-pick)
When a large message is being diverted, a new copy of the original
message is created and replicated (if there is a backup) to the backup.
In LargeServerMessageImpl.copy(long) it reuse a byte array to copy
message body. It is possible that one block of date is read into
the byte array before the previous read has been replicated,
causing the replicated bytes to corrupt.
If we make a copy of the byte array before replication, the corruption
of data will be avoided.
Add extra configuration to address-settings to be able to
control / enable address/queue deletion by pattern,
rather than a global toggle.
Add support in the reload logic to remove address
and/or queues if the address matches an address setting,
where it is enabled.
Use AcitveMQDestination for subscription naming, fixing and aligning queue naming in the process.
The change is behind a configuration toggle so to avoid causing any breaking changes for uses not expecting.
In a cluster if a node is shut down (or crashed) when a
message is being routed to a remote binding, a internal
property may be added to the message and persisted. The
name of the property is like _AMQ_ROUTE_TOsf.my-cluster*.
if the node starts back, it will load and reroute this message
and if it goes to a local consumer, this property won't
get removed and goes to the client.
The fix is to remove this internal property before it
is sent to any client.
remove custom repo
update groupid to match artifact in maven central.
bump version also to that now deployed to maven central.
bump checkstyle version to 7.7 to make compatible.
updated checkstyle.xml to ignore existing issues which are prolific
which are now flagged in latest version as some bugs in previous meant they we'ren't detected e.g. https://github.com/checkstyle/checkstyle/issues/3320
fixing some violations which are not too prolific.
This commit has 2 changes for backwards compatibility with older
clients:
1) A "bindings query" will now detect if the client is both JMS and
"old" (i.e. pre-2.0) and will prefix the returned queue names with the
old prefix (i.e. "jms.queue."). This will allow the old client to
properly detect whether or not a queue exists in its auto-creation
logic.
2) When messages are dispatched to a consumer there is logic to detect
if the consumer is both JMS and "old" and will prefix the "address"
on the message with "jms.queue." or "jms.topic." as appropriate
(if it's not already prefixed).
Building on ARTEMIS-905 JCtools ConcurrentMap replacement first proposed but currently parked by @franz1981, replace the collections with primitive key concurrent collections to avoid auto boxing.
The goal of this is to reduce/remove autoboxing on the hot path.
We are just adding jctools to the broker (should not be in client dependencies)
Like wise targeting specific use case with specific implementation rather than a blanket replace all.
Using collections from Bookkeeper, reduces outside tlab allocation, on resizing compared to JCTools, which occurs frequently on testing.
The empty folder artemis-server/artemis-load-generator seems to be a
git submodule link. However, there is no submodule configuratation
file (ie .gitmodules) in this repository.
This caused the 'git submodule' command to fail with the following
error message:
fatal: no submodule mapping found in .gitmodules for path 'artemis-server/artemis-load-generator'
Added a wait-for-activation option to shared-store master HA policies.
This option is enabled by default to ensure unchanged server startup behavior.
If this option is enabled, ActiveMQServer.start() with a shared-store master server will not return
before the server has been activated.
If this options is disabled, start() will return after a background activation thread has been started.
The caller can use waitForActivation() to wait until server is activated, or just check the current activation status.
Adding a new ActievMQServerPlugin interface to support adding custom
behavior to the broker at certain events such as connection or session
creation.
https://issues.apache.org/jira/browse/ARTEMIS-898
Use `long` array for hourly counters instead of `int` array.
Prevents overflow when the number of new messages (a `long`) is added.
Fixes one of the "Implicit narrowing conversion in compound assignment"
alerts on https://lgtm.com/projects/g/apache/activemq-artemis/alerts.
Have `AddressControlImpl::getMessageCount` use and return a `long`.
Prevents potential overflow from use of an `int` count variable.
Fixes one of the "Implicit narrowing conversion in compound assignment"
alerts at https://lgtm.com/projects/g/apache/activemq-artemis/alerts.
Instead of going directly into backup mode within the shared-store
live activation, we just change the HA-policy to slave and return
to the caller - ActiveMQServerImpl.internalStart().
The caller will then handle the backup activation as usual
in a separate thread, such that EmbeddedJMS.start() can return.
Also added a related integration test.
sendMessage() may throw ActiveMQException that causes CNFE
at the management client. Also it should check if headers
in the message is null (to prevent NPE).
The ActiveMQJAASSecurityManager class uses LoginContext to validate
users and roles. LoginContext loads LoginModule classes defined in
the configuration (login.config) using current thread's context
classloader.
Normally this wouldn't be a problem but when a caller thread comes
from JMX (for example a client calls QueueControl.sendMessage() via
JMX) the caller thread has a different context class loader.
This will cause the LoginContext to fail to load the LoginModule
class (e.g. PropertiesLoginModule) and the validation will fail
even if correct credentials are supplied.
Broker should support full qualified queue names (FQQN)
as well as bare queue names. This means when clients access
to a queue they have two equivalent ways to do so. One way
is by queue names and the other is by FQQN (i.e. address::qname)
names. Currently only receiving is supported.
AIOFileLockManager doesn't work on NFS-mounted share store directories.
Since the GFS2 bug https://bugzilla.redhat.com/show_bug.cgi?id=678585
has been fixed end of 2011, the class AIOFileLockManager is no longer needed and I have removed it.
Broker should support full qualified queue names (FQQN)
as well as bare queue names. This means when clients access
to a queue they have two equivalent ways to do so. One way
is by queue names and the other is by FQQN (i.e. address::qname)
names. Currently only receiving is supported.
Add tests for this management operation with both core and AMQP encoded
messages. Also fix a few problems with the implementation like not
checking the passed-in headers for null and not counting messages
properly.
When a replica attempts to connect to a live server using a group-name
and there are > 1 servers on the network using that group there is a
chance it will fail because it doesn't keep track of all of the
topology data it receives. This fix ensures that all the topology data
from the cluster tracked until it is used and fails at which point it
is discarded.
When populate-validated-user = true AMQP messages can cause exceptions.
This feature isn't particularly applicable to AMQP so this commit
eliminates the exception and leaves the AMQP messages untouched
even if populate-validated-user = true. In other words,
populate-validated-user + AMQP is not supported.
If broker fails to decode any packets from buffer, it should
treat it as a critical bug and disconnect immediately.
Currently broker only logs an error message.
The following changes are made to support Epoll.
Refactored SharedNioEventLoopGroup into renamed SharedEventLoopGroup to be generic (as so we can re-use for both Nio and Epoll)
Add support and toggles for Epoll in NettyAcceptor and NettyConnector (with fall back to NIO if cannot load Epoll)
Removal from code of PartialPooledByteBufAllocator, caused bad address when doing native, and no longer needed - see jira discussion
New Connector Properties:
useEpoll - toggles to use epoll or not, default true (but we failback to nio gracefully)
remotingThreads = same behaviour as nioRemotingThreads. Previous property is depreated.
useGlobalWorkerPool = same behaviour as useNioGlobalWorkerPool. Old property is deprecated.
New Acceptor Properties:
useEpoll - toggles to use epoll or not, default true (but we failback to nio gracefully)
useGlobalWorkerPool = same behaviour as useNioGlobalWorkerPool but for Epoll.
This closes#1093
Adjust slow-consumer detection logic to use the number of messages in
the queue and not just the number of messages added since the last
check. This means the getRate() method now returns the rate of messages
which it *could* have dispatched since the last check rather than the
rate at which it received messages. This is a more reliable metric to
ensure the slow-consumer detection logic doesn't flag a consumer as
slow unfairly. Although the reliability will come at a performance cost
since getMessageCount() must lock the queue.
As part of my refactoring on AMQP, the broker shouldn't rely on Application properties
for any broker semantic changes on delivery.
I am removing any access to those now, so we can properly deal with this post 2.0.0.
Artemis expose createQueue() method to management console like Jon.
If the queue to be created already exists it throws an ActiveMQException
back to the console, which will get a ClassNotFoundException when
deserializing the exception.
The same issue goes also with AddressControl.sendMessage() method.
To fix that it should throw a common java exception like IllegalStateException.
with this we could send and receive message in their raw format,
without requiring conversions to Core.
- MessageImpl and ServerMessage are removed as part of this
- AMQPMessage and CoreMessage will have the specialized message format for each protocol
- The protocol manager is now responsible to send the message
- The message will provide an encoder for journal and paging
Due to recent changes, the web component is shutdown by the
server, but the shutdown flag is lost so the web component's
cleanup check method is not get called and the web's tmp
dir is left there after user stopped the broker (control-c).
The fix is add a suitable API to allow passing of the
flag so the web component can make sure its tmp dir gets
cleaned up properly before exiting the VM.
In a cluster of replicated live/backup pairs if a backup crashes and
then its live crashes the cluster will retain the topology information
of the live such that when the live server restarts it will check the
cluster to see if its nodeID is present (which it will be) and then it
will activate as a backup rather than a live. To prevent this situation
an additional check is necessary to see if the server with the matching
nodeID is actually active or not which is done by attempting to make a
connection to it.
remove reconnect attempt as it serves no purposes as replication has stopped anyway.
Ensure if vote fails thatthe latch is counted down to avoid delay.
If we can't detect live has failed signal failure in replicating.
https://issues.apache.org/jira/browse/ARTEMIS-866