The failback process needs to be deterministic rather than relying on various
incarnations of Thread.sleep() at crucial points. Important aspects of this
change include:
1) Make the initial replication synchronization process block at the very
last step and wait for a response from the replica to ensure the replica has
as the necessary data. This is a critical piece of knowledge during the
failback process because it allows the soon-to-become-backup server to know
for sure when it can shut itself down and allow the soon-to-become-live
server to take over. Also, introduce a new configuration element called
"initial-replication-sync-timeout" to conrol how long this blocking will occur.
2) Set the state of the server as 'LIVE' only after the server is fully
started. This is necessary because once the soon-to-be-backup server shuts
down it needs to know that the soon-to-be-live server has started fully before
it restarts itself as the new backup. If the soon-to-be-backup server restarts
before the soon-to-be-live is fully started then it won't actually become a
backup server but instead will become a live server which will break the
failback process.
3) Wait to receive the announcement of a backup server before failing-back.
Netty 4.x uses pooled buffers. These buffers can run out of memory when
transferring large amounts of data over connection. This was causing an
OutOfMemory exception to be thrown on the CoreBridge when tranferring
large messages. Netty provides a callback handler to notify listeners
when a Connection is writable. This patch adds the ability to register
connection writable listeners to the Netty connection and registers the
relevant callback from the Bridge to avoid writing when the buffers are
full.
It is possible for the closure of one resource to potentially impact
another since they are now sharing the same ServerLocator instance.
Keep track of references to avoid this.
In one situation I have seen a failrue on ProducerFlowControl to break everything else from here
This change will both avoid the failure and change the report of leaked threads so we can find them easily on the system.out when it happens (for future debugging)
-Using core api to inspect queue status
-Catch command visit() exceptions in order to
pass it back to client.
-Correct destination add/remove handlings
Inbound sessions are always created from the same ActiveMQConnectionFactory
which means the load-balancing policy is applied to them in the expected
manner. However, outbound sessions are created from independent, unique
ActiveMQConnectionFactory instances which means that the load-balancing
doesn't follow the expected pattern.
This commit changes this behavior by caching each unique
ActiveMQConnectionFactory instance and using it for both inbound and outbound
sessions potentially. This ensures the sessions are load-balanced as
expected.
Currently a cluster bridge will continue to attempt to reconnect to
a node that sends it a DISCONNECT until its reconnect-attempts is
exhausted. A DISCONNECT message indicates that the node is not coming
back so no reconnect attempt should be made and the bridge should be
stopped, the bindings should be cleaned up, etc.
The change to this test exposes this problem.
this is just calling Idea format on all the files using the new style
I am separating manual changes from automatic changes in case I have to repeat the manual changes again
https://issues.apache.org/jira/browse/ARTEMIS-163
On this pass I'm just converting the native layer to a simpler one.
It wasn't very easy to change the alignment at the current framework,
so I did some refactoring simplifying the native layer
The volume of the nubmer of changes here is because:
- The API is changed, we now don't close the libaio queue between files
- The native layer won't use malloc as much as it used to, saving some CPU and memory defragmentation
- I organized the code around nio and libaio
-- added addConnector() method so new connector can be created for tests
-- fix zero-byte key store files that cause test failures
-- fix a invm issue in test
The current Security Manager implementation was returning the username
instead of the default password when validating the default user.
This patch returns the correct value and cleans up the validate method.
https://issues.apache.org/jira/browse/ARTEMIS-138
The list method should return an empty list in case of non existent folders,
So this would unveil whatever is the cause for non existent folders at the next level where it's happening
https://issues.apache.org/jira/browse/ARTEMIS-136
From what I researched from implementers of XA TM if you throw ERR over communication errors the transaction manager will create
an heuristic transaction to be manually dealt with.
Other XA Implementations (such as Oracle JDBC) are return FAIL over communication failures during any XA operation.
https://issues.apache.org/jira/browse/ARTEMIS-135
This is importing a recent fix from the journal on the old version.
If a crash happened between the create file and the fill of the file the
file wouldn't be loaded and the server wouldn't start until you removed the offending file
To reproduce this commit, apply a replace regex rule using:
search regex: /\*\*\n \* Licensed
replace: /\*\n \* Licensed
These files had to be changed manually:
artemis-selector/src/main/javacc/HyphenatedParser.jj
artemis-selector/src/main/javacc/StrictParser.jj
artemis-website/src/main/resources/styles/impact/css/pygmentize.css
artemis-website/src/main/resources/styles/impact/css/site.css
We had a few reported small issues on the codebase from the recent introduced google error prone.
This should eliminate any issues, and I am making sure these won't happen again
If standalone backup server with shared has defined scale-down policy
but it's disabled then backup does not activate. Problem is that
server is checking only whether scale down is defined but if it's
enabled. This causes that server.stop() is called and backup does
not activate.
Lots of work on the test-suite in this commit including:
- Rename ServiceTestBase to ActiveMQTestBase
- Make AddressSettings fluent
- Remove unnecessary tearDown() implementations
- Use ActiveMQTestBase.create*Locator() instead of
ActiveMQClient.createServerLocator*(..)
- Use fluent ServerLocator methods
- Make sure all ActiveMQServers.newActiveMQServer invocations
are surrounded with addServer() where appropriate
- Create a few example tests to be references from hacking-guide
- Update hacking-guide with more info on writing tests
- Refactor config creation methods in ActiveMQTestBase
This has bothered me for awhile, but writing the hacking guide has
given me an opportunity to refactor some of our test-suite to be
simpler, more consistent, and easier to understand. This is
important if we want users to provide well-written tests. Our
test-suite is an important part of the code-base and it should be
easy to write good tests.
Basically I just consolidated CoreUnitTestCase, UnitTestCase, and
ServiceTestBase into a single class named ServiceTestBase. I also
simplified some of the configuration creation methods to reduce
duplicated code.