activemq-artemis/docs/user-manual/duplicate-detection.adoc

= Duplicate Message Detection
:idprefix:
:idseparator: -

Apache ActiveMQ Artemis includes powerful automatic duplicate message detection, filtering out duplicate messages without you having to code your own fiddly duplicate detection logic at the application level.
This chapter will explain what duplicate detection is, how Apache ActiveMQ Artemis uses it and how and where to configure it.

When sending messages from a client to a server, or indeed from a server to another server, if the target server or connection fails sometime after sending the message, but before the sender receives a response that the send (or commit) was processed successfully then the sender cannot know for sure if the message was sent successfully to the address.

If the target server or connection failed after the send was received and processed but before the response was sent back then the message will have been sent to the address successfully, but if the target server or connection failed before the send was received and finished processing then it will not have been sent to the address successfully.
From the senders point of view it's not possible to distinguish these two cases.

When the server recovers this leaves the client in a difficult situation.
It knows the target server failed, but it does not know if the last message reached its destination ok.
If it decides to resend the last message, then that could result in a duplicate message being sent to the address.
If each message was an order or a trade then this could result in the order being fulfilled twice or the trade being double booked.
This is clearly not a desirable situation.

Sending the message(s) in a transaction does not help out either.
If the server or connection fails while the transaction commit is being processed it is also indeterminate whether the transaction was successfully committed or not!

To solve these issues Apache ActiveMQ Artemis provides automatic duplicate messages detection for messages sent to addresses.

== Using Duplicate Detection for Message Sending

Enabling duplicate message detection for sent messages is simple: you just need to set a special property on the message to a unique value.
You can create the value however you like, as long as it is unique.
When the target server receives the message it will check if that property is set, if it is, then it will check in its in memory cache if it has already received a message with that value of the header.
If it has received a message with the same value before then it will ignore the message.

[NOTE]
====


Using duplicate detection to move messages between nodes can give you the same _once and only once_ delivery guarantees as if you were using an XA transaction to consume messages from source and send them to the target, but with less overhead and much easier configuration than using XA.
====

If you're sending messages in a transaction then you don't have to set the property for _every_ message you send in that transaction, you only need to set it once in the transaction.
If the server detects a duplicate message for any message in the transaction, then it will ignore the entire transaction.

The name of the property that you set is given by the value of `org.apache.activemq.artemis.api.core.Message.HDR_DUPLICATE_DETECTION_ID`, which is `_AMQ_DUPL_ID`

The value of the property can be of type `byte[]` or `SimpleString` if you're using the core API.
If you're using JMS it must be a `String`, and its value should be unique.
An easy way of generating a unique id is by generating a UUID.

Here's an example of setting the property using the core API:

[,java]
----
...

ClientMessage message = session.createMessage(true);

SimpleString myUniqueID = "This is my unique id";   // Could use a UUID for this

message.setStringProperty(HDR_DUPLICATE_DETECTION_ID, myUniqueID);
----

And here's an example using the JMS API:

[,java]
----
...

Message jmsMessage = session.createMessage();

String myUniqueID = "This is my unique id";   // Could use a UUID for this

message.setStringProperty(HDR_DUPLICATE_DETECTION_ID.toString(), myUniqueID);

...
----

== Configuring the Duplicate ID Cache

The server maintains caches of received values of the `org.apache.activemq.artemis.core.message.impl.HDR_DUPLICATE_DETECTION_ID` property sent to each address.
Each address has its own distinct cache.

The cache is a circular fixed size cache.
If the cache has a maximum size of `n` elements, then the ``n + 1``th id stored will overwrite the ``0``th element in the cache.

The maximum size of the cache is configured by the parameter `id-cache-size` in `broker.xml`, the default value is `20000` elements.

To implement an address-specific `id-cache-size`, you can add to the
corresponding address-settings section in `broker.xml`. Specify the
desired `id-cache-size` value for the particular address. When a message
is sent to an address with a specific `id-cache-size` configured, it
will take precedence over the global `id-cache-size` value, allowing
for greater flexibility and optimization of duplicate ID caching.

The caches can also be configured to persist to disk or not.
This is configured by the parameter `persist-id-cache`, also in `broker.xml`.
If this is set to `true` then each id will be persisted to permanent storage as they are received.
The default value for this parameter is `true`.

[NOTE]
====


When choosing a size of the duplicate id cache be sure to set it to a larger enough size so if you resend messages all the previously sent ones are in the cache not having been overwritten.
====

== Duplicate Detection and Bridges

Core bridges can be configured to automatically add a unique duplicate id value (if there isn't already one in the message) before forwarding the message to its target.
This ensures that if the target server crashes or the connection is interrupted and the bridge resends the message, then if it has already been received by the target server, it will be ignored.

To configure a core bridge to add the duplicate id header, simply set the `use-duplicate-detection` to `true` when configuring a bridge in `broker.xml`.

The default value for this parameter is `true`.

For more information on core bridges and how to configure them, please see xref:core-bridges.adoc#core-bridges[Core Bridges].

== Duplicate Detection and Cluster Connections

Cluster connections internally use core bridges to move messages reliable between nodes of the cluster.
Consequently they can also be configured to insert the duplicate id header for each message they move using their internal bridges.

To configure a cluster connection to add the duplicate id header, simply set the `use-duplicate-detection` to `true` when configuring a cluster connection in `broker.xml`.

The default value for this parameter is `true`.

For more information on cluster connections and how to configure them, please see xref:clusters.adoc#clusters[Clusters].
ARTEMIS-4383 migrate user docs to AsciiDoc Markdown, which is currently used for user-facing documentation, is good for a lot of things. However, it's not great for the kind of complex documentation we have and our need to produce both multi-page HTML and single-page PDF output via Maven. Markdown lacks features which would make the documentation easier to read, easier to navigate, and just look better overall. The current tool-chain uses honkit and a tool called Calibre. Honkit is written in TypeScript and is installed via NPM. Calibre is a native tool so it must be installed via an OS-specific package manager. All this complexity makes building, releasing, uploading, etc. a pain. AsciiDoc is relatively simple like Markdown, but it has more features for presentation and navigation not to mention Java-based Maven tooling to generate both HTML and PDF. Migrating will improve both the appearance of the documentation as well as the processes to generate and upload it. This commit contains the following changes: - Convert all the Markdown for the User Manual, Migration Guide, and Hacking guide to AsciiDoc via kramdown [1]. - Update the `artemis-website` build to use AsciiDoctor Maven tooling. - Update `RELEASING.md` with simplified instructions. - Update Hacking Guide with simplified instructions. - Use AsciiDoc link syntax in Artemis Maven doc plugin. - Drop EPUB & MOBI docs for User Manual as well as PDF for the Hacking Guide. All docs will be HTML only except for the User Manual which will have PDF. - Move all docs up out of their respective "en" directory. This was a hold-over from when we had docs in different languages. - Migration & Hacking Guides are now single-page HTML since they are relatively short. - Refactor README.md to simplify and remove redundant content. Benefits of the change: - Much simplified tooling. No more NPM packages or native tools. - Auto-generated table of contents for every chapter. - Auto-generated anchor links for every sub-section. - Overall more appealing presentation. - All docs will use the ActiveMQ favicon. - No more manual line-wrapping! AsciiDoc recommends one sentence per line and paragraphs are separated by a blank line. - AsciiDoctor plugins for IDEA are quite good. - Resulting HTML is less than half of the previous size. All previous links/bookmarks should continue to work. [1] https://github.com/asciidoctor/kramdown-asciidoc 2023-07-27 23:45:17 -04:00			`= Duplicate Message Detection`
			`:idprefix:`
			`:idseparator: -`

			`Apache ActiveMQ Artemis includes powerful automatic duplicate message detection, filtering out duplicate messages without you having to code your own fiddly duplicate detection logic at the application level.`
			`This chapter will explain what duplicate detection is, how Apache ActiveMQ Artemis uses it and how and where to configure it.`

			`When sending messages from a client to a server, or indeed from a server to another server, if the target server or connection fails sometime after sending the message, but before the sender receives a response that the send (or commit) was processed successfully then the sender cannot know for sure if the message was sent successfully to the address.`

			`If the target server or connection failed after the send was received and processed but before the response was sent back then the message will have been sent to the address successfully, but if the target server or connection failed before the send was received and finished processing then it will not have been sent to the address successfully.`
			`From the senders point of view it's not possible to distinguish these two cases.`

			`When the server recovers this leaves the client in a difficult situation.`
			`It knows the target server failed, but it does not know if the last message reached its destination ok.`
			`If it decides to resend the last message, then that could result in a duplicate message being sent to the address.`
			`If each message was an order or a trade then this could result in the order being fulfilled twice or the trade being double booked.`
			`This is clearly not a desirable situation.`

			`Sending the message(s) in a transaction does not help out either.`
			`If the server or connection fails while the transaction commit is being processed it is also indeterminate whether the transaction was successfully committed or not!`

			`To solve these issues Apache ActiveMQ Artemis provides automatic duplicate messages detection for messages sent to addresses.`

			`== Using Duplicate Detection for Message Sending`

			`Enabling duplicate message detection for sent messages is simple: you just need to set a special property on the message to a unique value.`
			`You can create the value however you like, as long as it is unique.`
			`When the target server receives the message it will check if that property is set, if it is, then it will check in its in memory cache if it has already received a message with that value of the header.`
			`If it has received a message with the same value before then it will ignore the message.`

			`[NOTE]`
			`====`


			`Using duplicate detection to move messages between nodes can give you the same _once and only once_ delivery guarantees as if you were using an XA transaction to consume messages from source and send them to the target, but with less overhead and much easier configuration than using XA.`
			`====`

			`If you're sending messages in a transaction then you don't have to set the property for _every_ message you send in that transaction, you only need to set it once in the transaction.`
			`If the server detects a duplicate message for any message in the transaction, then it will ignore the entire transaction.`

			The name of the property that you set is given by the value of `org.apache.activemq.artemis.api.core.Message.HDR_DUPLICATE_DETECTION_ID`, which is `_AMQ_DUPL_ID`

			The value of the property can be of type `byte[]` or `SimpleString` if you're using the core API.
			If you're using JMS it must be a `String`, and its value should be unique.
			`An easy way of generating a unique id is by generating a UUID.`

			`Here's an example of setting the property using the core API:`

			`[,java]`
			`----`
			`...`

			`ClientMessage message = session.createMessage(true);`

			`SimpleString myUniqueID = "This is my unique id"; // Could use a UUID for this`

			`message.setStringProperty(HDR_DUPLICATE_DETECTION_ID, myUniqueID);`
			`----`

			`And here's an example using the JMS API:`

			`[,java]`
			`----`
			`...`

			`Message jmsMessage = session.createMessage();`

			`String myUniqueID = "This is my unique id"; // Could use a UUID for this`

			`message.setStringProperty(HDR_DUPLICATE_DETECTION_ID.toString(), myUniqueID);`

			`...`
			`----`

			`== Configuring the Duplicate ID Cache`

			The server maintains caches of received values of the `org.apache.activemq.artemis.core.message.impl.HDR_DUPLICATE_DETECTION_ID` property sent to each address.
			`Each address has its own distinct cache.`

			`The cache is a circular fixed size cache.`
			If the cache has a maximum size of `n` elements, then the ``n + 1``th id stored will overwrite the ``0``th element in the cache.

			The maximum size of the cache is configured by the parameter `id-cache-size` in `broker.xml`, the default value is `20000` elements.

ARTEMIS-4159 Support duplicate cache size configuration per address This commit introduces support for configuring a specific Duplicate ID cache size per address in the Artemis server. Previously, there was only a global setting for the ID cache size, but now each address can have its own cache size. The changes include the addition of a new configuration property id-cache-size in the Artemis server configuration file. This property can now be specified under each address setting in the configuration file, and its value will determine the Duplicate ID cache size for that particular address. If the id-cache-size property is not specified for an address, it will use the global setting. The test cases have been updated to cover this new functionality, and integration test have been added to verify that address-specific cache sizes work as expected. Documentation has been added to address-settings.adoc, configuration-index.adoc and duplicate-detection.adoc 2023-07-25 17:43:21 -04:00			To implement an address-specific `id-cache-size`, you can add to the
			corresponding address-settings section in `broker.xml`. Specify the
			desired `id-cache-size` value for the particular address. When a message
			is sent to an address with a specific `id-cache-size` configured, it
			will take precedence over the global `id-cache-size` value, allowing
			`for greater flexibility and optimization of duplicate ID caching.`

ARTEMIS-4383 migrate user docs to AsciiDoc Markdown, which is currently used for user-facing documentation, is good for a lot of things. However, it's not great for the kind of complex documentation we have and our need to produce both multi-page HTML and single-page PDF output via Maven. Markdown lacks features which would make the documentation easier to read, easier to navigate, and just look better overall. The current tool-chain uses honkit and a tool called Calibre. Honkit is written in TypeScript and is installed via NPM. Calibre is a native tool so it must be installed via an OS-specific package manager. All this complexity makes building, releasing, uploading, etc. a pain. AsciiDoc is relatively simple like Markdown, but it has more features for presentation and navigation not to mention Java-based Maven tooling to generate both HTML and PDF. Migrating will improve both the appearance of the documentation as well as the processes to generate and upload it. This commit contains the following changes: - Convert all the Markdown for the User Manual, Migration Guide, and Hacking guide to AsciiDoc via kramdown [1]. - Update the `artemis-website` build to use AsciiDoctor Maven tooling. - Update `RELEASING.md` with simplified instructions. - Update Hacking Guide with simplified instructions. - Use AsciiDoc link syntax in Artemis Maven doc plugin. - Drop EPUB & MOBI docs for User Manual as well as PDF for the Hacking Guide. All docs will be HTML only except for the User Manual which will have PDF. - Move all docs up out of their respective "en" directory. This was a hold-over from when we had docs in different languages. - Migration & Hacking Guides are now single-page HTML since they are relatively short. - Refactor README.md to simplify and remove redundant content. Benefits of the change: - Much simplified tooling. No more NPM packages or native tools. - Auto-generated table of contents for every chapter. - Auto-generated anchor links for every sub-section. - Overall more appealing presentation. - All docs will use the ActiveMQ favicon. - No more manual line-wrapping! AsciiDoc recommends one sentence per line and paragraphs are separated by a blank line. - AsciiDoctor plugins for IDEA are quite good. - Resulting HTML is less than half of the previous size. All previous links/bookmarks should continue to work. [1] https://github.com/asciidoctor/kramdown-asciidoc 2023-07-27 23:45:17 -04:00			`The caches can also be configured to persist to disk or not.`
			This is configured by the parameter `persist-id-cache`, also in `broker.xml`.
			If this is set to `true` then each id will be persisted to permanent storage as they are received.
			The default value for this parameter is `true`.

			`[NOTE]`
			`====`


			`When choosing a size of the duplicate id cache be sure to set it to a larger enough size so if you resend messages all the previously sent ones are in the cache not having been overwritten.`
			`====`

			`== Duplicate Detection and Bridges`

			`Core bridges can be configured to automatically add a unique duplicate id value (if there isn't already one in the message) before forwarding the message to its target.`
			`This ensures that if the target server crashes or the connection is interrupted and the bridge resends the message, then if it has already been received by the target server, it will be ignored.`

			To configure a core bridge to add the duplicate id header, simply set the `use-duplicate-detection` to `true` when configuring a bridge in `broker.xml`.

			The default value for this parameter is `true`.

			`For more information on core bridges and how to configure them, please see xref:core-bridges.adoc#core-bridges[Core Bridges].`

			`== Duplicate Detection and Cluster Connections`

			`Cluster connections internally use core bridges to move messages reliable between nodes of the cluster.`
			`Consequently they can also be configured to insert the duplicate id header for each message they move using their internal bridges.`

			To configure a cluster connection to add the duplicate id header, simply set the `use-duplicate-detection` to `true` when configuring a cluster connection in `broker.xml`.

			The default value for this parameter is `true`.

			`For more information on cluster connections and how to configure them, please see xref:clusters.adoc#clusters[Clusters].`