activemq-artemis/docs/user-manual/en/perf-tuning.xml

<?xml version="1.0" encoding="UTF-8"?>
<!-- ============================================================================= -->
<!-- Copyright © 2009 Red Hat, Inc. and others.                                    -->
<!--                                                                               -->
<!-- The text of and illustrations in this document are licensed by Red Hat under  -->
<!-- a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). -->
<!--                                                                               -->
<!-- An explanation of CC-BY-SA is available at                                    -->
<!--                                                                               -->
<!--            http://creativecommons.org/licenses/by-sa/3.0/.                    -->
<!--                                                                               -->
<!-- In accordance with CC-BY-SA, if you distribute this document or an adaptation -->
<!-- of it, you must provide the URL for the original version.                     -->
<!--                                                                               -->
<!-- Red Hat, as the licensor of this document, waives the right to enforce,       -->
<!-- and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent        -->
<!-- permitted by applicable law.                                                  -->
<!-- ============================================================================= -->

<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
<!ENTITY % BOOK_ENTITIES SYSTEM "HornetQ_User_Manual.ent">
%BOOK_ENTITIES;
]>
<chapter id="perf-tuning">
    <title>Performance Tuning</title>
    <para>In this chapter we'll discuss how to tune HornetQ for optimum performance.</para>
    <section>
        <title>Tuning persistence</title>
        <itemizedlist>
            <listitem>
                <para>Put the message journal on its own physical volume. If the disk is shared with
                    other processes e.g. transaction co-ordinator, database or other journals which
                    are also reading and writing from it, then this may greatly reduce performance
                    since the disk head may be skipping all over the place between the different
                    files. One of the advantages of an append only journal is that disk head
                    movement is minimised - this advantage is destroyed if the disk is shared. If
                    you're using paging or large messages make sure they're ideally put on separate
                    volumes too.</para>
            </listitem>
            <listitem>
                <para>Minimum number of journal files. Set <literal>journal-min-files</literal> to a
                    number of files that would fit your average sustainable rate. If you see new
                    files being created on the journal data directory too often, i.e. lots of data
                    is being persisted, you need to increase the minimal number of files, this way
                    the journal would reuse more files instead of creating new data files.</para>
            </listitem>
            <listitem>
                <para>Journal file size. The journal file size should be aligned to the capacity of
                    a cylinder on the disk. The default value 10MiB should be enough on most
                    systems.</para>
            </listitem>
            <listitem>
                <para>Use AIO journal. If using Linux, try to keep your journal type as AIO. AIO
                    will scale better than Java NIO.</para>
            </listitem>
            <listitem>
                <para>Tune <literal>journal-buffer-timeout</literal>. The timeout can be increased
                    to increase throughput at the expense of latency.</para>
            </listitem>
            <listitem>
                <para>If you're running AIO you might be able to get some better performance by
                    increasing <literal>journal-max-io</literal>. DO NOT change this parameter if
                    you are running NIO.</para>
            </listitem>
        </itemizedlist>
    </section>
    <section>
        <title>Tuning JMS</title>
        <para>There are a few areas where some tweaks can be done if you are using the JMS
            API</para>
        <itemizedlist>
            <listitem>
                <para>Disable message id. Use the <literal>setDisableMessageID()</literal> method on
                    the <literal>MessageProducer</literal> class to disable message ids if you don't
                    need them. This decreases the size of the message and also avoids the overhead
                    of creating a unique ID.</para>
            </listitem>
            <listitem>
                <para>Disable message timestamp. Use the <literal
                        >setDisableMessageTimeStamp()</literal> method on the <literal
                        >MessageProducer</literal> class to disable message timestamps if you don't
                    need them.</para>
            </listitem>
            <listitem>
                <para>Avoid <literal>ObjectMessage</literal>. <literal>ObjectMessage</literal> is
                    convenient but it comes at a cost. The body of a <literal
                        >ObjectMessage</literal> uses Java serialization to serialize it to bytes.
                    The Java serialized form of even small objects is very verbose so takes up a lot
                    of space on the wire, also Java serialization is slow compared to custom
                    marshalling techniques. Only use <literal>ObjectMessage</literal> if you really
                    can't use one of the other message types, i.e. if you really don't know the type
                    of the payload until run-time.</para>
            </listitem>
            <listitem>
                <para>Avoid <literal>AUTO_ACKNOWLEDGE</literal>. <literal>AUTO_ACKNOWLEDGE</literal>
                    mode requires an acknowledgement to be sent from the server for each message
                    received on the client, this means more traffic on the network. If you can, use
                        <literal>DUPS_OK_ACKNOWLEDGE</literal> or use <literal
                        >CLIENT_ACKNOWLEDGE</literal> or a transacted session and batch up many
                    acknowledgements with one acknowledge/commit. </para>
            </listitem>
            <listitem>
                <para>Avoid durable messages. By default JMS messages are durable. If you don't
                    really need durable messages then set them to be non-durable. Durable messages
                    incur a lot more overhead in persisting them to storage.</para>
            </listitem>
            <listitem>
                <para>Batch many sends or acknowledgements in a single transaction. HornetQ will
                    only require a network round trip on the commit, not on every send or
                    acknowledgement.</para>
            </listitem>
        </itemizedlist>
    </section>
    <section>
        <title>Other Tunings</title>
        <para>There are various other places in HornetQ where we can perform some tuning:</para>
        <itemizedlist>
            <listitem>
                <para>Use Asynchronous Send Acknowledgements. If you need to send durable messages
                    non transactionally and you need a guarantee that they have reached the server
                    by the time the call to send() returns, don't set durable messages to be sent
                    blocking, instead use asynchronous send acknowledgements to get your
                    acknowledgements of send back in a separate stream, see <xref
                        linkend="send-guarantees"/> for more information on this.</para>
            </listitem>
            <listitem>
                <para>Use pre-acknowledge mode. With pre-acknowledge mode, messages are acknowledged
                        <literal>before</literal> they are sent to the client. This reduces the
                    amount of acknowledgement traffic on the wire. For more information on this, see
                        <xref linkend="pre-acknowledge"/>.</para>
            </listitem>
            <listitem>
                <para>Disable security. You may get a small performance boost by disabling security
                    by setting the <literal>security-enabled</literal> parameter to <literal
                        >false</literal> in <literal>hornetq-configuration.xml</literal>.</para>
            </listitem>
            <listitem>
                <para>Disable persistence. If you don't need message persistence, turn it off
                    altogether by setting <literal>persistence-enabled</literal> to false in
                        <literal>hornetq-configuration.xml</literal>.</para>
            </listitem>
            <listitem>
                <para>Sync transactions lazily. Setting <literal
                        >journal-sync-transactional</literal> to <literal>false</literal> in
                        <literal>hornetq-configuration.xml</literal> can give you better
                    transactional persistent performance at the expense of some possibility of loss
                    of transactions on failure. See <xref linkend="send-guarantees"/> for more
                    information.</para>
            </listitem>
            <listitem>
                <para>Sync non transactional lazily. Setting <literal
                        >journal-sync-non-transactional</literal> to <literal>false</literal> in
                        <literal>hornetq-configuration.xml</literal> can give you better
                    non-transactional persistent performance at the expense of some possibility of
                    loss of durable messages on failure. See <xref linkend="send-guarantees"/> for
                    more information.</para>
            </listitem>
            <listitem>
                <para>Send messages non blocking. Setting <literal>block-on-durable-send</literal>
                    and <literal>block-on-non-durable-send</literal> to <literal>false</literal> in
                        <literal>hornetq-jms.xml</literal> (if you're using JMS and JNDI) or
                    directly on the ServerLocator. This means you don't have to wait a whole
                    network round trip for every message sent. See <xref linkend="send-guarantees"/>
                    for more information.</para>
            </listitem>
            <listitem>
                <para>If you have very fast consumers, you can increase consumer-window-size. This
                    effectively disables consumer flow control.</para>
            </listitem>
            <listitem>
                <para>Socket NIO vs Socket Old IO. By default HornetQ uses old (blocking) on the
                    server and the client side (see the chapter on configuring transports for more
                    information <xref linkend="configuring-transports"/>). NIO is much more scalable
                    but can give you some latency hit compared to old blocking IO. If you need to be
                    able to service many thousands of connections on the server, then you should
                    make sure you're using NIO on the server. However, if don't expect many
                    thousands of connections on the server you can keep the server acceptors using
                    old IO, and might get a small performance advantage.</para>
            </listitem>
            <listitem>
                <para>Use the core API not JMS. Using the JMS API you will have slightly lower
                    performance than using the core API, since all JMS operations need to be
                    translated into core operations before the server can handle them. If using the
                    core API try to use methods that take <literal>SimpleString</literal> as much as
                    possible. <literal>SimpleString</literal>, unlike java.lang.String does not
                    require copying before it is written to the wire, so if you re-use <literal
                        >SimpleString</literal> instances between calls then you can avoid some
                    unnecessary copying.</para>
            </listitem>
        </itemizedlist>
    </section>
    <section>
        <title>Tuning Transport Settings</title>
        <itemizedlist>
            <listitem>
                <para>TCP buffer sizes. If you have a fast network and fast machines you may get a
                    performance boost by increasing the TCP send and receive buffer sizes. See the
                        <xref linkend="configuring-transports"/> for more information on this. </para>
                <note>
                    <para> Note that some operating systems like later versions of Linux include TCP
                        auto-tuning and setting TCP buffer sizes manually can prevent auto-tune from
                        working and actually give you worse performance!</para>
                </note>
            </listitem>
            <listitem>
                <para>Increase limit on file handles on the server. If you expect a lot of
                    concurrent connections on your servers, or if clients are rapidly opening and
                    closing connections, you should make sure the user running the server has
                    permission to create sufficient file handles.</para>
                <para>This varies from operating system to operating system. On Linux systems you
                    can increase the number of allowable open file handles in the file <literal
                        >/etc/security/limits.conf</literal> e.g. add the lines
                    <programlisting>
serveruser     soft    nofile  20000
serveruser     hard    nofile  20000</programlisting>
                    This would allow up to 20000 file handles to be open by the user <literal
                        >serveruser</literal>. </para>
            </listitem>
            <listitem>
                <para>Use <literal>batch-delay</literal> and set <literal>direct-deliver</literal>
                    to false for the best throughput for very small messages. HornetQ comes with a
                    preconfigured connector/acceptor pair (<literal>netty-throughput</literal>) in
                        <literal>hornetq-configuration.xml</literal> and JMS connection factory
                        (<literal>ThroughputConnectionFactory</literal>) in <literal
                        >hornetq-jms.xml</literal>which can be used to give the very best
                    throughput, especially for small messages. See the <xref
                        linkend="configuring-transports"/> for more information on this.</para>
            </listitem>
        </itemizedlist>
    </section>
    <section>
        <title>Tuning the VM</title>
        <para>We highly recommend you use the latest Java JVM for the best performance. We test
            internally using the Sun JVM, so some of these tunings won't apply to JDKs from other
            providers (e.g. IBM or JRockit)</para>
        <itemizedlist>
            <listitem>
                <para>Garbage collection. For smooth server operation we recommend using a parallel
                    garbage collection algorithm, e.g. using the JVM argument <literal
                        >-XX:+UseParallelOldGC</literal> on Sun JDKs.</para>
            </listitem>
            <listitem id="perf-tuning.memory">
                <para>Memory settings. Give as much memory as you can to the server. HornetQ can run
                    in low memory by using paging (described in <xref linkend="paging"/>) but if it
                    can run with all queues in RAM this will improve performance. The amount of
                    memory you require will depend on the size and number of your queues and the
                    size and number of your messages. Use the JVM arguments <literal>-Xms</literal>
                    and <literal>-Xmx</literal> to set server available RAM. We recommend setting
                    them to the same high value.</para>
            </listitem>
            <listitem>
                <para>Aggressive options. Different JVMs provide different sets of JVM tuning
                    parameters, for the Sun Hotspot JVM the full list of options is available <ulink
                        url="http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html"
                        >here</ulink>. We recommend at least using <literal
                        >-XX:+AggressiveOpts</literal> and<literal>
                        -XX:+UseFastAccessorMethods</literal>. You may get some mileage with the
                    other tuning parameters depending on your OS platform and application usage
                    patterns.</para>
            </listitem>
        </itemizedlist>
    </section>
    <section>
        <title>Avoiding Anti-Patterns</title>
        <itemizedlist>
            <listitem>
                <para>Re-use connections / sessions / consumers / producers. Probably the most
                    common messaging anti-pattern we see is users who create a new
                    connection/session/producer for every message they send or every message they
                    consume. This is a poor use of resources. These objects take time to create and
                    may involve several network round trips. Always re-use them.</para>
                <note>
                    <para>Some popular libraries such as the Spring JMS Template are known to use
                        these anti-patterns. If you're using Spring JMS Template and you're getting
                        poor performance you know why. Don't blame HornetQ! The Spring JMS Template
                        can only safely be used in an app server which caches JMS sessions (e.g.
                        using JCA), and only then for sending messages. It cannot be safely be used
                        for synchronously consuming messages, even in an app server. </para>
                </note>
            </listitem>
            <listitem>
                <para>Avoid fat messages. Verbose formats such as XML take up a lot of space on the
                    wire and performance will suffer as result. Avoid XML in message bodies if you
                    can.</para>
            </listitem>
            <listitem>
                <para>Don't create temporary queues for each request. This common anti-pattern
                    involves the temporary queue request-response pattern. With the temporary queue
                    request-response pattern a message is sent to a target and a reply-to header is
                    set with the address of a local temporary queue. When the recipient receives the
                    message they process it then send back a response to the address specified in
                    the reply-to. A common mistake made with this pattern is to create a new
                    temporary queue on each message sent. This will drastically reduce performance.
                    Instead the temporary queue should be re-used for many requests.</para>
            </listitem>
            <listitem>
                <para>Don't use Message-Driven Beans for the sake of it. As soon as you start using
                    MDBs you are greatly increasing the codepath for each message received compared
                    to a straightforward message consumer, since a lot of extra application server
                    code is executed. Ask yourself do you really need MDBs? Can you accomplish the
                    same task using just a normal message consumer?</para>
            </listitem>
        </itemizedlist>
    </section>
</chapter>