1369 lines
63 KiB
XML
1369 lines
63 KiB
XML
<chapter id="performance">
|
|
<title>Improving performance</title>
|
|
|
|
<sect1 id="performance-fetching" revision="2">
|
|
<title>Fetching strategies</title>
|
|
|
|
<para>
|
|
A <emphasis>fetching strategy</emphasis> is the strategy Hibernate will use for
|
|
retrieving associated objects if the application needs to navigate the association.
|
|
Fetch strategies may be declared in the O/R mapping metadata, or over-ridden by a
|
|
particular HQL or <literal>Criteria</literal> query.
|
|
</para>
|
|
|
|
<para>
|
|
Hibernate3 defines the following fetching strategies:
|
|
</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
<emphasis>Join fetching</emphasis> - Hibernate retrieves the
|
|
associated instance or collection in the same <literal>SELECT</literal>,
|
|
using an <literal>OUTER JOIN</literal>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<emphasis>Select fetching</emphasis> - a second <literal>SELECT</literal>
|
|
is used to retrieve the associated entity or collection. Unless
|
|
you explicitly disable lazy fetching by specifying <literal>lazy="false"</literal>,
|
|
this second select will only be executed when you actually access the
|
|
association.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<emphasis>Subselect fetching</emphasis> - a second <literal>SELECT</literal>
|
|
is used to retrieve the associated collections for all entities retrieved in a
|
|
previous query or fetch. Unless you explicitly disable lazy fetching by specifying
|
|
<literal>lazy="false"</literal>, this second select will only be executed when you
|
|
actually access the association.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<emphasis>Batch fetching</emphasis> - an optimization strategy
|
|
for select fetching - Hibernate retrieves a batch of entity instances
|
|
or collections in a single <literal>SELECT</literal>, by specifying
|
|
a list of primary keys or foreign keys.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
Hibernate also distinguishes between:
|
|
</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
<emphasis>Immediate fetching</emphasis> - an association, collection or
|
|
attribute is fetched immediately, when the owner is loaded.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<emphasis>Lazy collection fetching</emphasis> - a collection is fetched
|
|
when the application invokes an operation upon that collection. (This
|
|
is the default for collections.)
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<emphasis>"Extra-lazy" collection fetching</emphasis> - individual
|
|
elements of the collection are accessed from the database as needed.
|
|
Hibernate tries not to fetch the whole collection into memory unless
|
|
absolutely needed (suitable for very large collections)
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<emphasis>Proxy fetching</emphasis> - a single-valued association is
|
|
fetched when a method other than the identifier getter is invoked
|
|
upon the associated object.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<emphasis>"No-proxy" fetching</emphasis> - a single-valued association is
|
|
fetched when the instance variable is accessed. Compared to proxy fetching,
|
|
this approach is less lazy (the association is fetched even when only the
|
|
identifier is accessed) but more transparent, since no proxy is visible to
|
|
the application. This approach requires buildtime bytecode instrumentation
|
|
and is rarely necessary.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<emphasis>Lazy attribute fetching</emphasis> - an attribute or single
|
|
valued association is fetched when the instance variable is accessed.
|
|
This approach requires buildtime bytecode instrumentation and is rarely
|
|
necessary.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
We have two orthogonal notions here: <emphasis>when</emphasis> is the association
|
|
fetched, and <emphasis>how</emphasis> is it fetched (what SQL is used). Don't
|
|
confuse them! We use <literal>fetch</literal> to tune performance. We may use
|
|
<literal>lazy</literal> to define a contract for what data is always available
|
|
in any detached instance of a particular class.
|
|
</para>
|
|
|
|
<sect2 id="performance-fetching-lazy">
|
|
<title>Working with lazy associations</title>
|
|
|
|
<para>
|
|
By default, Hibernate3 uses lazy select fetching for collections and lazy proxy
|
|
fetching for single-valued associations. These defaults make sense for almost
|
|
all associations in almost all applications.
|
|
</para>
|
|
|
|
<para>
|
|
<emphasis>Note:</emphasis> if you set
|
|
<literal>hibernate.default_batch_fetch_size</literal>, Hibernate will use the
|
|
batch fetch optimization for lazy fetching (this optimization may also be enabled
|
|
at a more granular level).
|
|
</para>
|
|
|
|
<para>
|
|
However, lazy fetching poses one problem that you must be aware of. Access to a
|
|
lazy association outside of the context of an open Hibernate session will result
|
|
in an exception. For example:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[s = sessions.openSession();
|
|
Transaction tx = s.beginTransaction();
|
|
|
|
User u = (User) s.createQuery("from User u where u.name=:userName")
|
|
.setString("userName", userName).uniqueResult();
|
|
Map permissions = u.getPermissions();
|
|
|
|
tx.commit();
|
|
s.close();
|
|
|
|
Integer accessLevel = (Integer) permissions.get("accounts"); // Error!]]></programlisting>
|
|
|
|
<para>
|
|
Since the permissions collection was not initialized when the
|
|
<literal>Session</literal> was closed, the collection will not be able to
|
|
load its state. <emphasis>Hibernate does not support lazy initialization
|
|
for detached objects</emphasis>. The fix is to move the code that reads
|
|
from the collection to just before the transaction is committed.
|
|
</para>
|
|
|
|
<para>
|
|
Alternatively, we could use a non-lazy collection or association,
|
|
by specifying <literal>lazy="false"</literal> for the association mapping.
|
|
However, it is intended that lazy initialization be used for almost all
|
|
collections and associations. If you define too many non-lazy associations
|
|
in your object model, Hibernate will end up needing to fetch the entire
|
|
database into memory in every transaction!
|
|
</para>
|
|
|
|
<para>
|
|
On the other hand, we often want to choose join fetching (which is non-lazy by
|
|
nature) instead of select fetching in a particular transaction. We'll now see
|
|
how to customize the fetching strategy. In Hibernate3, the mechanisms for
|
|
choosing a fetch strategy are identical for single-valued associations and
|
|
collections.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-fetching-custom" revision="4">
|
|
<title>Tuning fetch strategies</title>
|
|
|
|
<para>
|
|
Select fetching (the default) is extremely vulnerable to N+1 selects problems,
|
|
so we might want to enable join fetching in the mapping document:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[<set name="permissions"
|
|
fetch="join">
|
|
<key column="userId"/>
|
|
<one-to-many class="Permission"/>
|
|
</set]]></programlisting>
|
|
|
|
<programlisting><![CDATA[<many-to-one name="mother" class="Cat" fetch="join"/>]]></programlisting>
|
|
|
|
<para>
|
|
The <literal>fetch</literal> strategy defined in the mapping document affects:
|
|
</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
retrieval via <literal>get()</literal> or <literal>load()</literal>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
retrieval that happens implicitly when an association is navigated
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>Criteria</literal> queries
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
HQL queries if <literal>subselect</literal> fetching is used
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
No matter what fetching strategy you use, the defined non-lazy graph is guaranteed
|
|
to be loaded into memory. Note that this might result in several immediate selects
|
|
being used to execute a particular HQL query.
|
|
</para>
|
|
|
|
<para>
|
|
Usually, we don't use the mapping document to customize fetching. Instead, we
|
|
keep the default behavior, and override it for a particular transaction, using
|
|
<literal>left join fetch</literal> in HQL. This tells Hibernate to fetch
|
|
the association eagerly in the first select, using an outer join. In the
|
|
<literal>Criteria</literal> query API, you would use
|
|
<literal>setFetchMode(FetchMode.JOIN)</literal>.
|
|
</para>
|
|
|
|
<para>
|
|
If you ever feel like you wish you could change the fetching strategy used by
|
|
<literal>get()</literal> or <literal>load()</literal>, simply use a
|
|
<literal>Criteria</literal> query, for example:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[User user = (User) session.createCriteria(User.class)
|
|
.setFetchMode("permissions", FetchMode.JOIN)
|
|
.add( Restrictions.idEq(userId) )
|
|
.uniqueResult();]]></programlisting>
|
|
|
|
<para>
|
|
(This is Hibernate's equivalent of what some ORM solutions call a "fetch plan".)
|
|
</para>
|
|
|
|
<para>
|
|
A completely different way to avoid problems with N+1 selects is to use the
|
|
second-level cache.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-fetching-proxies" revision="2">
|
|
<title>Single-ended association proxies</title>
|
|
|
|
<para>
|
|
Lazy fetching for collections is implemented using Hibernate's own implementation
|
|
of persistent collections. However, a different mechanism is needed for lazy
|
|
behavior in single-ended associations. The target entity of the association must
|
|
be proxied. Hibernate implements lazy initializing proxies for persistent objects
|
|
using runtime bytecode enhancement (via the excellent CGLIB library).
|
|
</para>
|
|
|
|
<para>
|
|
By default, Hibernate3 generates proxies (at startup) for all persistent classes
|
|
and uses them to enable lazy fetching of <literal>many-to-one</literal> and
|
|
<literal>one-to-one</literal> associations.
|
|
</para>
|
|
|
|
<para>
|
|
The mapping file may declare an interface to use as the proxy interface for that
|
|
class, with the <literal>proxy</literal> attribute. By default, Hibernate uses a subclass
|
|
of the class. <emphasis>Note that the proxied class must implement a default constructor
|
|
with at least package visibility. We recommend this constructor for all persistent classes!</emphasis>
|
|
</para>
|
|
|
|
<para>
|
|
There are some gotchas to be aware of when extending this approach to polymorphic
|
|
classes, eg.
|
|
</para>
|
|
|
|
<programlisting><![CDATA[<class name="Cat" proxy="Cat">
|
|
......
|
|
<subclass name="DomesticCat">
|
|
.....
|
|
</subclass>
|
|
</class>]]></programlisting>
|
|
|
|
<para>
|
|
Firstly, instances of <literal>Cat</literal> will never be castable to
|
|
<literal>DomesticCat</literal>, even if the underlying instance is an
|
|
instance of <literal>DomesticCat</literal>:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[Cat cat = (Cat) session.load(Cat.class, id); // instantiate a proxy (does not hit the db)
|
|
if ( cat.isDomesticCat() ) { // hit the db to initialize the proxy
|
|
DomesticCat dc = (DomesticCat) cat; // Error!
|
|
....
|
|
}]]></programlisting>
|
|
|
|
<para>
|
|
Secondly, it is possible to break proxy <literal>==</literal>.
|
|
</para>
|
|
|
|
<programlisting><![CDATA[Cat cat = (Cat) session.load(Cat.class, id); // instantiate a Cat proxy
|
|
DomesticCat dc =
|
|
(DomesticCat) session.load(DomesticCat.class, id); // acquire new DomesticCat proxy!
|
|
System.out.println(cat==dc); // false]]></programlisting>
|
|
|
|
<para>
|
|
However, the situation is not quite as bad as it looks. Even though we now have two references
|
|
to different proxy objects, the underlying instance will still be the same object:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[cat.setWeight(11.0); // hit the db to initialize the proxy
|
|
System.out.println( dc.getWeight() ); // 11.0]]></programlisting>
|
|
|
|
<para>
|
|
Third, you may not use a CGLIB proxy for a <literal>final</literal> class or a class
|
|
with any <literal>final</literal> methods.
|
|
</para>
|
|
|
|
<para>
|
|
Finally, if your persistent object acquires any resources upon instantiation (eg. in
|
|
initializers or default constructor), then those resources will also be acquired by
|
|
the proxy. The proxy class is an actual subclass of the persistent class.
|
|
</para>
|
|
|
|
<para>
|
|
These problems are all due to fundamental limitations in Java's single inheritance model.
|
|
If you wish to avoid these problems your persistent classes must each implement an interface
|
|
that declares its business methods. You should specify these interfaces in the mapping file. eg.
|
|
</para>
|
|
|
|
<programlisting><![CDATA[<class name="CatImpl" proxy="Cat">
|
|
......
|
|
<subclass name="DomesticCatImpl" proxy="DomesticCat">
|
|
.....
|
|
</subclass>
|
|
</class>]]></programlisting>
|
|
|
|
<para>
|
|
where <literal>CatImpl</literal> implements the interface <literal>Cat</literal> and
|
|
<literal>DomesticCatImpl</literal> implements the interface <literal>DomesticCat</literal>. Then
|
|
proxies for instances of <literal>Cat</literal> and <literal>DomesticCat</literal> may be returned
|
|
by <literal>load()</literal> or <literal>iterate()</literal>. (Note that <literal>list()</literal>
|
|
does not usually return proxies.)
|
|
</para>
|
|
|
|
<programlisting><![CDATA[Cat cat = (Cat) session.load(CatImpl.class, catid);
|
|
Iterator iter = session.iterate("from CatImpl as cat where cat.name='fritz'");
|
|
Cat fritz = (Cat) iter.next();]]></programlisting>
|
|
|
|
<para>
|
|
Relationships are also lazily initialized. This means you must declare any properties to be of
|
|
type <literal>Cat</literal>, not <literal>CatImpl</literal>.
|
|
</para>
|
|
|
|
<para>
|
|
Certain operations do <emphasis>not</emphasis> require proxy initialization
|
|
</para>
|
|
|
|
<itemizedlist spacing="compact">
|
|
<listitem>
|
|
<para>
|
|
<literal>equals()</literal>, if the persistent class does not override
|
|
<literal>equals()</literal>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>hashCode()</literal>, if the persistent class does not override
|
|
<literal>hashCode()</literal>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
The identifier getter method
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
Hibernate will detect persistent classes that override <literal>equals()</literal> or
|
|
<literal>hashCode()</literal>.
|
|
</para>
|
|
|
|
<para>
|
|
By choosing <literal>lazy="no-proxy"</literal> instead of the default
|
|
<literal>lazy="proxy"</literal>, we can avoid the problems associated with typecasting.
|
|
However, we will require buildtime bytecode instrumentation, and all operations
|
|
will result in immediate proxy initialization.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-fetching-initialization">
|
|
<title>Initializing collections and proxies</title>
|
|
|
|
<para>
|
|
A <literal>LazyInitializationException</literal> will be thrown by Hibernate if an uninitialized
|
|
collection or proxy is accessed outside of the scope of the <literal>Session</literal>, ie. when
|
|
the entity owning the collection or having the reference to the proxy is in the detached state.
|
|
</para>
|
|
|
|
<para>
|
|
Sometimes we need to ensure that a proxy or collection is initialized before closing the
|
|
<literal>Session</literal>. Of course, we can alway force initialization by calling
|
|
<literal>cat.getSex()</literal> or <literal>cat.getKittens().size()</literal>, for example.
|
|
But that is confusing to readers of the code and is not convenient for generic code.
|
|
</para>
|
|
|
|
<para>
|
|
The static methods <literal>Hibernate.initialize()</literal> and <literal>Hibernate.isInitialized()</literal>
|
|
provide the application with a convenient way of working with lazily initialized collections or
|
|
proxies. <literal>Hibernate.initialize(cat)</literal> will force the initialization of a proxy,
|
|
<literal>cat</literal>, as long as its <literal>Session</literal> is still open.
|
|
<literal>Hibernate.initialize( cat.getKittens() )</literal> has a similar effect for the collection
|
|
of kittens.
|
|
</para>
|
|
|
|
<para>
|
|
Another option is to keep the <literal>Session</literal> open until all needed
|
|
collections and proxies have been loaded. In some application architectures,
|
|
particularly where the code that accesses data using Hibernate, and the code that
|
|
uses it are in different application layers or different physical processes, it
|
|
can be a problem to ensure that the <literal>Session</literal> is open when a
|
|
collection is initialized. There are two basic ways to deal with this issue:
|
|
</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
In a web-based application, a servlet filter can be used to close the
|
|
<literal>Session</literal> only at the very end of a user request, once
|
|
the rendering of the view is complete (the <emphasis>Open Session in
|
|
View</emphasis> pattern). Of course, this places heavy demands on the
|
|
correctness of the exception handling of your application infrastructure.
|
|
It is vitally important that the <literal>Session</literal> is closed and the
|
|
transaction ended before returning to the user, even when an exception occurs
|
|
during rendering of the view. The servlet filter has to be able to access the
|
|
<literal>Session</literal> for this approach. We recommend that a
|
|
<literal>ThreadLocal</literal> variable be used to hold the current
|
|
<literal>Session</literal> (see chapter 1,
|
|
<xref linkend="quickstart-playingwithcats"/>, for an example implementation).
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
In an application with a separate business tier, the business logic must
|
|
"prepare" all collections that will be needed by the web tier before
|
|
returning. This means that the business tier should load all the data and
|
|
return all the data already initialized to the presentation/web tier that
|
|
is required for a particular use case. Usually, the application calls
|
|
<literal>Hibernate.initialize()</literal> for each collection that will
|
|
be needed in the web tier (this call must occur before the session is closed)
|
|
or retrieves the collection eagerly using a Hibernate query with a
|
|
<literal>FETCH</literal> clause or a <literal>FetchMode.JOIN</literal> in
|
|
<literal>Criteria</literal>. This is usually easier if you adopt the
|
|
<emphasis>Command</emphasis> pattern instead of a <emphasis>Session Facade</emphasis>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
You may also attach a previously loaded object to a new <literal>Session</literal>
|
|
with <literal>merge()</literal> or <literal>lock()</literal> before
|
|
accessing uninitialized collections (or other proxies). No, Hibernate does not,
|
|
and certainly <emphasis>should</emphasis> not do this automatically, since it
|
|
would introduce ad hoc transaction semantics!
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
Sometimes you don't want to initialize a large collection, but still need some
|
|
information about it (like its size) or a subset of the data.
|
|
</para>
|
|
|
|
<para>
|
|
You can use a collection filter to get the size of a collection without initializing it:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[( (Integer) s.createFilter( collection, "select count(*)" ).list().get(0) ).intValue()]]></programlisting>
|
|
|
|
<para>
|
|
The <literal>createFilter()</literal> method is also used to efficiently retrieve subsets
|
|
of a collection without needing to initialize the whole collection:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[s.createFilter( lazyCollection, "").setFirstResult(0).setMaxResults(10).list();]]></programlisting>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-fetching-batch">
|
|
<title>Using batch fetching</title>
|
|
|
|
<para>
|
|
Hibernate can make efficient use of batch fetching, that is, Hibernate can load several uninitialized
|
|
proxies if one proxy is accessed (or collections. Batch fetching is an optimization of the lazy select
|
|
fetching strategy. There are two ways you can tune batch fetching: on the class and the collection level.
|
|
</para>
|
|
|
|
<para>
|
|
Batch fetching for classes/entities is easier to understand. Imagine you have the following situation
|
|
at runtime: You have 25 <literal>Cat</literal> instances loaded in a <literal>Session</literal>, each
|
|
<literal>Cat</literal> has a reference to its <literal>owner</literal>, a <literal>Person</literal>.
|
|
The <literal>Person</literal> class is mapped with a proxy, <literal>lazy="true"</literal>. If you now
|
|
iterate through all cats and call <literal>getOwner()</literal> on each, Hibernate will by default
|
|
execute 25 <literal>SELECT</literal> statements, to retrieve the proxied owners. You can tune this
|
|
behavior by specifying a <literal>batch-size</literal> in the mapping of <literal>Person</literal>:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[<class name="Person" batch-size="10">...</class>]]></programlisting>
|
|
|
|
<para>
|
|
Hibernate will now execute only three queries, the pattern is 10, 10, 5.
|
|
</para>
|
|
|
|
<para>
|
|
You may also enable batch fetching of collections. For example, if each <literal>Person</literal> has
|
|
a lazy collection of <literal>Cat</literal>s, and 10 persons are currently loaded in the
|
|
<literal>Sesssion</literal>, iterating through all persons will generate 10 <literal>SELECT</literal>s,
|
|
one for every call to <literal>getCats()</literal>. If you enable batch fetching for the
|
|
<literal>cats</literal> collection in the mapping of <literal>Person</literal>, Hibernate can pre-fetch
|
|
collections:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[<class name="Person">
|
|
<set name="cats" batch-size="3">
|
|
...
|
|
</set>
|
|
</class>]]></programlisting>
|
|
|
|
<para>
|
|
With a <literal>batch-size</literal> of 8, Hibernate will load 3, 3, 3, 1 collections in four
|
|
<literal>SELECT</literal>s. Again, the value of the attribute depends on the expected number of
|
|
uninitialized collections in a particular <literal>Session</literal>.
|
|
</para>
|
|
|
|
<para>
|
|
Batch fetching of collections is particularly useful if you have a nested tree of items, ie.
|
|
the typical bill-of-materials pattern. (Although a <emphasis>nested set</emphasis> or a
|
|
<emphasis>materialized path</emphasis> might be a better option for read-mostly trees.)
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-fetching-subselect">
|
|
<title>Using subselect fetching</title>
|
|
|
|
<para>
|
|
If one lazy collection or single-valued proxy has to be fetched, Hibernate loads all of
|
|
them, re-running the original query in a subselect. This works in the same way as
|
|
batch-fetching, without the piecemeal loading.
|
|
</para>
|
|
|
|
<!-- TODO: Write more about this -->
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-fetching-lazyproperties">
|
|
<title>Using lazy property fetching</title>
|
|
|
|
<para>
|
|
Hibernate3 supports the lazy fetching of individual properties. This optimization technique
|
|
is also known as <emphasis>fetch groups</emphasis>. Please note that this is mostly a
|
|
marketing feature, as in practice, optimizing row reads is much more important than
|
|
optimization of column reads. However, only loading some properties of a class might
|
|
be useful in extreme cases, when legacy tables have hundreds of columns and the data model
|
|
can not be improved.
|
|
</para>
|
|
|
|
<para>
|
|
To enable lazy property loading, set the <literal>lazy</literal> attribute on your
|
|
particular property mappings:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[<class name="Document">
|
|
<id name="id">
|
|
<generator class="native"/>
|
|
</id>
|
|
<property name="name" not-null="true" length="50"/>
|
|
<property name="summary" not-null="true" length="200" lazy="true"/>
|
|
<property name="text" not-null="true" length="2000" lazy="true"/>
|
|
</class>]]></programlisting>
|
|
|
|
<para>
|
|
Lazy property loading requires buildtime bytecode instrumentation! If your persistent
|
|
classes are not enhanced, Hibernate will silently ignore lazy property settings and
|
|
fall back to immediate fetching.
|
|
</para>
|
|
|
|
<para>
|
|
For bytecode instrumentation, use the following Ant task:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[<target name="instrument" depends="compile">
|
|
<taskdef name="instrument" classname="org.hibernate.tool.instrument.InstrumentTask">
|
|
<classpath path="${jar.path}"/>
|
|
<classpath path="${classes.dir}"/>
|
|
<classpath refid="lib.class.path"/>
|
|
</taskdef>
|
|
|
|
<instrument verbose="true">
|
|
<fileset dir="${testclasses.dir}/org/hibernate/auction/model">
|
|
<include name="*.class"/>
|
|
</fileset>
|
|
</instrument>
|
|
</target>]]></programlisting>
|
|
|
|
<para>
|
|
A different (better?) way to avoid unnecessary column reads, at least for
|
|
read-only transactions is to use the projection features of HQL or Criteria
|
|
queries. This avoids the need for buildtime bytecode processing and is
|
|
certainly a prefered solution.
|
|
</para>
|
|
|
|
<para>
|
|
You may force the usual eager fetching of properties using <literal>fetch all
|
|
properties</literal> in HQL.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="performance-cache" revision="1">
|
|
<title>The Second Level Cache</title>
|
|
|
|
<para>
|
|
A Hibernate <literal>Session</literal> is a transaction-level cache of persistent data. It is
|
|
possible to configure a cluster or JVM-level (<literal>SessionFactory</literal>-level) cache on
|
|
a class-by-class and collection-by-collection basis. You may even plug in a clustered cache. Be
|
|
careful. Caches are never aware of changes made to the persistent store by another application
|
|
(though they may be configured to regularly expire cached data).
|
|
</para>
|
|
|
|
<para>
|
|
By default, Hibernate uses EHCache for JVM-level caching. (JCS support is now deprecated and will
|
|
be removed in a future version of Hibernate.) You may choose a different implementation by
|
|
specifying the name of a class that implements <literal>org.hibernate.cache.CacheProvider</literal>
|
|
using the property <literal>hibernate.cache.provider_class</literal>.
|
|
</para>
|
|
|
|
<table frame="topbot" id="cacheproviders" revision="1">
|
|
<title>Cache Providers</title>
|
|
<tgroup cols='5' align='left' colsep='1' rowsep='1'>
|
|
<colspec colname='c1' colwidth="1*"/>
|
|
<colspec colname='c2' colwidth="3*"/>
|
|
<colspec colname='c3' colwidth="1*"/>
|
|
<colspec colname='c4' colwidth="1*"/>
|
|
<colspec colname='c5' colwidth="1*"/>
|
|
<thead>
|
|
<row>
|
|
<entry>Cache</entry>
|
|
<entry>Provider class</entry>
|
|
<entry>Type</entry>
|
|
<entry>Cluster Safe</entry>
|
|
<entry>Query Cache Supported</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>Hashtable (not intended for production use)</entry>
|
|
<entry><literal>org.hibernate.cache.HashtableCacheProvider</literal></entry>
|
|
<entry>memory</entry>
|
|
<entry></entry>
|
|
<entry>yes</entry>
|
|
</row>
|
|
<row>
|
|
<entry>EHCache</entry>
|
|
<entry><literal>org.hibernate.cache.EhCacheProvider</literal></entry>
|
|
<entry>memory, disk</entry>
|
|
<entry></entry>
|
|
<entry>yes</entry>
|
|
</row>
|
|
<row>
|
|
<entry>OSCache</entry>
|
|
<entry><literal>org.hibernate.cache.OSCacheProvider</literal></entry>
|
|
<entry>memory, disk</entry>
|
|
<entry></entry>
|
|
<entry>yes</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SwarmCache</entry>
|
|
<entry><literal>org.hibernate.cache.SwarmCacheProvider</literal></entry>
|
|
<entry>clustered (ip multicast)</entry>
|
|
<entry>yes (clustered invalidation)</entry>
|
|
<entry></entry>
|
|
</row>
|
|
<row>
|
|
<entry>JBoss TreeCache</entry>
|
|
<entry><literal>org.hibernate.cache.TreeCacheProvider</literal></entry>
|
|
<entry>clustered (ip multicast), transactional</entry>
|
|
<entry>yes (replication)</entry>
|
|
<entry>yes (clock sync req.)</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<sect2 id="performance-cache-mapping">
|
|
<title>Cache mappings</title>
|
|
|
|
<para>
|
|
The <literal><cache></literal> element of a class or collection mapping has the
|
|
following form:
|
|
</para>
|
|
|
|
<programlistingco>
|
|
<areaspec>
|
|
<area id="cache1" coords="2 70"/>
|
|
</areaspec>
|
|
<programlisting><![CDATA[<cache
|
|
usage="transactional|read-write|nonstrict-read-write|read-only"
|
|
/>]]></programlisting>
|
|
<calloutlist>
|
|
<callout arearefs="cache1">
|
|
<para>
|
|
<literal>usage</literal> specifies the caching strategy:
|
|
<literal>transactional</literal>,
|
|
<literal>read-write</literal>,
|
|
<literal>nonstrict-read-write</literal> or
|
|
<literal>read-only</literal>
|
|
</para>
|
|
</callout>
|
|
</calloutlist>
|
|
</programlistingco>
|
|
|
|
<para>
|
|
Alternatively (preferrably?), you may specify <literal><class-cache></literal> and
|
|
<literal><collection-cache></literal> elements in <literal>hibernate.cfg.xml</literal>.
|
|
</para>
|
|
|
|
<para>
|
|
The <literal>usage</literal> attribute specifies a <emphasis>cache concurrency strategy</emphasis>.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-cache-readonly">
|
|
<title>Strategy: read only</title>
|
|
|
|
<para>
|
|
If your application needs to read but never modify instances of a persistent class, a
|
|
<literal>read-only</literal> cache may be used. This is the simplest and best performing
|
|
strategy. It's even perfectly safe for use in a cluster.
|
|
</para>
|
|
|
|
<programlisting><![CDATA[<class name="eg.Immutable" mutable="false">
|
|
<cache usage="read-only"/>
|
|
....
|
|
</class>]]></programlisting>
|
|
|
|
</sect2>
|
|
|
|
|
|
<sect2 id="performance-cache-readwrite">
|
|
<title>Strategy: read/write</title>
|
|
|
|
<para>
|
|
If the application needs to update data, a <literal>read-write</literal> cache might be appropriate.
|
|
This cache strategy should never be used if serializable transaction isolation level is required.
|
|
If the cache is used in a JTA environment, you must specify the property
|
|
<literal>hibernate.transaction.manager_lookup_class</literal>, naming a strategy for obtaining the
|
|
JTA <literal>TransactionManager</literal>. In other environments, you should ensure that the transaction
|
|
is completed when <literal>Session.close()</literal> or <literal>Session.disconnect()</literal> is called.
|
|
If you wish to use this strategy in a cluster, you should ensure that the underlying cache implementation
|
|
supports locking. The built-in cache providers do <emphasis>not</emphasis>.
|
|
</para>
|
|
|
|
<programlisting><![CDATA[<class name="eg.Cat" .... >
|
|
<cache usage="read-write"/>
|
|
....
|
|
<set name="kittens" ... >
|
|
<cache usage="read-write"/>
|
|
....
|
|
</set>
|
|
</class>]]></programlisting>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-cache-nonstrict">
|
|
<title>Strategy: nonstrict read/write</title>
|
|
|
|
<para>
|
|
If the application only occasionally needs to update data (ie. if it is extremely unlikely that two
|
|
transactions would try to update the same item simultaneously) and strict transaction isolation is
|
|
not required, a <literal>nonstrict-read-write</literal> cache might be appropriate. If the cache is
|
|
used in a JTA environment, you must specify <literal>hibernate.transaction.manager_lookup_class</literal>.
|
|
In other environments, you should ensure that the transaction is completed when
|
|
<literal>Session.close()</literal> or <literal>Session.disconnect()</literal> is called.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-cache-transactional">
|
|
<title>Strategy: transactional</title>
|
|
|
|
<para>
|
|
The <literal>transactional</literal> cache strategy provides support for fully transactional cache
|
|
providers such as JBoss TreeCache. Such a cache may only be used in a JTA environment and you must
|
|
specify <literal>hibernate.transaction.manager_lookup_class</literal>.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<para>
|
|
None of the cache providers support all of the cache concurrency strategies. The following table shows
|
|
which providers are compatible with which concurrency strategies.
|
|
</para>
|
|
|
|
<table frame="topbot">
|
|
<title>Cache Concurrency Strategy Support</title>
|
|
<tgroup cols='5' align='left' colsep='1' rowsep='1'>
|
|
<colspec colname='c1' colwidth="1*"/>
|
|
<colspec colname='c2' colwidth="1*"/>
|
|
<colspec colname='c3' colwidth="1*"/>
|
|
<colspec colname='c4' colwidth="1*"/>
|
|
<colspec colname='c5' colwidth="1*"/>
|
|
<thead>
|
|
<row>
|
|
<entry>Cache</entry>
|
|
<entry>read-only</entry>
|
|
<entry>nonstrict-read-write</entry>
|
|
<entry>read-write</entry>
|
|
<entry>transactional</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>Hashtable (not intended for production use)</entry>
|
|
<entry>yes</entry>
|
|
<entry>yes</entry>
|
|
<entry>yes</entry>
|
|
<entry></entry>
|
|
</row>
|
|
<row>
|
|
<entry>EHCache</entry>
|
|
<entry>yes</entry>
|
|
<entry>yes</entry>
|
|
<entry>yes</entry>
|
|
<entry></entry>
|
|
</row>
|
|
<row>
|
|
<entry>OSCache</entry>
|
|
<entry>yes</entry>
|
|
<entry>yes</entry>
|
|
<entry>yes</entry>
|
|
<entry></entry>
|
|
</row>
|
|
<row>
|
|
<entry>SwarmCache</entry>
|
|
<entry>yes</entry>
|
|
<entry>yes</entry>
|
|
<entry></entry>
|
|
<entry></entry>
|
|
</row>
|
|
<row>
|
|
<entry>JBoss TreeCache</entry>
|
|
<entry>yes</entry>
|
|
<entry></entry>
|
|
<entry></entry>
|
|
<entry>yes</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="performance-sessioncache" revision="2">
|
|
<title>Managing the caches</title>
|
|
|
|
<para>
|
|
Whenever you pass an object to <literal>save()</literal>, <literal>update()</literal>
|
|
or <literal>saveOrUpdate()</literal> and whenever you retrieve an object using
|
|
<literal>load()</literal>, <literal>get()</literal>, <literal>list()</literal>,
|
|
<literal>iterate()</literal> or <literal>scroll()</literal>, that object is added
|
|
to the internal cache of the <literal>Session</literal>.
|
|
</para>
|
|
<para>
|
|
When <literal>flush()</literal> is subsequently called, the state of that object will
|
|
be synchronized with the database. If you do not want this synchronization to occur or
|
|
if you are processing a huge number of objects and need to manage memory efficiently,
|
|
the <literal>evict()</literal> method may be used to remove the object and its collections
|
|
from the first-level cache.
|
|
</para>
|
|
|
|
<programlisting><![CDATA[ScrollableResult cats = sess.createQuery("from Cat as cat").scroll(); //a huge result set
|
|
while ( cats.next() ) {
|
|
Cat cat = (Cat) cats.get(0);
|
|
doSomethingWithACat(cat);
|
|
sess.evict(cat);
|
|
}]]></programlisting>
|
|
|
|
<para>
|
|
The <literal>Session</literal> also provides a <literal>contains()</literal> method to determine
|
|
if an instance belongs to the session cache.
|
|
</para>
|
|
|
|
<para>
|
|
To completely evict all objects from the session cache, call <literal>Session.clear()</literal>
|
|
</para>
|
|
|
|
<para>
|
|
For the second-level cache, there are methods defined on <literal>SessionFactory</literal> for
|
|
evicting the cached state of an instance, entire class, collection instance or entire collection
|
|
role.
|
|
</para>
|
|
|
|
<programlisting><![CDATA[sessionFactory.evict(Cat.class, catId); //evict a particular Cat
|
|
sessionFactory.evict(Cat.class); //evict all Cats
|
|
sessionFactory.evictCollection("Cat.kittens", catId); //evict a particular collection of kittens
|
|
sessionFactory.evictCollection("Cat.kittens"); //evict all kitten collections]]></programlisting>
|
|
|
|
<para>
|
|
The <literal>CacheMode</literal> controls how a particular session interacts with the second-level
|
|
cache.
|
|
</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
<literal>CacheMode.NORMAL</literal> - read items from and write items to the second-level cache
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>CacheMode.GET</literal> - read items from the second-level cache, but don't write to
|
|
the second-level cache except when updating data
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>CacheMode.PUT</literal> - write items to the second-level cache, but don't read from
|
|
the second-level cache
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<literal>CacheMode.REFRESH</literal> - write items to the second-level cache, but don't read from
|
|
the second-level cache, bypass the effect of <literal>hibernate.cache.use_minimal_puts</literal>, forcing
|
|
a refresh of the second-level cache for all items read from the database
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
To browse the contents of a second-level or query cache region, use the <literal>Statistics</literal>
|
|
API:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[Map cacheEntries = sessionFactory.getStatistics()
|
|
.getSecondLevelCacheStatistics(regionName)
|
|
.getEntries();]]></programlisting>
|
|
|
|
<para>
|
|
You'll need to enable statistics, and, optionally, force Hibernate to keep the cache entries in a
|
|
more human-understandable format:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[hibernate.generate_statistics true
|
|
hibernate.cache.use_structured_entries true]]></programlisting>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="performance-querycache" revision="1">
|
|
<title>The Query Cache</title>
|
|
|
|
<para>
|
|
Query result sets may also be cached. This is only useful for queries that are run
|
|
frequently with the same parameters. To use the query cache you must first enable it:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[hibernate.cache.use_query_cache true]]></programlisting>
|
|
|
|
<para>
|
|
This setting causes the creation of two new cache regions - one holding cached query
|
|
result sets (<literal>org.hibernate.cache.StandardQueryCache</literal>), the other
|
|
holding timestamps of the most recent updates to queryable tables
|
|
(<literal>org.hibernate.cache.UpdateTimestampsCache</literal>). Note that the query
|
|
cache does not cache the state of the actual entities in the result set; it caches
|
|
only identifier values and results of value type. So the query cache should always be
|
|
used in conjunction with the second-level cache.
|
|
</para>
|
|
|
|
<para>
|
|
Most queries do not benefit from caching, so by default queries are not cached. To
|
|
enable caching, call <literal>Query.setCacheable(true)</literal>. This call allows
|
|
the query to look for existing cache results or add its results to the cache when
|
|
it is executed.
|
|
</para>
|
|
|
|
<para>
|
|
If you require fine-grained control over query cache expiration policies, you may
|
|
specify a named cache region for a particular query by calling
|
|
<literal>Query.setCacheRegion()</literal>.
|
|
</para>
|
|
|
|
<programlisting><![CDATA[List blogs = sess.createQuery("from Blog blog where blog.blogger = :blogger")
|
|
.setEntity("blogger", blogger)
|
|
.setMaxResults(15)
|
|
.setCacheable(true)
|
|
.setCacheRegion("frontpages")
|
|
.list();]]></programlisting>
|
|
|
|
<para>
|
|
If the query should force a refresh of its query cache region, you should call
|
|
<literal>Query.setCacheMode(CacheMode.REFRESH)</literal>. This is particularly useful
|
|
in cases where underlying data may have been updated via a separate process (i.e.,
|
|
not modified through Hibernate) and allows the application to selectively refresh
|
|
particular query result sets. This is a more efficient alternative to eviction of
|
|
a query cache region via <literal>SessionFactory.evictQueries()</literal>.
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="performance-collections">
|
|
<title>Understanding Collection performance</title>
|
|
|
|
<para>
|
|
We've already spent quite some time talking about collections.
|
|
In this section we will highlight a couple more issues about
|
|
how collections behave at runtime.
|
|
</para>
|
|
|
|
<sect2 id="performance-collections-taxonomy">
|
|
<title>Taxonomy</title>
|
|
|
|
<para>Hibernate defines three basic kinds of collections:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>collections of values</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>one to many associations</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>many to many associations</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
This classification distinguishes the various table and foreign key
|
|
relationships but does not tell us quite everything we need to know
|
|
about the relational model. To fully understand the relational structure
|
|
and performance characteristics, we must also consider the structure of
|
|
the primary key that is used by Hibernate to update or delete collection
|
|
rows. This suggests the following classification:
|
|
</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>indexed collections</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>sets</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>bags</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
All indexed collections (maps, lists, arrays) have a primary key consisting
|
|
of the <literal><key></literal> and <literal><index></literal>
|
|
columns. In this case collection updates are usually extremely efficient -
|
|
the primary key may be efficiently indexed and a particular row may be efficiently
|
|
located when Hibernate tries to update or delete it.
|
|
</para>
|
|
|
|
<para>
|
|
Sets have a primary key consisting of <literal><key></literal> and element
|
|
columns. This may be less efficient for some types of collection element, particularly
|
|
composite elements or large text or binary fields; the database may not be able to index
|
|
a complex primary key as efficently. On the other hand, for one to many or many to many
|
|
associations, particularly in the case of synthetic identifiers, it is likely to be just
|
|
as efficient. (Side-note: if you want <literal>SchemaExport</literal> to actually create
|
|
the primary key of a <literal><set></literal> for you, you must declare all columns
|
|
as <literal>not-null="true"</literal>.)
|
|
</para>
|
|
|
|
<para>
|
|
<literal><idbag></literal> mappings define a surrogate key, so they are
|
|
always very efficient to update. In fact, they are the best case.
|
|
</para>
|
|
|
|
<para>
|
|
Bags are the worst case. Since a bag permits duplicate element values and has no
|
|
index column, no primary key may be defined. Hibernate has no way of distinguishing
|
|
between duplicate rows. Hibernate resolves this problem by completely removing
|
|
(in a single <literal>DELETE</literal>) and recreating the collection whenever it
|
|
changes. This might be very inefficient.
|
|
</para>
|
|
|
|
<para>
|
|
Note that for a one-to-many association, the "primary key" may not be the physical
|
|
primary key of the database table - but even in this case, the above classification
|
|
is still useful. (It still reflects how Hibernate "locates" individual rows of the
|
|
collection.)
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-collections-mostefficientupdate">
|
|
<title>Lists, maps, idbags and sets are the most efficient collections to update</title>
|
|
|
|
<para>
|
|
From the discussion above, it should be clear that indexed collections
|
|
and (usually) sets allow the most efficient operation in terms of adding,
|
|
removing and updating elements.
|
|
</para>
|
|
|
|
<para>
|
|
There is, arguably, one more advantage that indexed collections have over sets for
|
|
many to many associations or collections of values. Because of the structure of a
|
|
<literal>Set</literal>, Hibernate doesn't ever <literal>UPDATE</literal> a row when
|
|
an element is "changed". Changes to a <literal>Set</literal> always work via
|
|
<literal>INSERT</literal> and <literal>DELETE</literal> (of individual rows). Once
|
|
again, this consideration does not apply to one to many associations.
|
|
</para>
|
|
|
|
<para>
|
|
After observing that arrays cannot be lazy, we would conclude that lists, maps and
|
|
idbags are the most performant (non-inverse) collection types, with sets not far
|
|
behind. Sets are expected to be the most common kind of collection in Hibernate
|
|
applications. This is because the "set" semantics are most natural in the relational
|
|
model.
|
|
</para>
|
|
|
|
<para>
|
|
However, in well-designed Hibernate domain models, we usually see that most collections
|
|
are in fact one-to-many associations with <literal>inverse="true"</literal>. For these
|
|
associations, the update is handled by the many-to-one end of the association, and so
|
|
considerations of collection update performance simply do not apply.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-collections-mostefficentinverse">
|
|
<title>Bags and lists are the most efficient inverse collections</title>
|
|
|
|
<para>
|
|
Just before you ditch bags forever, there is a particular case in which bags (and also lists)
|
|
are much more performant than sets. For a collection with <literal>inverse="true"</literal>
|
|
(the standard bidirectional one-to-many relationship idiom, for example) we can add elements
|
|
to a bag or list without needing to initialize (fetch) the bag elements! This is because
|
|
<literal>Collection.add()</literal> or <literal>Collection.addAll()</literal> must always
|
|
return true for a bag or <literal>List</literal> (unlike a <literal>Set</literal>). This can
|
|
make the following common code much faster.
|
|
</para>
|
|
|
|
<programlisting><![CDATA[Parent p = (Parent) sess.load(Parent.class, id);
|
|
Child c = new Child();
|
|
c.setParent(p);
|
|
p.getChildren().add(c); //no need to fetch the collection!
|
|
sess.flush();]]></programlisting>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-collections-oneshotdelete">
|
|
<title>One shot delete</title>
|
|
|
|
<para>
|
|
Occasionally, deleting collection elements one by one can be extremely inefficient. Hibernate
|
|
isn't completely stupid, so it knows not to do that in the case of an newly-empty collection
|
|
(if you called <literal>list.clear()</literal>, for example). In this case, Hibernate will
|
|
issue a single <literal>DELETE</literal> and we are done!
|
|
</para>
|
|
|
|
<para>
|
|
Suppose we add a single element to a collection of size twenty and then remove two elements.
|
|
Hibernate will issue one <literal>INSERT</literal> statement and two <literal>DELETE</literal>
|
|
statements (unless the collection is a bag). This is certainly desirable.
|
|
</para>
|
|
|
|
<para>
|
|
However, suppose that we remove eighteen elements, leaving two and then add thee new elements.
|
|
There are two possible ways to proceed
|
|
</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>delete eighteen rows one by one and then insert three rows</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>remove the whole collection (in one SQL <literal>DELETE</literal>) and insert
|
|
all five current elements (one by one)</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
Hibernate isn't smart enough to know that the second option is probably quicker in this case.
|
|
(And it would probably be undesirable for Hibernate to be that smart; such behaviour might
|
|
confuse database triggers, etc.)
|
|
</para>
|
|
|
|
<para>
|
|
Fortunately, you can force this behaviour (ie. the second strategy) at any time by discarding
|
|
(ie. dereferencing) the original collection and returning a newly instantiated collection with
|
|
all the current elements. This can be very useful and powerful from time to time.
|
|
</para>
|
|
|
|
<para>
|
|
Of course, one-shot-delete does not apply to collections mapped <literal>inverse="true"</literal>.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="performance-monitoring" revision="1">
|
|
<title>Monitoring performance</title>
|
|
|
|
<para>
|
|
Optimization is not much use without monitoring and access to performance numbers.
|
|
Hibernate provides a full range of figures about its internal operations.
|
|
Statistics in Hibernate are available per <literal>SessionFactory</literal>.
|
|
</para>
|
|
|
|
<sect2 id="performance-monitoring-sf" revision="2">
|
|
<title>Monitoring a SessionFactory</title>
|
|
|
|
<para>
|
|
You can access <literal>SessionFactory</literal> metrics in two ways.
|
|
Your first option is to call <literal>sessionFactory.getStatistics()</literal> and
|
|
read or display the <literal>Statistics</literal> yourself.
|
|
</para>
|
|
|
|
<para>
|
|
Hibernate can also use JMX to publish metrics if you enable the
|
|
<literal>StatisticsService</literal> MBean. You may enable a single MBean for all your
|
|
<literal>SessionFactory</literal> or one per factory. See the following code for
|
|
minimalistic configuration examples:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[// MBean service registration for a specific SessionFactory
|
|
Hashtable tb = new Hashtable();
|
|
tb.put("type", "statistics");
|
|
tb.put("sessionFactory", "myFinancialApp");
|
|
ObjectName on = new ObjectName("hibernate", tb); // MBean object name
|
|
|
|
StatisticsService stats = new StatisticsService(); // MBean implementation
|
|
stats.setSessionFactory(sessionFactory); // Bind the stats to a SessionFactory
|
|
server.registerMBean(stats, on); // Register the Mbean on the server]]></programlisting>
|
|
|
|
|
|
<programlisting><![CDATA[// MBean service registration for all SessionFactory's
|
|
Hashtable tb = new Hashtable();
|
|
tb.put("type", "statistics");
|
|
tb.put("sessionFactory", "all");
|
|
ObjectName on = new ObjectName("hibernate", tb); // MBean object name
|
|
|
|
StatisticsService stats = new StatisticsService(); // MBean implementation
|
|
server.registerMBean(stats, on); // Register the MBean on the server]]></programlisting>
|
|
|
|
<para>
|
|
TODO: This doesn't make sense: In the first case, we retrieve and use the MBean directly. In the second one, we must give
|
|
the JNDI name in which the session factory is held before using it. Use
|
|
<literal>hibernateStatsBean.setSessionFactoryJNDIName("my/JNDI/Name")</literal>
|
|
</para>
|
|
<para>
|
|
You can (de)activate the monitoring for a <literal>SessionFactory</literal>
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
at configuration time, set <literal>hibernate.generate_statistics</literal> to <literal>false</literal>
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
at runtime: <literal>sf.getStatistics().setStatisticsEnabled(true)</literal>
|
|
or <literal>hibernateStatsBean.setStatisticsEnabled(true)</literal>
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
Statistics can be reset programatically using the <literal>clear()</literal> method.
|
|
A summary can be sent to a logger (info level) using the <literal>logSummary()</literal>
|
|
method.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="performance-monitoring-metrics" revision="1">
|
|
<title>Metrics</title>
|
|
|
|
<para>
|
|
Hibernate provides a number of metrics, from very basic to the specialized information
|
|
only relevant in certain scenarios. All available counters are described in the
|
|
<literal>Statistics</literal> interface API, in three categories:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Metrics related to the general <literal>Session</literal> usage, such as
|
|
number of open sessions, retrieved JDBC connections, etc.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Metrics related to he entities, collections, queries, and caches as a
|
|
whole (aka global metrics),
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Detailed metrics related to a particular entity, collection, query or
|
|
cache region.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
For exampl,e you can check the cache hit, miss, and put ratio of entities, collections
|
|
and queries, and the average time a query needs. Beware that the number of milliseconds
|
|
is subject to approximation in Java. Hibernate is tied to the JVM precision, on some
|
|
platforms this might even only be accurate to 10 seconds.
|
|
</para>
|
|
|
|
<para>
|
|
Simple getters are used to access the global metrics (i.e. not tied to a particular entity,
|
|
collection, cache region, etc.). You can access the metrics of a particular entity, collection
|
|
or cache region through its name, and through its HQL or SQL representation for queries. Please
|
|
refer to the <literal>Statistics</literal>, <literal>EntityStatistics</literal>,
|
|
<literal>CollectionStatistics</literal>, <literal>SecondLevelCacheStatistics</literal>,
|
|
and <literal>QueryStatistics</literal> API Javadoc for more information. The following
|
|
code shows a simple example:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[Statistics stats = HibernateUtil.sessionFactory.getStatistics();
|
|
|
|
double queryCacheHitCount = stats.getQueryCacheHitCount();
|
|
double queryCacheMissCount = stats.getQueryCacheMissCount();
|
|
double queryCacheHitRatio =
|
|
queryCacheHitCount / (queryCacheHitCount + queryCacheMissCount);
|
|
|
|
log.info("Query Hit ratio:" + queryCacheHitRatio);
|
|
|
|
EntityStatistics entityStats =
|
|
stats.getEntityStatistics( Cat.class.getName() );
|
|
long changes =
|
|
entityStats.getInsertCount()
|
|
+ entityStats.getUpdateCount()
|
|
+ entityStats.getDeleteCount();
|
|
log.info(Cat.class.getName() + " changed " + changes + "times" );]]></programlisting>
|
|
|
|
<para>
|
|
To work on all entities, collections, queries and region caches, you can retrieve
|
|
the list of names of entities, collections, queries and region caches with the
|
|
following methods: <literal>getQueries()</literal>, <literal>getEntityNames()</literal>,
|
|
<literal>getCollectionRoleNames()</literal>, and
|
|
<literal>getSecondLevelCacheRegionNames()</literal>.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
</chapter> |