242 lines
10 KiB
XML
Executable File
242 lines
10 KiB
XML
Executable File
<chapter id="batch">
|
|
<title>Batch processing</title>
|
|
|
|
<para>
|
|
A naive approach to inserting 100 000 rows in the database using Hibernate might
|
|
look like this:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[Session session = sessionFactory.openSession();
|
|
Transaction tx = session.beginTransaction();
|
|
for ( int i=0; i<100000; i++ ) {
|
|
Customer customer = new Customer(.....);
|
|
session.save(customer);
|
|
}
|
|
tx.commit();
|
|
session.close();]]></programlisting>
|
|
|
|
<para>
|
|
This would fall over with an <literal>OutOfMemoryException</literal> somewhere
|
|
around the 50 000th row. That's because Hibernate caches all the newly inserted
|
|
<literal>Customer</literal> instances in the session-level cache.
|
|
</para>
|
|
|
|
<para>
|
|
In this chapter we'll show you how to avoid this problem. First, however, if you
|
|
are doing batch processing, it is absolutely critical that you enable the use of
|
|
JDBC batching, if you intend to achieve reasonable performance. Set the JDBC batch
|
|
size to a reasonable number (say, 10-50):
|
|
</para>
|
|
|
|
<programlisting><![CDATA[hibernate.jdbc.batch_size 20]]></programlisting>
|
|
|
|
<para>
|
|
You also might like to do this kind of work in a process where interaction with
|
|
the second-level cache is completely disabled:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[hibernate.cache.use_second_level_cache false]]></programlisting>
|
|
|
|
<sect1 id="batch-inserts">
|
|
<title>Batch inserts</title>
|
|
|
|
<para>
|
|
When making new objects persistent, you must <literal>flush()</literal> and
|
|
then <literal>clear()</literal> the session regularly, to control the size of
|
|
the first-level cache.
|
|
</para>
|
|
|
|
<programlisting><![CDATA[Session session = sessionFactory.openSession();
|
|
Transaction tx = session.beginTransaction();
|
|
|
|
for ( int i=0; i<100000; i++ ) {
|
|
Customer customer = new Customer(.....);
|
|
session.save(customer);
|
|
if ( i % 20 == 0 ) { //20, same as the JDBC batch size
|
|
//flush a batch of inserts and release memory:
|
|
session.flush();
|
|
session.clear();
|
|
}
|
|
}
|
|
|
|
tx.commit();
|
|
session.close();]]></programlisting>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="batch-update" >
|
|
<title>Batch updates</title>
|
|
|
|
<para>
|
|
For retrieving and updating data the same ideas apply. In addition, you need to
|
|
use <literal>scroll()</literal> to take advantage of server-side cursors for
|
|
queries that return many rows of data.
|
|
</para>
|
|
|
|
<programlisting><![CDATA[Session session = sessionFactory.openSession();
|
|
Transaction tx = session.beginTransaction();
|
|
|
|
ScrollableResults customers = session.getNamedQuery("GetCustomers")
|
|
.setCacheMode(CacheMode.IGNORE)
|
|
.scroll(ScrollMode.FORWARD_ONLY);
|
|
int count=0;
|
|
while ( customers.next() ) {
|
|
Customer customer = (Customer) customers.get(0);
|
|
customer.updateStuff(...);
|
|
if ( ++count % 20 == 0 ) {
|
|
//flush a batch of updates and release memory:
|
|
session.flush();
|
|
session.clear();
|
|
}
|
|
}
|
|
|
|
tx.commit();
|
|
session.close();]]></programlisting>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="batch-direct" revision="1">
|
|
<title>DML-style operations</title>
|
|
|
|
<para>
|
|
As already discussed, automatic and transparent object/relational mapping is concerned
|
|
with the management of object state. This implies that the object state is available
|
|
in memory, hence manipulating (using the SQL <literal>Data Manipulation Language</literal>
|
|
(DML) statements: <literal>INSERT</literal>, <literal>UPDATE</literal>, <literal>DELETE</literal>)
|
|
data directly in the database will not affect in-memory state. However, Hibernate provides methods
|
|
for bulk SQL-style DML statement execution which are performed through the
|
|
Hibernate Query Language (<xref linkend="queryhql">HQL</xref>).
|
|
</para>
|
|
|
|
<para>
|
|
The psuedo-syntax for <literal>UPDATE</literal> and <literal>DELETE</literal> statements
|
|
is: <literal>( UPDATE | DELETE ) FROM? EntityName (WHERE where_conditions)?</literal>. Some
|
|
points to note:
|
|
</para>
|
|
|
|
<itemizedlist spacing="compact">
|
|
<listitem>
|
|
<para>
|
|
In the from-clause, the FROM keyword is optional
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
There can only be a single entity named in the from-clause; it can optionally be
|
|
aliased. If the entity name is aliased, then any property references must
|
|
be qualified using that alias; if the entity name is not aliased, then it is
|
|
illegal for any property references to be qualified.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
No joins (either implicit or explicit) can be specified in a bulk HQL query. Sub-queries
|
|
may be used in the where-clause; the subqueries, themselves, can contain joins.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
The where-clause is also optional.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
As an example, to execute an HQL <literal>UPDATE</literal>, use the
|
|
<literal>Query.executeUpdate()</literal> method (the method is named for
|
|
those familiar with JDBC's <literal>PreparedStatement.executeUpdate()</literal>):
|
|
</para>
|
|
|
|
<programlisting><![CDATA[Session session = sessionFactory.openSession();
|
|
Transaction tx = session.beginTransaction();
|
|
|
|
String hqlUpdate = "update Customer c set c.name = :newName where c.name = :oldName";
|
|
// or String hqlUpdate = "update Customer set name = :newName where name = :oldName";
|
|
int updatedEntities = s.createQuery( hqlUpdate )
|
|
.setString( "newName", newName )
|
|
.setString( "oldName", oldName )
|
|
.executeUpdate();
|
|
tx.commit();
|
|
session.close();]]></programlisting>
|
|
|
|
<para>
|
|
To execute an HQL <literal>DELETE</literal>, use the same <literal>Query.executeUpdate()</literal>
|
|
method:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[Session session = sessionFactory.openSession();
|
|
Transaction tx = session.beginTransaction();
|
|
|
|
String hqlDelete = "delete Customer c where c.name = :oldName";
|
|
// or String hqlDelete = "delete Customer where name = :oldName";
|
|
int deletedEntities = s.createQuery( hqlDelete )
|
|
.setString( "oldName", oldName )
|
|
.executeUpdate();
|
|
tx.commit();
|
|
session.close();]]></programlisting>
|
|
|
|
<para>
|
|
The <literal>int</literal> value returned by the <literal>Query.executeUpdate()</literal>
|
|
method indicate the number of entities effected by the operation. Consider this may or may not
|
|
correlate to the number of rows effected in the database. An HQL bulk operation might result in
|
|
multiple actual SQL statements being executed, for joined-subclass, for example. The returned
|
|
number indicates the number of actual entities affected by the statement. Going back to the
|
|
example of joined-subclass, a delete against one of the subclasses may actually result
|
|
in deletes against not just the table to which that subclass is mapped, but also the "root"
|
|
table and potentially joined-subclass tables further down the inheritence hierarchy.
|
|
</para>
|
|
|
|
<para>
|
|
The psuedo-syntax for <literal>INSERT</literal> statements is:
|
|
<literal>INSERT INTO EntityName (properties_list)? select_statement</literal>. Some
|
|
points to note:
|
|
</para>
|
|
|
|
<itemizedlist spacing="compact">
|
|
<listitem>
|
|
<para>
|
|
Only the INSERT INTO ... SELECT ... form is supported; not the INSERT INTO ... VALUES ... form.
|
|
</para>
|
|
<para>
|
|
The properties_list is optional. It is analogous to the <literal>column speficiation</literal>
|
|
in the SQL <literal>INSERT</literal> statement. If omitted, all "eligible" (see next) properties are
|
|
automatically included.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
For entities involved in mapped inheritence, only properties directly defined on that
|
|
given class-level can be used in the properties_list. Superclass properties are not
|
|
allowed; and subclass properties do not make sense. In other words, <literal>INSERT</literal>
|
|
statements are inherently non-polymorphic.
|
|
</para>
|
|
<para>
|
|
select_statement can be any valid HQL select query, with the caveat that the return types
|
|
must match the types expected by the insert. Currently, this is checked during query
|
|
compilation rather than allowing the check to relegate to the database. Note however
|
|
that this might cause problems between Hibernate <literal>Type</literal>s which are
|
|
<emphasis>equivalent</emphasis> as opposed to <emphasis>equal</emphasis>. This might cause
|
|
issues with mismatches between a property defined as a <literal>org.hibernate.type.DateType</literal>
|
|
and a property defined as a <literal>org.hibernate.type.TimestampType</literal>, even though the
|
|
database might not make a distinction or might be able to handle the conversion.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
An example HQL <literal>INSERT</literal> statement execution:
|
|
</para>
|
|
|
|
<programlisting><![CDATA[Session session = sessionFactory.openSession();
|
|
Transaction tx = session.beginTransaction();
|
|
|
|
String hqlInsert = "insert into DelinquentAccount (id, name) select c.id, c.name from Customer c where ...";
|
|
int createdEntities = s.createQuery( hqlInsert )
|
|
.executeUpdate();
|
|
tx.commit();
|
|
session.close();]]></programlisting>
|
|
|
|
</sect1>
|
|
|
|
</chapter>
|