Transactions And Concurrency
The most important point about Hibernate and concurrency control is that it is very
easy to understand. Hibernate directly uses JDBC connections and JTA resources without
adding any additional locking behavior. We highly recommend you spend some time with the
JDBC, ANSI, and transaction isolation specification of your database management system.
Hibernate only adds automatic versioning but does not lock objects in memory or change the
isolation level of your database transactions. Basically, use Hibernate like you would
use direct JDBC (or JTA/CMT) with your database resources.
However, in addition to automatic versioning, Hibernate also offers a (minor) API for
pessimistic locking of rows, using the SELECT FOR UPDATE syntax. This API
is discussed later in this chapter.
We start the discussion of concurrency control in Hibernate with the granularity of
Configuration, SessionFactory, and
Session, as well as database and long application transactions.
Session and transaction scopes
A SessionFactory is an expensive-to-create, threadsafe object
intended to be shared by all application threads. It is created once, usually on
application startup, from a Configuration instance.
A Session is an inexpensive, non-threadsafe object that should be
used once, for a single business process, a single unit of work, and then discarded.
A Session will not obtain a JDBC Connection
(or a Datasource) unless it is needed, so you may safely open
and close a Session even if you are not sure that data access will
be needed to serve a particular request. (This becomes important as soon as you are
implementing some of the following patterns using request interception.)
To complete this picture you also have to think about database transactions. A
database transaction has to be as short as possible, to reduce lock contention in
the database. Long database transactions will prevent your application from scaling
to highly concurrent load.
What is the scope of a unit of work? Can a single Hibernate Session
span several database transactions or is this a one-to-one relationship of scopes? When
should you open and close a Session and how do you demarcate the
database transaction boundaries?
Unit of work
First, don't use the session-per-operation antipattern, that is,
don't open and close a Session for every simple database call in
a single thread! Of course, the same is true for database transactions. Database calls
in an application are made using a planned sequence, they are grouped into atomic
units of work. (Note that this also means that auto-commit after every single
SQL statement is useless in an application, this mode is intended for ad-hoc SQL
console work. Hibernate disables, or expects the application server to do so,
auto-commit mode immediately.)
The most common pattern in a multi-user client/server application is
session-per-request. In this model, a request from the client
is send to the server (where the Hibernate persistence layer runs), a new Hibernate
Session is opened, and all database operations are executed in this unit
of work. Once the work has been completed (and the response for the client has been prepared),
the session is flushed and closed. You would also use a single database transaction to
serve the clients request, starting and committing it when you open and close the
Session. The relationship between the two is one-to-one and this
model is a perfect fit for many applications.
The challenge lies in the implementation: not only has the Session
and transaction to be started and ended correctly, but they also have to be accessible for
data access operations. The demarcation of a unit of work is ideally implemented using an
interceptor that runs when a request hits the server and before the response will be send (i.e.
a ServletFilter). We recommend to bind the Session to
the thread that serves the request, using a ThreadLocal variable. This allows
easy access (like accessing a static variable) in all code that runs in this thread. Depending
on the database transaction demarcation mechanism you chose, you might also keep the transaction
context in a ThreadLocal variable. The implementation patterns for this
are known as ThreadLocal Session and Open Session in View.
You can easily extend the HibernateUtil helper class shown earlier in this
documentation to implement this. Of course, you'd have to find a way to implement an interceptor
and set it up in your environment. See the Hibernate website for tips and examples.
Application transactions
The session-per-request pattern is not the only useful concept you can use to design
units of work. Many business processes require a whole series of interactions with the user
interleaved with database accesses. In web and enterprise applications it is
not acceptable for a database transaction to span a user interaction. Consider the following
example:
The first screen of a dialog opens, the data seen by the user has been loaded in
a particular Session and database transaction. The user is free to
modify the objects.
The user clicks "Save" after 5 minutes and expects his modifications to be made
persistent; he also expects that he was the only person editing this information and
that no conflicting modification can occur.
We call this unit of work, from the point of view of the user, a long running
application transaction. There are many ways how you can implement
this in your application.
A first naive implementation might keep the Session and database
transaction open during user think time, with locks held in the database to prevent
concurrent modification, and to guarantee isolation and atomicity. This is of course
an anti-pattern, since lock contention would not allow the application to scale with
the number of concurrent users.
Clearly, we have to use several database transactions to implement the application
transaction. In this case, maintaining isolation of business processes becomes the
partial responsibility of the application tier. A single application transaction
usually spans several database transactions. It will be atomic if only one of
these database transactions (the last one) stores the updated data, all others
simply read data (e.g. in a wizard-style dialog spanning several request/response
cycles). This is easier to implement than it might sound, especially if
you use Hibernate's features:
Automatic Versioning - Hibernate can do automatic
optimistic concurrency control for you, it can automatically detect
if a concurrent modification occured during user think time.
Detached Objects - If you decide to use the already
discussed session-per-request pattern, all loaded instances
will be in detached state during user think time. Hibernate allows you to
reattach the objects and persist the modifications, the pattern is called
session-per-request-with-detached-objects. Automatic
versioning is used to isolate concurrent modifications.
Long Session - The Hibernate Session may
be disconnected from the underlying JDBC connection after the database transaction
has been committed, and reconnected when a new client request occurs. This pattern
is known as session-per-application-transaction and makes
even reattachment unnecessary. Automatic versioning is used to isolate
concurrent modifications.
Both session-per-request-with-detached-objects and
session-per-application-transaction have advantages and disadvantages,
we discuss them later in this chapter in the context of optimistic concurrency control.
Considering object identity
An application may concurrently access the same persistent state in two
different Sessions. However, an instance of a persistent class
is never shared between two Session instances. Hence there are
two different notions of identity:
Database Identity
foo.getId().equals( bar.getId() )
JVM Identity
foo==bar
Then for objects attached to a particular Session
(i.e. in the scope of a Session) the two notions are equivalent, and
JVM identity for database identity is guaranteed by Hibernate. However, while the application
might concurrently access the "same" (persistent identity) business object in two different
sessions, the two instances will actually be "different" (JVM identity). Conflicts are
resolved using (automatic versioning) at flush/commit time, using an optimistic approach.
This approach leaves Hibernate and the database to worry about concurrency; it also provides
the best scalability, since guaranteeing identity in single-threaded units of work only doesn't
need expensive locking or other means of synchronization. The application never needs to
synchronize on any business object, as long as it sticks to a single thread per
Session. Within a Session the application may safely use
== to compare objects.
However, an application that uses == outside of a Session,
might see unexpected results. This might occur even in some unexpected places, for example,
if you put two detached instances into the same Set. Both might have the same
database identity (i.e. they represent the same row), but JVM identity is by definition not
guaranteed for instances in detached state. The developer has to override the equals()
and hashCode() methods in persistent classes and implement
his own notion of object equality. There is one caveat: Never use the database
identifier to implement equality, use a business key, a combination of unique, usually
immutable, attributes. The database identifier will change if a transient object is made
persistent. If the transient instance (usually together with detached instances) is held in a
Set, changing the hashcode breaks the contract of the Set.
Attributes for business keys don't have to be as stable as database primary keys, you only
have to guarantee stability as long as the objects are in the same Set. See
the Hibernate website for a more thorough discussion of this issue. Also note that this is not
a Hibernate issue, but simply how Java object identity and equality has to be implemented.
Common issues
Never use the anti-patterns session-per-user-session or
session-per-application (of course, there are rare exceptions to
this rule). Note that some of the following issues might also appear with the recommended
patterns, make sure you understand the implications before making a design decision:
A Session is not thread-safe. Things which are supposed to work
concurrently, like HTTP requests, session beans, or Swing workers, will cause race
conditions if a Session instance would be shared. If you keep your
Hibernate Session in your HttpSession (discussed
later), you should consider synchronizing access to your Http session. Otherwise,
a user that clicks reload fast enough may use the same Session in
two concurrently running threads.
An exception thrown by Hibernate means you have to rollback your database transaction
and close the Session immediately (discussed later in more detail).
If your Session is bound to the application, you have to stop
the application. Rolling back the database transaction doesn't put your business
objects back into the state they were at the start of the transaction. This means the
database state and the business objects do get out of sync. Usually this is not a
problem, because exceptions are not recoverable and you have to start over after
rollback anyway.
The Session caches every object that is in persistent state (watched
and checked for dirty state by Hibernate). This means it grows endlessly until you
get an OutOfMemoryException, if you keep it open for a long time or simply load too
much data. One solution for this is to call clear() and evict()
to manage the Session cache, but you most likely should consider a
Stored Procedure if you need mass data operations. Some solutions are shown in
. Keeping a Session open for the duration
of a user session also means a high probability of stale data.
Database transaction demarcation
Datatabase (or system) transaction boundaries are always necessary. No communication with
the database can occur outside of a database transaction (this seems to confuse many developers
who are used to the auto-commit mode). Always use clear transaction boundaries, even for
read-only operations. Depending on your isolation level and database capabilities this might not
be required but there is no downside if you always demarcate transactions explicitly.
A Hibernate application can run in non-managed (i.e. standalone, simple Web- or Swing applications)
and managed J2EE environments. In a non-managed environment, Hibernate is usually responsible for
its own database connection pool. The application developer has to manually set transaction
boundaries, in other words, begin, commit, or rollback database transactions himself. A managed environment
usually provides container-managed transactions, with the transaction assembly defined declaratively
in deployment descriptors of EJB session beans, for example. Programmatic transaction demarcation is
then no longer necessary, even flushing the Session is done automatically.
However, it is often desirable to keep your persistence layer portable. Hibernate offers a wrapper
API called Transaction that translates into the native transaction system of
your deployment environment. This API is optional (using database transactions is not!) and you don't
have to use it if database portability provided by Hibernate is all you need.
Usually, ending a Session involves four distinct phases:
flush the session
commit the transaction
close the session
handle exceptions
Flushing the session has been discussed earlier, we'll now have a closer look at transaction
demarcation and exception handling in both managed- and non-managed environments.
Non-managed environment
If a Hibernate persistence layer runs in a non-managed environment, database connections
are either handled by Hibernate's pooling mechanism or provided by the developer (this
case has other implications, esp. with regard to caching):
Note that you will very likely never see this piece of code in a normal application;
fatal (system) exceptions should always be caught at the "top". In other words, the
code that executes Hibernate calls (in the persistence layer) and the code that handles
RuntimeException (and usually can only clean up and exit) are in
different layers. This can be a challenge to design yourself and you should use J2EE/EJB
container services whenever they are available. Exception handling is discussed later in
this chapter.
We recommend, even if persistence layer portability is not your primary concern, the
Transaction API:
Note that you don't have to flush() the Session
explicitly, the call to commit() automatically triggers the
synchronization. This piece of code is now portable and runs in non-managed and JTA
environments. See for
the configuration options of the Transaction API and how it can
be mapped to the underlying resource transaction system.
A call to close() marks the end of a session. The main implication
of close() is that the JDBC connection will be relinquished by the session.
If you provided your own connection, close() returns a reference
to it, so you can manually close it or return it to the pool. Otherwise close()
returns it to the pool.
Using JTA
If your persistence layer runs in an application server (e.g. behind EJB session beans),
transaction boundaries are defined in deployment descriptors. Every datasource connection
obtained by Hibernate will automatically be part of a global JTA transaction. Hibernate
simply joins this transaction, or if a particular session bean method has no mandatory
transaction, Hibernate will tell the application server to start and end a transaction
directly. (The latter should be considered a very rare case and is offered for consistency
reasons. Note that your container might not allow mixed CMT and BMT behavior.)
If you set the properties hibernate.transaction.flush_before_completion
and hibernate.transaction.auto_close_session to true,
Hibernate wil also automatically flush and close the Session for you.
The only thing left is exception handling and rollback of the database transaction.
Fortunately, even this happens automatically, since an unhandled RuntimeException
thrown by a session bean method tells the container to set the global transaction to
rollback.
In other words, all you have to do in a managed environment is to get a Session
from the SessionFactory (usually bound to JNDI), do your data access
work, and leave the rest to the container. Transaction boundaries are set declaratively in
the deployment descriptors of your session bean.
Exception handling
If the Session throws an exception (including any
SQLException), you should immediately rollback the database
transaction, call Session.close() and discard the
Session instance. Certain methods of Session
will not leave the session in a consistent state. No
exception thrown by Hibernate can be treated as recoverable. Ensure that the
Session will be closed by calling close()
in a finally block.
The HibernateException, which wraps most of the errors that
can occur in a Hibernate persistence layer, is an unchecked exception (it wasn't
in older versions of Hibernate). In our opinion, we shouldn't force the application
developer to catch an unrecoverable exception at a low layer. In most systems, unchecked
and fatal exceptions are only catched at the highest level of the call stack and an
error message is presented to the application user (or some other appropriate action
is taken). Note that Hibernate might also throw other unchecked exceptions (e.g.
when detecting stale data in version checks) which are not a
HibernateException. These are, again, not recoverable and appropriate
action should be taken.
TODO: document new SQLException converter
Optimistic concurrency control
The only approach that is consistent with high concurrency and high
scalability is optimistic concurrency control with versioning. Version
checking uses version numbers, or timestamps, to detect conflicting updates
(and to prevent lost updates). Hibernate provides for three possible approaches
to writing application code that uses optimistic concurrency. The use cases
we show are in the context of long application transactions but version checking
also has the benefit of preventing lost updates in single database transactions.
Application version checking
In an implementation without much help from Hibernate, each interaction with the
database occurs in a new Session and the developer is responsible
for reloading all persistent instances from the database before manipulating them.
This approach forces the application to carry out its own version checking to ensure
application transaction isolation. This approach is the least efficient in terms of
database access. It is the approach most similar to entity EJBs.
The version property is mapped using <version>,
and Hibernate will automatically increment it during flush if the entity is
dirty.
Of course, if you are operating in a low-data-concurrency environment and don't
require version checking, you may use this approach and just skip the version
check. In that case, last commit wins will be the default
strategy for your long application transactions. Keep in mind that this might
confuse the users of the application, as they might experience lost updates without
error messages or a chance to merge conflicting changes.
Clearly, manual version checking is only feasible in very trivial circumstances
and not practical for most applications. Often not only single instances, but
complete graphs of modified ojects have to be checked. Hibernate offers automatic
version checking with either long Session or detached instances
as the design paradigm.
Long session and automatic versioning
A single Session instance and its persistent instances are
used for the whole application transaction. Hibernate checks instance versions
at flush time, throwing an exception if concurrent modification is detected.
It's up to the developer to catch and handle this exception (common options
are the opportunity for the user to merge changes or to restart the business
process with non-stale data).
The Session is disconnected from any underlying JDBC connection
when waiting for user interaction. This approach is the most efficient in terms
of database access. The application need not concern itself with version checking or
with reattaching detached instances, nor does it have to reload instances in every
database transaction.
The foo object still knows which Session it was
loaded in. Session.reconnect() obtains a new connection (or you
may supply one) and resumes the session. The method Session.disconnect() will disconnect the session from
the JDBC connection and return the connection to the pool (unless you provided the
connection). After reconnection, to force a version check on data you aren't updating, you
may call Session.lock() with LockMode.READ on any
objects that might have been updated by another transaction. You don't need to lock any
data that you are updating.
This pattern is problematic if the Session is too big to
be stored during user think time, e.g. an HttpSession should
be kept as small as possible. As the Session is also the
(mandatory) first-level cache and contains all loaded objects, we can probably
use this strategy only for a few request/response cycles. This is indeed
recommended, as the Session will soon also have stale data.
Also note that you should keep the disconnected Session close
to the persistence layer. In other words, use an EJB stateful session bean to
hold the Session and don't transfer it to the web layer (or
even serialize it to a separate tier) to store it in the HttpSession.
Detached objects and automatic versioning
Each interaction with the persistent store occurs in a new Session.
However, the same persistent instances are reused for each interaction with the database.
The application manipulates the state of detached instances originally loaded in another
Session and then reattaches them using Session.update(),
Session.saveOrUpdate(), or Session.merge().
Again, Hibernate will check instance versions during flush, throwing an
exception if conflicting updates occured.
You may also call lock() instead of update()
and use LockMode.READ (performing a version check, bypassing all
caches) if you are sure that the object has not been modified.
Customizing automatic versioning
You may disable Hibernate's automatic version increment for particular properties and
collections by setting the optimistic-lock mapping attribute to
false. Hibernate will then no longer increment versions if the
property is dirty.
Legacy database schemas are often static and can't be modified. Or, other applications
might also access the same database and don't know how to handle version numbers or
even timestamps. In both cases, versioning can't rely on a particular column in a table.
To force a version check without a version or timestamp property mapping, with a
comparison of the state of all fields in a row, turn on optimistic-lock="all"
in the <class> mapping. Note that this concepetually only works
if Hibernate can compare the old and new state, i.e. if you use a single long
Session and not session-per-request-with-detached-objects.
Sometimes concurrent modification can be permitted as long as the changes that have been
made don't overlap. If you set optimistic-lock="dirty" when mapping the
<class>, Hibernate will only compare dirty fields during flush.
In both cases, with dedicated version/timestamp columns or with full/dirty field
comparison, Hibernate uses a single UPDATE statement (with an
appropriate WHERE clause) per entity to execute the version check
and update the information. If you use transitive persistence to cascade reattachment
to associated entities, Hibernate might execute uneccessary updates. This is usually
not a problem, but on update triggers in the database might be
executed even when no changes have been made to detached instances. You can customize
this behavior by setting select-before-update="true" in the
<class> mapping, forcing Hibernate to SELECT
the instance to ensure that changes did actually occur, before updating the row.
Pessimistic Locking
It is not intended that users spend much time worring about locking strategies. Its usually
enough to specify an isolation level for the JDBC connections and then simply let the
database do all the work. However, advanced users may sometimes wish to obtain
exclusive pessimistic locks, or re-obtain locks at the start of a new transaction.
Hibernate will always use the locking mechanism of the database, never lock objects
in memory!
The LockMode class defines the different lock levels that may be acquired
by Hibernate. A lock is obtained by the following mechanisms:
LockMode.WRITE is acquired automatically when Hibernate updates or inserts
a row.
LockMode.UPGRADE may be acquired upon explicit user request using
SELECT ... FOR UPDATE on databases which support that syntax.
LockMode.UPGRADE_NOWAIT may be acquired upon explicit user request using a
SELECT ... FOR UPDATE NOWAIT under Oracle.
LockMode.READ is acquired automatically when Hibernate reads data
under Repeatable Read or Serializable isolation level. May be re-acquired by explicit user
request.
LockMode.NONE represents the absence of a lock. All objects switch to this
lock mode at the end of a Transaction. Objects associated with the session
via a call to update() or saveOrUpdate() also start out
in this lock mode.
The "explicit user request" is expressed in one of the following ways:
A call to Session.load(), specifying a LockMode.
A call to Session.lock().
A call to Query.setLockMode().
If Session.load() is called with UPGRADE or
UPGRADE_NOWAIT, and the requested object was not yet loaded by
the session, the object is loaded using SELECT ... FOR UPDATE.
If load() is called for an object that is already loaded with
a less restrictive lock than the one requested, Hibernate calls
lock() for that object.
Session.lock() performs a version number check if the specified lock
mode is READ, UPGRADE or
UPGRADE_NOWAIT. (In the case of UPGRADE or
UPGRADE_NOWAIT, SELECT ... FOR UPDATE is used.)
If the database does not support the requested lock mode, Hibernate will use an appropriate
alternate mode (instead of throwing an exception). This ensures that applications will
be portable.