Transactions And Concurrency

Transactions And Concurrency The most important point about Hibernate and concurrency control is that it is very easy to understand. Hibernate directly uses JDBC connections and JTA resources without adding any additional locking behavior. We highly recommend you spend some time with the JDBC, ANSI, and transaction isolation specification of your database management system. Hibernate only adds automatic versioning but does not lock objects in memory or change the isolation level of your database transactions. Basically, use Hibernate like you would use direct JDBC (or JTA/CMT) with your database resources. However, in addition to automatic versioning, Hibernate also offers a (minor) API for pessimistic locking of rows, using the SELECT FOR UPDATE syntax. This API is discussed later in this chapter. We start the discussion of concurrency control in Hibernate with the granularity of Configuration, SessionFactory, and Session, as well as database and long application transactions. Session and transaction scopes A SessionFactory is an expensive-to-create, threadsafe object intended to be shared by all application threads. It is created once, usually on application startup, from a Configuration instance. A Session is an inexpensive, non-threadsafe object that should be used once, for a single business process, a single unit of work, and then discarded. A Session will not obtain a JDBC Connection (or a Datasource) unless it is needed, so you may safely open and close a Session even if you are not sure that data access will be needed to serve a particular request. (This becomes important as soon as you are implementing some of the following patterns using request interception.) To complete this picture you also have to think about database transactions. A database transaction has to be as short as possible, to reduce lock contention in the database. Long database transactions will prevent your application from scaling to highly concurrent load. What is the scope of a unit of work? Can a single Hibernate Session span several database transactions or is this a one-to-one relationship of scopes? When should you open and close a Session and how do you demarcate the database transaction boundaries? Unit of work First, don't use the session-per-operation antipattern, that is, don't open and close a Session for every simple database call in a single thread! Of course, the same is true for database transactions. Database calls in an application are made using a planned sequence, they are grouped into atomic units of work. (Note that this also means that auto-commit after every single SQL statement is useless in an application, this mode is intended for ad-hoc SQL console work. Hibernate disables, or expects the application server to do so, auto-commit mode immediately.) The most common pattern in a multi-user client/server application is session-per-request. In this model, a request from the client is send to the server (where the Hibernate persistence layer runs), a new Hibernate Session is opened, and all database operations are executed in this unit of work. Once the work has been completed (and the response for the client has been prepared), the session is flushed and closed. You would also use a single database transaction to serve the clients request, starting and committing it when you open and close the Session. The relationship between the two is one-to-one and this model is a perfect fit for many applications. The challenge lies in the implementation: not only have the Session and transaction to be started and ended correctly, but they also have to be accessible for data access operations. The demarcation of a unit of work is ideally implemented using an interceptor that runs when a request hits the server and before the response will be send (i.e. a ServletFilter). We recommend to bind the Session to the thread that serves the request, using a ThreadLocal variable. This allows easy access (like accessing a static variable) in all code that runs in this thread. Depending on the database transaction demarcation mechanism you chose, you might also keep the transaction context in a ThreadLocal variable. The implementation patterns for this are known as ThreadLocal Session and Open Session in View. You can easily extend the helper class shown in the first chapter to implement this. Of course, you'd have to find a way to implement an interceptor and set it up in your environment. See the Hibernate website for tips and examples. Application transactions The session-per-request pattern is not the only useful concept you can use to design units of work. Many business processes require a whole series of interactions with the user interleaved with database accesses. In web and enterprise applications it is not acceptable for a database transaction to span a user interaction. Consider the following example: The first screen of a dialog opens, the data seen by the user has been loaded in a particular Session and database transaction. The user is free to modify the objects. The user clicks "Save" after 5 minutes and expects his modifications to be made persistent; he also expects that he was the only person editing this information and that no conflicting modification can occur. We call this unit of work, from the point of view of the user, a long running application transaction. There are many ways how you can implement this in your application. A first naive implementation might keep the Session and database transaction open during user think time, with locks held in the database to prevent concurrent modification, and to guarantee isolation and atomicity. This is of course an anti-pattern, since lock contention would not allow the application to scale with the number of concurrent users. Clearly, we have to use several database transactions to implement the application transaction. In this case, maintaining isolation of business processes becomes the partial responsibility of the application tier. A single application transaction usually spans several database transactions. It will be atomic if only one of these database transactions (the last one) stores the updated data, all others simply read data (e.g. in a wizard-style dialog spanning several request/response cycles). This is easier to implement than it might sound, especially if you use Hibernate's features: Automatic Versioning - Hibernate can do automatic optimistic concurrency control for you, it can automatically detect if a concurrent modification occured during user think time. Detached Objects - If you decide to use the already discussed session-per-request pattern, all loaded instances will be in detached state during user think time. Hibernate allows you to reattach the objects and persist the modifications, the pattern is called session-per-request-with-detached-objects. Automatic versioning is used to isolate concurrent modifications. Long Session - The Hibernate Session may be disconnected from the underlying JDBC connection after the database transaction has been committed, and reconnected when a new client request arrives. This pattern is known as session-per-application-transaction and makes even reattachment unnecessary. Automatic versioning is used to isolate concurrent modifications. Both session-per-request-with-detached-objects and session-per-application-transaction have advantages and disadvantages, we discuss them later in this chapter in the context of optimistic concurrency control. Considering object identity An application may concurrently access the same persistent state in two different Sessions. However, an instance of a persistent class is never shared between two Session instances. Hence there are two different notions of identity: Database Identity foo.getId().equals( bar.getId() ) JVM Identity foo==bar Then for objects attached to a particular Session, the two notions are equivalent. However, while the application might concurrently access the "same" (persistent identity) business object in two different sessions, the two instances will actually be "different" (JVM identity). This approach leaves Hibernate and the database to worry about concurrency. The application never needs to synchronize on any business object, as long as it sticks to a single thread per Session or object identity. Within a Session the application may safely use == to compare objects. Note that an application that uses == outside of a Session, might see unexpected results. You have to override the equals() and hashCode() methods in your persistent classes and implement your own notion of object equality (i.e. if you decide to keep detached instances from two Sessions in the same Set). Never use the database identifier to implement equality, use a business key, a combination of unique, usually immutable, attributes. The database identifier will change if a transient object is made persistent, thus breaking the hashcodes for a Set. Attributes for business keys don't have to be as stable as database primary keys, you only have to guarantee stability as long as the objects are in the same Set. See the Hibernate website for a more thorough discussion of this issue. Also note that this is not a Hibernate problem, but simply how Java object identity and equality has to be implemented. Common issues Never use the anti-patterns session-per-user-session or session-per-application (of course, there are rare exceptions to this rule). Note that some of the following issues might also appear with the recommended patterns, make sure you understand the implications before making a design decision: A Session is not thread-safe. Things which are supposed to work concurrently, like HTTP requests, session beans, or swing workers, will cause race conditions. If you keep your Hibernate Session in your Http user session, you should consider synchronizing access to your Http session. Otherwise, a user that clicks reload fast enough may use the same Session in two concurrently running threads. An exception thrown by Hibernate means you have to rollback your database transaction and close the Session immediately (discussed later in more detail). If your Session is bound to the application, you have to stop the application. Rolling back the database transaction doesn't put your business objects back into the state they were at the start of the transaction. This means the database state and the business objects do get out of sync. Usually this is not a problem, because exceptions are not recoverable and you have to start over after rollback anyway. The Session caches every object that is in persistent state (watched and checked for dirty state by Hibernate). This means it grows endlessly until you get an OutOfMemoryException, if you keep it open for a long time or simply load too much data. One solution for this is to call clear() and evict() to manage the Session cache, but you most likely should consider a Stored Procedure if you need mass data operations. Keeping a Session open for the duration of a user session also means a high probability of stale data. Database transaction demarcation Datatabase (or system) transaction boundaries are always necessary. No communication with the database can occur outside of a database transaction (this seems to confuse many developers who are used to the auto-commit mode). Always use clear transaction boundaries, even for read-only operations (depending on your isolation level, this might not be required, but there is no downside if you always demarcate transactions explicitely). A Hibernate application can run in non-managed (i.e. standalone, simple Web- or Swing applications) and managed J2EE environments. In a non-managed environment, Hibernate is usually responsible for its own database connection pool. The application developer has to manually set transaction boundaries, in other words, begin, commit, or rollback database transactions himself. A managed environment usually provides container-managed transactions, with the transaction assembly defined declaratively in deployment descriptors of EJB session beans, for example. Programmatic transaction demarcation is then no longer necessary, even flushing the Session is done automatically. However, it is often desireable to keep your persistence layer portable. Hibernate offers a wrapper API called Transaction that translates into the native transaction system of your deployment environment. This API is optional (using database transactions is not!) and you don't have to use it if database portability provided by Hibernate is all you need. Usually, ending a Session involves four distinct phases: flush the session commit the transaction close the session handle exceptions Flushing the session has been discussed earlier, we'll now have a closer look at transaction demarcation and exception handling in both managed- and non-managed environments. Non-managed environment If a Hibernate persistence layer runs in a non-managed environment, database connections are either handled by Hibernate's pooling mechanism or provided by the developer (this case has other implications, esp. with regard to caching): We recommend, even if persistence layer portability is not your primary concern, the Transaction API: Note that you don't have to flush() the Session explicitely, the call to commit() automatically triggers the synchronization. This piece of code is now portable and runs in non-managed and JTA environments. See for more information about the Transaction API. A call to close() marks the end of a session. The main implication of close() is that the JDBC connection will be relinquished by the session. If you provided your own connection, close() returns a reference to it, so you can manually close it or return it to the pool. Otherwise close() returns it to the pool. Using JTA If your persistence layer runs in an application server (e.g. behind EJB session beans), transactions boundaries are defined in deployment descriptors. Every datasource obtained by Hibernate will automatically be part of a global JTA transaction. Hibernate simply joins this transaction, or if a particular session bean method has no transaction, Hibernate will tell the application server to start and end a transaction directly. If you set the properties hibernate.transaction.flush_before_completion and hibernate.transaction.auto_close_session to true, Hibernate wil also automatically flush and close the Session for you. The only thing left is exception handling and rollback of the database transaction. Fortunately, even this happens automatically, since an unhandled RuntimeException thrown by a session bean method tells the container to set the global transaction to rollback. In other words, all you have to do in a managed environment is to get a Session from the SessionFactory (usually bound to JNDI), do your data access work, and leave the rest to the container. Transaction boundaries are set declaratively in the deployment descriptors of your session bean. Exception handling If the Session throws an exception (including any SQLException), you should immediately rollback the database transaction, call Session.close() and discard the Session instance. Certain methods of Session will not leave the session in a consistent state. No exception thrown by Hibernate can be treaded as recoverable. The HibernateException, which wraps most of the errors that can occur in a Hibernate persistence layer, is an unchecked exception (it wasn't in older versions of Hibernate). In our opinion, we shouldn't force the application developer to catch an unrecoverable exception at a low layer. In most systems, unchecked and fatal exceptions are only catched at the highest level of the call stack and an error message is presented to the application user (or some other appropriate action is taken). Note that Hibernate might also throw other unchecked exceptions (e.g. when detecting stale data in version checks) which are not a HibernateException. TODO: document new SQLException converter