Working with objects

Working with objects Hibernate is a full object/relational mapping solution that not only shields the developer from the details of the underlying database management system, but also offers state management of objects. This is, contrary to the management of SQL statements in common JDBC/SQL persistence layers, a very natural object-oriented view of persistence in Java applications. In other words, Hibernate application developers should always think about the state of their objects, and not necessarily about the execution of SQL statements. This part is taken care of by Hibernate and is only relevant for the application developer when tuning the performance of the system. Hibernate object states Hibernate defines and supports the following object states: Transient - an object is transient if it has just been instantiated using the new operator, and it is not associated with a Hibernate Session. It has no persistent representation in the database and no identifier value has been assigned. Transient instances will be destroyed by the garbage collector if the application doesn't hold a reference anymore. Use the Hibernate Session to make an object persistent (and let Hibernate take care of the SQL statements that need to be executed for this transition). Persistent - a persistent instance has a representation in the database and an identifier value. It might just have been saved or loaded, however, it is by definition in the scope of a Session. Hibernate will detect any changes made to an object in persistent state and synchronize the state with the database when the unit of work completes. Developers don't execute manual UPDATE statements, or DELETE statements when an object should be made transient. Detached - a detached instance is an object that has been persistent, but its Session has been closed. The reference to the object is still valid, of course, and the detached instance might even be modified in this state. A detached instance can be reattached to a new Session at a later point in time, making it (and all the modifications) persistent again. This feature enables a programming model for long running units of work that require user think-time. We call them application transactions, i.e. a unit of work from the point of view of the user. We'll now discuss the states and state transitions (and the Hibernate methods that trigger a transition) in more detail. Making objects persistent Newly instantiated instances of a a persistent class are considered transient by Hibernate. We can make a transient instance persistent by associating it with a session: If Cat has a generated identifier, the identifier is generated and assigned to the cat when save() is called. If Cat has an assigned identifier, or a composite key, the identifier should be assigned to the cat instance before calling save(). You may also use persist() instead of save(), with the semantics defined in the EJB3 early draft. Alternatively, you may assign the identifier using an overloaded version of save(). If the object you make persistent has associated objects (e.g. the kittens collection in the previous example), these objects may be made persistent in any order you like unless you have a NOT NULL constraint upon a foreign key column. There is never a risk of violating foreign key constraints. However, you might violate a NOT NULL constraint if you save() the objects in the wrong order. Usually you don't bother with this detail, as you'll very likely use Hibernate's transitive persistence feature to save the associated objects automatically. Then, even NOT NULL constraint violations don't occur - Hibernate will take care of everything. Transitive persistence is discussed later in this chapter. Loading an object The load() methods of Session gives you a way to retrieve a persistent instance if you already know its identifier. load() takes a class object and will load the state into a newly instantiated instance of that class, in persistent state. Alternatively, you can load state into a given instance: Note that load() will throw an unrecoverable exception if there is no matching database row. If the class is mapped with a proxy, load() just returns an uninitialized proxy and does not actually hit the database until you invoke a method of the proxy. This behaviour is very useful if you wish to create an association to an object without actually loading it from the database. It also allows multiple instances to be loaded as a batch if batch-size is defined for the class mapping. If you are not certain that a matching row exists, you should use the get() method, which hits the database immediately and returns null if there is no matching row. You may even load an object using an SQL SELECT ... FOR UPDATE, using a LockMode. See the API documentation for more information. Note that any associated instances or contained collections are not selected FOR UPDATE, unless you decide to specify lock or all as a cascade style for the association. It is possible to re-load an object and all its collections at any time, using the refresh() method. This is useful when database triggers are used to initialize some of the properties of the object. An important question usually appears at this point: How much does Hibernate load from the database and how many SQL SELECTs will it use? This depends on the fetching strategy and is explained in . Querying If you don't know the identifiers of the objects you are looking for, you need a query. Hibernate supports an easy-to-use but powerful object oriented query language (HQL). For programmatic query creation, Hibernate supports a sophisticated Criteria and Example query feature (QBC and QBE). You may also express your query in the native SQL of your database, with optional support from Hibernate for result set conversion into objects. Executing queries HQL and native SQL queries are represented with an instance of org.hibernate.Query. This interface offers methods for parameter binding, result set handling, and for the execution of the actual query. You always obtain a Query using the current Session: A query is usually executed by invoking list(), the result of the query will be loaded completely into a collection in memory. Entity instances retrieved by a query are in persistent state. The uniqueResult() method offers a shortcut if you know your query will only return a single object. Iterating results Occasionally, you might be able to achieve better performance by executing the query using the iterate() method. This will only usually be the case if you expect that the actual entity instances returned by the query will already be in the session or second-level cache. If they are not already cached, iterate() will be slower than list() and might require many database hits for a simple query, usually 1 for the initial select which only returns identifiers, and n additional selects to initialize the actual instances. Queries that return tuples Hibernate queries sometimes return tuples of objects, in which case each tuple is returned as an array: Scalar results Queries may specify a property of a class in the select clause. They may even call SQL aggregate functions. Properties or aggregates are considered "scalar" results (and not entities in persistent state). Bind parameters Methods on Query are provided for binding values to named parameters or JDBC-style ? parameters. Contrary to JDBC, Hibernate numbers parameters from zero. Named parameters are identifiers of the form :name in the query string. The advantages of named parameters are: named parameters are insensitive to the order they occur in the query string they may occur multiple times in the same query they are self-documenting Pagination If you need to specify bounds upon your result set (the maximum number of rows you want to retrieve and / or the first row you want to retrieve) you should use methods of the Query interface: Hibernate knows how to translate this limit query into the native SQL of your DBMS. Scrollable iteration If your JDBC driver supports scrollable ResultSets, the Query interface may be used to obtain a ScrollableResults object, which allows flexible navigation of the query results. i++ ) && cats.next() ) pageOfCats.add( cats.get(1) ); } cats.close()]]> Note that an open database connection (and cursor) is required for this functionality, use setMaxResult()/setFirstResult() if you need offline pagination functionality. Externalizing named queries You may also define named queries in the mapping document. (Remember to use a CDATA section if your query contains characters that could be interpreted as markup.) ? ] ]>]]> Parameter binding and executing is done programatically: Note that the actual program code is independent of the query language that is used, you may also define native SQL queries in metadata, or migrate existing queries to Hibernate by placing them in mapping files. Filtering collections A collection filter is a special type of query that may be applied to a persistent collection or array. The query string may refer to this, meaning the current collection element. The returned collection is considered a bag, and it's a copy of the given collection. The original collection is not modified (this is contrary to the implication of the name "filter", but consistent with expected behavior). Observe that filters do not require a from clause (though they may have one if required). Filters are not limited to returning the collection elements themselves. Even an empty filter query is useful, e.g. to load a subset of elements in a huge collection: Criteria queries HQL is extremely powerful but some developers prefer to build queries dynamically, using an object-oriented API, rather than building query strings. Hibernate provides an intuitive Criteria query API for these cases: The Criteria and the associated Example API are discussed in more detail in . Queries in native SQL You may express a query in SQL, using createSQLQuery() and let Hibernate take care of the mapping from result sets to objects. Note that you may at any time call session.connection() and use the JDBC Connection directly. If you chose to use the Hibernate API, you must enclose SQL aliases in braces: SQL queries may contain named and positional parameters, just like Hibernate queries. More information about native SQL queries in Hibernate can be found in . Modifying persistent objects Transactional persistent instances (ie. objects loaded, saved, created or queried by the Session) may be manipulated by the application and any changes to persistent state will be persisted when the Session is flushed (discussed later in this chapter). There is no need to call a particular method (like update(), which has a different purpose) to make your modifications persistent. So the most straightforward way to update the state of an object is to load() it, and then manipulate it directly, while the Session is open: Sometimes this programming model is inefficient since it would require both an SQL SELECT (to load an object) and an SQL UPDATE (to persist its updated state) in the same session. Therefore Hibernate offers an alternate approach, using detached instances. Note that Hibernate does not offer its own API for direct execution of UPDATE or DELETE statements. Hibernate is a state management service, you don't have to think in statements to use it. JDBC is a perfect API for executing SQL statements, you can get a JDBC Connection at any time by calling session.connection(). Furthermore, the notion of mass operations conflicts with object/relational mapping for online transaction processing-oriented applications. Future versions of Hibernate may however provide special mass operation functions. See for some possible batch operation tricks. Modifying detached objects Many applications need to retrieve an object in one transaction, send it to the UI layer for manipulation, then save the changes in a new transaction. Applications that use this kind of approach in a high-concurrency environment usually use versioned data to ensure isolation for the "long" unit of work. Hibernate supports this model by providing for reattachment of detached instances using the Session.update() or Session.merge() methods: If the Cat with identifier catId had already been loaded by secondSession when the application tried to reattach it, an exception would have been thrown. Use update() if you are sure that the session does not contain an already persistent instance with the same identifier, and merge() if you want to merge your modifications at any time without consideration of the state of the session. In other words, update() is usually the first method you would call in a fresh session, ensuring that reattachment of your detached instances is the first operation that is executed. The application should individually update() detached instances reachable from the given detached instance if and only if it wants their state also updated. This can be automated of course, using transitive persistence, see . The lock() method also allows an application to reassociate an object with a new session. However, the detached instance has to be unmodified! Note that lock() can be used with various LockModes, see the API documentation and the chapter on transaction handling for more information. Reattachment is not the only usecase for lock(). Other models for long units of work are discussed in . Automatic state detection Hibernate users have requested a general purpose method that either saves a transient instance by generating a new identifier or updates/reattaches the detached instances associated with its current identifier. The saveOrUpdate() method implements this functionality. The usage and semantics of saveOrUpdate() seems to be confusing for new users. Firstly, so long as you are not trying to use instances from one session in another new session, you should not need to use update(), saveOrUpdate(), or merge(). Some whole applications will never use either of these methods. Usually update() or saveOrUpdate() are used in the following scenario: the application loads an object in the first session the object is passed up to the UI tier some modifications are made to the object the object is passed back down to the business logic tier the application persists these modifications by calling update() in a second session saveOrUpdate() does the following: if the object is already persistent in this session, do nothing if another object associated with the session has the same identifier, throw an exception if the object has no identifier property, save() it if the object's identifier has the value assigned to a newly instantiated object, save() it if the object is versioned (by a <version> or <timestamp>), and the version property value is the same value assigned to a newly instantiated object, save() it otherwise update() the object and merge() is very different: if there is a persistent instance with the same identifier currently associated with the session, copy the state of the given object onto the persistent instance if there is no persistent instance currently associated with the session, try to load it from the database, or create a new persistent instance the persistent instance is returned the given instance does not become associated with the session, it remains detached Deleting persistent objects Session.delete() will remove an object's state from the database. Of course, your application might still hold a reference to a deleted object. It's best to think of delete() as making a persistent instance transient. You may delete objects in any order you like, without risk of foreign key constraint violations. It is still possible to violate a NOT NULL constraint on a foreign key column by deleting objects in the wrong order, e.g. if you delete the parent, but forget to delete the children. Replicating object between two different datastores It is occasionally useful to be able to take a graph of persistent instances and make them persistent in a different datastore, without regenerating identifier values. The ReplicationMode determines how replicate() will deal with conflicts with existing rows in the database. ReplicationMode.IGNORE - ignore the object when there is an existing database row with the same identifier ReplicationMode.OVERWRITE - overwrite any existing database row with the same identifier ReplicationMode.EXCEPTION - throw an exception if there is an existing database row with the same identifier ReplicationMode.LATEST_VERSION - overwrite the row if its version number is earlier than the version number of the object, or ignore the object otherwise Usecases for this feature include reconciling data entered into different database instances, upgrading system configuration information during product upgrades, rolling back changes made during non-ACID transactions and more. Flushing the Session From time to time the Session will execute the SQL statements needed to synchronize the JDBC connection's state with the state of objects held in memory. This process, flush, occurs by default at the following points before some query executions from org.hibernate.Transaction.commit() from Session.flush() The SQL statements are issued in the following order all entity insertions, in the same order the corresponding objects were saved using Session.save() all entity updates all collection deletions all collection element deletions, updates and insertions all collection insertions all entity deletions, in the same order the corresponding objects were deleted using Session.delete() (An exception is that objects using native ID generation are inserted when they are saved.) Except when you explicity flush(), there are absolutely no guarantees about when the Session executes the JDBC calls, only the order in which they are executed. However, Hibernate does guarantee that the Query.list(..) will never return stale data; nor will they return the wrong data. It is possible to change the default behavior so that flush occurs less frequently. The FlushMode class defines three different modes: only flush at commit time (and only when the Hibernate Transaction API is used), flush automatically using the explained routine, or never flush unless flush() is called explicitly. The last mode is useful for long running units of work, where a Session is kept open and disconnected for a long time (see ). During flush, an exception might occur (e.g. if a DML operation violates a constraint). Since handling exceptions involves some understanding of Hibernate's transactional behavior, we discuss it in . Transitive persistence It is quite cumbersome to save, delete, or reattach individual objects, especially if you deal with a graph of associated objects. A common case is a parent/child relationship. Consider the following example: If the children in a parent/child relationship would be value typed (e.g. a collection of addresses or strings), their lifecycle would depend on the parent and no further action would be required for convenient "cascading" of state changes. When the parent is saved, the value-typed child objects are saved as well, when the parent is deleted, the children will be deleted, etc. This even works for operations such as the removal of a child from the collection; Hibernate will detect this and, since value-typed objects can't have shared references, delete the child from the database. Now consider the same scenario with parent and child objects being entities, not value-types (e.g. categories and items, or parent and child cats). Entities have their own lifecycle, support shared references (so removing an entity from the collection does not mean it can be deleted), and there is by default no cascading of state from one entity to any other associated entities. Hibernate does not implement persistence by reachability by default. For each basic operation of the Hibernate session - including persist(), merge(), saveOrUpdate(), delete(), lock(), refresh(), evict(), replicate() - there is a corresponding cascade style. Respectively, the cascade styles are named create, merge, save-update, delete, lock, refresh, evict, replicate. If you want an operation to be cascaded along an association, you must indicate that in the mapping document. For example: ]]> Cascade styles my be combined: ]]> You may even use cascade="all" to specify that all operations should be cascaded along the association. The default cascade="none" specifies that no operations are to be cascaded. A special cascade style, delete-orphan, applies only to one-to-many associations, and indicates that the delete() operation should be applied to any child object that is removed from the association. Recommendations: It doesn't usually make sense to enable cascade on a <many-to-one> or <many-to-many> association. Cascade is often useful for <one-to-one> and <one-to-many> associations. If the child object's lifespan is bounded by the lifespan of the of the parent object make it a lifecycle object by specifying cascade="all,delete-orphan". Otherwise, you might not need cascade at all. But if you think that you will often be working with the parent and children together in the same transaction, and you want to save yourself some typing, consider using cascade="create,merge,save-update". Mapping an association (either a single valued association, or a collection) with cascade="all" marks the association as a parent/child style relationship where save/update/delete of the parent results in save/update/delete of the child or children. Futhermore, a mere reference to a child from a persistent parent will result in save/update of the child. This metaphor is incomplete, however. A child which becomes unreferenced by its parent is not automatically deleted, except in the case of a <one-to-many> association mapped with cascade="delete-orphan". The precise semantics of cascading operations for a parent/child relationship are as follows: If a parent is passed to persist(), all children are passed to persist() If a parent is passed to merge(), all children are passed to merge() If a parent is passed to save(), update() or saveOrUpdate(), all children are passed to saveOrUpdate() If a transient or detached child becomes referenced by a persistent parent, it is passed to saveOrUpdate() If a parent is deleted, all children are passed to delete() If a child is dereferenced by a persistent parent, nothing special happens - the application should explicitly delete the child if necessary - unless cascade="delete-orphan", in which case the "orphaned" child is deleted. Using metadata Hibernate requires a very rich meta-level model of all entity and value types. From time to time, this model is very useful to the application itself. For example, the application might use Hibernate's metadata to implement a "smart" deep-copy algorithm that understands which objects should be copied (eg. mutable value types) and which should not (eg. immutable value types and, possibly, associated entities). Hibernate exposes metadata via the ClassMetadata and CollectionMetadata interfaces and the Type hierarchy. Instances of the metadata interfaces may be obtained from the SessionFactory.