Finding slow queries in SQL Server

When you want to optimise your app, one of the most important things is to find your slowest database queries, to see if indexes are missing, or the queries should be rewritten. SQL Server stores statistics for its queries in the dynamic management view dm_exec_query_stats. In the book “TSQL Querying” by Itzik Ben-Gan, Itzik suggests the following as a good query to extract and format the information. This query sums the total elapsed time for queries of the same type:

select top 10
max(query) as sample_query,
sum(execution_count) as cnt,
sum(total_worker_time) as cpu,
sum(total_physical_reads) as reads,
sum(total_logical_reads) as logical_reads,
sum(total_elapsed_time) as duration
from (select qs.*,
substring(st.text,(qs.statement_start_offset/2)+1,
((case statement_end_offset when - 1
then datalength(st.text)
else qs.statement_end_offset end
- qs.statement_start_offset)/2) + 1)
as query
from sys.dm_exec_query_stats as qs
cross apply sys.dm_exec_sql_text(qs.sql_handle) as st
cross apply sys.dm_exec_plan_attributes(qs.plan_handle) as pa
where pa.attribute = 'dbid'
and pa.value= db_id('your-database')
) as d
group by query_hash
order by duration desc

If you look on the web, you will sometimes see variations on this query that attempt to filter by calling the db_name() function on the database id stored in the statement. However, this will only work for stored procedures. The database name will be null for a normal SQL statement. This makes sense, because running the same statement against different databases, even if they have the same tables, could result in completely different query plans. Hence it is the query plan that is linked to the database. As you can see from the above, you can safely filter by converting the dbid stored in the plan attributes.

Posted in Performance, SQL Server | Tagged , | Leave a comment

Writing a DSL with Scala Parser Combinators

Scala parser combinators provide a very easy way to construct your own domain specific languages (DSLs). Let’s look at a simple example. Suppose that you work on a large application which produces various report files on a scheduled basis. You decide that you want to give the users of your application the ability to control what directories different file types are moved to, and to send e-mail alerts when specific files are generated. You could do this by allowing a power user to type the instructions in using little language such as the following:
if ($FILENAME contains "finance") {
	MOVE "finance"
	EMAIL "finance-team@somecompany.com"
}
else if ($FILENAME endsWith "xlsx") {
	MOVE "spreadsheets"
}
else if ($FILENAME startsWith "daily") {
	MOVE "daily-reports"
	EMAIL "report-team@somecompany.com"
}
else {
	MOVE "additional"
}
In this scenario, I’m assuming that this routine is invoked once for each file that is output, and the name of the file is put into the variable $FILENAME.

So how do Scala parser combinators work and how do they compare to the traditional way of constructing a DSL? Before looking at the Scala approach, it is worth clarifying that whenever you talk about a DSL, you need to be clear whether you are talking about an internal or external DSL. An internal DSL is one that is simply written in a normal programming language, but the objects and methods provided have been created to give the appearance of a domain specific language. This tends not to be done much in Java, since Java does not have a very flexible language syntax. It works better in languages such as Ruby or Scala where you have more freedom about how you define methods, such as being able to use spaces and symbols in the names, and being able to change the order of objects and the methods you are invoking on them. An external DSL is one where the code is written in its own format, and this code is parsed to understand its meaning. Traditionally this is done by writing the rules of the language in a formal grammar, usually Backus-Naur Form (BNF). The BNF grammar is then read in by a parser generator such as ANTLR or JavaCC, which generates a parser capable of understanding the language. This approach works well, but does have problems. One problem is debugging. If something goes wrong, a developer may have to debug through auto-generated code. A second annoyance is that whenever you update your grammar, you have to run the parser generation step, rather than just compiling some code. This is where parser combinators come in, as an alternative approach.

So, how do you use Scala parser combinators to parse a small language like the one above? In short, rather than writing the BNF grammar into a grammar file, and then generating a parser, you write a Scala class in which each BNF rule becomes a method. The Scala parser combinator library provides parsers, matchers and methods that correspond to all of the elements in a grammar. Let’s look at how this would work for the example above. I’ll start off with a few code snippets, and then give the full class, so you can run it yourself. If you haven’t run any Scala code before, using Eclipse with the Scala plugin is probably the easiest way to get started. Please note that although Scala parser combinators used to be included in the standard Scala library, from 2.11 onwards they are a separate download from http://www.scala-lang.org/.

To give your class access to the parser combinator code, the easiest thing to do is extend one of the parser traits, so I’ll start with this:

import scala.util.parsing.combinator.JavaTokenParsers

class FileHandler extends JavaTokenParsers {

}
 

Now we need to work on the grammar rules that will define the language. The easiest statements to define first will be the MOVE and EMAIL commands. They are simply the name of the command, followed by a string. Hence we could start by defining a string, and these two rules:

def string : Parser[Any] = """".*"""".r
def email : Parser[Any] = "EMAIL"~string	
def move : Parser[Any] = "MOVE"~string 

Here, I have used the .r method on a string to define a regular expression saying that in our little language, a string is a double quote, followed by any characters, followed by another double quote. The whole thing is enclosed in triple double quotes, as in Scala, this means that you are defining a string literal, and you do not want to have to escape special characters in the string. For me, this is more readable that escaping the special characters, because the triple quotes are at the start and end of the string, whereas when you escape special characters, you end up with backslashes intermingled with the regular expression you are trying to write, and it can be easy to mistype the regex. Once we have defined a string, we can define the email and move rules by saying that they are the word EMAIL or MOVE respectively, followed by a string. The tilde symbol ~ is used whenever you want combine two parsers sequentially. Note that we don't have to worry about whitespace, the tilde will take care of that.

Now we can start to define the if else statement. The if statement uses an expression, which tests the $FILENAME variable with various operators. In a larger language, we might need to define the $FILENAME as a variable. Here, because I know this is the only "variable" in the language, I'm not going to bother doing that, I'll just write it into the expression rule:

def operator : Parser[Any] = "contains" | "endsWith" | "startsWith"
def expr : Parser[Any] = "$FILENAME"~operator~string

Here we have used the pipe symbol to say that an operator is any one of the three strings specified, and an expression is the $FILENAME, an operator and a string.

To build up the rules for the if statement, I'm going to say that:

  • We need the concept of a block - which is a pair of curly braces, with code in between.
  • A single if clause is the word "if", followed by an expression in brackets, followed by a block.
  • An else if clause is the word "else", followed by an if clause.
  • An else clause is the word "else", just followed by a block.
  • The entire if statement is an if clause, optionally followed by one or more "if-else" clauses, optionally followed by an "else".
One way to embody this in code is:
def block : Parser[Any] = "{"~rep(email | move)~"}"
def ifClause :Parser[Any] = "if"~"("~expr~")"~block
def elseIfClause : Parser[Any] = "else"~ifClause
def elseClause : Parser[Any] = "else"~block
def ifElse : Parser[Any] = ifClause~opt(rep(elseIfClause))~opt(elseClause)


Here you can see some additional parser combinator syntax - the rep method denotes repetition, the opt method denotes an optional element. The only thing that remains now is to define the "top level" rule, which will be the starting point for parsing code written in our little language. Based on the above language sample, a sensible choice for this rule is just to say that text written using this language will be composed of a list of commands, using any of the three commands MOVE, EMAIL or an if else statement:

def commandList = rep(email | move | ifElse)


So now the entire class looks like this:

import scala.util.parsing.combinator.JavaTokenParsers

class FileHandler extends JavaTokenParsers {
  	def string : Parser[Any] = """".*"""".r
	def email : Parser[Any] = "EMAIL"~string
	def move : Parser[Any] = "MOVE"~string 
	
	def operator : Parser[Any] = "contains" | "endsWith" | "startsWith"
	def expr : Parser[Any] = "$FILENAME"~operator~string
	def block : Parser[Any] = "{"~rep(email | move)~"}"
	def ifClause :Parser[Any] = "if"~"("~expr~")"~block
	def elseIfClause : Parser[Any] = "else"~ifClause
	def elseClause : Parser[Any] = "else"~block
	def ifElse : Parser[Any] = ifClause~opt(rep(elseIfClause))~opt(elseClause)
	
	def commandList = rep(email | move | ifElse)
}


Now we can test it. Create a file with the original language sample in it. I've called mine file-handler.txt. Then we can create a parser by subclassing our parser code, then pointing it at this file as input, and invoking our top level method "commandList":

import java.io.FileReader
import java.io.FileInputStream

object FileHandlerTest extends FileHandler {

  def main(args: Array[String]): Unit = {
    val reader = new FileReader("/path-to-your-file/file-handler.txt")
    println(parseAll(commandList, reader))
  }

}


The output will show the code being parsed (line breaks added for clarity):


parsed: List(( 
((((((if~()~(($FILENAME~contains)~"finance"))~))
~(({~List((MOVE~"finance"), (EMAIL~"finance-team@somecompany.com")))~}))
~Some(List((else~((((if~()~(($FILENAME~endsWith)~"xlsx"))~))
~(({~List((MOVE~"spreadsheets")))~}))), 
(else~((((if~()~(($FILENAME~startsWith)~"daily"))~))
~(({~List((MOVE~"daily-reports"), 
(EMAIL~"report-team@somecompany.com")))~}))))))
~Some((else~(({~List((MOVE~"additional")))~})))))


Note that with the parser as written above, the language does not support nesting, because a block is defined to composed of either MOVE or EMAIL statements. i.e. it cannot contain a nested "if" statement. The generic way to change your grammar rules to support nesting is to change the definition of a block to recursively point back to your top level rule, which in our case is the one called "commandList":

def block : Parser[Any] = "{"~commandList~"}"


You might like to make this change, then update your example input to include a nested if statement, and confirm that it can be parsed correctly.

For more information on Scala parser combinators, check out chapter 31 of "Programming in Scala", available online here: http://www.artima.com/pins1ed/combinator-parsing.html

Posted in Scala | Tagged | 2 Comments

Optimistic locking and versioning with Hibernate

Just put a small example of Hibernate optimistic locking and versioning on github:

https://github.com/hedleyproctor/HibernateVersioningExample

Hibernate uses an optimistic locking strategy, based on versioning. This means it tries to avoid taking a lock on data and instead relies on having a version attribute for each entity (assuming you have configured one). When it performs an update, it includes the version attribute in the where clause of the update. If the database row has already been updated by another thread, then no rows will be updated. Since most databases return the number of rows affected by an update statement, Hibernate can check if this value is zero and then throw an exception.

To enable optimistic versioning, simply add a version attribute to each entity. You can choose either a timestamp or a number, but a number may be safer as using a timestamp could fail if multiple computers are updating data and their clocks are out of sync. To add a version attribute, just annotate with @Version.

To demonstrate that attempting to update data with an out of date version does indeed throw an exception, in the example, I’ve created two threads. The first one is given a reference to the data, but it waits until the second thread has updated the data before attempting its own update. You will see that this throws a stale object exception. You can also see from the SQL that the where clause includes the version number.

This article is part of a series on Hibernate. You might be interested in other articles in the series:

Hibernate query limitations and correlated sub-queries
Bespoke join conditions with Hibernate JoinFormula
Inheritance and polymorphism
One-to-many associations
One-to-one associations
Many-to-many associations

Posted in Hibernate | Tagged | Leave a comment

Distinct, order by and Hibernate HQL

Recently someone posted a question on stackoverflow explaining that they were having a problem with a Hibernate HQL query that involved both order by and a distinct in the select. The query was across five linked entities – a many to many association, mapped by an entity called PreferenceDateETL, which links the Preference and DateETL entities, and then two properties on the join entity – Employee and Corporation. The person asking the question explained that they wanted to write a query like the following:

select distinct pd.preference
from PreferenceDateETL pd
where pd.corporation.id=:corporationId
and pd.preference.employee.deleted=false
and pd.deleted=false
and pd.preference.deleted=false
and pd.dateETL.localDate>=:startDM
and pd.dateETL.localDate<=:endDM and pd.preference.approvalStatus!=:approvalStatus order by pd.preference.dateCreated

However, when trying to write this, he got an exception saying that the date created is not in the select list. What is going on here and how can it be fixed? Well, the order by clause is always evaluated last in a database query, so you can only order on columns you have selected. Although it appears as though the above query will have all of the columns, in fact, Hibernate ends up having two instances of the preference table in its query, and the order by clause refers to a different one to the one in the select, hence the exception. The generated SQL has the form:

select
distinct preference1_.id as id1_76_,
....things...
from
preference_date_etl preference0_
inner join
preference preference1_
on preference0_.preference_id=preference1_.id cross
join
preference preference2_ cross
join
employee employee3_ cross
join
date_etl dateetl5_
where
...things...
order by
preference2_.date_created

The problem with this query is really that using distinct is unnecessary. It is required in the above form of the query, because the query has been written to pick out instances of the many to many entity that we are interested in, then from those, navigate to the preference objects they link to. By definition, because the many to many entities are representing a many to many relationship, this can give you a list that includes duplicates. You can avoid this by restructuring the query so that the query on the many to many entities becomes a subquery. i.e. you pick out the many to many entities you are interested in, and then say "now get me all of the preference objects that are in the list of preference objects referred to by the PreferenceDateETL objects". This query doesn't have any duplicates, hence you do not need the distinct, and the order by clause will work correctly. Here is the HQL:

from Preference p where p in
(select pd.preference from PreferenceDateETL pd
where pd.corporation.id=:corporationId
and pd.deleted=false
and pd.dateETL.localDate>=:startDM
and pd.dateETL.localDate<=:endDM) and p.employee.deleted=false and p.deleted=false and p.approvalStatus != :approvalStatus order by p.dateCreated

If you want to read the original question, it is here:

http://stackoverflow.com/questions/25545569/hibernate-using-multiple-joins-on-same-table

This article is part of a series of articles on Hibernate. Others in the series include:

Hibernate query limitations and correlated sub-queries
Bespoke join conditions with Hibernate JoinFormula
Inheritance and polymorphism
One-to-many associations
One-to-one associations
Many-to-many associations

Posted in Hibernate | Tagged | 1 Comment

Understanding the Hibernate session cache

When you use Hibernate, you do so via a session. It is well known that the session acts as a first level cache, but sometimes this can cause confusing behaviour. Let’s look at a simple example. Suppose I have Customer and Order objects. An order has a reference to the customer who made the order, and similarly, a customer has a set of orders.

Suppose I create a single customer with three associated orders as follows:

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

Customer customer = new Customer();
customer.setForename("Hedley");
customer.setSurname("Proctor");
session.save(customer);

Order order1 = new Order();
order1.setOrderTotal(29.99);
order1.setCustomer(customer);

Order order2 = new Order();
order2.setOrderTotal(8.99);
order2.setCustomer(customer);

Order order3 = new Order();
order3.setOrderTotal(15.99);
order3.setCustomer(customer);

session.save(order1);
session.save(order2);
session.save(order3);

session.flush();

As you can see, I’ve called flush() on the session, so the data has been sent to the database. Now let’s write a query that will pick out this customer, and assert that they have three orders:

Criteria criteria = session.createCriteria(Customer.class);
criteria.add(Restrictions.eq("forename","Hedley"));
criteria.add(Restrictions.eq("surname","Proctor"));
List results = criteria.list();
Customer customer1 = (Customer) results.get(0);
assertEquals(customer1.getOrders().size(),3);

This assertion will fail! Why? Because this criteria query just returns the customer object that you already have, which is in the session. Despite the fact that in the database the orders table really does have the correct foreign keys to the customers, because Hibernate has flushed all of the objects to the database, it doesn’t think that the session is dirty, so will happily return the Customer object that is the in session. What should you do to prevent confusion like this? The answer suggested by Java Persistence with Hibernate is that when you have a bidirectional association you should always have a helper method to add items to it, that will set both sides of the association, so you will never get this discrepancy between the session objects and the contents of the database.

This article is part of a series on Hibernate. You might be interested in some of the others:

Hibernate query limitations and correlated sub-queries Bespoke join conditions with Hibernate JoinFormula Inheritance and polymorphism One-to-many associations One-to-one associations Many-to-many associations
Posted in Hibernate | Tagged | Leave a comment

Understanding Hibernate session flushing

When you interact with Hibernate, you do so via a Hibernate session. Hibernate sessions are flushed to the database in three situations:
  • When you commit a (Hibernate) transaction.
  • Before you run a query.
  • When you call session.flush().

However, it is extremely important to understand the second of these situations – session flushing prior to running a query. This does not happen before every query! Remember, the purpose of the Hibernate session is to minimise the number of writes to the database, so it will avoid flushing to the database if it thinks that it isn’t needed. Hibernate will try to work out if the objects currently in your session are relevant to the query that is running, and it will only flush them if it thinks they are. You can easily see the effects of this behaviour. For example, suppose I have Customer and Order objects. A customer has a set of orders. To begin with, I’ll just create the orders without any reference to the customers:

Order order1 = new Order();
order1.setOrderTotal(29.99);

Order order2 = new Order();
order2.setOrderTotal(8.99);

Order order3 = new Order();
order3.setOrderTotal(15.99);

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

session.save(order1);
session.save(order2);
session.save(order3);

At this stage, the three objects will be saved to the database. Now let’s create a customer and save them:

Customer customer = new Customer();
customer.setForename("Hedley");
customer.setSurname("Proctor");
session.save(customer);

Okay, now the customer is in the database. Let’s now create the link between the customer and their orders:

order1.setCustomer(customer);
order2.setCustomer(customer);
order3.setCustomer(customer);

Now the Hibernate session is dirty, meaning that the objects in the session have changes that haven’t been persisted to the database yet. If we run an SQL query, Hibernate doesn’t know to flush the session, and the results you get back do not reflect the current states of the objects:

Query query = session.createSQLQuery("select id,customer_id from orders");
List results = query.list();

Here are the results of this query:

Hibernate: select id,customer_id from orders 1,null 2,null 3,null

Even running a Hibernate query in HQL or with a criteria won’t cause the session to flush if Hibernate doesn’t think the query concerns the objects that are dirty. e.g.

Query query = session.createQuery("from Customer");

This won’t cause a session flush because it is the Order objects that are dirty. Hence, a Hibernate query involving these objects will cause a session flush. e.g.

Query query = session.createQuery("from Order");

If you want to see this code running, it is available from github:

https://github.com/hedleyproctor/hibernate-mapping-and-query-examples

This article is part of a series on Hibernate. You might be interested in some of the others:

Hibernate query limitations and correlated sub-queries Bespoke join conditions with Hibernate JoinFormula Inheritance and polymorphism One-to-many associations One-to-one associations Many-to-many associations
Posted in Hibernate | Tagged | 2 Comments

Hibernate query limitations and correlated sub queries

It’s well known that one of the key limitations of writing queries in Hibernate, using either HQL or criteria, is that you cannot write queries that have a join in the from clause. What is less well known is that for some simple queries that involve a join in the from clause, you can rewrite them as correlated subqueries, to make them runnable by Hibernate. Let’s look at an example. Imagine we have a customer table and an orders table. The orders table has a foreign key to the customer table. Suppose you want to get a report of all customers that have three or more orders. If you were to write this in SQL using a join, you would write:
select id from 
customer c 
    join on 
(select customer_id,count(*) as num_orders from orders group by customer_id) co 
    where c.id = co.customer_id and co.num_orders > 2

To convert this to an SQL correlated subquery, the join is removed and the subquery gets moved to the where clause. The conditions on the join get pushed down into the subquery. Since they refer to the columns from the outer query (on customer), this is what makes the subquery a correlated subquery:

select id,forename,surname from
customer c 
where (select count(*) from orders o where o.customer_id = c.id) > 2

In HQL, this becomes:

select id,forename,surname from Customer c where (select count(*) from Order o where o.customer.id = c.id) > 2

Now suppose we want the most recent order for each customer. In the most basic variation, where you just want to get the timestamp of the most recent order for each customer id, you get both columns out of your grouping query:

select max(timestamp),customer_id from orders group by customer_id

In the more general case, where you want to get further details of either the customer or the order, you have to join this query to the customer or orders table respectively. For example, to get the order id and total as well:

select o.id,o.timestamp,o.order_total from 
orders o 
join 
(select max(timestamp) as max_ts,customer_id from orders group by customer_id) as x 
on o.customer_id = x.customer_id and o.timestamp = x.max_ts

To convert this to a correlated subquery, the subquery gets moved from the join into the where clause, and the additional condition on the join becomes part of the subquery where clause. Note that the group by can be removed, since the restriction on customer_id has become part of the correlated subquery where clause.

select o1.id,o1.timestamp,o1.order_total from 
orders o1 
where o1.timestamp in 
(select max(timestamp) from orders o2 where o1.customer_id = o2.customer_id)


In HQL this would become:

select o1.id,o1.timestamp,o1.orderTotal from Order o1 where o1.timestamp in (select max(timestamp) from Order o2 where o1.customer.id = o2.customer.id)

Suppose we want to add some details of the customer as well. In sql we would join this to the customer table:

select c.id as cust_id,c.forename,c.surname,o1.id,o1.timestamp,o1.order_total 
from orders o1
join customer c on o1.customer_id = c.id
where o1.timestamp in 
(select max(timestamp) from orders o2 where o1.customer_id = o2.customer_id)


In HQL, we simply navigate the association from order to customer to get the customer info:

select o1.customer.id,o1.customer.forename,o1.customer.surname, o1.id,o1.timestamp,o1.orderTotal from Order o1 where o1.timestamp in (select max(timestamp) from Order o2 where o1.customer.id = o2.customer.id)

So, are these queries equivalent to the corresponding join queries, and should you always rewrite them in this format? The answer to the first question is that technically, a subquery could get a different execution plan at the database. In theory, evaluating a correlated subquery could mean that the database evaluates the correlated subquery separately for every single row in the outer query. Clearly, this would be a very expensive operation. However, for correlated subqueries that are actually equivalent to a join query, a good database query optimiser will recognise this, and follow the same execution plan as it would do for the join query. i.e. it will perform a join operation to get a result set that contains all of the rows in the subquery as a single operation, rather than evaluating the query once for every row in the outer query. Certainly, if I run queries like the above on SQL Server and look at the execution plan, I can see it is equivalent to the corresponding join query. What about the second question? Even if you can rewrite these sorts of query, should you? Well, this probably comes down to the normal questions surrounding Hibernate - the Hibernate version of the query won't need updating if you change the tables or columns in the database, whereas the SQL version would. If you want a query that returns objects, the Hibernate version will make this easier for you. As a general rule, the Hibernate versions are going to be easier to use and maintain.

This article is part of a series on Hibernate. You might be in interested in other articles in the series:

Bespoke join conditions with Hibernate JoinFormula Inheritance and polymorphism One-to-many associations One-to-one associations Many-to-many associations
Posted in Hibernate | Tagged | 1 Comment

Bespoke join conditions with Hibernate JoinFormula

Recently someone posted a question on stackoverflow asking how to deal with a database join, where the foreign key could reside in one of two different columns. This situation is sometimes found in a legacy database schema, where someone has chosen to use two different columns, because they want them to represent different sorts of relationship, but relationships that join to the same kind of entity. In the example on stackoverflow, a customer has to link to a UserAccount object, but this could either be their own, using a foreign key in the “accountId” column, or a partner account, with the foreign key in the “partnerId” column. How do you deal with this?

The answer is to use a join formula to pick out the non-null column and use it to join:

@ManyToOne
@JoinColumnsOrFormulas(
{@JoinColumnOrFormula(
formula = @JoinFormula(
value = "case when accountId is not null then accountId " +
 "when partnerId is not null then partnerId end", 
referencedColumnName="id")) }
)
public UserAccount getUserAccount() {
	return userAccount;
}
If you want to see the full working example, it is on github: https://github.com/hedleyproctor/hibernate-mapping-and-query-examples The Customer class shows the join formula and you can see a test which demonstrates it in the Tests class, in the joinFormula() test method.
Posted in Hibernate | Tagged | Leave a comment

Hibernate example 4 – many to many associations

I’ve put some code on Github that shows three ways of modelling many to many associations with JPA / Hibernate:
  1. Using a join table that is not mapped as either an entity or an embedded component type.
  2. Mapping the join table as an entity in its own right, so you can add properties to it.
  3. Using a collection of embedded components, to simplify the management of the persistence of the association.
The code is available from: https://github.com/hedleyproctor/hibernate-many-to-many Or the direct download link for the zip is: https://github.com/hedleyproctor/hibernate-many-to-many/archive/master.zip I’ll explain how each of these ways of modelling the association work.

Using a join table not mapped as an entity or component

If you simply want a many to many association, without any properties attached to the association, then this is a good choice. You use a join table, and you can navigate from either end of the association. In my example, the Category and Item classes have this association, and you can see it mapped in the Category class:
	@ManyToMany(cascade = CascadeType.ALL)
	@JoinTable(
		name = "CATEGORY_ITEM",
		joinColumns = { @JoinColumn(name = "CATEGORY_ID")},
		inverseJoinColumns = {@JoinColumn(name = "ITEM_ID")}
		)
	public Set<Item> getItems() {
		return items;
	}
Since the join table isn’t an entity, it doesn’t have a corresponding Java class.

Mapping the join table as an entity in its own right

If you want to add properties to the association, then you have to map the join table as an entity. Typical examples might be the username of the person who created the association, or the time it was created. In this case, the join table is represented by a class which has a primary key composed of the two foreign keys to the entities it is linking. I won’t show all of the code here, just the start of the class:
@Entity
@Table(name = "CATEGORY_ITEM_RELATIONSHIP")
public class CategoryItemRelationship {
	
	@EmbeddedId
	private Id id = new Id();
	@ManyToOne
	@JoinColumn(name = "CATEGORY_ID", insertable = false, updatable = false)
	private Category2 category;
	@ManyToOne
	@JoinColumn(name = "ITEM_ID", insertable = false, updatable = false)
	private Item2 item;
	private Date dateAdded;

	@Embeddable
	public static class Id implements Serializable {
		@Column(name = "CATEGORY_ID")
		private Long categoryId;
		@Column(name = "ITEM_ID")
		private Long itemId;
		
		public Id() {}

		public Id(Long categoryId, Long itemId) {
			this.categoryId = categoryId;
			this.itemId = itemId;
		}
Because the id is a composite, it is represented by a static inner class and mapped using the @EmbeddedId annotation. You can see that the database columns for the category and item foreign keys are updated by this static inner Id class. Hence when the item and category instance variables in the outer class are mapped as a join, they are marked as not insertable or updatable. Similarly, when we make the association bi-directional, so that we can navigate from the Item2 and Category2 classes across the association, we have to declare that the association is mapped and controlled by the intermediate CategoryItemRelationship class. For example, in the Category2 class:
	@OneToMany(mappedBy = "category", cascade = CascadeType.ALL)
	public Set<CategoryItemRelationship> getCategoryItemRelationships() {
		return categoryItemRelationships;
	}

Using a collection of components

This is a less common choice. When you use a component, it does not have its own lifecycle. Rather, it is owned by an entity. It also does not have an id, so you cannot retrieve it from the database using an id. (Obviously in database terms a component stored in its own table will have a primary key, but this is never represented in the component class as an id field.) Finally, because a component can only have one owning entity (since its lifecycle is tied to the owning entity), you cannot enable bidirectionality on an association mapped using components. However, using a component won’t require configuring any cascade settings, so could be an appropriate choice if you have an association that you think will generally be accessed and administered from one side. In my example, this component is the CategorizedItem class:
@Embeddable
public class CategorizedItem {
	private String username;
	private Date dateAdded = new Date();
	private Item3 item;
	private Category3 category;
	public CategorizedItem(String username,
		Category3 category,
		Item3 item) {
		this.username = username;
		this.category = category;
		this.item = item;
	}
...
You can see that because it is a component rather than an entity, it is given the @Embedded annotation. Since it does not need an identifier, it doesn’t require the static inner Id class used by the entity approach. At a database level, the primary key of the categorized item table will be a composite of all columns. To link the Category3 class to the CategorizedItem, we use the @ElementCollection and @CollectionTable annotations:
@Entity
public class Category3 {
	private Long id;
	private String name;
	private Set<CategorizedItem> categorizedItems = new HashSet<CategorizedItem>();

	@ElementCollection
	@CollectionTable(name = "CATEGORIZED_ITEMS", joinColumns = @JoinColumn(name = "CATEGORY_ID"))
	public Set<CategorizedItem> getCategorizedItems() {
		return categorizedItems;
	}
	public void setCategorizedItems(Set<CategorizedItem> categorizedItems) {
		this.categorizedItems = categorizedItems;
	}

...
Posted in Hibernate, Java | Tagged , | Leave a comment

Hibernate example 3 – one to one associations

Just put a Hibernate example of the three different ways to create one-to-one associations on github: https://github.com/hedleyproctor/hibernate-one-to-one It contains a Customer entity, which has one-to-one associations to a UserProfile, MarketingPreferences and a Wistlist.
  1. Foreign key relationship. Customer has a foreign key to UserProfile.
  2. Shared primary key. The MarketingPreferences entity is set to use the same primary key as the customer, so its primary key is also a foreign key to the customer.
  3. Join table. Customer and wishlist are linked by a join table, appropriate if the association is optional.
The code is built using maven, so you run the tests with: mvn test I’ll quickly explain how each type of association is mapped. The relationship between Customer and UserProfile is a foreign key, which is the default for a one to one association, so you simply need the @OneToOne annotation in the Customer class:
	private UserProfile userProfile;

	@OneToOne(cascade = CascadeType.ALL)
	public UserProfile getUserProfile() {
		return userProfile;
	}
You can see I’ve also set a cascade on this mapping, so that if you create a new customer and user profile, the user profile will be saved automatically, but it isn’t related to how the mapping works.

Then, if you want to make the association bidirectional, in the UserProfile class, add a Customer instance variable, but specify that the association is mapped and controlled by the userProfile instance variable in the Customer class. This is the standard set up for bi-directional associations – you must not define the association on both sides, as this would actually create two associations. The code in UserProfile is:

	@OneToOne(mappedBy = "userProfile")
	public Customer getCustomer() {
		return customer;
	}
For the MarketingPreferences, it is set to always get the same value as its primary key as its corresponding Customer instance. Hence the primary key for MarketingPreferences is also a foreign key to the Customer. In the MarketingPreferences class this is mapped as:
	private Long id;
	private Customer customer;

	@Id
	public Long getId() {
		return id;
	}
	
	@MapsId
	@OneToOne
	public Customer getCustomer() {
		return customer;
	}
In most classes, the id would be generated by the database, so the @Id annotation would be accompanied by @GeneratedValue. Here, the id must be obtained from the associated Customer class. We have an instance variable of type Customer, and the @MapsId annotation tells Hibernate that the foreign key for it should not be put in a new database column, but instead is to be used as the primary key.

To make this bidirectional, on the Customer side, you must tell Hibernate to use the primary key of MarketingPreferences when it joins the database tables:

	@OneToOne(cascade = CascadeType.ALL)
	@PrimaryKeyJoinColumn
	public MarketingPreferences getMarketingPreferences() {
		return marketingPreferences;
	}
Finally there is the association between Customer and Wishlist, which is mapped by an intermediate join table. This approach is useful for two reasons. Firstly, when the assocation is optional – although you could have a null database foreign key, this isn’t best practice. Secondly, if you want to add properties to the association, the join table gives you a place to put them. The mapping in the Customer class is:
	@OneToOne(cascade = CascadeType.ALL)
	@JoinTable( name = "CUSTOMER_WISHLIST", 
		joinColumns = @JoinColumn( name = "CUSTOMER_ID"),
		inverseJoinColumns = @JoinColumn( name = "WISHLIST_ID"))
	public Wishlist getWishlist() {
		return wishlist;
	}
To make the association bidirectional, add a Customer instance variable to the Wishlist, and specify that the mapping is done by the Customer:
	@OneToOne(mappedBy = "wishlist")
	public Customer getCustomer() {
		return customer;
	}
For more info about one to one associations, check out chapter 8 of the Hibernate docs: http://docs.jboss.org/hibernate/orm/4.2/manual/en-US/html/ch08.html
Posted in Hibernate, Java | Tagged , | Leave a comment