Recently someone posted a question on stackoverflow explaining that they were having a problem with a Hibernate HQL query that involved both order by and a distinct in the select. The query was across five linked entities – a many to many association, mapped by an entity called PreferenceDateETL, which links the Preference and DateETL entities, and then two properties on the join entity – Employee and Corporation. The person asking the question explained that they wanted to write a query like the following:
select distinct pd.preference
from PreferenceDateETL pd
where pd.corporation.id=:corporationId
and pd.preference.employee.deleted=false
and pd.deleted=false
and pd.preference.deleted=false
and pd.dateETL.localDate>=:startDM
and pd.dateETL.localDate<=:endDM
and pd.preference.approvalStatus!=:approvalStatus
order by pd.preference.dateCreated
However, when trying to write this, he got an exception saying that the date created is not in the select list. What is going on here and how can it be fixed? Well, the order by clause is always evaluated last in a database query, so you can only order on columns you have selected. Although it appears as though the above query will have all of the columns, in fact, Hibernate ends up having two instances of the preference table in its query, and the order by clause refers to a different one to the one in the select, hence the exception. The generated SQL has the form:
select
distinct preference1_.id as id1_76_,
....things...
from
preference_date_etl preference0_
inner join
preference preference1_
on preference0_.preference_id=preference1_.id cross
join
preference preference2_ cross
join
employee employee3_ cross
join
date_etl dateetl5_
where
...things...
order by
preference2_.date_created
The problem with this query is really that using distinct is unnecessary. It is required in the above form of the query, because the query has been written to pick out instances of the many to many entity that we are interested in, then from those, navigate to the preference objects they link to. By definition, because the many to many entities are representing a many to many relationship, this can give you a list that includes duplicates. You can avoid this by restructuring the query so that the query on the many to many entities becomes a subquery. i.e. you pick out the many to many entities you are interested in, and then say "now get me all of the preference objects that are in the list of preference objects referred to by the PreferenceDateETL objects". This query doesn't have any duplicates, hence you do not need the distinct, and the order by clause will work correctly. Here is the HQL:
from Preference p where p in
(select pd.preference from PreferenceDateETL pd
where pd.corporation.id=:corporationId
and pd.deleted=false
and pd.dateETL.localDate>=:startDM
and pd.dateETL.localDate<=:endDM)
and p.employee.deleted=false
and p.deleted=false
and p.approvalStatus != :approvalStatus
order by p.dateCreated
If you want to read the original question, it is here:
http://stackoverflow.com/questions/25545569/hibernate-using-multiple-joins-on-same-table
This article is part of a series of articles on Hibernate. Others in the series include:
Hibernate query limitations and correlated sub-queries
Bespoke join conditions with Hibernate JoinFormula
Inheritance and polymorphism
One-to-many associations
One-to-one associations
Many-to-many associations
1 Response to Distinct, order by and Hibernate HQL