2
votes

With Spring Data JPA (4.0.8 currently) and Hibernate (4.2.16 currently) I'm having a problem getting pagination to carry down to the subselect hits of JPA OneToMany lists. In the inner list, the subselect selects ALL of the elements to combine with the main, not only the ones for a given page.

For example:

Main JPA entity:

@Entity
@NamedQuery(name="Message.findAll", query="SELECT m FROM Message m")
public class Message implements Serializable {
...
//uni-directional many-to-one association to Message_Review
@OneToMany(cascade={CascadeType.ALL}, fetch=FetchType.EAGER)
@JoinColumn(name="message_id", referencedColumnName="message_id", insertable=false, updatable=false)
@Fetch(FetchMode.SUBSELECT) 
private Set<Message_Review> messageReviews;

The SQL being generated to obtain the primary list of JPA entities is as such:

SELECT TOP ? MESSAGE0_.MESSAGE_ID AS MESSAGE_1_0_, MESSAGE0_....... AND NOT (EXISTS (SELECT MESSAGEREV1_.MESSAGE_ID, MESSAGEREV1_.......))

After the base list of JPA entities, Spring Data JPA (or Hibernate) then queries for the OneToMany lists (per the @FetchMode.SUBSELECT). It generates the SQL as such:

SELECT messagerev0_.message_id as message_2_0_1_, 
.....
where messagetop0_.message_id in 
(select message0_.message_id 
from .......
)

Notice the inner select on the OneToMany hit does not have a TOP associated with it. Therefore, it is pulling back ALL of the message reviews (instead of just the pertainent ones that JPA will need to match them up). Normally this is not a big issue, but with substantially large datasets in the OneToMany lists, the payload may return back millions of items from the database, when it only needs a very small subset for a given page.

Any ideas on how to get Paging to carry down to OneToMany lists in a JPA entity when utilizing @FetchMode.SubSelect to limit the number of SQL hits?

1

1 Answers

0
votes

Hello this is very normal behavior when you have Subselect fetching. This is how it is supposed to be. If you want more scrollable behavior you can use @BatchSize(size=5) annotation where the size tells you how many collections will be fetched at once.

So lets say you have something like Category -> Item relationship and you do:

select * from Category

then you start iterating through categories. when you access item it will return you 5 relevant collections. Not only for the Category that you have navigated through but for the next 5.

the query fir Item will look like:

select * from Item where categoryID in (1,2,3,4,5)

In my opinion what you need is @BatchSize in combination with FetchType.Select.

I am not very good in explanations so I am also sending you a link. Note that Subselect is very early used.http://www.mkyong.com/hibernate/hibernate-fetching-strategies-examples/

Pros and cons of BatchSize vs Subselect:

The bigger the batch size the bigger the memory consumption, the more CPU you need. I guess keeping the batch size low is good for devices with limited resources. If you open the Hibernate tutorial you will see that the @BatchSize is described in the performance tunning chapter. Same as the Subselect strategy with this hint that the subselect can be extremely heavy over large datasets. In this regards using BatchSize is a good compromise in regards of the N+1 select problem and at the same time keeping the memory consumption low.