5
votes

Hibernate Pagination Issue

I have an issue which is related to Hibernate Pagination and to some extent this has been explained in

Mysql Pagination Optimization

Using Hibernate's ScrollableResults to slowly read 90 million records

Hibernate - HQL pagination

Issues with Pagination and Sorting

Hibernate Row Pagination

Details

HQL Query from Application:

Query q = session.createQuery("from RequestDao r order by r.id desc");
            q.setFirstResult(0);
            q.setMaxResults(50);

Query returns 3 million records and for pagination we are setting only 50 of those records, pagination page is very slow because on every refresh we are calling the query which get 3 millions records and out of those we only set 50 records.

My main question is

Does HQL always goes and hits database or does it go and hit session or memory to look for the data and if it goes everytime to hit database and get resultset then it is very proper from performance point of view, what would be best solutions to improve it?

Using HQL in hibernate is there a way we can query database and get only 50 records out first and then get other records as required by the user. This challenge is really bogging down application and so what would be best way to solve this problem?

HQL Query generated in logs

from com.delta.dao.RequestDao r order by r.id desc

Hibernate Generated Query

select
    getrequest0_.ID as ID24_,
    getrequest0_.TIME as START3_24_,
    getrequest0_.STAT as STATUS24_,
    getrequest0_.SUM as SUMMARY24_,
    getrequest0_.OUTNAME as OUTPUT7_24_,
    getrequest0_.INPNAME as INPUT8_24_,
    getrequest0_.REQUEST_DATE as requestT9_24_,
    getrequest0_.PARENT_ID as PARENT10_24_,
    getrequest0_.INTER_TYPE as INTERPO60_24_,
    getrequest0_.OPEN_INT as OPEN61_24_,
    getrequest0_.SOURCE_TYPE as SOURCE62_24_,
    getrequest0_.TARGET_TYPE as TARGET20_24_,
    getrequest0_.SOURCE as SOURCE14_24_,
    getrequest0_.COPY_DATA as COPY16_24_,
    getrequest0_.CURVE as GENERATE63_24_,
    getrequest0_.TITLE as TITLE24_,
    getrequest0_.TIME_ID as TIMESERIES12_24_,
    getrequest0_.TASK_NAME as TASK51_24_ 
from
    REQUEST getrequest0_ 
where
    getrequest0_.KIND='csv' 
order by
    getrequest0_.ID desc

Here is the Explain Plan for the query:


 | id | select_type | table        | type | possible_keys  | key        | key_len | ref          |  rows    | filtered | Extra       | 
 |  1 | SIMPLE      | getrequest0_ | ref  | TR_KIND_ID     | TR_KIND_ID | 6       | const        | 1703018  |   100.00 | Using where |

Additional information: Query run time with and without order by clause on 50 records limit


If i run query with order clause then query takes 0.0012s with setting LIMIT 50 and without order clause, same query takes 0.0032s with same LIMIT 50.


Also how can we find if:

  1. Particular HQL Query is hitting database and not cache or getting information from session?
  2. Is it true that HQL Query will always go and hit database to get result out and Criteria would go and hit session or cache and get results from it?
  3. Also in my below mentioned query:

    a) Query q = session.createQuery("from RequestDao r order by r.id desc");
    b) q.setFirstResult(0);
    c) q.setMaxResults(50);
    

at a, is it true that we get result from database and store it in memory or where if not and at this time we have 3 million results in result set and then at b and c we set offset value and limit so on page we would only see 50 results so now where are remaining 3 million records and on our second call to this query do we again go and hit database and get 3 million records and put them in memory and then at c again we set 50 records and go on an on.

This issue is not clear to me and so would highly appreciate if someone can provide clear and detailed explanation as how this is working and what would be best solution for this problem.

Update

As it turns out, issue am having is not related to display of records on the page but i have filter on that page and on every request am getting all drop down values again from database and there are some funky things going on in there that is causing rise in page load time.

I am making multiple nested hibernate queries to database and getting results back, what would be an optimal solution for this problem?

2
this question is open as i have yet not got an answer to my questions, do add answers if you have any information.Rachel
I just edited my answer to reflect the update.Olaf

2 Answers

5
votes

Your query tells database to sort all records that satisfy the WHERE clause. It potentially may sort millions of records before returning you top 50.

EDIT 1/26: Now that the specific questions were clarified, I'll try to respond more specifically.

  1. Every time you execute query like that, Hibernate goes to the database. Even more, it would flush all new/updated data in the session to disk. If this is your situation, this behavior might contribute to the slowness.

  2. Using Hibernate Query API usually performs quite well in most situations and is compatible with a broad variety of the database platforms. If you are really concerned about squeezing last drop of performance out of your data access layer, you can write your own native SQL query to select top 50 results. But as soon as you do that, you'll almost certainly loose database independence. So, evaluate your costs vs. benefits.

Your query run times appear to be in the single milliseconds range. This is usually as good as it gets with the relational databases that store data on disk. So you might want to evaluate whether you indeed have a performance problem.

EDIT 1/27: OK. It looks like the problem in the overall design of the page. I had been using AJAX for last 7 years or so, so I don't usually have to wait for the filtering UI controls to redraw when going through pages of a table. I guess, switching application UI frameworks is not an option in your case. You have to figure out how to optimize loading of the data for the dropdowns and such. Does this data change frequently? Can you cache it somewhere in the application? If you have to load them every time, can you get away with just getting the display strings instead of entire objects?

1
votes

If you set the first result and the maximum number of results Hibernate will only retrieve those entries from the database. If that is slow, set max results to 1 and enable SQL logging to see which associations are loaded as well.

Hibernate also supports several caches: * first level cache: will only be used during one Hibernate session which often corresponds to one transaction * second level cache: used over session/transaction boundaries, but mostly only by find by id methods * query cache: if enabled Hibernate will lookup the query results from there first. However, the query must be the same in terms of HQL and parameters so each page might be loaded from the DB once and then be cached. (For more information have a look here: http://docs.jboss.org/hibernate/core/3.5/reference/en/html/performance.html#performance-querycache)

Please note that caching requires heap memory and depending on the size of your entities caching 3 million entities might result in a huge memory hit and thus increased garbage collection which would hit performance again.