14
votes

I want to be able to fetch all records from a very big table using Slick. If I try to do this through foreach, for or list fetching; I get an Out Of Memory Exception.

Is there any way to use "cursors" with Slick or lazy loading that only fetch the object when needed reducing the amount of memory used?

3
Not sure why foreach would result in an OOM, it should only proceed one element at a time. You can instead try elements(), which will return a CloseableIterator. If that also results in an OOM, post the rest of your code.Saish

3 Answers

5
votes

Not sure what do you mean by cursors, but you can fetch partial data using pagination:

query.drop(0).take(1000) will take the first 1000 records

query.drop(1000).take(1000) will take from 1001 to 2000 lines of the table.

But this query efficiency will depend on your database, if it will support it, if the table is right indexed.

1
votes

you could use the combination of iterator which returns an iterator:

 val object = Objects.where(...).map(w => w).iterator()

and a groupby:

val chunkSize = 1000
val groupedObjects = objects.grouped(chunkSize)
groupedObjects.foreach {objects => objects.par.map(h => doJob(h))}

as suggest in this answer

0
votes

dirceusemighini's answer is correct. I ran into a similar issue a few days ago due to wrong assumption about Query.list(), so I can give some more context. From Slick reference:

"Queries are executed using methods defined in the Invoker trait (or UnitInvoker for the parameterless versions). There is an implicit conversion from Query, so you can execute any Query directly. The most common usage scenario is reading a complete result set into a strict collection with a specialized method such as list or the generic method to which can build any kind of collection"

It is indeed true that Query.list() loads the complete result set in memory. With this in mind, you can have multiple approaches for your problem.