5
votes

I've got a very hard time understanding of lazyness work and how the cache is working.

I think that a step-by-step example of a lazy-seq at work could really help here. For example I've read the following question:

Clojure lazy sequence usage

but it's still not clear.

My question would be how the call determines if another call is "equal" to a cached call and how long does it stays in the cache? I tried to (source lazy-seq) but apparently it's in Java land so I'm out of luck here.

For a simple lazy-seq, taking just one argument (like say the list of powers of two), what if I call it with 5 and then 8? Are just these two values cached?

And what is the point in creating and caching an infinite list to get an infinite structure if I'm going to ruin the memory by caching every input I already called the lazy function with?

Because it says it's caching the result on every subsequent calls... With an 's'.

1: result for argument being '1' cached 2: result for argument being '2' cached 3: result for argument being '3' cached ... 230: I counted up to 230 and it's great because I'm lazy and all, but now there's a 2**30 cache in memory caching all the previous calls for all the subsequent calls.

or is it just the last call that is cached?

What if I write a lazy function taking trees as arguments? Does it run equals? on the argument passed to know if there needs to be a new evaluation?

Can this behavior somehow be traced at run-time?

2

2 Answers

8
votes

The 'cache' in a lazy sequence is not a mutable cache that expires things as you would use in a webapp, it is a cache of size one and there is one in each cell in the list. that 'cache' either contains a value, or the code to compute the value, and never both. once it computes the value it caches the value (in that cell/entry) and if anyone reads the cell again it gives them the value directly instead of calling the code.

here is a simplified imaginary repl session to illustrate the point:

user> (def a (range))
a = [code-for-rest]
user> (first a)
a = [code-for-first, code-for-rest]
a = [0, code-for-rest]
result=> 0
user> (first a)
a = [0, code-for-rest]
result=> 0
user> (nth a 10)
a = [0]->[1]->[2]->[3]->[4]->[5]->[6]->[7]->[8]->[9, code-for-rest]
result=> 4

In this example each cell initially contains (and this is a simplification to illustrate just this point) the code to generate the value and the code to generate the rest of the list (or nil if this is the end of the list). once that cell is realized (made unlazy) then it replaces it's contents with the actual value, so it now contains the value and the code to generate the rest of the sequence. When the next cell in the list is read it will first be generated by code-for-rest (as contained in the cell) then the code-for-nth in the new cell will produce the value for that cell.

1
votes

Here you have a toy example that shows what's happening at run-time:

(defn times-two[number]
 (print "- ")
 (* 2 number))

(def powers-of-two (lazy-cat [1 2] (map times-two (rest powers-of-two))))

(println (take 10 powers-of-two))
(println (take 12 powers-of-two))

The output should be:

(1 - 2 - 4 - 8 - 16 - 32 - 64 - 128 - 256 512)

(1 2 4 8 16 32 64 128 256 - 512 - 1024 2048)