3
votes

I have a lazy-seq where each item takes some time to calculate:

(defn gen-lazy-seq [size]
  (for [i (range size)]
    (do
      (Thread/sleep 1000)
      (rand-int 10))))

Is it possible to evaluate this sequence step by step and print the results. When I try to process it with for or doseq clojure always realizes the whole lazy-seq before printing anything out:

(doseq [item (gen-lazy-seq 10)]
  (println item))

(for [item (gen-lazy-seq 10)]
  (println item))

Both expressions will wait for 10 seconds before printing anything out. I have looked at doall and dorun as a solution, but they require that the lazy-seq producing function contain the println. I would like to define a lazy-seq producing function and lazy-seq printing function separately and make them work together item by item.

Motivation for trying to do this: I have messages coming in over a network, and I want to start processing them before all have been received. At the same time it would be nice to save all messages corresponding to a query in a lazy-seq.

Edit 1:

JohnJ's answer shows how to create a lazy-seq that will be evaluated step by step. I would like to know how to evaluate any lazy-seq step by step.

I'm confused because running (chunked-seq? (gen-lazy-seq 10)) on gen-lazy-seq as defined above OR as defined in JohnJ's answer both return false. So then the problem can't be that one creates a chunked sequence and the other doesn't.

In this answer, a function seq1 which turns a chunked lazy-seq into a non-chunked one is shown. Trying that function still gives the same problem with delayed output. I thought that maybe the delay has to do with the some sort of buffering in the repl, so I tried to also print the time when each item in the seq is realized:

(defn seq1 [s]
  (lazy-seq
    (when-let [[x] (seq s)]
      (cons x (seq1 (rest s))))))

(let [start-time (java.lang.System/currentTimeMillis)]
  (doseq [item (seq1 (gen-lazy-seq 10))]
    (let [elapsed-time (- (java.lang.System/currentTimeMillis) start-time)]
      (println "time: " elapsed-time "item: " item))))

; output:
time:  10002 item:  1
time:  10002 item:  8
time:  10003 item:  9
time:  10003 item:  1
time:  10003 item:  7
time:  10003 item:  2
time:  10004 item:  0
time:  10004 item:  3
time:  10004 item:  5
time:  10004 item:  0

Doing the same thing with JohnJ's version of gen-lazy-seq works as expected

; output:
time:  1002 item:  4
time:  2002 item:  1
time:  3002 item:  6
time:  4002 item:  8
time:  5002 item:  8
time:  6002 item:  4
time:  7002 item:  5
time:  8002 item:  6
time:  9003 item:  1
time:  10003 item:  4

Edit 2:

It's not only sequences generated with for which have this problem. This sequence generated with map cannot be processed step by step regardless of seq1 wrapping:

(defn gen-lazy-seq [size]
  (map (fn [_] 
         (Thread/sleep 1000)
         (rand-int 10))
       (range 0 size)))

But this sequence, also created with map works:

(defn gen-lazy-seq [size] 
  (map (fn [_] 
         (Thread/sleep 1000)
         (rand-int 10))
       (repeat size :ignored)))
1
I also tried seq1 and didn't manage to get it to work with Clojure 1.5 or with 1.2.1. It may be that you are up against particularities in the implementation of the for macro, which is anything but transparent, and which you cannot simply turn off by wrapping it in your own lazy seq.JohnJ
Thanks for testing. It seems reasonable that the problem is due to for. I tried creating a lazy-seq with map and that worked as expected.snowape
Testing some more, it seems that I have the same problem with sequences generated by map.snowape

1 Answers

4
votes

Clojure's lazy sequences are often chunked. You can see the chunking at work in your example if you take large sizes (it will be helpful to reduce the thread sleep time in this case). See also these related SO posts.

Though for seems to be chunked, the following is not and works as desired:

(defn gen-lazy-seq [size]
  (take size (repeatedly #(do (Thread/sleep 1000)
                              (rand-int 10)))))

(doseq [item (gen-lazy-seq 10)]
  (println item)) 

"I have messages coming in over a network, and I want to start processing them before all have been received." Chunked or no, this should actually be the case if you process them lazily.