9
votes

I have a simple prime number calculator in clojure (an inefficient algorithm, but I'm just trying to understand the behavior of recur for now). The code is:

(defn divisible [x,y] (= 0 (mod x y)))

(defn naive-primes [primes candidates] 
  (if (seq candidates)
      (recur  (conj primes (first candidates)) 
              (remove (fn [x] (divisible x (first candidates))) candidates))
      primes)
)

This works as long as I am not trying to find too many numbers. For example

(print (sort (naive-primes [] (range 2 2000))))

works. For anything requiring more recursion, I get an overflow error.

    (print (sort (naive-primes [] (range 2 20000))))

will not work. In general, whether I use recur or call naive-primes again without the attempt at TCO doesn't appear to make any difference. Why am I getting errors for large recursions while using recur?

1
Does recur require loop to get tail recursion? I don't see loop in your code. I'd make this an answer, but I'm still learning Clojure. - octopusgrabbus
Your code works for me in Clojure 1.2.1 and 1.3. The only error I eventually get is an OutOfMemoryError when finding primes up to 200,000. - Jon Gauthier
@octopusgrabbus, no, recur can be used in this fashion (just within a function body) as well. See clojure.org/special_forms#recur. - Jon Gauthier
@HansEngel running at the repl in 1.3, I get a stack overflow error when finding primes up to 200,000. Explanation below. - Retief
My apologies for not finding this before posting. My question was similar to stackoverflow.com/questions/2946764/… - DanB

1 Answers

18
votes

recur always uses tail recursion, regardless of whether you are recurring to a loop or a function head. The issue is the calls to remove. remove calls first to get the element from the underlying seq and checks to see if that element is valid. If the underlying seq was created by a call to remove, you get another call to first. If you call remove 20000 times on the same seq, calling first requires calling first 20000 times, and none of the calls can be tail recursive. Hence, the stack overflow error.

Changing (remove ...) to (doall (remove ...)) fixes the problem, since it prevents the infinite stacking of remove calls (each one gets fully applied immediately and returns a concrete seq, not a lazy seq). I think this method only ever keeps one candidates list in memory at one time, though I am not positive about this. If so, it isn't too space inefficient, and a bit of testing shows that it isn't actually much slower.