11
votes

Learning Clojure and trying to understand the implementation:

What's the difference from:

(def factorial
  (fn [n]
    (loop [cnt n acc 1]
       (if (zero? cnt)
            acc
          (recur (dec cnt) (* acc cnt))
; in loop cnt will take the value (dec cnt)
; and acc will take the value (* acc cnt)
))))

and the following C-like pseudocode

function factorial (n)
    for( cnt = n,  acc = 1) {
        if (cnt==0) return acc;
        cnt = cnt-1;
        acc = acc*cnt;
    }
// in loop cnt will take the value (dec cnt)
// and acc will take the value (* acc cnt)

Are clojure's "loop" and "recur", forms specifically designed to code a simple imperative loop ? (assuming pseudocode's "for" creates it's own scope, so cnt and acc exists only inside the loop)

2

2 Answers

30
votes

Are Clojure's loop and recur forms specifically designed to code a simple imperative loop?

Yes.

In functional terms:

  • A loop is a degenerate form of recursion called tail-recursion.
  • The 'variables' are not modified in the body of the loop. Instead, they are re-incarnated whenever the loop is re-entered.

Clojure's recur makes a tail-recursive call to the surrounding recursion point.

  • It re-uses the one stack frame, so working faster and avoiding stack overflow.
  • It can only happen as the last thing to do in any call - in so-called tail position.

Instead of being stacked up, each successive recur call overwrites the last.

A recursion point is

  • a fn form, possibly disguised in defn or letfn OR
  • a loop form, which also binds/sets-up/initialises the locals/variables.

So your factorial function could be re-written

(def factorial
  (fn [n]
    ((fn fact [cnt acc]
      (if (zero? cnt)
        acc
        (fact (dec cnt) (* acc cnt))))
     n 1)))

... which is slower, and risks stack overflow.

Not every C/C++ loop translates smoothly. You can get trouble from nested loops where the inner loop modifies a variable in the outer one.


By the way, your factorial function

  • will cause integer overflow quite quickly. If you want to avoid this, use 1.0 instead of 1 to get floating point (double) arithmetic, or use *' instead of * to get Clojure's BigInt arithmetic.
  • will loop endlessly on a negative argument.

A quick fix for the latter is

(def factorial
  (fn [n]
    (loop [cnt n acc 1]
      (if (pos? cnt)
        (recur (dec cnt) (* acc cnt))
        acc))))
; 1

... though it would be better to return nil or Double.NEGATIVE_INFINITY.

9
votes

One way to look at loop/recur is that it lets you write code that is functional, but where the underlying implementation ends up essentially being an imperative loop.

To see that it is functional, take your example

(def factorial
  (fn [n]
    (loop [cnt n acc 1]
      (if (zero? cnt)
        acc
        (recur (dec cnt) (* acc cnt))))))

and rewrite it so that the loop form is broken out to a separate helper function:

(def factorial-helper
  (fn [cnt acc]
    (if (zero? cnt)
      acc
      (recur (dec cnt) (* acc cnt)))))

(def factorial'
  (fn [n]
    (factorial-helper n 1)))

Now you can see that the helper function is simply calling itself; you can replace recur with the function name:

(def factorial-helper
  (fn [cnt acc]
    (if (zero? cnt)
      acc
      (factorial-helper (dec cnt) (* acc cnt)))))

You can look at recur, when used in factorial-helper as simply making a recursive call, which is optimized by the underlying implementation.

I think an important idea is that it allows the underlying implementation to be an imperative loop, but your Clojure code still remains functional. In other words, it is not a construct that allows you to write imperative loops that involve arbitrary assignment. But, if you structure your functional code in this way, you can gain the performance benefit associated with an imperative loop.

One way to successfully transform an imperative loop to this form is to change the imperative assignments into expressions that are "assigned to" the argument parameters of the recursive call. But, of course, if you encounter an imperative loop that makes arbitrary assignments, you may not be able to translate it into this form. In this view, loop/recur is a much more constrained construct.