4
votes

http://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_need says:
"Call-by-need is a memoized version of call-by-name where, if the function argument is evaluated, that value is stored for subsequent uses. [...] Haskell is the most well-known language that uses call-by-need evaluation."

However, the value of a computation is not always stored for faster access (for example consider a recursive definition of fibonacci numbers). I asked someone on #haskell and the answer was that this memoization is done automatically "only in one instance, e.g. if you have `let foo = bar baz', foo will be evaluated once".

My questions is: What does instance exactly mean, are there other cases than let in which memoization is done automatically?

3
Are there cases in which it's unclear whether memoization will be done? Just wondering.n. 1.8e9-where's-my-share m.
@byrondrossos No, GHC generally doesn't do Common Subexpression Elimination or anything like that, it can have negative effects (e.g. space leaks if you keep a billion-element list alive just because some unevaluated thunk uses its head).user395760
@delnan Thanks for the link, deleted my comment so that it wont be confusing.byrondrossos

3 Answers

7
votes

Describing this behavior as "memoization" is misleading. "Call by need" just means that a given input to a function will be evaluated somewhere between 0 and 1 times, never more than once. (It could be partially evaluated as well, which means the function only needed part of that input.) In contrast, "call by name" is simply expression substitution, which means if you give the expression 2 + 3 as an input to a function, it may be evaluated multiple times if the input is used more than once. Both call by need and call by name are non-strict: if the input is not used, then it is never evaluated. Most programming languages are strict, and use a "call by value" approach, which means that all inputs are evaluated before you begin evaluating the function, whether or not the inputs are used. This all has nothing to do with let expressions.

Haskell does not perform any automatic memoization. Let expressions are not an example of memoization. However, most compilers will evaluate let bindings in a call-by-need-esque fashion. If you model a let expression as a function, then the "call by need" mentality does apply:

let foo = expression one in expression two that uses foo
==>
(\foo -> expression two that uses foo) (expression one)

This doesn't correctly model recursive bindings, but you get the idea.

6
votes

The haskell language definition does not define when, or how often, code is invoked. Infinite loops are defined in terms of 'the bottom' (written ⊥), which is a value (which exists within all types) that represents an error condition. The compiler is free to make its own decisions regarding when and how often to evaluate things as long as the program (and presence/absence of error conditions, including infinite loops!) behaves according to spec.

That said, the usual way of doing this is that most expressions generate 'thunks' - basically a pointer to some code and some context data. The first time you attempt to examine the result of the expression (ie, pattern match it), the thunk is 'forced'; the pointed-to code is executed, and the thunk overwritten with real data. This in turn can recursively evaluate other thunks.

Of course, doing this all the time is slow, so the compiler usually tries to analyze when you'd end up forcing a thunk right away anyway (ie, when something is 'strict' on the value in question), and if it finds this, it'll skip the whole thunk thing and just call the code right away. If it can't prove this, it can still make this optimization as long as it makes sure that executing the thunk right away can't crash or cause an infinite loop (or it handles these conditions somehow).

0
votes

If you don't want to have to get very technical about this, the essential point is that when you have an expression like some_expensive_computation of all these arguments, you can do whatever you want with it; store it in a data structure, create a list of 53 copies of it, pass it to 6 other functions, etc, and then even return it to your caller for the caller to do whatever it wants with it.

What Haskell will (mostly) do is evaluate it at most once; if it the program ever needs to know something about what that expression returned in order to make a decision, then it will be evaluated (at least enough to know which way the decision should go). That evaluation will affect all the other references to the same expression, even if they are now scattered around in data structures and other not-yet-evaluated expressions all throughout your program.