3
votes

I'm trying to understand what's happening under the hood in terms of lazy evaluation in Haskell.

If I have a function call like:

negate $ 5 * sqrt 16

My understanding is that Haskell will process the sqrt 16 first, creating a thunk allowing the value to be calculated when it's needed.

But is sqrt 16 evaluated when it's passed to the multiplication or only when it's outputted to the console in some way?

In other words, in what order would each part of the expression be evaluated when entered into GHCi (for example)?

5
Have you read though the Wiki? haskell.org/haskellwiki/Performance/LazinessDevin M
Thanks for the link. Very helpful info in thereTom Savage
lots of references by DonS, and OP helpful: reddit.com/r/programming/comments/i4jb1/…Gene T

5 Answers

8
votes

By default, every function and constructor call becomes a thunk. So, in this case, the evaluation happens like this:

evaluate "negate $ 5 * sqrt 16" -> <thunk> $ <thunk>
 evaluate "negate"               -> <func>
 evaluate "5 * sqrt 16"          -> <thunk> * <thunk>
  evaluate "5"                    -> 5.0
  evaluate "sqrt 16"              -> 4.0

<thunk> means something that is a thunk, and <func> means it's a function value that can't be represented as a string.

Indentation means that the "parent" will evaluate the "children" before it evaluates itself.

So, if you write print (negate $ 5 * sqrt 16), the runtime will perform these steps:

eval thunk:
  <thunk 1> $ <thunk 2>
eval thunk 1:
  <func> $ <thunk 2>
eval thunk 2:
  <func> $ <thunk 3> * <thunk 4>
(cheating a little here, because (*) is strict, so these three are actually
 one step:)
eval thunk 3:
  <func> $ 5 * <thunk 4>
eval thunk 4 by applying sqrt:
  <func> $ 5 * 4
apply (*):
  <func> $ 20
apply ($):
  -20
6
votes

You can think of it as going outside-in, i.e. first negate is called. It will then force evaluation of its argument, which might force the evaluation of other expressions and so on. You can play with Debug.Trace.trace, which prints its first argument while returning its second when evaluated, to see exactly in which order things happen in GHCi:

> trace "A" (negate (trace "B" (5 * (trace "C" (sqrt 16)))))
A
B
C
-20.0

However, note that the compiler is allowed to perform optimizations that may change the order in which expressions get evaluated, which is why we use the IO monad when the order matters.

4
votes

The expression is evaluated when that value is needed. Therefore if you write:

f = negate $ 5 * sqrt 16

It is only until you use f that you evaluate. negate will need 5 * sqrt 16, which in turn will need sqrt 16. The evaluation continues unfolding until it reaches the base case, which will be evaluated, and then that will be used for the previous/higher expression (going backwards now), until the whole expression is evaluated.

4
votes

First of all, the thunk for the whole expression is created. * is strict, so the thunk for sqrt 16 might or might not get created inside it, depending on optimizations (the direct call to sqrt might be used). Then when it is forced (its value is needed), negate will force its argument, which is the * expression, and being strict, the * will force both its arguments and produce the value, 20.

By the way I think when you speak of Haskell, it is the non-strict semantics that you should be talking about. "Thunk" and "lazy" belongs to the implementation specifics.

2
votes

The way of thinking about this that I read somewhere that really helped me on this front is to look at it this way:

  • Every expression produces a thunk.
  • Thunks will be forced when their value is "needed."
  • The notion of "need" is what you may call "conditional forcing": "Assuming that thunk T is being forced, which other thunks will this cause to be forced?" A function is strict in its argument if forcing a thunk that applies that function causes its argument's thunk to be forced.

So, values are printed to the console by invoking the appropriate show method on them; i.e., printing to the console forces an expression of form show x (for some x). Forcing show x forces x. Suppose x is negate $ 5 * sqrt 16; since negate is strict in its argument, forcing the thunk also forces the thunk for 5 * sqrt 16. Likewise, * and sqrt are both strict in their arguments, so the thunks for 5, sqrt 16 and 16 must also be forced.

The other thing to understand is how data constructors and pattern matching affect thunking. Basically, unless there are special strictness anotations, constructors are like non-strict functions, in that forcing the thunk doesn't force the constructor's arguments. Unless the special lazy pattern matching syntax is used, matching against a constructor forces the argument thunk. So we have:

identity x = x  -- irrefutable pattern; `x` is not forced

uncons (x:xs) = (x, xs)  -- match against (:) constructor; argument 
                         -- must be forced, but x and xs aren't forced

foo (x:x':xs) = (x, x', xs) -- two matches against (:) constructor;
                            -- the argument thunk is forced, as is its
                            -- tail thunk.