69
votes

I often read that lazy is not the same as non-strict but I find it hard to understand the difference. They seem to be used interchangeably but I understand that they have different meanings. I would appreciate some help understanding the difference.

I have a few questions which are scattered about this post. I will summarize those questions at the end of this post. I have a few example snippets, I did not test them, I only presented them as concepts. I have added quotes to save you from looking them up. Maybe it will help someone later on with the same question.

Non-Strict Def:

A function f is said to be strict if, when applied to a nonterminating expression, it also fails to terminate. In other words, f is strict iff the value of f bot is |. For most programming languages, all functions are strict. But this is not so in Haskell. As a simple example, consider const1, the constant 1 function, defined by:

const1 x = 1

The value of const1 bot in Haskell is 1. Operationally speaking, since const1 does not "need" the value of its argument, it never attempts to evaluate it, and thus never gets caught in a nonterminating computation. For this reason, non-strict functions are also called "lazy functions", and are said to evaluate their arguments "lazily", or "by need".

-A Gentle Introduction To Haskell: Functions

I really like this definition. It seems the best one I could find for understanding strict. Is const1 x = 1 lazy as well?

Non-strictness means that reduction (the mathematical term for evaluation) proceeds from the outside in,

so if you have (a+(bc)) then first you reduce the +, then you reduce the inner (bc).

-Haskell Wiki: Lazy vs non-strict

The Haskell Wiki really confuses me. I understand what they are saying about order but I fail to see how (a+(b*c)) would evaluate non-strictly if it was pass _|_?

In non-strict evaluation, arguments to a function are not evaluated unless they are actually used in the evaluation of the function body.

Under Church encoding, lazy evaluation of operators maps to non-strict evaluation of functions; for this reason, non-strict evaluation is often referred to as "lazy". Boolean expressions in many languages use a form of non-strict evaluation called short-circuit evaluation, where evaluation returns as soon as it can be determined that an unambiguous Boolean will result — for example, in a disjunctive expression where true is encountered, or in a conjunctive expression where false is encountered, and so forth. Conditional expressions also usually use lazy evaluation, where evaluation returns as soon as an unambiguous branch will result.

-Wikipedia: Evaluation Strategy

Lazy Def:

Lazy evaluation, on the other hand, means only evaluating an expression when its results are needed (note the shift from "reduction" to "evaluation"). So when the evaluation engine sees an expression it builds a thunk data structure containing whatever values are needed to evaluate the expression, plus a pointer to the expression itself. When the result is actually needed the evaluation engine calls the expression and then replaces the thunk with the result for future reference. ...

Obviously there is a strong correspondence between a thunk and a partly-evaluated expression. Hence in most cases the terms "lazy" and "non-strict" are synonyms. But not quite.

-Haskell Wiki: Lazy vs non-strict

This seems like a Haskell specific answer. I take that lazy means thunks and non-strict means partial evaluation. Is that comparison too simplified? Does lazy always mean thunks and non-strict always mean partial evaluation.

In programming language theory, lazy evaluation or call-by-need1 is an evaluation strategy which delays the evaluation of an expression until its value is actually required (non-strict evaluation) and also avoid repeated evaluations (sharing).

-Wikipedia: Lazy Evaluation

Imperative Examples

I know most people say forget imperative programming when learning a functional language. However, I would like to know if these qualify as non-strict, lazy, both or neither? At the very least it would provide something familiar.

Short Circuiting

f1() || f2()

C#, Python and other languages with "yield"

public static IEnumerable Power(int number, int exponent)
{
    int counter = 0;
    int result = 1;
    while (counter++ < exponent)
    {
        result = result * number;
        yield return result;
    }
}

-MSDN: yield (c#)

Callbacks

int f1() { return 1;}
int f2() { return 2;}

int lazy(int (*cb1)(), int (*cb2)() , int x) {
    if (x == 0)
        return cb1();
    else
        return cb2();
}

int eager(int e1, int e2, int x) {
    if (x == 0)
         return e1;
    else
         return e2;
}

lazy(f1, f2, x);
eager(f1(), f2(), x);

Questions

I know the answer is right in front of me with all those resources, but I can't grasp it. It all seems like the definition is too easily dismissed as implied or obvious.

I know I have a lot of questions. Feel free to answer whatever questions you feel are relevant. I added those questions for discussion.

  • Is const1 x = 1 also lazy?
  • How is evaluating from "inward" non-strict? Is it because inward allows reductions of unnecessary expressions, like in const1 x = 1? Reductions seem to fit the definition of lazy.
  • Does lazy always mean thunks and non-strict always mean partial evaluation? Is this just a generalization?
  • Are the following imperative concepts Lazy, Non-Strict, Both or Neither?
    • Short Circuiting
    • Using yield
    • Passing Callbacks to delay or avoid execution
  • Is lazy a subset of non-strict or vice versa, or are they mutually exclusive. For example is it possible to be non-strict without being lazy, or lazy without being non-strict?
  • Is Haskell's non-strictness achieved by laziness?

Thank you SO!

6

6 Answers

61
votes

Non-strict and lazy, while informally interchangeable, apply to different domains of discussion.

Non-strict refers to semantics: the mathematical meaning of an expression. The world to which non-strict applies has no concept of the running time of a function, memory consumption, or even a computer. It simply talks about what kinds of values in the domain map to which kinds of values in the codomain. In particular, a strict function must map the value ⊥ ("bottom" -- see the semantics link above for more about this) to ⊥; a non strict function is allowed not to do this.

Lazy refers to operational behavior: the way code is executed on a real computer. Most programmers think of programs operationally, so this is probably what you are thinking. Lazy evaluation refers to implementation using thunks -- pointers to code which are replaced with a value the first time they are executed. Notice the non-semantic words here: "pointer", "first time", "executed".

Lazy evaluation gives rise to non-strict semantics, which is why the concepts seem so close together. But as FUZxxl points out, laziness is not the only way to implement non-strict semantics.

If you are interested in learning more about this distinction, I highly recommend the link above. Reading it was a turning point in my conception of the meaning of computer programs.

18
votes

An example for an evaluation model, that is neither strict nor lazy: optimistic evaluation, which gives some speedup as it can avoid a lot of "easy" thunks:

Optimistic evaluation means that even if a subexpression may not be needed to evaluate the superexpression, we still evaluate some of it using some heuristics. If the subexpression doesn't terminate quickly enough, we suspend its evaluation until it's really needed. This gives us an advantage over lazy evaluation if the subexpression is needed later, as we don't need to generate a thunk. On the other hand, we don't lose too much if the expression doesn't terminate, as we can abort it quickly enough.

As you can see, this evaluation model is not strict: If something that yields _|_ is evaluated, but not needed, the function will still terminate, as the engine aborts the evaluation. On the other hand, it may be possible that more expressions than needed are evaluated, so it's not completely lazy.

6
votes

Yes, there is some unclear use of terminology here, but the terms coincide in most cases regardless, so it's not too much of a problem.

One major difference is when terms are evaluated. There are multiple strategies for this, ranging on a spectrum from "as soon as possible" to "only at the last moment". The term eager evaluation is sometimes used for strategies leaning toward the former, while lazy evaluation properly refers to a family of strategies leaning heavily toward the latter. The distinction between "lazy evaluation" and related strategies tend to involve when and where the result of evaluating something is retained, vs. being tossed aside. The familiar memoization technique in Haskell of assigning a name to a data structure and indexing into it is based on this. In contrast, a language that simply spliced expressions into each other (as in "call-by-name" evaluation) might not support this.

The other difference is which terms are evaluated, ranging from "absolutely everything" to "as little as possible". Since any value actually used to compute the final result can't be ignored, the difference here is how many superfluous terms are evaluated. As well as reducing the amount of work the program has to do, ignoring unused terms means that any errors they would have generated won't occur. When a distinction is being drawn, strictness refers to the property of evaluating everything under consideration (in the case of a strict function, for instance, this means the terms it's applied to. It doesn't necessarily mean sub-expressions inside the arguments), while non-strict means evaluating only some things (either by delaying evaluation, or by discarding terms entirely).

It should be easy to see how these interact in complicated ways; decisions are not at all orthogonal, as the extremes tend to be incompatible. For instance:

  • Very non-strict evaluation precludes some amount of eagerness; if you don't know whether a term will be needed, you can't evaluate it yet.

  • Very strict evaluation makes non-eagerness somewhat irrelevant; if you're evaluating everything, the decision of when to do so is less significant.

Alternate definitions do exist, though. For instance, at least in Haskell, a "strict function" is often defined as one that forces its arguments sufficiently that the function will evaluate to _|_ ("bottom") whenever any argument does; note that by this definition, id is strict (in a trivial sense), because forcing the result of id x will have exactly the same behavior as forcing x alone.

4
votes

This started out as an update but it started to get long.

Laziness / Call-by-need is a memoized version of call-by-name where, if the function argument is evaluated, that value is stored for subsequent uses. In a "pure" (effect-free) setting, this produces the same results as call-by-name; when the function argument is used two or more times, call-by-need is almost always faster.
Imperative Example - Apparently this is possible. There is an interesting article on Lazy Imperative Languages. It says there are two methods. One requires closures the second uses graph reductions. Since C does not support closures you would need to explicitly pass an argument to your iterator. You could wrap a map structure and if the value does not exist calculate it otherwise return value.
Note: Haskell implements this by "pointers to code which are replaced with a value the first time they are executed" - luqui.
This is non-strict call-by-name but with sharing/memorization of the results.

Call-By-Name - In call-by-name evaluation, the arguments to a function are not evaluated before the function is called — rather, they are substituted directly into the function body (using capture-avoiding substitution) and then left to be evaluated whenever they appear in the function. If an argument is not used in the function body, the argument is never evaluated; if it is used several times, it is re-evaluated each time it appears.
Imperative Example: callbacks
Note: This is non-strict as it avoids evaluation if not used.

Non-Strict = In non-strict evaluation, arguments to a function are not evaluated unless they are actually used in the evaluation of the function body.
Imperative Example: short-circuiting
Note: _|_ appears to be a way to test if a function is non-strict

So a function can be non-strict but not lazy. A function that is lazy is always non-strict. Call-By-Need is partly defined by Call-By-Name which is partly defined by Non-Strict

An Excerpt from "Lazy Imperative Languages"

2.1. NON-STRICT SEMANTICS VS. LAZY EVALUATION We must first clarify the distinction between "non-strict semantics" and "lazy evaluation". Non-strictsemantics are those which specify that an expression is not evaluated until it is needed by a primitiveoperation. There may be various types of non-strict semantics. For instance, non-strict procedure calls donot evaluate the arguments until their values are required. Data constructors may have non-strictsemantics, in which compound data are assembled out of unevaluated pieces Lazy evaluation, also called delayed evaluation, is the technique normally used to implement non-strictsemantics. In section 4, the two methods commonly used to implement lazy evaluation are very brieflysummarized.

CALL BY VALUE, CALL BY LAZY, AND CALL BY NAME "Call by value" is the general name used for procedure calls with strict semantics. In call by valuelanguages, each argument to a procedure call is evaluated before the procedure call is made; the value isthen passed to the procedure or enclosing expression. Another name for call by value is "eager" evaluation.Call by value is also known as "applicative order" evaluation, because all arguments are evaluated beforethe function is applied to them."Call by lazy" (using William Clinger's terminology in [8]) is the name given to procedure calls which usenon-strict semantics. In languages with call by lazy procedure calls, the arguments are not evaluatedbefore being substituted into the procedure body. Call by lazy evaluation is also known as "normal order"evaluation, because of the order (outermost to innermost, left to right) of evaluation of an expression."Call by name" is a particular implementation of call by lazy, used in Algol-60 [18]. The designers ofAlgol-60 intended that call-by-name parameters be physically substituted into the procedure body, enclosedby parentheses and with suitable name changes to avoid conflicts, before the body was evaluated.

CALL BY LAZY VS. CALL BY NEED Call by need is an extension of call by lazy, prompted by the observation that a lazy evaluation could beoptimized by remembering the value of a given delayed expression, once forced, so that the value need notbe recalculated if it is needed again. Call by need evaluation, therefore, extends call by lazy methods byusing memoization to avoid the need for repeated evaluation. Friedman and Wise were among the earliestadvocates of call by need evaluation, proposing "suicidal suspensions" which self-destructed when theywere first evaluated, replacing themselves with their values.

0
votes

The way I understand it, "non-strict" means trying to reduce workload by reaching completion in a lower amount of work.

Whereas "lazy evaluation" and similar try to reduce overall workload by avoiding full completion (hopefully forever).

from your examples...

f1() || f2()

...short circuiting from this expression would not possibly result in triggering 'future work', and there's neither a speculative/amortized factor to the reasoning, nor any computational complexity debt accruing.

Whereas in the C# example, "lazy" conserves a function call in the overall view, but in exchange, comes with those above sorts of difficulty (at least from point of call until possible full completion... in this code that's a negligible-distance pathway, but imagine those functions had some high-contention locks to put up with).

int f1() { return 1;}
int f2() { return 2;}

int lazy(int (*cb1)(), int (*cb2)() , int x) {
    if (x == 0)
        return cb1();
    else
        return cb2();
}

int eager(int e1, int e2, int x) {
    if (x == 0)
         return e1;
    else
         return e2;
}

lazy(f1, f2, x);
eager(f1(), f2(), x);
-3
votes

If we're talking general computer science jargon, then "lazy" and "non-strict" are generally synonyms -- they stand for the same overall idea, which expresses itself in different ways for different situations.

However, in a given particular specialized context, they may have acquired differing technical meanings. I don't think you can say anything accurate and universal about what the difference between "lazy" and "non-strict" is likely to be in the situation where there is a difference to make.