Advantages of strict fields in data types

Question

This may now be a bit fuzzy, but I've been wondering that for a while. To my knowledge with !, one can make sure a parameter for a data constructor is being evaluated before the value is constructed:

data Foo = Bar !Int !Float

I have often thought that laziness is a great thing. Now, when I go through sources, I see strict fields more often than the !-less variant.

What is the advantage of this and why shouldn't I leave it lazy as it is?

ehird ehird · Accepted Answer · 2011-12-20T14:30:54

Unless you're storing a large computation in the Int and Float fields, significant overhead can build up from lots of trivial computations building up in thunks. For instance, if you repeatedly add 1 to a lazy Float field in a data type, it will use up more and more memory until you actually force the field, calculating it.

Often, you want to store to expensive computation in a field. But if you know you won't be doing anything like that ahead of time, you can mark the field strict, and avoid having to manually add seq everywhere to get the efficiency you desire.

As an additional bonus, when given the flag -funbox-strict-fields GHC will unpack strict fields¹ of data types directly into the data type itself, which is possible since it knows they will always be evaluated, and thus no thunk has to be allocated; in this case, a Bar value would contain the machine words comprising the Int and Float directly inside the Bar value in memory, rather than containing two pointers to thunks which contain the data.

Laziness is a very useful thing, but some of the time, it just gets in the way and impedes computation, especially for small fields that are always looked at (and thus forced), or that are modified often but never with very expensive computations. Strict fields help overcome these issues without having to modify all uses of the data type.

Whether it's more common than lazy fields or not depends on the type of code you're reading; you aren't likely to see any functional tree structures use strict fields extensively, for instance, because they benefit greatly from laziness.

Let's say you have an AST with a constructor for infix operations:

data Exp = Infix Op Exp Exp
         | ...

data Op = Add | Subtract | Multiply | Divide

You wouldn't want to make the Exp fields strict, as applying a policy like that would mean that the entire AST is evaluated whenever you look at the top-level node, which is clearly not what you want to benefit from laziness. However, the Op field is never going to contain an expensive computation that you want to defer to a later date, and the overhead of a thunk per infix operator might get expensive if you have really deeply-nested parse trees. So for the infix constructor, you'd want to make the Op field strict, but leave the two Exp fields lazy.

¹ Only single-constructor types can be unpacked.

Advantages of strict fields in data types

4 Answers