27
votes

In Haskell, the term spine strictness is often mentioned in relation to lazy evaluation. Though I have a vague understanding of that it means, it would be nice to have a more concrete explanation about:

  • What is the spine of a data structure
  • What does spine strictness mean?
  • What are the benefits when comparing spine strict data structures with lazy ones?
1

1 Answers

33
votes

Here's an example

> length (undefined : 3 : 4 : undefined : [])
4
> length (2 : 3 : 4 : 5 : undefined)
<<loop>>

The first list contains bottoms as elements, but the "shape" of the list is fully defined. Roughly speaking, every list cell has a clearly defined "pointer" to its next element. This "shape" is called the spine.

The second list, by comparison, has completely defined elements, yet its spine is not defined. This is because it does not end with the empty list [], but with a non-terminating expression undefined. In this case the spine is not defined.

The function length cares about the spine, not the elements. So it is able to work in the first case (thanks to laziness), but not the second. We say that length is strict in the spine, but not in the elements of the list.

Similarly, in tree data structures, the spine is the "shape" of the tree. Some functions such as tree height can be written without inspecting elements, but only the spine. Such functions are strict in the spine.

While some functions have to be spine-strict (e.g. length), others can be written both in a spine-strict or spine-lazy fashion. For instance map on lists is spine-lazy: it will return the first element of the output before accessing all the spine of its input. A stricter variant can be obtained by

map' :: (a->b) -> [a] -> [b]
map' _ []     = []
map' f (x:xs) = (f x :) $! map' f xs

Whether this is beneficial depends on the context. Consider

-- apply n times function f
iter n f = foldr (.) id $ replicate n f

list1 = iter 1000 (map  succ) [1..10]
list2 = iter 1000 (map' succ) [1..10]

If I demand head list1 I will force the application of the 1000 maps only at the first element of the list. This means that after that there will be 1000 allocated thunks in memory holding up space.

Instead, head list2 will force the application of the 1000 maps on the whole list. So, all the 1000 thunks can be immediately garbage collected, reclaiming memory.