8
votes

I'm trying to learn Haskell and I stumbled upon the following:

myAdd (x:xs) = x + myAdd xs
myAdd null = 0

f = let n = 10000000 in myAdd [1 .. n]

main = do
 putStrLn (show f)

When compiling with GHC, this yields a stack overflow. As a C/C++ programmer, I would have expected the compiler to do tail call optimization.

I don't like that I would have to "help" the compiler in simple cases like these, but what options are there? I think it is reasonable to require that the calculation given above be done without using O(n) memory, and without deferring to specialized functions.

If I cannot state my problem naturally (even on a toy problem such as this), and expect reasonable performance in terms of time & space, much of the appeal of Haskell would be lost.

4
Just for the record: That function is not tail-recursive.sepp2k
You'd have to actually make a tail recursive function to expect optimization - try making an accumulator argument for your myAdd in order to make an easily understandable version.Sarah
"As a C/C++ programmer" - sir to learn Haskell, you first must unlearn some things. If you expect Haskell to behave like C, you are in for a world of hurt. The way function calls work is completely different, for one thing. If you instead try to learn Haskell from scratch, I think you'll find it to be quite intriguing and intuitive.Dan Burton
Legitimate question; I'm not sure why some people express their questions as criticisms of the language they are using. Is there nothing to be gained from learning a language, faults and all?luqui

4 Answers

20
votes

Firstly, make sure you're compiling with -O2. It makes a lot of performance problems just go away :)

The first problem I can see is that null is just a variable name there. You want []. It's equivalent here because the only options are x:xs and [], but it won't always be.

The issue here is simple: when you call sum [1,2,3,4], it looks like this:

1 + (2 + (3 + (4 + 0)))

without ever reducing any of these additions to a number, because of Haskell's non-strict semantics. The solution is simple:

myAdd = myAdd' 0
  where myAdd' !total [] = total
        myAdd' !total (x:xs) = myAdd' (total + x) xs

(You'll need {-# LANGUAGE BangPatterns #-} at the top of your source file to compile this.)

This accumulates the addition in another parameter, and is actually tail recursive (yours isn't; + is in tail position rather than myAdd). But in fact, it's not quite tail recursion we care about in Haskell; that distinction is mainly relevant in strict languages. The secret here is the bang pattern on total: it forces it to be evaluated every time myAdd' is called, so no unevaluated additions build up, and it runs in constant space. In this case, GHC can actually figure this out with -O2 thanks to its strictness analysis, but I think it's usually best to be explicit about what you want strict and what you don't.

Note that if addition was lazy, your myAdd definition would work fine; the problem is that you're doing a lazy traversal of the list with a strict operation, which ends up causing the stack overflow. This mostly comes up with arithmetic, which is strict for the standard numeric types (Int, Integer, Float, Double, etc.).

This is quite ugly, and it would be a pain to write something like this every time we want to write a strict fold. Thankfully, Haskell has an abstraction ready for this!

myAdd = foldl' (+) 0

(You'll need to add import Data.List to compile this.)

foldl' (+) 0 [a, b, c, d] is just like (((0 + a) + b) + c) + d, except that at each application of (+) (which is how we refer to the binary operator + as a function value), the value is forced to be evaluated. The resulting code is cleaner, faster, and easier to read (once you know how the list folds work, you can understand any definition written in terms of them easier than a recursive definition).

Basically, the problem here is not that the compiler can't figure out how to make your program efficient — it's that making it as efficient as you like could change its semantics, which an optimisation should never do. Haskell's non-strict semantics certainly pose a learning curve to programmers in more "traditional" languages like C, but it gets easier over time, and once you see the power and abstraction that Haskell's non-strictness offers, you'll never want to go back :)

9
votes

Expanding the example ehird hinted at in the comments:

data Peano = Z | S Peano
  deriving (Eq, Show)

instance Ord Peano where
    compare (S a) (S b) = compare a b
    compare Z Z = EQ
    compare Z _ = LT
    compare _ _ = GT

instance Num Peano where
    Z + n = n
    (S a) + n = S (a + n)
    -- omit others
    fromInteger 0 = Z
    fromInteger n
        | n < 0 = error "Peano: fromInteger requires non-negative argument"
        | otherwise = S (fromInteger (n-1))

instance Enum Peano where
    succ = S
    pred (S a) = a
    pred _ = error "Peano: no predecessor"
    toEnum n
        | n < 0 = error "toEnum: invalid argument"
        | otherwise = fromInteger (toInteger n)
    fromEnum Z = 0
    fromEnum (S a) = 1 + fromEnum a
    enumFrom = iterate S
    enumFromTo a b = takeWhile (<= b) $ enumFrom a
    -- omit others

infinity :: Peano
infinity = S infinity

result :: Bool
result = 3 < myAdd [1 .. infinity]

result is True by the definition of myAdd, but if the compiler transformed into a tail-recursive loop, it wouldn't terminate. So that transformation is not only a change in efficiency, but also in semantics, hence a compiler must not do it.

7
votes

A little funny example regarding "The issue is why the compiler is unable to optimize something that appears to be rather trivial to optimize."

Let's say I'm coming from Haskell to C++. I used to write foldr because in Haskell foldr is usually more effective than foldl because of laziness and list fusion.

So I'm trying to write a foldr for a (single-linked) list in C and complaining why it's grossly inefficient:

int foldr(int (*f)(int, node*), int base, node * list)
{
    return list == NULL
        ? base
        : f(a, foldr(f, base, list->next));
}

It is inefficient not because the C compiler in question is an unrealistic toy tool developed by ivory tower theorists for their own satisfaction, but because the code in question is grossly non-idiomatic for C.

It is not the case that you cannot write an efficient foldr in C: you just need a doubly-linked list. In Haskell, similarly, you can write an efficient foldl, you need strictness annotations for foldl to be efficient. The standard library provides both foldl (without annotations) and foldl' (with annotations).

The idea of left folding a list in Haskell is the same kind of perversion as a desire to iterate a singly-linked list backwards using recursion in C. Compiler is there to help normal people, not perverts lol.

As your C++ projects probably don't have code iterating singly-linked lists backwards, my HNC project contains only 1 foldl I incorrectly wrote before I mastered Haskell enough. You hardly ever need to foldl in Haskell.

You must unlearn that forward iteration is natural and fastest, and learn that backward iteration is. The forward iteration (left folding) does not do what you intend, until you annotate: it does three passes - list creation, thunk chain buildup and thunk evaluation, instead of two (list creation and list traversal). Note that in immutable world lists can be only efficiently created backwards: a : b is O(1), and a ++ [b] is O(N).

And the backward iteration doesn't do what you intend either. It does one pass instead of three as you might expect from your C background. It doesn't create a list, traverse it to the bottom and then traverse it backwards (2 passes) - it traverses the list as it creates it, that is, 1 pass. With optimizations on, it is just a loop - no actual list elements are created. With optimizations off, it is still O(1) space operation with bigger constant overhead, but explanation is a bit longer.

1
votes

So there are two things I will address about your problem, firstly the performance problem, and then secondly the expressive problem, that of having to help the compiler with something that seems trivial.

The performance

The thing is that your program is in fact not tail recursive, that is, there is no single call to a function that can replace the recursion. Lets have a look at what happens when we expand myAdd [1..3]:

myAdd [1,2,3]
1 + myAdd [2,3]
1 + 2 + myAdd [3]

As you can see, at any given step, we cannot replace the recursion with a function call, we could simplify the expression by reducing 1 + 2 to 3, but that is not what tail recursion is about.

So here is a version that is tail recursive:

myAdd2 = go 0
  where go a [] = a
        go a (x:xs) = go (a + x) xs

Lets have a look at how go 0 [1,2,3] is evaluated:

go 0 [1,2,3]
go (1+0) [2,3]
go (2 + 1 + 0) [3]

As you see, at every step, we only need to keep track of one function call, and as long the first parameter is evaluated strictly we should not get an exponential space blow up, and in fact, if you compile with optimization (-O1 or -O2) ghc is smart enough to figure that out on its own.

Expressiveness

Alright so it is a bit harder to reason about performance in haskell, but most of the time you don't have to. The thing is that you can use combinators that ensure efficiency. This particular pattern above is captured by foldl (and its strict cousin foldl') so myAdd can be written as:

myAdd = foldl (+) 0

and if you compile that with optimization it will not give you an exponential space blowup!