3
votes

I'm reading Learn you a Haskell, specifically the chapter about pattern matching. Here's the code presented in the tutorial for computing the length of a list:

length' :: (Num b) => [a] -> b  
length' [] = 0  
length' (_:xs) = 1 + length' xs

My question is, would inverting the order of recursion (by placing the base case bellow) show any significant performance increase?

length' :: (Num b) => [a] -> b
length' (_:xs) = 1 + length' xs
length' [] = 0
2

2 Answers

7
votes

Nope, this offers no performance increase whatsoever. In both cases the compiler has to evaluate the argument to it's WHNF to check whether it's the empty list or not.

Actually, it is pretty likely that this function will be rewritten during compilation an generate code entirely different from what you wrote (assuming you compiled with optimizations).

Looking at the generated core (compiled without optimizations):

(letrec {
   f1_aLn [Occ=LoopBreaker] :: [Integer] -> Integer
   [LclId, Arity=1, Str=DmdType]
   f1_aLn =
     \ (ds_d1gX :: [Integer]) ->
       case ds_d1gX of _ [Occ=Dead] {
         [] -> fromInteger @ Integer GHC.Num.$fNumInteger 0;
         : ds1_d1h4 xs_at0 ->
           + @ Integer
             GHC.Num.$fNumInteger
             (fromInteger @ Integer GHC.Num.$fNumInteger 1)
             (f1_aLn xs_at0)
       }; } in
f1_aLn (enumFromTo @ Integer GHC.Enum.$fEnumInteger 1 100))

(letrec {
   f2_aBk [Occ=LoopBreaker] :: [Integer] -> Integer
   [LclId, Arity=1, Str=DmdType]
   f2_aBk =
     \ (ds_d1gP :: [Integer]) ->
       case ds_d1gP of _ [Occ=Dead] {
         [] -> fromInteger @ Integer GHC.Num.$fNumInteger 0;
         : ds1_d1gW xs_aBh ->
           + @ Integer
             GHC.Num.$fNumInteger
             (fromInteger @ Integer GHC.Num.$fNumInteger 1)
             (f2_aBk xs_aBh)
       }; } in
f2_aBk (enumFromTo @ Integer GHC.Enum.$fEnumInteger 1 100))

We can see that the compiler generates equivalent statements. Just fyi, this was the code:

main = do
         print $ f1 [1..100]
         print $ f2 [1..100]

f1 [] = 0
f1 (_:xs) = 1 + f1 xs
f2 (_:xs) = 1 + f2 xs
f2 [] = 0

compiled with ghc -ddump-simpl file.hs

3
votes

With only two cases, the order doesn't matter all. With three cases, the order won't affect performance, but it may simplify the code. For example, say you have a function that does roughly the same thing for an empty or singleton list, but recurses on a list that has 2 or more items. You could write your patterns simplest to most complex:

foo [] = []
foo [x] = [x]
foo (x:y:rest) = x+y : foo (y:rest)

or, you could reduce this to two cases by handling the more complex one first:

foo (x:y:rest) = x+y : foo (y:rest)
foo short = short

since foo = id for both of the short cases.