8
votes

I'm working through the online LYAH book (the link will take you directly to the section that my question concerns).

The author defines a binary tree data type, and shows how it can be made an instance of the type Foldable (defined in Data.Foldable) by implementing the foldMap function:

import Data.Monoid
import qualified Data.Foldable as F

data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show, Read, Eq)

instance F.Foldable Tree where  
  foldMap f Empty = mempty  
  foldMap f (Node x l r) = F.foldMap f l `mappend`  
                           f x           `mappend`  
                           F.foldMap f r 

The type declaration of foldMap is as follows:

F.foldMap :: (Monoid m, F.Foldable t) => (a -> m) -> t a -> m

so it takes a function that takes an instance of type "a" and returns a monoid.

Now as an example, the author creates a Tree instance

    testTree = Node 5  
                 (Node 3  
                    (Node 1 Empty Empty)  
                    (Node 6 Empty Empty)  
                 )  
                 (Node 9  
                    (Node 8 Empty Empty)  
                    (Node 10 Empty Empty)  
                 )  

and performs the following fold (defined for Foldable types):

F.foldl (+) 0 testTree -- the answer is 42 (sum of the Node Integers)

My question is, how does Haskell figure out that addition over the Integer type - querying Haskell for the type of testTree gives Tree [Integer] - can be viewed as a monoid operation (if my terminology is correct)?

(My own attempt at the answer: The author a few paragraphs before this section describes how the Num type can be interpreted as a Monoid type in two different ways; by wrapping them into the Sum and Product type with (+) and (*) as the mappend functions and 0 and 1 as the mempty element, respectively. Is the type of "a" in (Tree a) somehow being inferred as belonging to the Sum type (the way Haskell variously interprets numerical values according to the context) or is it something else entirely? ]

2

2 Answers

13
votes

My question is, how does Haskell figure out that addition over the Integer type - querying Haskell for the type of testTree gives Tree [Integer] - can be viewed as a monoid operation (if my terminology is correct)?

It can't! In fact, there is no Monoid instance for Integer.

Now, don't get me wrong--integers are a monoid under addition. They're also a monoid under multiplication, however, and Haskell has no way to know which to use, hence the newtype wrappers.

But... none of that is what's going on here. Moving on...

(My own attempt at the answer: The author a few paragraphs before this section describes how the Num type can be interpreted as a Monoid type in two different ways; by wrapping them into the Sum and Product type with (+) and (*) as the mappend functions and 0 and 1 as the mempty element, respectively. Is the type of "a" in (Tree a) somehow being inferred as belonging to the Sum type (the way Haskell variously interprets numerical values according to the context) or is it something else entirely? ]

Not a bad guess, but that sort of inference (finding the instance using Sum based on the arguments you gave) is beyond what Haskell can do for you.

There's two key points here--first of all, the Monoid constraint is only used for certain functions, not folds in general. In particular, foldl doesn't actually need a Monoid instance at all, because you provide both the binary operation and initial value for it to use.

The second point is what I suspect you're really after--how does it create a generic foldl that doesn't need a Monoid, when all you defined is foldMap, which does? To answer that, we can simply look at the default implementation of foldl:

foldl :: (a -> b -> a) -> a -> t b -> a
foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z

Here, Endo is another newtype wrapper, specifically for functions a -> a giving the Monoid of composition, with id as the identity, while Dual is a wrapper that reverses the direction of a Monoid. So the Monoid it's actually using here is so it can glue uses of (+) together with function composition, then apply the result to the seed value.

5
votes

The monoid isn't actually used here. The last line is using F.foldl which has signature F.Foldable t => (a -> b -> a) -> a -> t b -> a. Basically you're 'manually' using a monoid by supplying (+) and 0.

If you want to use a monoid 'implicitly', you can use F.fold (which has signature (F.Foldable t, Monoid m) -> t m -> m). In this case, if you try it, you'll get:

*Main> F.fold testTree

<interactive>:1:1:
    No instance for (Monoid Integer)
      arising from a use of `F.fold'
    Possible fix: add an instance declaration for (Monoid Integer)
    In the expression: F.fold testTree
    In an equation for `it': it = F.fold testTree
*Main> :t F.foldl

Now, GHCI is complaining that there is no Monoid instance for Integer, as it should. You have to select either Sum or Product by wrapping the Integer up. For this we can use F.foldMap (signature (F.Foldable t, Monoid m) => (a -> m) -> t a -> m):

*Main> F.foldMap Sum testTree
Sum {getSum = 42}