3
votes

I'm working through the exercises in the book "Beginning Haskell." Exercise 4-8 is to make a binary search tree an instance of Functor and define fmap. This is what the tree looks like:

data BinaryTree a = Node a (BinaryTree a) (BinaryTree a) 
                  | Leaf
                    deriving Show

Because it is a search tree, all operations on the tree must maintain the invariant that all values in the left subtree are < the node's value and all values in the right subtree are > the node's value. This means that all values in the tree must be ordinal (Ord a => BinaryTree a).

Two questions:

  1. Since fmap :: (a -> b) -> BinaryTree a -> BinaryTree b, how do I enforce that b is also ordinal? If it didn't have to be a Functor, I could simply do fmapOrd :: (Ord a, Ord b) => (a -> b) -> BinaryTree a -> BinaryTree b, but the Functor typeclass doesn't enforce the Ord contraints.
  2. What does an efficient implementation look like? My first thought was to fold over the tree, and build up a new tree out of the mapped values. Unfortunately, I didn't get this far because of (1).
5
It would be nice if instances of TypeClasses could put stricter constraints on function arguments than the original declaration. It seems like this should be possible....Tim Perry

5 Answers

4
votes

The point of Functors and fmap is that it works for all a and b that can be stored in your data structure, just like Monad has to work for all types a as well. Your Functor instance should look like

instance Functor BinaryTree where
    fmap f Leaf = Leaf
    fmap f (Node a l r) = Node (f a) (fmap f l) (fmap f r)

But if you then want to ensure that mapping over a binary tree keeps it balanced, then you need a function

balanceTree :: Ord a => BinaryTree a -> BinaryTree a

You should be able to implement this function fairly easily with some googling, then you can define a specialized mapping function

binMap :: (Ord a, Ord b) => (a -> b) -> BinaryTree a -> BinaryTree b
binMap f = balanceTree . fmap f

And then you should ensure that you and the users of your library never use fmap (unless necessary) and instead use binMap.

5
votes

If you want to enforce ordering, then your binary tree as it is cannot be made into a functor, because - as you pointed out - the types don't match. However, while the tree can't be a functor over the keys, it can be a functor over the values, provided that there are separate type parameters for each. The standard Data.Map (also implemented as a search tree) works this way.

-- Now the "v" parameter can be mapped over without any care for tree invariants
data Tree k v = Node k v (Tree k v) (Tree k v) | Leaf 

As to the implementation of fmap, your first thought is right. There is also a lazier way though, namely letting GHC derive the instance:

{-# LANGUAGE DeriveFunctor #-}

data Tree k v = Node k v (Tree k v) (Tree k v) | Leaf deriving (Functor)

It pretty much always matches your intents, just remember to let the last type parameter be the one you intend to map over.

4
votes

I'm not going to say I recommend what follows, but for completeness it actually is possible to define such a Functor.

The Functor typeclass demands that you can fmap any function into your Functor. This means, normally, that it's difficult to ensure invariants which demand typeclass instances. However, we can contort this situation quite a bit and actually end up getting away with a Functor instance. In practice, what it means is that we can use the type system to ensure that we we defer our rebalancing to times that are more convenient.

First, we'll introduce a demand on the above type. In particular, we'll give it a Monoid instance which maintains balance. This works out fine since Monoid doesn't require our container be polymorphic.

instance Ord a => Monoid (BalancedTree a) where
  mempty = Leaf
  mappend Leaf Leaf = Leaf
  mappend Leaf b    = b
  mappend b    Leaf = b
  mappend (Node a l1 r1) (Node b l2 r2) = ... -- merge and rebalance here

Now, using this instance we can write functions which correspond almost to a Monad instance for BinaryTree. In particular, we need it in order to combine our new trees as build using bindBin, the almost version of (>>=) on binary search trees.

returnBin :: a -> BinaryTree a
returnBin a = Node a Leaf Leaf

bindBin :: Ord b => BinaryTree a -> (a -> BinaryTree b) -> BinaryTree b
bindBin Leaf _ = Leaf
bindBin (Node a l r) f = bindBin l f <> f a <> bindBin r f

Then we introduce this very strange type (which requires the RankNTypes extension)

newtype FBinaryTree a = 
  FBinaryTree (forall r . Ord r => (a -> BinaryTree r) -> BinaryTree r)

There are many ways of thinking about this, but we'll just note that there's an isomorphism between FBinaryTree a and BinaryTree a witnessed, basically, by returnBin and bindBin.

toF :: BinaryTree a -> FBinaryTree a
toF bt = FBinaryTree (bindBin bt)

fromF :: Ord a => FBinaryTree a -> BinaryTree a
fromF (FBinaryTree k) = k returnBin

and finally, as FBinaryTree inherits some of the properties of the Cont monad or the Yoneda lemma type, we can define a Functor instance for FBinaryTree!

instance Functor FBinaryTree where
  fmap f (FBinaryTree c) = FBinaryTree (\k -> c (k . f))

So that now, all we must do is convert our BinaryTrees into FBinaryTrees, perform our Functor operations there, and then drop back down to BinaryTree as needed. Smooth sailing, right?

Well, almost. It turns out that there's a huge efficiency price we pay for this. In particular, it's easy to have exponential blowups occur when using types like FBinaryTree. We can avoid these by sending FBinaryTree through BinaryTree from time to time by using

optimize :: Ord a => FBinaryTree a -> FBinaryTree a
optimize = toF . fromF

which, as the type shows, requires us to have an Ord instance right there. In fact, the code will use the Ord instance there to perform the needed rebalancing.

0
votes

To answer your first question, you do not need to make the constraint that either type is a member of Ord. Functions like add, search, and remove will only work with members of Ord, but, for fmap, it is not necessary to compare. There is nothing wrong with allowing users to convert trees from one incomparable to another. It is just that he will not be able to call add, remove, or search on the resulting type.

As for your second question, my recommendation would be to use recursion. The function would take a tree of type a and a function, and return a new tree with the function and return a tree with the function applied to the value and fmap applied to its children. Here is a simple implementation:

fmap::(BinaryTree a,BinaryTree b)=>BinaryTree a->(a->b)->BinaryTree b
fmap (Node value left right) fun=Node (fun value) (fmap left) (fmap right)
fmap Leaf _ _=Leaf

I am not sure if my syntax is right, but you get the idea.

0
votes

One more option is to make Ord a constraint part of the type using GADTs:

data BinaryTree a where
  Leaf :: BinaryTree a
  Node :: Ord a => a -> BinaryTree a -> BinaryTree a -> BinaryTree a
    deriving Show

Now when pattern-matching on Node you can use the constraint.

fmap _ Leaf = Leaf
fmap f (Node value left right) = insert (f value) (merge (fmap f left) (fmap f right))
  -- assumes you defined insert and merge functions for search trees