2
votes

I'm trying to learn Haskell, but find it really difficult, and there aren't a lot of online resources. I seem to have some major lack of understanding of how the recursive calls are supposed to look, and would appreciate being pointed in the right direction. I'm trying to take in a tree and return every leaf node with the symbol stored there, as well as the path taken to get there. (So the input (Fork (Leaf x) (Leaf y)) would have the output [(x,[False]) ,(y,[True])] ). My code looks like this:

data htree a = Leaf a | Fork (htree a) (htree a) deriving (Show, Eq)

encode :: htree a -> [(a, [Bool])]
encode (Leaf a) = [(a, ????)]

I understand this isn't much to go off of. I've identified the base case, which is whenever you reach a leaf, you return the symbol stored at the leaf, and the path you've taken to get there. Left is false, right is true. I'm not sure how to put all this information together to get moving on my code. I would appreciate any kind of guidance here.

2
You've chosen a rather nontrivial example to understand how recursion works. Try traversing lists for a start.Fyodor Soikin
I actually know how to traverse lists pretty well, I think, and have written a lot of functions with lists. I get the basics of this kind of thinking, I just can't apply it to anything, which is frustrating.napqueen6078
there aren't a lot of online resources Is there a particular type of resource you find most helpful? A lot of people recommend "Learn you a Haskell". It's free online. Here is the chapter on recursion: learnyouahaskell.com/recursionGarrison

2 Answers

7
votes

Consider the Fork. It has two subtrees, and each of those subtrees has some encoding.

Let's say that the left subtree encoding is:

[(x, pathToX), (y, pathToY)]

And let's say that the right subtree encoding is:

[(a, pathToA), (b, pathToB)]

Now, can you see what the encoding of the whole fork should be? It should be this:

[(a, True : pathToA), (b, True : pathToB), (x, False : pathToX), (y, False : pathToY)]

Do you agree with that? If not, give it a think. Maybe work through some small examples. Until you agree that this is the case.

See what I did there? I prepended a False to each path in the left subtree, and then I prepended True to each path of the right subtree.

Let's write that down in Haskell syntax:

encode (Fork left right) = prependToEach False (encode left) ++ prependToEach True (encode right)

Now you may have noticed that I cheated here: I'm using a function prependToEach that doesn't exist. Well, no matter, let's define it!

prependToEach x list = map (prepend x) list

See? Prepending a thing to each element of the list is just mapping a single-element prepending function over the list.

But of course, I cheated again: there is no such function as prepend yet. So let there be one!

prepend x (a, path) = (a, x : path)

And there you go! Now all that's left is to define the base case: what should the path of a Leaf be? Well, according to the example you gave, each Leaf would have an empty path, reflecting the fact that you don't need to take any turns to get from that leaf to that same leaf:

encode (Leaf a) = [(a, [])]

And now, putting it all together:

encode :: HTree a -> [(a, [Bool])]
encode (Leaf a) = [(a, [])]
encode (Fork left right) = prependToEach False (encode left) ++ prependToEach True (encode right)
    where
    prependToEach x list = map (prepend x) list
    prepend x (a, path) = (a, x : path)

And now that we understand how it was constructed and why, we may shorten it a little by making use of list comprehensions (though I consider this step very much optional):

encode :: HTree a -> [(a, [Bool])]
encode (Leaf a) = [(a, [])]
encode (Fork left right) = [(x, False : p) | (x, p) <- encode left] ++ [(x, True : p) | (x, p) <- encode right]

P.S. note that a type cannot be named htree, because type names in Haskell must be capitalized. You may notice that I renamed it to HTree in my final snippet.

0
votes

Fyodor's answer is a good one, but it's worth being wary of using (++). Often if you're not cafeful it can cause your code to run really slowly for particular inputs, in this case an unbalanced tree.

The reason is that (list of N elements) ++ (list of 1 element) has to construct an entirely new list with N+1 elements. So adding only a few elements at a time this way can be slow.

A way to avoid this is to have your intermediate functions, instead of returning lists, returning functions that when passed a list return a list. That way you can just combine functions (which is fast) and construct the list at the end, which will now be done left to right without recreating elements.

Here's an example encode function using this method:

data HTree a = Leaf a | Fork (HTree a) (HTree a) deriving (Show, Eq)

encode :: HTree a -> [(a, [Bool])]
encode tree = go [] tree [] where
  go :: [Bool] -> HTree a -> [(a, [Bool])] -> [(a, [Bool])]
  go path (Leaf leaf) = ((leaf, path):)
  go path (Fork left right) = go (False:path) left . go (True:path) right

Note you don't really need the type signatures, I've just included them for clarity (and it's probably good practice) but removed it's only 3 lines:

encode tree = go [] tree [] where
  go path (Leaf leaf) = ((leaf, path):)
  go path (Fork left right) = go (False:path) left . go (True:path) right

Note this returns the paths from leaf to root, if you want them root to leaf you can either just reverse them at the end or use my return a function trick again.