5
votes

I've been learning some Haskell, and doing project Euler problems as I go. I'm not really bothered about the answer to the Euler Problem (which I can happily brute force, or do it in some other language), but the method.

I'm stuck on doing Problem 15 in reasonable time. It asks for the number of non-backtracking routes from top-left to the bottom-right of a 20x20 grid. Link here

I'd say it's reasonably obvious that the idea is to memoize (sp?) the function, but I'm not reall sure how to do it. I googled, and came across this on Haskell Cafe that I naively tried to adapt, but ended up with:

memRoute :: (Int,Int) -> Int
memRoute (x,y) = fromJust $ lookup (x,y) $ map (\t -> (t,route t)) [(x,y) | x <- [0..20], y <- [0..20]]

route (0,0) = 0
route (_,0) = 1
route (0,_) = 1
route (x,y) = memRoute (x-1,y) + memRoute (x,y-1)

Which looks bad, and performs horribly (a lot slower than a naive version). The problem is that I don't really understand how Haskell memoization works.

Using my code as an example, would anyone care to explain a) why mine is so slow
b) how it should be done (without using mutables, which was a solution I came across)
c) how the memoization works in this case?

3
I haven't read your program yet, but I wanted to let you know there's a clever O(1) solution. See haskell.org/haskellwiki/Euler_problems/11_to_20#Problem_15Wei Hu
More generally, this is a problem of enumerative combinatorics. The intended solution isn't actually constant time--arithmetic isn't free, especially when factorials are involved--but it is certainly far more efficient than actually iterating through each route.C. A. McCann
That problem is horrible... I still remember it... and the attempts to brute force it. Honestly who can possibly think about Pascal's triangle when they look at that problem :)Jerome

3 Answers

5
votes

Toss in a couple of trace points

memRoute :: (Int,Int) -> Int
memRoute (x,y) = trace ("mem: " ++ show (x,y))
                 fromJust $
                 lookup (x,y) $
                 map (\t -> (t,route t))
                 [(x,y) | x <- [0..20], y <- [0..20]]

route (0,0) = 0
route (_,0) = 1
route (0,_) = 1
route (x,y) = trace ("route: " ++ show (x,y))
              memRoute (x-1,y) + memRoute (x,y-1)

to see that you haven't memoized at all.

ghci> memRoute (2,2)
mem: (2,2)
route: (2,2)
mem: (1,2)
route: (1,2)
mem: (0,2)
mem: (1,1)
route: (1,1)
mem: (0,1)
mem: (1,0)
mem: (2,1)
route: (2,1)
mem: (1,1)
route: (1,1)
mem: (0,1)
mem: (1,0)
mem: (2,0)
6

Reusing subcomputations is dynamic programming.

import Data.Array

routes :: (Int,Int) -> Int
routes = (rte !)
  where rte = array ((1,1), (n,n))
                    [ ((x,y), route (x,y)) | x <- [1 .. n],
                                             y <- [1 .. n] ]
        route (1,1) = 0
        route (_,1) = 1
        route (1,_) = 1
        route (x,y) = rte ! (x-1,y) + rte ! (x,y-1)
        n = 20

Note that the algorithm is incorrect, but at least it's easy to get a bad answer quickly!

10
votes

In my opinion, "memoization" is highly overrated. There is no one-size-fits-all memoization technique (in any programming language) that turns every single-case calculation into a general calculation. You have to understand each problem, and organize things to control the amount of memory you need to use.

In this case, to get the number of paths for an n x m rectangle, you don't need to remember the totals for all smaller rectangles, just for the rectangles that are one step smaller in either direction. So you can build up row by row, forgetting all of the totals for smaller rectangles as you go.

In Haskell, laziness is perfect for this kind of situation. It relieves you of all the work of keeping of track of exactly what to remember and what to forget - just compute an infinite grid of the totals for all possible rectangles, and let Haskell do the rest of the work.

For zero rows, you have only the bottom line. No matter how long it is, there is only a single path to the end, so the numbers of paths are repeat 1.

To compute a row in the grid from the previous row, you start with 1 (only one path straight down, no matter how high you are), then at each step you add together the corresponding entry in the previous row and the previous step in the current row. Altogether, we have:

iterate (scanl (+) 1 . tail) (repeat 1) !! 20 !! 20

That computes the answer instantaneously for me at the GHCi prompt.

0
votes

What happens is that your memoization table is recomputed on every call to memRoute. Not in its entirity (fortunately!) but at least the work done in one call is not shared with the rest. The simplest rewrite I could come up with was:

memRoute = let table = map (\t -> (t,route t)) [(x,y) | x <- [0..20], y <- [0..20]]
           in  \(x,y) -> fromJust $ lookup (x,y) $ table

That stays really close to your expressed intent, but I think there are better ways of doing memoization, by using a Data.Map and letting the actual call pattern determine what values are memoized.