In my quest to understand and harness GHC automatic memoization, I've hit a wall: when pure functions are called with fixed values like fib 42
, they are sometimes fast and sometimes slow when called again. It varies if they're called plainly like fib 42
or implicitly through some math, e.g. (\x -> fib (x - 1)) 43
. The cases have no seeming rhyme or reason, so I'll present them with the intention of asking what the logic is behind the behavior.
Consider a slow Fibonacci implementation, which makes it obvious when the memoization is working:
slow_fib :: Int -> Integer
slow_fib n = if n < 2 then 1 else (slow_fib (n - 1)) + (slow_fib (n - 2))
I tested three basic questions to see if GHC (version 8.2.2) will memoize calls with fixed args:
- Can
slow_fib
access previous top-level calls toslow_fib
? - Are previous results memoized for later non-trivial (e.g. math) top-level expressions?
- Are previous results memoized for later identical top-level expressions?
The answers seem to be:
- No
- No
- Yes [??]
The fact that the last case works is very confusing to me: if I can reprint the result for example, then I should expect to be able to add them. Here's the code that shows this:
main = do
-- 1. all three of these are slow, even though `slow_fib 37` is
-- just the sum of the other two results. Definitely no memoization.
putStrLn $ show $ slow_fib 35
putStrLn $ show $ slow_fib 36
putStrLn $ show $ slow_fib 37
-- 2. also slow, definitely no memoization as well.
putStrLn $ show $ (slow_fib 35) + (slow_fib 36) + (slow_fib 37)
putStrLn $ show $ (slow_fib 35) + 1
-- 3. all three of these are instant. Huh?
putStrLn $ show $ slow_fib 35
putStrLn $ show $ slow_fib 36
putStrLn $ show $ slow_fib 37
Yet stranger, doing math on the results worked when it's embedded in a recursive function: this fibonacci variant that starts at Fib(40):
let fib_plus_40 n = if n <= 0
then slow_fib 40
else (fib_plus_40 (n - 1)) + (fib_plus_40 (n - 2))
Shown by the following:
main = do
-- slow as expected
putStrLn $ show $ fib_plus_40 0
-- instant. Why?!
putStrLn $ show $ fib_plus_40 1
I can't find any reasoning for this in any explanations for GHC memoization, which typically incriminate explicit variables (e.g. here, here, and here). This is why I expected fib_plus_40
to fail to memoize.
-ddump-simpl
, if I recall – luquislow_fib 35
as an expression to share between two calls toputStrLn
; but rather thatputStrLn $ show $ slow_fib 35
itself has been pulled out for sharing. This explains whyputStrLn $ show $ slow_fib35 + 1
is still slow: it is a separate (perhaps memoized!) action that still must do computation. Indeed, you can verify that if you put a secondputStrLn $ show $ slow_fib35 + 1
, it too executes instantly, so this is not a "trivial vs. nontrivial" distinction as you propose. – Daniel Wagner