Memoizing function in Haskell

The Wiki Version 1
The Wiki Version 2
Memoize in All Generality
And the Answer Is
Conclusion

Henry Laxen April 3, 2008

I have struggled many a time with memoizing something or other in Haskell, and I have to admit that the brief section found on the haskell.org wiki site was usefull, but certainly not transparent. Here is their *easy* way to memoize the fibonacci sequence

The Wiki Version 1 [top]

> slow_fib :: Int -> Integer
> slow_fib 0 = 0
> slow_fib 1 = 1
> slow_fib n = slow_fib (n-2) + slow_fib (n-1)
> 
> memoized_fib :: Int -> Integer
> memoized_fib =
>    let fib 0 = 0
>        fib 1 = 1
>        fib n = memoized_fib (n-2) + memoized_fib (n-1)
>    in  (map fib [0 ..] !!)

This works great, and it is very simple. The idea is to look up the previous value of the fibonacci function in a list, rather than recompute it. The problem is that you have to know in advance what the arguments are that the fibonacci function will be called with. In the case of fibonacci that is pretty easy, but what if you don't know the arguments in advance? This simple technique isn't going to work.

The wiki goes on to mention something about fixed points and recursion, which I have to admit I didn't understand.

The Wiki Version 2 [top]

> fib :: (Int -> Integer) -> Int -> Integer
> fib f 0 = 1
> fib f 1 = 1
> fib f n = f (n-1) + f (n-2)
>  
> fibonacci :: Int -> Integer
> fibonacci = memoFix fib
> 
> memoFix :: ((a -> b) -> (a -> b)) -> a -> b
> memoFix f =
>    let mf = memoize (f mf) in mf

I suppose the `memoize` function is left as an exercise for the reader, but this reader didn't quite get it. Also the role of `f` as an argument in fib didn't make sense to me.

It wasn't until I found this post by Dick Thierbach discussing the memoization of the Ackerman function that things finally began to make sense. I hope the explanation that follows will spare you, gentle reader, the hours of frustration I endured to finally get to the bottom of this memoization thing in haskell.

Memoize in All Generality [top]

> import Debug.Trace
> import Data.Map as Map
> import Control.Monad.State.Lazy as State
> 
> 
> tfib m n = trace ("fibM called with " ++ show m ++ " returning " ++ show n) n
> 
> fibM :: Monad m =>  (Integer -> m Integer) -> Integer -> m Integer
> fibM f' 0 = return $ tfib 0 1
> fibM f' 1 = return $ tfib 1 1
> fibM f' n = do
>   a <- f' (n-1) 
>   b <- f' (n-2)
>   return $ tfib n (a+b)
> 
> type StateMap a b = State (Map a b) b
> 
> memoizeM :: (Show a, Show b, Ord a) => 
>             ((a -> StateMap a b) -> (a -> StateMap a b)) -> (a -> b)
> memoizeM t x = evalState (f x) Map.empty where
>   g x = do
>     y <- t f x  
>     m <- get
>     put $ Map.insert x y m
>     newM <- get
>     return $ trace ("Map now contains\n" ++ Map.showTree newM) y
>   f x = get >>= \m -> maybe (g x) return (Map.lookup x m)
> 
> 
> fib n = memoizeM fibM n
>

Let's look at this, a little at a time. The interesting function is `memoizeM`, so we'll start there. `evalState` lives in the Control.Mondad.State.Lazy library, and make it easy to keep an internal state within its "sphere." The internal state we are keeping, in this instance, is a map of all of the previous calls to the `fibM` function, as well as the results it returns. Let's take a closer look at how this works. We start off with: `evalState (f x) Map.empty` where f is defined as: `f x = get >>= \m -> maybe (g x) return (Map.lookup x m)`. `f` gets the internal state, which is the current map, (initially empty of course.) It then tries to lookup `x` in this map. If `Map.lookup` succeeds, returning a `Just n` value, we are done, and strip the `Just` off with the maybe function. Life is good. If `Map.lookup` returns `Nothing`, the maybe function returns is *default value*, which in this case is `(g x)`. So next we must explore what `g` is doing.

`g` is actually very simple, though there is one subtle feature that I was too dense to appreciate without some mental gymnastics. The subtlety appears right off the bat in line 46, namely `y <- t f x`. `t` is the function that we were originally called with, in this case it is `fibM`. `t` is called with the function we just looked at, namely `f` as its first argument, and x, the value we are looking for, ie. the *x'th* Fibonacci number, as the second argument. So, lets look at `fibM` and bathe in the glory of enlightenment. The first argument to `fibM` is a function, which in this case is our old friend `f`. But `f` role in life is to lookup a previously computed result, and return it. Since `f` is being called with `(n-1)` and `(n-2)`, this result will exist and be returned immediately. So now we finally understand why `fibM`'s first argument is a function, so we can *thread* the memoization function through it, grab the arguments and the result, and **memoize it!**. The rest of `g` is just gravy now. In line 47 we retrieve our map, an in line 48 we insert the argument and the result into the map. The beauty of this approach is that we do not have to know the arguments in advance, as we did in the first version of the memoized fibonacci function. Furthermore, this appoach will work with any function we want to memoize. The scheme is to rewrite the function so that it takes exactly two arguments. The first is a fucntion which will do the memoization, and the second will be a tuple of all of the required arguments for the function. That way, the tuple will be the key in the map, and whatever the function returns will be the value. We **do not need to know the arguments in advance.** The `memoizeM` function takes care of keeping track of those for us.

And the Answer Is [top]

Here is the output of the above using the wonderful `Debug.Trace` module to allow us to watch the whole thing in action. Notice that the map is built up piece by piece, as the fibonacci function is called, not all at once for all possible arguments. This may not seem like a big deal for fibonacci, but for Ackerman, arguments get big, very big, and using a list or an array to hold the lookup table just isn't feasible. Life is beautiful, once again.


*Main        fib 5  
Loading package array-0.1.0.0 ... linking ... done.  
Loading package containers-0.1.0.1 ... linking ... done.  
Loading package mtl-1.1.0.0 ... linking ... done.  
fibM called with 1 returning 1  
fibM called with 0 returning 1  
Map now contains  
1:=1  
  
Map now contains  
1:=1  
+--0:=1  
+--|  
  
fibM called with 2 returning 2  
Map now contains  
1:=1  
+--0:=1  
+--2:=2  
  
fibM called with 3 returning 3  
Map now contains  
1:=1  
+--0:=1  
+--2:=2  
   +--|  
   +--3:=3  p
  
fibM called with 4 returning 5  
Map now contains  
1:=1  
+--0:=1  
+--3:=3  
   +--2:=2  
   +--4:=5  
  
fibM called with 5 returning 8  
Map now contains  
1:=1  
+--0:=1  
+--3:=3  
   +--2:=2  
   +--4:=5  
      +--|  
      +--5:=8  
  
8  
*Main

Conclusion [top]

Well, gentle reader, there you have it. At this point I hope it is obvious how to memoize just about any function you can dream up with Haskell. If there is something in this article that you feel needs further explanation, please drop me a line at *nadine.and.henry@pobox.com* and I'll do my best to clarify things. I will leave you with a thought that should be emblazoned deep within the brains of just about every programmer I know. **Better is the enemy of good.** The rest is silence.

This file is also available as an lhs file if you want to play with it.

Quote of the day:
Physics is like sex; it may give you some practical results, but that's not why we do it.
Richard Feynman

Sitemap
Go up to Haskell Go up to Home Page of Nadine Loves Henry
Continue with Solving Project Euler Problem 191