Sunday, June 5, 2011

Memoization Is Your Friend

 

I just came across some gems regarding memoization here, here, here and here. As a side note, I’d like to go on record and say that Wes Dyer’s derivation of fixed-point combinators is second to none! Even I could understand what he was saying! :) Another great discussion of fixed-point combinators is here. For more theoretical underpinnings, this paper discusses memoization of fixed-point functions.

We’re being plagued by performance issues in the project I’m currently working on, and it would be really nice to have a quick way to to automatically memoize some functions to gain the benefits of laziness and good book-keeping.

So, here is my implementation of a Memoize() function, which uses the fixed-point combinator concepts explained in Wes’ blog.

public static Func<TArgument, TResult> MemoizeFix<TArgument, TResult>(Func<Func<TArgument, TResult>, Func<TArgument, TResult>> func)
{
// ReSharper disable AccessToModifiedClosure
Func<TArgument, TResult> funcMemoized = null;

funcMemoized = func(_ => funcMemoized(_));
funcMemoized = funcMemoized.Memoize();

return funcMemoized;
// ReSharper restore AccessToModifiedClosure
}

private static Func<TArgument, TResult> Memoize<TArgument, TResult>(this Func<TArgument, TResult> func)
{
var _map = new ConcurrentDictionary<TArgument, TResult>();
return _ => _map.GetOrAdd(_, func);
}

In MemoizeFix(), we create the funcMemoized variable as null so that we get around the definite assignment problem. Because we defer the assignment to after the application of the memoization, we actually want to have access to the modified closure.

The other nice thing with .NET 4.0 is the ability to use the ConcurrentDictionary as a thread-safe result cache, with the added bonus of a very terse invocation to GetOrAdd().


We can use the memoization routines this way:

private static readonly Func<int, long> FastFibonacci = Functional.MemoizeFix<int, long>(
fib => n => (n <= 0 ? 0 : (n <= 2 ? 1 : fib(n - 1) + fib(n - 2)))
);

private static readonly Func<int, long> FastLucas = Functional.MemoizeFix<int, long>(
luc => (n => FastFibonacci(n) + FastFibonacci(n + 2))
);

private static readonly Func<int, double> FastFactorial = Functional.MemoizeFix<int, double>(
fact => n => n < 2 ? 1.0d : n*fact(n - 1)
);

For functions with more than one argument, I’d create a Tuple<> with all the arguments as the cache key as the simplest extension. Thusly:

private static Func<TArg1, TArg2, TResult> Memoize<TArg1, TArg2, TResult>(this Func<TArg1, TArg2, TResult> func)
{
var _map = new ConcurrentDictionary<Tuple<TArg1, TArg2>, TResult>();
return (_a1, _a2) => _map.GetOrAdd(new Tuple<TArg1, TArg2>(_a1, _a2), _ => func(_.Item1, _.Item2));
}

public static Func<TArg1, TArg2, TResult> MemoizeFix<TArg1, TArg2, TResult>(Func<Func<TArg1, TArg2, TResult>, Func<TArg1, TArg2, TResult>> func)
{
// ReSharper disable AccessToModifiedClosure
Func<TArg1, TArg2, TResult> funcMemoized = null;

funcMemoized = func((_a1, _a2) => funcMemoized(_a1, _a2));
funcMemoized = funcMemoized.Memoize();

return funcMemoized;
// ReSharper restore AccessToModifiedClosure
}


Pretty slick :)