Monday, June 27, 2011

A (Fast) Object Copier

 

A colleague of mine asked me today to help with the onerous task of “twisting-off” the data in an entity into a DTO. Something about serialization, he said…

He said that he didn’t mind if the copying was slow – just that he didn’t want to put in hundreds of lines of code each saying

destination.Id = source.Id;
destination.Name = source.Name;
destination.DateAndTimeOfBirth = source.DateAndTimeOfBirth;
destination.BirthWeightInKilograms = source.BirthWeightInKilograms;
destination.IsMultipleBirth = source.IsMultipleBirth;
destination.BirthOrder = source.BirthOrder;
...


So I sat down and knocked off the traditional reflection-based code:

internal static T Map<T>(T source, T destination) where T : class
{
foreach (var pi in typeof (T).GetProperties())
{
pi.SetValue(destination,
pi.GetValue(source, BindingFlags.GetProperty | BindingFlags.Instance, null, null, null),
BindingFlags.SetProperty | BindingFlags.Instance,
null,
null,
null);
}
return destination;
}


(I’ve removed the error checking just to keep things simple for this post)



Neat and all, and does the job.



But I couldn’t help wondering if I could do better – specially since I knew that the performance would potentially be too terrible to bear…



Express(ion) yourself!



It would be nice to use the tricks we learned with expressions to create, for a given type, a set of assignment expressions for each of the properties, and then simply apply all expressions one by one to the source and destination, and that way the cost of reflection is borne only once.



Fair enough. I got home, and after dinner, knocked off this piece of code:

public static class FastCopier
{
public static T Copy<T>(T source, T destination) where T : class
{
return FastCopierInternal<T>.Copy(source, destination);
}

private static class FastCopierInternal<T>
{
private static readonly IDictionary<PropertyInfo, Action<T, T>> _copyFunctions = typeof (T).GetProperties().ToDictionary(_pi => _pi, GetCopyFunc);

private static Action<T, T> GetCopyFunc(PropertyInfo propertyInfo)
{
Trace.WriteLine(String.Format("Creating mapping func for {0}.{1}", propertyInfo.DeclaringType.Name, propertyInfo.Name));
var sourceParameterExpression = Expression.Parameter(typeof (T), "_source");
var destinationParameterExpression = Expression.Parameter(typeof (T), "_destination");

var sourceAccessExpression = Expression.Property(sourceParameterExpression, propertyInfo);
var destinationAccessExpression = Expression.Property(destinationParameterExpression, propertyInfo);

var sourceAssignmentExpression = Expression.Assign(destinationAccessExpression, sourceAccessExpression);

var lambdaExpression = Expression.Lambda<Action<T, T>>(sourceAssignmentExpression,
sourceParameterExpression,
destinationParameterExpression);

return lambdaExpression.Compile();
}

public static T Copy(T source, T destination)
{
foreach (var func in _copyFunctions.Values)
{
func(source, destination);
}

return destination;
}
}
}


Here we have a static class with a single public method to do the Copy for a given type T.



Internally, we have a static generic copier class which only gets instantiated once for each type T, and which builds up an associative array of the properties of the type and the function required to copy each property from source to destination.



Here’s what the lambda expression looks like as it is built:



assignment_lambda



When the public Copy method on the outer class is called, it calls the Copy method on the (implicitly initalized) inner static class, which applies each of the copy functions in turn to copy the source to the destination one property at a time.



Pretty cool. So how much faster are we?

Explicit copy of object 1000000 times took 204 ms
Fast copy of object 1000000 times took 790 ms
Slow copy of object 1000000 times took 23801 ms


Running over a million objects, the fastest, as expected, is when we explicitly code the DTO initialization, and let the compiler optimize.



The lambda version is around 3.5x slower, but still pretty darn fast compared with the basic, slow, all-reflection version – which is two orders of magnitude slower!



Not a bad option to have, when the development and maintenance cost of creating the DTOs and keeping them up to date is factored in!