Showing posts with label Feber. Show all posts
Showing posts with label Feber. Show all posts

Sunday, July 10, 2011

Functional Object Manipulation

 

In the last couple of posts, we’ve looked at ways to copy scalar properties from one object to another, with the high performance of explicit assignments and the low maintenance of reflection.

The approach of creating a lambda to operate on the object instance as a whole shows a lot of promise. For example, we may want to compare two object instances property-by-property, and the code for that would look quite similar.

public static class Comparer
{
public static bool Compare<T>(this T source, T destination) { return CompareClosure<T>.Compare(source, destination); }

private static class CompareClosure<T>
{
private static Func<T, T, bool> BuildComparerLambda(Func<PropertyInfo, bool> propertyFilter = null)
{
propertyFilter = propertyFilter ?? (_ => true);

var sourceParameterExpression = Expression.Parameter(typeof(T), "_left");
var destinationParameterExpression = Expression.Parameter(typeof(T), "_right");

var properties = typeof(T).GetProperties().Where(propertyFilter);
var expressions = properties.Select(_pi => Expression.Equal(Expression.Property(destinationParameterExpression, _pi), Expression.Property(sourceParameterExpression, _pi)));
var conjoinedExpression = expressions.Aggregate<Expression, Expression>(Expression.Constant(true), Expression.AndAlso);

var lambdaExpression = Expression.Lambda<Func<T, T, bool>>(conjoinedExpression, sourceParameterExpression, destinationParameterExpression);
return lambdaExpression.Compile();
}

private static readonly Func<T, T, bool> _compare = BuildComparerLambda();
public static bool Compare(T source, T destination) { return _compare(source, destination); }
}
}

which gives us:


comparer


What are are doing in the code is mapping each property to a boolean value indicating whether the two instances have the same value for the property, and then folding that set of boolean values to a single boolean value by aggregating it with the && operator (and a seed value of ‘true’).


We can readily see a pattern emerging here:


There are two types of operations possible for a given arity. In the case of Compare and Copy where the operation arity is 2, we can create either an Action<T, T> or a Func<T, T, TResult>.


The Action<> case is a lambda that performs an operation on each property with reference to the given argument instances.


We can create a general operation builder by lifting a parameter which specifies the desired operation on a property, with reference to the lambda arguments.

public static Action<T, T> BuildOperationAction<T>(
Func<PropertyInfo, ParameterExpression, ParameterExpression, Expression> operationExpressionFactory)
{
var leftParameterExpression = Expression.Parameter(typeof (T), "_left");
var rightParameterExpression = Expression.Parameter(typeof (T), "_right");

var properties = typeof (T).GetProperties();
var expressions = properties.Select(_ => operationExpressionFactory(_, leftParameterExpression, rightParameterExpression));

var blockExpression = Expression.Block(expressions);
var lambdaExpression = Expression.Lambda<Action<T, T>>(blockExpression, leftParameterExpression, rightParameterExpression);

return lambdaExpression.Compile();
}

The Func<> case is a lambda that maps the set of properties to a set of values with reference to the argument instances, and then aggregates this set of values to a single value using an aggregator.


We can create a general operation builder by lifting:



  1. a parameter which specifies the desired operation on a property, with reference to the lambda arguments
  2. a parameter which specifies the aggregate operation to fold the set of computed values to a single value, and
  3. a parameter specifying the initial value for the aggregator.

Thusly:

public static Func<T, T, TResult> BuildOperationFunc<T, TResult>(
Func<PropertyInfo, ParameterExpression, ParameterExpression, Expression> operationExpressionFactory,
Func<Expression, Expression, Expression> conjunction, Expression seed)
{
var leftParameterExpression = Expression.Parameter(typeof (T), "_left");
var rightParameterExpression = Expression.Parameter(typeof (T), "_right");

var properties = typeof (T).GetProperties();
var expressions = properties.Select(_ => operationExpressionFactory(_, leftParameterExpression, rightParameterExpression));

var joinedExpression = expressions.Aggregate(seed, conjunction);
var lambdaExpression = Expression.Lambda<Func<T, T, TResult>>(joinedExpression, leftParameterExpression, rightParameterExpression);

return lambdaExpression.Compile();
}

This allows us to implement Copy() and Compare() as follows:

public static bool CompareWith<T>(this T source, T destination) { return CompareClosure<T>.Compare(source, destination); }
private static class CompareClosure<T>
{
private static readonly Func<T, T, bool> _compare =
BuildOperationFunc<T, bool>(
(_pi, _src, _dest) => Expression.Assign(Expression.Property(_dest, _pi), Expression.Property(_src, _pi)),
Expression.AndAlso,
Expression.Constant(true));

public static bool Compare(T source, T destination) { return _compare(source, destination); }
}

public static void CopyTo<T>(this T source, T destination) { CopyClosure<T>.Copy(source, destination); }
private static class CopyClosure<T>
{
private static readonly Action<T, T> _copy =
BuildOperationAction<T>(
(_pi, _src, _dest) => Expression.Assign(Expression.Property(_dest, _pi), Expression.Property(_src, _pi)));

public static void Copy(T source, T destination) { _copy(source, destination); }
}

Very neat.


We could develop more operations than Copy and Compare, building fast, automatically maintained mappers for data access (think ORMs) and UI field access, by simply specifying the equivalent expression. The development effort is simply in the development of the expression.


For example, here is a mapper which maps properties defined on type T, from a dynamic source object to backing fields on a target object of type T:

private static readonly Action<T, T> _copyDynamicToStaticBackingFields = 
BuildOperationAction<T>(
(_pi, _src, _dest) =>
{
var sourceBinder = RuntimeBinder.GetMember(CSharpBinderFlags.InvokeSpecialName, _pi.Name, typeof (T), new [ ] { ThisArgument });
var sourceGetExpression = Expression.Convert(Expression.Dynamic(sourceBinder, typeof (object), _src), _pi.PropertyType);

var fieldName = String.Format("_{0}{1}", _pi.Name.Substring(0, 1).ToLower(), _pi.Name.Substring(1));

var fieldInfo = typeof (T).GetField(fieldName, BindingFlags.Instance | BindingFlags.NonPublic);
if (fieldInfo == null) return Expression.Default(typeof (void));

var destinationAccessExpression = Expression.Field(_dest, fieldInfo);
return Expression.Assign(destinationAccessExpression, sourceGetExpression);
});

We could further develop this approach by developing the BuildOperationXXX functions for Action<>s and Func<>s of arity 1, which would allow us to quickly develop fast, low-maintenance object printers and serializers.


I believe that this set of generating functions is likely to be more generally useful, so I’ve created a CodePlex project for this. Please visit it and let me know how you used it!

The Object Copier, revisited.

 

We saw in the previous post how we could improve the maintainability of reflection-based code by building lambdas using reflection, and then gain the performance benefits by applying the lambdas at execution time.

Taking the idea a step further, we can easily see that what we are really doing is generating a function to do the copy operation on the entire object and applying the copy function.

(T _source, T _destination) =>
{
_destination.Id = _source.Id;
_destination.Name = _source.Name;
...
}

Based on the approach we took last time, this is easily achievable by wrapping the individual assignments into a block statement, wiring through the parameter expressions from the block into the individual lambdas, and then compiling out a full copy. Thusly:

public static class Copier
{
public static void CopyTo<T>(this T source, T destination) { CopyClosure<T>.Copy(source, destination); }

private static class CopyClosure<T>
{
private static Action<T, T> BuildCopierLambda()
{
var sourceParameterExpression = Expression.Parameter(typeof(T), "_source");
var destinationParameterExpression = Expression.Parameter(typeof(T), "_destination");

var properties = typeof(T).GetProperties();
var expressions = properties.Select(_pi => Expression.Assign(Expression.Property(destinationParameterExpression, _pi), Expression.Property(sourceParameterExpression, _pi)));

var blockExpression = Expression.Block(expressions);
var lambdaExpression = Expression.Lambda<Action<T, T>>(blockExpression, sourceParameterExpression, destinationParameterExpression);

return lambdaExpression.Compile();
}

private static readonly Action<T, T> _copy = BuildCopierLambda();
public static void Copy(T source, T destination) { _copy(source, destination); }
}
}


There are a few things to note about the code:


The outer, public, static  Copier class is non-generic to allow the CopyTo operation to be defined as an extension property on generic type T.


The compiler will create a special, unique CopyClosure class for each type T. This ensures that the _copy Action is unique for each type T. Also, the _copy Action is initialized in a thread-safe manner, because the compiler guarantees thread-safety on field initializers.


This approach actually improves performance, because there is no run-time iteration!


copy_block_expression


Stay tuned for more improvements…

Monday, June 27, 2011

A (Fast) Object Copier

 

A colleague of mine asked me today to help with the onerous task of “twisting-off” the data in an entity into a DTO. Something about serialization, he said…

He said that he didn’t mind if the copying was slow – just that he didn’t want to put in hundreds of lines of code each saying

destination.Id = source.Id;
destination.Name = source.Name;
destination.DateAndTimeOfBirth = source.DateAndTimeOfBirth;
destination.BirthWeightInKilograms = source.BirthWeightInKilograms;
destination.IsMultipleBirth = source.IsMultipleBirth;
destination.BirthOrder = source.BirthOrder;
...


So I sat down and knocked off the traditional reflection-based code:

internal static T Map<T>(T source, T destination) where T : class
{
foreach (var pi in typeof (T).GetProperties())
{
pi.SetValue(destination,
pi.GetValue(source, BindingFlags.GetProperty | BindingFlags.Instance, null, null, null),
BindingFlags.SetProperty | BindingFlags.Instance,
null,
null,
null);
}
return destination;
}


(I’ve removed the error checking just to keep things simple for this post)



Neat and all, and does the job.



But I couldn’t help wondering if I could do better – specially since I knew that the performance would potentially be too terrible to bear…



Express(ion) yourself!



It would be nice to use the tricks we learned with expressions to create, for a given type, a set of assignment expressions for each of the properties, and then simply apply all expressions one by one to the source and destination, and that way the cost of reflection is borne only once.



Fair enough. I got home, and after dinner, knocked off this piece of code:

public static class FastCopier
{
public static T Copy<T>(T source, T destination) where T : class
{
return FastCopierInternal<T>.Copy(source, destination);
}

private static class FastCopierInternal<T>
{
private static readonly IDictionary<PropertyInfo, Action<T, T>> _copyFunctions = typeof (T).GetProperties().ToDictionary(_pi => _pi, GetCopyFunc);

private static Action<T, T> GetCopyFunc(PropertyInfo propertyInfo)
{
Trace.WriteLine(String.Format("Creating mapping func for {0}.{1}", propertyInfo.DeclaringType.Name, propertyInfo.Name));
var sourceParameterExpression = Expression.Parameter(typeof (T), "_source");
var destinationParameterExpression = Expression.Parameter(typeof (T), "_destination");

var sourceAccessExpression = Expression.Property(sourceParameterExpression, propertyInfo);
var destinationAccessExpression = Expression.Property(destinationParameterExpression, propertyInfo);

var sourceAssignmentExpression = Expression.Assign(destinationAccessExpression, sourceAccessExpression);

var lambdaExpression = Expression.Lambda<Action<T, T>>(sourceAssignmentExpression,
sourceParameterExpression,
destinationParameterExpression);

return lambdaExpression.Compile();
}

public static T Copy(T source, T destination)
{
foreach (var func in _copyFunctions.Values)
{
func(source, destination);
}

return destination;
}
}
}


Here we have a static class with a single public method to do the Copy for a given type T.



Internally, we have a static generic copier class which only gets instantiated once for each type T, and which builds up an associative array of the properties of the type and the function required to copy each property from source to destination.



Here’s what the lambda expression looks like as it is built:



assignment_lambda



When the public Copy method on the outer class is called, it calls the Copy method on the (implicitly initalized) inner static class, which applies each of the copy functions in turn to copy the source to the destination one property at a time.



Pretty cool. So how much faster are we?

Explicit copy of object 1000000 times took 204 ms
Fast copy of object 1000000 times took 790 ms
Slow copy of object 1000000 times took 23801 ms


Running over a million objects, the fastest, as expected, is when we explicitly code the DTO initialization, and let the compiler optimize.



The lambda version is around 3.5x slower, but still pretty darn fast compared with the basic, slow, all-reflection version – which is two orders of magnitude slower!



Not a bad option to have, when the development and maintenance cost of creating the DTOs and keeping them up to date is factored in!