Wednesday, April 20, 2011

C# Abuse

 

So I just watched this video presentation by Jon Skeet about some of the edges of our understanding of the C# language.

In this blog post, he threw open the challenge to define three static methods such that the following code compiles:

static void Main() 
{
Foo<int>();
Foo<string>();
Foo<int?>();
}

While watching the video and playing around on my own, I came up with:

class Program
{
static void Foo<T>() { Console.WriteLine(typeof(T).Name); }

class Inner : Program
{
struct R<T> where T : class { }

static void Foo<T>(params R<T>?[] _) where T : class { Console.WriteLine(typeof(T).Name); }
static void Foo<T>(params T?[] _) where T : struct { Console.WriteLine(typeof(T).Name); }

static void Main()
{
Foo<int>();
Foo<string>();
Foo<int?>();
}
}
}

which prints out the correct result. The analysis of how this works is given brilliantly over at Jon’s blog!


The key is the inner class, which forces the compiler to first check signature suitability for the two methods defined therein, before allowing the base class to catch-all with the most generalized signature.


Figure 1


The same caveat that Jon specified in his blog applies – such code should *never* be written!

Sunday, April 17, 2011

Lambda Wrangling For Fun : Part 4 – Sorting Multiple Columns

 

We can approach the problem of sorting a dataset by multiple properties using the same techniques as for the filter case.

A single-property sorting lambda looks like:

….OrderBy(_ => _.property) or ….OrderByDescending(_ => _.property)

If we wanted to sort by more than one property, we would do something like:

….OrderBy(_ => _.property1).OrderByDescending(_ => _.property2)

It’s important to notice two points:



  • The lambdas for each sort clause are self-contained – unlike the composite filter lambda where the same parameter is threaded through each predicate.
  • The direction of the sort is specified by the use of either the OrderBy or OrderByDescending extension method. This means that the composite sort criteria will be represented by a chain of OrderBy and/or OrderByDescending methods.

Bearing these points in mind, we can develop a general Sort action specification:

public interface ISort<TSource>
{
string PropertyName { get; }
SortDirection SortDirection { get; }

Expression<Func<TSource, object>> GetSelectorLambda();
}

The properties are as we expected, and we have encountered a variant GetSelectorLambda before.


Applying a composite sort on an enumerable can be achieved by:

public static IOrderedEnumerable<T> Sort<T>(this IEnumerable<T> _this, IEnumerable<ISort<T>> sortOperations)
{
var parameterExpression = Expression.Parameter(typeof (IEnumerable<T>), "_");

Expression items = parameterExpression;
foreach (var sortOperation in sortOperations)
{
// selector : __ => __.{sortOperation.PropertyName}
var selector = sortOperation.GetSelectorLambda();

// items.OrderBy<T, object>(__ => __.Property)
// is actually
// Enumerable.OrderBy<T, object>(items, __ => __.Property)
items = Expression.Call(typeof (Enumerable),
sortOperation.SortDirection == SortDirection.Ascending ? "OrderBy" : "OrderByDescending",
new [ ] { typeof (T), typeof (object) },
items,
selector);
}

// _ => _.OrderBy[Descending](__ => __.Property1)...OrderBy[Descending](__ => __.PropertyN)
var lambda = Expression.Lambda<Func<IEnumerable<T>, IOrderedEnumerable<T>>>(items, parameterExpression);

// compile the lambda...
var func = lambda.Compile();

// ...call the chain of extension methods on (_this)
return func(_this);
}

The function generates calls to either the ‘OrderyBy’ or the ‘OrderByDescending’ method on the Enumerable type, using the previous result as the first argument, thus generating the chain.


Figure1It finally compiles and executes the chain of methods on the actual enumerable instance passed in.


Here is a typical use:

var sorts = new List<ISort<IPerson>> {
new Sort<IPerson>("Timestamp", SortDirection.Descending),
new Sort<IPerson>("Name", SortDirection.Ascending),
};

var sortedPersons = persons.Sort(sorts);

Saturday, April 16, 2011

Lambda Wrangling For Fun : Part 3 – Filtering Multiple Columns

 

We now know how to generate a dynamic predicate to filter a collection of items by a single property value.

It is often useful to be able to apply multiple filters to a collection to support more sophisticated searching and filtering operations.

Let’s define a general-purpose filter interface to encapsulate the property, the target value and the comparison function.

public interface IFilter
{
string PropertyName { get; }
dynamic TargetValue { get; }
Expression<Func<dynamic, dynamic, bool>> Comparison { get; }
Expression<Func<TSource, bool>> GetPredicateLambda<TSource>();
Expression BuildComparisonExpression<TSource>(Expression parameterExpression = null);
}

The three properties are what we expect to see, and we’ve already seen a variant of the GetPredicateLambda function before.


Before we see what the other method is needed for, let’s analyse what it means to filter multiple columns first:


One Fish, Two Fish…


A filter with a single predicate comparing against a value looks thus:

….Where(_ => predicate0(_.property0, value0))

and if we had to filter by more than one predicate would look like:

….Where(_ => predicate0(_.property0, value0) 
&& predicate1(_.property1, value1))

Notice that what we need to do is to create just the comparison part part of the lambda for each filter, with the same parameter injected into each predicate call.


This is exactly what the BuildComparisonExpression function does. Here is the implementation:

public virtual Expression BuildComparisonExpression<TSource>(Expression parameterExpression = null)
{
// This gives us * _ *
parameterExpression = parameterExpression ?? Expression.Parameter(typeof (TSource), "_");

var propertyInfo = typeof (TSource).GetProperty(PropertyName, true);
if (propertyInfo == null)
{
throw new MemberAccessException(String.Format("No such property exists: {0}", PropertyName));
}

// This gives us * _.property *
var accessorExpression = Expression.Property(parameterExpression, propertyInfo);

// This gives us * value *
var valueExpression = Expression.Constant(TargetValue);

// This gives us * comparison(_property, value) *
return Expression.Invoke(Comparison, accessorExpression, valueExpression);
}

The key to this method is the last line. Expression.Invoke generates an expression which invokes the Comparison expression with the accessorExpression and valueExpression as its arguments.


Also, we do not generate the Lambda with this expression, as we may have more than one Comparison to chain together.


We do that as follows:

public static Expression<Func<TSource, bool>> GetFilterExpression<TSource>(this IEnumerable<IFilter> filterCollection)
{
if (filterCollection == null) return _ => true;

var parameterExpression = Expression.Parameter(typeof (TSource), "_");

var aggregateFilterExpression = filterCollection.Aggregate(
(Expression) Expression.Constant(true),
(current, filter) => Expression.AndAlso(current,
filter.BuildComparisonExpression<TSource>(parameterExpression)));

return Expression.Lambda<Func<TSource, bool>>(aggregateFilterExpression, parameterExpression);
}

This function aggregates all the filter expressions, threading through the same parameterExpression, and builds the lambda function to call.


This is what a complex filter expression looks like:


Figure 1


Clean and straightforward lambdas.


We can now extend this approach to use conjunctions other than AndAlso, and perhaps special-case some comparison operations.

Tuesday, April 12, 2011

Lambda Wrangling For Fun : Part 2 – Filtering Single Columns

 

In the previous post, we developed a way of dynamically generating a lambda based on a property name to pass through to the OrderBy or OrderByDescending LINQ function in order to sort a collection of items by the specified property.

We can do the same thing to dynamically generate a filter predicate to use in several LINQ extension methods that accept predicates.

Anatomy of a Filter

A filter predicate has three parts:

  • A constant value or expression to compare against
  • A comparison operator
  • A target property on the object which should be compared with the constant value

We already know how to express property selectors – we used them here.

Let’s approach the rest of the problem in stages:

Ceteris paribus…

Let’s fix the comparison operator to be the equality operator  - “ == “.

Then we can write a naive lambda generator thus:

public static Func<TSource, bool> GetPropertyFilterLambda<TSource>(this TSource _this, string propertyName, object filterValue)
{
// Given that propertyName = "Foo" and filterValue = blah
// we want the lambda expression * _ => _.Foo == blah *

// This gives us the * _ => * part
var parameterExpression = Expression.Parameter(typeof (TSource), "_");

// This gives us the * .Foo * part - see the interfaces and inheritance blog post
var propertyInfo = typeof(TSource).GetProperty(propertyName, true);

// This gives us the * _.Foo * part
var accessorExpression = Expression.Property(parameterExpression, propertyInfo);

// This gives us the * blah * part
var valueExpression = Expression.Convert(Expression.Constant(filterValue), propertyInfo.PropertyType);

// This gives us the * _.Foo == blah * part
var equalityComparisionExpression = Expression.Equal(accessorExpression, valueExpression);

// This gives us the * _ => _.Foo == blah * part
return Expression.Lambda<Func<TSource, bool>>(equalityComparisionExpression, parameterExpression).Compile();
}

We need to incorporate the type conversion into the constant expression because it’s very likely that the constant value is a string (passed in from an ASP.NET MVC Controller, for example) while the property is an integer property.


You’re not my Type!



By happy coincidence, the C# language does the right thing for the equality operator on strings and compares values even though strings are a reference type. But other reference types will pose a problem since the equality operator only compares references.


So let’s strongly type the constant value:

public static Func<TSource, bool> GetPropertyFilterLambda<TSource, TValue>(this TSource _this, string propertyName, TValue filterValue)
{
// Given that propertyName = "Foo" and filterValue = blah
// we want the lambda expression * _ => _.Foo == blah *

// This gives us the * _ => * part
var parameterExpression = Expression.Parameter(typeof (TSource), "_");

// This gives us the * .Foo * part - see the interfaces and inheritance blog post
var propertyInfo = typeof (TSource).GetProperty(propertyName, true);

// ensure the property is of the same type as the value
if (!propertyInfo.PropertyType.IsAssignableFrom(typeof (TValue)))
{
throw new MemberAccessException(String.Format("Property {0}.{1} is not of type {2}",
typeof (TSource).Name,
propertyInfo.Name,
typeof (TValue).Name));
}

// This gives us the * _.Foo * part
var accessorExpression = Expression.Property(parameterExpression, propertyInfo);

// This gives us the * blah * part
var valueExpression = Expression.Convert(Expression.Constant(filterValue), propertyInfo.PropertyType);

// This gives us the * _.Foo == blah * part
var equalityComparisionExpression = Expression.Equal(accessorExpression, valueExpression);

// This gives us the * _ => _.Foo == blah * part
return Expression.Lambda<Func<TSource, bool>>(equalityComparisionExpression, parameterExpression).Compile();
}

So far so good.


sed ceteris non paribus…



But there’s no way to specify a different operation than simple equality comparison with this. There is not even a way to do a deep equality comparison on non-string reference objects.


What we really need is a way to provide our own comparison functions, which will give us the flexibility to compare two objects and decide if they satisfy the predicate.


Here is such a function:

public static Expression<Func<TSource, bool>> GetPropertyFilterLambda<TSource, TValue>(this TSource _this,
string propertyName,
TValue filterValue,
Func<TValue, TValue, bool> compareFunc)
{
// Given that propertyName = "Foo" and filterValue = blah
// we want the lambda expression * _ => compareFunc(_.Foo, blah) *

// This gives us the * _ => * part
var parameterExpression = Expression.Parameter(typeof (TSource), "_");

// This gives us the * .Foo * part - see the interfaces and inheritance blog post
var propertyInfo = typeof (TSource).GetProperty(propertyName, true);

// ensure the property is of the same type as the value
if (!propertyInfo.PropertyType.IsAssignableFrom(typeof (TValue)))
{
throw new MemberAccessException(String.Format("Property {0}.{1} is not of type {2}",
typeof (TSource).Name,
propertyInfo.Name,
typeof (TValue).Name));
}

// This gives us the * _.Foo * part
var accessorExpression = Expression.Property(parameterExpression, propertyInfo);

// This gives us the * blah * part
var valueExpression = Expression.Convert(Expression.Constant(filterValue), propertyInfo.PropertyType);

// This gives us the * compareFunc(_.Foo, blah) * part
var compareExpression = Expression.Call(null, compareFunc.Method, accessorExpression, valueExpression);

// This gives us the * _ => compareFunc(_.Foo, blah) * part
return Expression.Lambda<Func<TSource, bool>>(compareExpression, parameterExpression);
}


which can be invoked thus:

<…>.Where(default(IIngredient)
.GetPropertyFilterLambda(
"Name",
"Cheese",
(_l, _r) => _l.Contains(_r)));


so we aren’t constrained to an equality operation any more!


We raise the stakes one more level and curry the constant into the comparison lambda thus:

public static Expression<Func<TSource, bool>> GetPropertyFilterLambda<TSource, TValue>(this TSource _this,
string propertyName,
Func<TValue, bool> predicate)
{
// Given that propertyName = "Foo"
// we want the lambda expression * _ => predicate(_.Foo) *

// This gives us the * _ => * part
var parameterExpression = Expression.Parameter(typeof(TSource), "_");

// This gives us the * .Foo * part - see the interfaces and inheritance blog post
var propertyInfo = typeof(TSource).GetProperty(propertyName, true);

// ensure the property is of the same type as the value
if (!propertyInfo.PropertyType.IsAssignableFrom(typeof(TValue)))
{
throw new MemberAccessException(String.Format("Property {0}.{1} is not of type {2}",
typeof(TSource).Name,
propertyInfo.Name,
typeof(TValue).Name));
}

// This gives us the * _.Foo * part
var accessorExpression = Expression.Property(parameterExpression, propertyInfo);

// This gives us the * predicate(_.Foo) * part
var compareExpression = Expression.Call(null, predicate.Method, accessorExpression);

// This gives us the * _ => predicate(_.Foo) * part
return Expression.Lambda<Func<TSource, bool>>(compareExpression, parameterExpression);
}


which can be invoked like this:

<…>.Where(default(IIngredient)
.GetPropertyFilterLambda(
"Name",
(string _l) => _l.Contains("Cheese"));


Neat-o!

Sunday, April 10, 2011

Lambda Wrangling For Fun : Part 1 – Sorting Single Columns

 

No doubt that a lot of us remember the good old days when we wrote web applications with grids that had sortable columns. For those crazy old timers like myself who were writing this sort of thing before the naff ASP.NET grids came out, we remember the process to go something along these lines:

  • Obtain the name of the column to sort the data set on.
  • Obtain the direction of the sort
  • Construct the ORDER BY clause in the SELECT query to be executed to return the data sorted in the correct way.
  • Execute said query and render the data set appropriately

Setting aside all the discussion about the crudeness of the solution, one thing can be said in favour of this approach – it was simple.

With a more formal service-oriented approach, one constructs a method on the service for each column to be sorted on and implements the appropriate LINQ query statically.

(Yes, I’m aware of Dynamic LINQ, but pretend it doesn’t exist for now.)

The object of this post is to discuss an approach with the flexibility of run-time column selection and the type-safety and inherent power of LINQ.

Dissecting a Lambda

System.Linq.Enumerable has a few extension methods which provide ordering semantics. Let’s consider the most commonly used one – ‘OrderBy’. In its simplest form, it looks like this:

public static System.Linq.IOrderedEnumerable OrderBy(this System.Collections.Generic.IEnumerable source, 
System.Func keySelector)

Member of System.Linq.Enumerable

Summary:
Sorts the elements of a sequence in ascending order according to a key.

Type Parameters:
TSource: The type of the elements of source.
TKey: The type of the key returned by keySelector.

Parameters:
source: A sequence of values to order.
keySelector: A function to extract a key from an element.

Returns:
An System.Linq.IOrderedEnumerable whose elements are sorted according to a key.

So this OrderBy function takes a collection of items to be sorted, and a lambda which selects a property on each item on which to sort by.


The lambda itself specifies the type of the item (which is known because we have a strongly typed collection) and the type of the sorting property. This can be inferred if the lambda is present at compile time.


Creating a Lambda on the fly


Given a property name string, we want to somehow concoct a lambda which can be given to the OrderBy function.


Here is a function which does that – analysis to follow:

public static Func<TSource, object> GetPropertySelectorLambda<TSource>(this TSource _this, string propertyName)
{
// Given that propertyName = "Foo", we want the lambda expression * _ => _.Foo *

// This gives us the * _ => * part
var parameterExpression = Expression.Parameter(typeof(TSource), "_");

// This gives us the * .Foo * part - see the interfaces and inheritance blog post
var propertyInfo = typeof(TSource).GetProperty(propertyName, true);

// This gives us the * _.Foo * part
var unboxedAccessorExpression = Expression.Property(parameterExpression, propertyInfo);

// Guid, int and other value types can't strictly be return types in expressions
// Why? Because 'object' is a reference type
// So we convert value types to Object
var accessorExpression = (unboxedAccessorExpression.Type.IsValueType
? (Expression)Expression.Convert(unboxedAccessorExpression, typeof(object))
: unboxedAccessorExpression);

// Build the lambda to the strong Func type and compile it down to a Func
return Expression.Lambda<Func<TSource, object>>(accessorExpression, parameterExpression).Compile();
}

The Expression wrangling above is typical of functions which generate lambdas on the fly – it’s not as elegant as a truly functional language, but it works even though it’s a bit unwieldy.


The NullReferenceException that wasn’t!


The eagle-eyed reader will notice that the only use of ‘_this’ is to make the function an extension method – so that it becomes accessible in a reasonably fluent manner. 


What we’re really interested is in the type of ‘_this’. The value of ‘_this’ is never used. This means that ‘_this’ can be null – or even more mysteriously, it can refer to something that can never exist!


So typically, as in an MVC controller, this method may be used thus:

// GET: /T/
public virtual ActionResult Index(string sort_column = "Timestamp")
{
var orders = Service.Get(null, default(TOrder).GetPropertySelectorLambda(sort_column));
return View(orders);
}

TOrder could be an interface, so we use default(TOrder) instead of new(TOrder) as the anchor for the extension method. Now, since TOrder is defined as a reference type, default(TOrder) can never be anything but null, but the extension method still works very nicely!


Cool!

Saturday, April 9, 2011

Interfaces and inheritance

 

Quite often, we would like to reflectively use properties defined on types.

Basic Reflection

The Type class has a GetProperties() method which returns all the properties on it.

var properties = typeof(Foo).GetProperties()


Consider the following inheritance hierarchy:
internal class Base
{
public string BaseProperty { get; set; }
}

internal class Derived : Base
{
public string DerivedProperty { get; set; }
}

var properties = typeof(Derived).GetProperties();

We expect the properties collection to have two items corresponding with BaseProperty and DerivedProperty respectively.


Figure1


The Trouble with Interfaces


Now consider the following inheritance hierarchy:

internal interface IBase
{
string BaseProperty { get; set; }
}

internal interface IDerived : IBase
{
string DerivedProperty { get; set; }
}

var properties = typeof(IDerived).GetProperties();

Do we expect anything different? Running this gives us:


Figure2


Now, there is only one property: the DerivedProperty on the IDerived interface – why?


Legalese…


It turns out that this behaviour is by design, and it hinges on the way in which inheritance is interpreted by the language designers.


This is made clear in section 8.10 of the ECMA CLI specifications where it says (emphasis mine):

8.10 Member inheritance Only object types can inherit implementations, hence only object types can inherit members (see §8.9.8). While interface types can be derived from other interface types, they only “inherit” the requirement to implement method contracts, never fields or method implementations.

Reading through the legalese, what this means is that derived interfaces do not contain their base interface properties like derived classes do – they only specify the requirement that any implementer of the interface must implement both derived and base properties.


It’s a very fine line, but  I’m not immediately aware of why such a line exists!


The way out


We can quite easily write a little extension function to iterate over all the properties of a class, recursively including any base classes and interfaces.


There are a few key points to notice which will be pointed out, but first, the code:

public static IEnumerable<PropertyInfo> GetAllProperties(this Type _this,
IList<Type> processedInterfaces = null,
BindingFlags flags =
BindingFlags.Instance | BindingFlags.Public | BindingFlags.DeclaredOnly)
{
processedInterfaces = processedInterfaces ?? new List<Type>();

if (!processedInterfaces.Contains(_this))
{
foreach (var _pi in _this.GetProperties(flags))
{
yield return _pi;
}

if (_this.BaseType != null)
{
foreach (var _pi in _this.BaseType.GetAllProperties(processedInterfaces, flags))
{
yield return _pi;
}
}

if (_this.IsInterface)
{
foreach (var _pi in _this.GetInterfaces().SelectMany(_ => _.GetAllProperties(processedInterfaces, flags)))
{
yield return _pi;
}
}

processedInterfaces.Add(_this);
}

yield break;
}

The first thing to note is that we treat a single class atomically. That is to say, we consider a class to potentially have a base class and some interfaces. We recursively treat the base class the same way.


However, to prevent double-counting of the base class properties, we limit the properties considered for each class to be just those specified in the class itself.


Furthermore, multiple classes and interfaces can implement the same interface, so we keep a list of interface types that we have processed, and only process them once. This allows for complex interface hierarchies to be expressed in their simplest form. We use a default argument for this to keep the function signature as minimal as possible without sacrificing functionality.


The last bit of elegance is to use the yield operator to only process the enumeration at iteration time. As a bonus, this keeps the code very readable!


Enjoy!