Sunday, November 3, 2013

Classes and Memoization - A Neat C# Trick

 

This post talks about a set of utilities provided in the BrightSword SwissKnife library (nuget, codeplex).

When programming in the real world (as opposed to programming for an academic exercise), we and (and should) combine concepts to synergistically improve our solution.

We have visited Memoization as a technique, and have seen the use of functional programming to provide more succinct, maintainable code in discussing the Reflection Utility.

In this post, we will combine those two concepts (and derive a programming rule-of-thumb) to provide a more efficient utility for Reflection.

You will find this utility here, and can access it as BrightSword.SwissKnife.TypeMemberDiscoverer<T>

The Simple Approach

The Reflection Utility extension method surface is simple to use and highly maintainable, because there is minimal code-duplication.

To re-cap, our extension method surface for properties looks like:

public static IEnumerable<PropertyInfo> GetAllProperties(this Type _this, BindingFlags bindingFlags = DefaultBindingFlags)
{
return _this.GetAllMembers((_type, _bindingFlags) => _type.GetProperties(_bindingFlags), bindingFlags);
}
which allows for usage like this:

var readonlyProperties = typeof (T).GetAllProperties().Where(_ => !_.CanWrite || _.GetSetMethod() == null);
One of the immediately evident drawbacks is that each invocation of the extension method involves reflection over the type’s members, and therefore we pay a linear cost for reflection.

One approach might be to consider Memoization to store and reuse the result of the reflection utilities, and pay the price for reflection just once. This post discusses the approach SwissKnife takes in order to do this.

The real problem with simple memoization is the BindingFlags parameter, which may select different sets of members each time. In order to optimally reduce the required computation, we could restrict the BindingFlags value to some reasonable default (such as ‘all public members’) and memoize the result-set of members by Type.

This line of thought leads to something like this:

public static class TypeMemberDiscoverer
{
private const BindingFlags DefaultBindingFlags = BindingFlags.Default | BindingFlags.Instance | BindingFlags.Public;

public static ConcurrentDictionary<Type, IEnumerable<PropertyInfo>> _propertySetCache =
new ConcurrentDictionary<Type, IEnumerable<PropertyInfo>>();

public static IEnumerable<PropertyInfo> GetAllProperties(this Type type)
{
return _propertySetCache.GetOrAdd(type,
_ => _.GetAllMembers((_type,
_bindingFlags) => _type.GetProperties(_bindingFlags),
DefaultBindingFlags));
}
}
This approach still provides an extension method, and now the default property set is cached and the GetAllMembers call is only performed once.

However, we can use a feature of the C# language to get a slightly more elegant solution, and simultaneously derive a programming rule-of-thumb.

 

Static Fields in Generic Types



Because C# generic classes are true classes, a static field defined in an open generic class will actually be defined as a separate static field in each corresponding closed generic class.

Consider the open generic type:

public class Foo<T>
{
private static int _bar;
}


Consider now, a Foo<int> and a Foo<string>.

Both of these closed generic types will have a static field of type int called _bar.

As expected, all instances of Foo<int> will share a static _bar field, and all instances of Foo<string> will share a static _bar field.

What you might not expect, however is that Foo<int> will not be the same as Foo<string>. In fact, Resharper™ and Microsoft Code Analysis both have warnings alerting you that this behaviour might be potentially unexpected – you might expect that the field is shared across all types that close the open generic type, but it won’t be!

However, in our case, this is exactly the behaviour we want! Because static fields in generic classes are actually only shared by instances of the corresponding closed generic type, we can use a generic class itself as a kind of memoization container, where we can cache something by Type!

So the general rule-of-thumb is: If you find yourself trying to cache against a type name for whatever reason, consider using a generic class as a cache container. The static initializers of a class are guaranteed thread-safe, and you can write more readable code and let the language provide the cache!

So we can write a generic type with one type parameter, as follows:

public static class TypeMemberDiscoverer<T>
{
private const BindingFlags DefaultBindingFlags = BindingFlags.Default | BindingFlags.Instance | BindingFlags.Public;

// ReSharper disable StaticFieldInGenericType
private static readonly IEnumerable<PropertyInfo> _properties =
typeof(T).GetAllMembers((_type,
_bindingFlags) => _type.GetProperties(_bindingFlags), DefaultBindingFlags);
// ReSharper restore StaticFieldInGenericType

public static IEnumerable<PropertyInfo> GetAllProperties()
{
return _properties;
}
}
The usage is:
var readonlyProperties = TypeMemberDiscoverer<int>.GetAllProperties()
.Where(_ => !_.CanWrite || _.GetSetMethod() == null);


In the class above, we pay the price to discover the properties exactly once – when the TypeMemberDiscoverer<int> type is initialized. We can use the Lazy<T> pattern to further defer the evaluation to when the GetAllProperties() call is first invoked.

public static class TypeMemberDiscoverer<T>
{
private const BindingFlags DefaultBindingFlags = BindingFlags.Default | BindingFlags.Instance | BindingFlags.Public;

// ReSharper disable StaticFieldInGenericType
private static readonly Lazy<IEnumerable<PropertyInfo>> _propertiesL =
new Lazy<IEnumerable<PropertyInfo>>(
() => typeof(T).GetAllMembers((_type,
_bindingFlags) => _type.GetProperties(_bindingFlags),
DefaultBindingFlags));
// ReSharper restore StaticFieldInGenericType

public static IEnumerable<PropertyInfo> GetAllProperties()
{
return _propertiesL.Value;
}
}


This is both efficient and easier-to-read, and it doesn't need a ConcurrentDictionary to work!

 

Summary





  • We can use the functional programming principle of memoization to cache the result of an expensive operation (recursive reflection). This eliminates repeated computation at the expense of memory.

  • We can do this elegantly by using a feature of the C# language which allows static fields in a generic type to act as a cache keyed by the type parameters.

  • We can further reduce wasted effort by using the Lazy pattern provided by the runtime, and only compute (and cache) results when first used

Functions as Function Results

 

This post talks about a set of utilities provided in the BrightSword SwissKnife library (nuget, codeplex).

Learning to write performance-sensitive code is a journey. There are many programming tools and techniques which contribute to different aspects of improving code performance.

The most elegant of these are the fundamentally timeless classics – mathematical methods and proofs that are agnostic to technologies. However, understanding these techniques generally requires a rigorous understanding and application of logic and mathematical technique.

Memoizing is a commonly used mathematical technique to improve the performance of the act of calculations – historically, great care was taken to compute, verify and publish the values resulting from the calculation of formulae – logarithms, arithmetic and trigonometric functions, for example.

You will find a general-purpose Memoize function here, but its implementation is more advanced than is discussed by this post. We will cover the advanced issues in a future post which addresses recursion.

 

Introduction

We’ve seen that we can abstract away boilerplate code and parameterize logic using functions as function arguments.

We also had a stab at memoizing a function parameterized only by a single generic type parameter. We’re going to spend a little more time talking about memoizing functions now.

Consider the following snippet using the Math.Sqrt function, which computes the square root of a given positive real number.


var y = Math.Sqrt (Math.Sqrt (a)) + 2 * Math.Sqrt (b) * Math.Sqrt (a);


A function, such as Math.Sqrt, which always returns the same result for the same arguments without changing any other state is called a “pure” function. Such functions have a property known as referential transparency, which means that the result value of the function can replace its invocation in an expression. Because of this property, we do not need to evaluate Math.Sqrt(a) twice in the expression above. We could simply store the result against the argument somewhere and return it whenever the function is invoked with the same argument. Such a technique is called memoization.

 

Memoizing



We could take several approaches to memoize a function: the most naïve approach might be to use code such as:

public static ConcurrentDictionary<double, double> GlobalCacheForSqrt = new ConcurrentDictionary<double, double>();
public static double sqrt_memoized(double x)
{
return GlobalCacheForSqrt.GetOrAdd(x,
Math.Sqrt);
}

...

var y = sqrt_memoized(sqrt_memoized(a)) + 2 * sqrt_memoized(b) * sqrt_memoized(a);


This would work, but what price have we paid to make this happen?



  • We have created a global value cache to store the results of the Math.Sqrt function.

  • We have created a wrapper function to return the cached value and only invoke the Math.Sqrt function if we haven’t done it once before.


If we wanted to memoize another function, we would have to do the same two steps all over again, which is somewhat tedious. But somewhat more seriously, we are memoizing the results in mutable global state – making the system impure.

What we really want is to localize the cache somewhere local and safe, and also remove the tedium of writing the wrapping function.

Something that looks like this:


public class MemoizedSqrt
{
private readonly ConcurrentDictionary<double, double> _cache = new ConcurrentDictionary<double, double>();

public double Sqrt(double x)
{
return _cache.GetOrAdd(x,
Math.Sqrt);
}
}

...

var sqrt_memoized = Memoize<double, double>(Math.Sqrt);
var y = sqrt_memoized(sqrt_memoized(a)) + 2 * sqrt_memoized(b) * sqrt_memoized(a);


 

This is much nicer – it has now hidden away the cache inside of a class, but we still have to write a different class for each function we want to memoize.

So let’s look at a different way of doing this – imagine we had a magic box which took in a function like Math.Sqrt, and returned the sqrt_memoized for us, automatically encapsulating a class with the cache in it.

Imagine no further – we can build such a box. It’s simply a higher-order function which takes a function as an argument and returns another function.


public static Func<TDomain, TRange> Memoize<TDomain, TRange>(Func<TDomain, TRange> f)
{
private class M
{
private readonly ConcurrentDictionary<TDomain, TRange> _cache = new ConcurrentDictionary<TDomain, TRange>();
private readonly Func<TDomain, TRange> _f;

public M(Func<TDomain, TRange> f)
{
_f = f;
}

public TRange MemoizedF(TDomain x)
{
return _cache.GetOrAdd(x,
_f);
}
}

return new M<TDomain, TRange>(f).MemoizedF;
}

...

var sqrt_memoized = Memoize<double, double>(Math.Sqrt);
var y = sqrt_memoized(sqrt_memoized(a)) + 2 * sqrt_memoized(b) * sqrt_memoized(a);


And there you have it – one single function to memoize any single-argument function, all neatly packaged into a static method.

All done!

What? Howls of protest? This couldn’t work?

Of course it won’t work! There’s no way to embed the class M inside the Memoize function. But we have a nice, elegant solution and we don’t want to mess it up by making the class M public…what do we do?

Getting some Closure



The trick to getting that inner embedded class inside the function is to let the compiler put it there instead of writing the class ourselves.

Remember that we are writing a higher-order function, so we could just build and return the memoizing function in place.

Consider the following code:

public static Func<TDomain, TRange> Memoize<TDomain, TRange>(Func<TDomain, TRange> f)
{
var _cache = new ConcurrentDictionary<TDomain, TRange>();
return _ => _cache.GetOrAdd(_, f);
}


Let’s break this down – ignoring the type arguments:



  • Memoize is a function which takes in a function f and returns an anonymous function.

  • The function has a local variable _cache to cache appropriately typed results against arguments.

  • The result is a function which references the cache. The salient point to notice is that the _cache variable is outside the scope of the result function. We say that the result function closes over the _cache variable.

  • The function returned requires a proper reference to the _cache variable even after the Memoize function exits.

  • In order to keep the reference alive, the compiler creates an inner, anonymous class storing the closed variable and exposing a method representing the resultant function.

  • The compiler then instantiates the class and returns a reference to the method, which now references the closed variable legally.


In fact the anonymous class looks and behaves almost exactly the class we tried to sneak in.

This construct is called a closure, and is an excellent way to encapsulate state without creating classes and objects.

In fact, the last code snippet is a general purpose memoizing function for functions of one argument, and you will agree is much more elegant than any of the other alternatives!

 

Summary





  • SwissKnife provides a Memoize function to elegantly provide memoized versions of pure single-argument functions.

  • We can use the functional programming principle of higher-order functions to memoize pure functions to leverage Referential Transparency. Memoizing functions ensure that functions are evaluated only once per argument value and the cached value of the result can be used subsequently.

  • We use closures to encapsulate the state associated with the cache.

Saturday, November 2, 2013

Functions as Function Arguments

 

This post talks about a set of utilities provided in the BrightSword SwissKnife library (nuget, codeplex).

Reflection is a commonly used meta-programming technique where run-time inspection of types allows for more elegant solutions in some cases. The .NET framework provides powerful primitives to use reflection, which have a nice orthogonal and consistent interface for the various type elements.

A necessary, but less desirable, consequence of an orthogonal interface is the ‘boilerplate’ nature of the code which uses different elements. There is typically a great deal of symmetry and similarity in the code, which generally leads to a cut-copy-paste kind of code-proliferation – and results in unmaintainable code.

In this post, we’ll talk about the software techniques we can use to mitigate this form of code-proliferation, and actively develop a lean and maintainable utility without sacrificing flexibility.

You will find these utilities here, where there are a set of extension methods on the Type type. 

 

Reflection Utilities

One of the fundamental principles of functional programming is the concept of a First-Class Function.

A language with a First Class function is one that allows functions to be treated exactly the same way as other first class entities such as variables and constants. This generally means that there is the concept of being able to pass in a function as an argument to another function (just like one can pass in a string as an argument), assign a function to a variable, or return the function as the result of a function.

This is a very powerful concept, as it allows us to parameterize logic just like we traditionally parameterize values.

GetAllProperties|GetAllMethods|GetAllEvents

Consider the problem of reflecting over a type and enumerating over its properties:

A naïve implementation of a function to enumerate a type’s properties might look like this:

public static IEnumerable<PropertyInfo> GetProperties(Type t) 
{
foreach (var item in t.GetProperties())
{
yield return item;
}
}



An analogous method to enumerate methods would then be:

public static IEnumerable<MethodInfo> GetMethods(Type t) 
{
foreach (var item in t.GetMethods())
{
yield return item;
}
}



The reason this is a naïve implementation is that interfaces do not inherit, in the traditional sense of the term, the members of their base interfaces – rather they inherit the requirement that the base interface is implemented as well. See my earlier post for the background to this problem.


The complete solution involves recursively walk up the inheritance chain for interfaces, whilst keep track of interfaces seen before, and enumerating the members of each interface encountered.


The code to do this is somewhat more involved:


public static IEnumerable<PropertyInfo> GetInterfaceProperties(this Type type,
ISet<Type> processedInterfaces = null)
{
Debug.Assert(type.IsInterface);

processedInterfaces = processedInterfaces ?? new HashSet();

if (processedInterfaces.Contains(type))
{
yield break;
}

foreach (var _pi in type.GetProperties())
{
yield return _pi;
}

foreach (var _pi in type.GetInterfaces()
.SelectMany(_ => _.GetInterfaceProperties(processedInterfaces)))
{
yield return _pi;
}

processedInterfaces.Add(type);
}
After taking a single glance at this code, one can imagine what the analogous code for Methods and Events might be – a lot of similar code with a few method calls and types changed.


This is an ideal situation to use the functional programming support provided by C#.


We can wrangle out the boilerplate code out of this to get:

public static IEnumerable<TMemberInfo> GetInterfaceMembers(this Type type, 
Func<Type, IEnumerable<TMemberInfo>> accessor,
ISet<Type> processedInterfaces = null)
where TMemberInfo : MemberInfo
{
Debug.Assert(type.IsInterface);

processedInterfaces = processedInterfaces ?? new HashSet();

if (processedInterfaces.Contains(type))
{
yield break;
}

foreach (var _pi in accessor(type))
{
yield return _pi;
}

foreach (var _pi in type.GetInterfaces()
.SelectMany(_ => _.GetInterfaceMembers(accessor, processedInterfaces)))
{
yield return _pi;
}

processedInterfaces.Add(type);
}



So by introducing the <TMember> type parameter to represent a MemberInfo generically, and providing a function argument accessor which provides a collection of members of the specified type, we can abstract out the boilerplate code and parameterize the accessor logic.


SwissKnife uses this approach to provide a set of extension methods on the Type type, which simply parameterize a common function:


public static IEnumerable<PropertyInfo> GetAllProperties(this Type _this, BindingFlags bindingFlags = DefaultBindingFlags) 
{
return _this.GetAllMembers((_type, _bindingFlags) => _type.GetProperties(_bindingFlags), bindingFlags);
}

public static IEnumerable<MethodInfo> GetAllMethods(this Type _this, BindingFlags bindingFlags = DefaultBindingFlags)
{
return _this.GetAllMembers((_type, _bindingFlags) => _type.GetMethods(_bindingFlags), bindingFlags);
}

public static IEnumerable<EventInfo> GetAllEvents(this Type _this, BindingFlags bindingFlags = DefaultBindingFlags)
{
return _this.GetAllMembers((_type, _bindingFlags) => _type.GetEvents(_bindingFlags), bindingFlags);
}
which allows for usage like this:
var readonlyProperties = typeof (T).GetAllProperties()
.Where(_ => !_.CanWrite || _.GetSetMethod() == null);



A functional approach to coding up these functions has reduced coding noise and resulted in simpler, easier-to-use, easier-to-understand and easier-to-maintain code.


Summary







  • SwissKnife provides a utility mechanism to simplify reflection over a type to enumerate different types of members. This is especially useful in the case of interface hierarchies.


  • We can use the functional programming principle of first-class functions to extract boilerplate code and parameterize logic in addition to values. This makes code easier to maintain.

Thursday, October 31, 2013

Putting the Fun in Functional Programming

 

Introduction

Like everyone who has been programming for some time, I have a set of tools and utility functions which I have developed over time, and reuse often.

My library of utilities has been available for some time both as open-sourced code and a free-to-use NuGet package.

The purpose of this series of posts is to talk about observations in my evolution as a functional programmer, and how the implementation and approaches I tend to take these days increasingly use functional concepts.

As you browse the code, you’ll come across more utilities than the ones discussed in this blog series. This series is going to try and focus on those utilities which demonstrate some facet of functional programming.

 

What is Functional Programming? Why do I care?

Functional Programming is a programming paradigm that treats programs as the evaluation of mathematical functions and expressions.

One can conceive of a kind of spectrum of programming abstraction:

  • Assembly Language which abstracts away parameterized opcodes
  • FORTRAN, COBOL and BASIC introduce control flow statements and subroutines
  • Pascal and C which allow for rich user-defined types, abstracting data and largely obviating the GOTO construct
  • Modula-2 and C++ which introduce concepts of objects maintaining state, encapsulating both data and related functions
  • Java and C# 1.0 which introduce concepts of managed memory and a virtual runtime environment, allowing for simpler creation of objects and popularizing Exceptions
  • Modern C# which introduces concepts of true type parameterization, lambdas and attributes. LINQ is a very powerful programming extension which is available here.
  • Scala introduces type classes and F# provides full type inference. These languages also use pattern-matching and recursion constructs to obviate some familiar imperative programming patterns.
  • LISP, a proto-functional language, is actually a multi-paradigm language, and as such, generally provides a great introduction to programming. LISP relies heavily on recursion, obviates some of the familiar programming constructs such as for loops and variables, introduces the concept of isomorphic code and data which makes it trivial to generate and immediately execute a function, and provides powerful meta-programming support.
  • Haskell, which emphasizes purity to the extent of introducing the concept of IO Monads to limit the impact of side-effects on a pure functional language.

Other authors have spoken cogently about the experience of a programmer skilled in a language somewhere in the spectrum (say a Java expert) looking up towards a more abstract language and not being able to comprehend why a construct or concept from that language would be useful or even necessary.

Therefore, it isn’t unreasonable to ask why we would need a functional programming perspective when we can do almost all we need with our daily programming language. Hopefully, as we go through this blog series, we’ll get a flavour of the kinds of things that functional programming makes easier, or possible!

 

What’s coming down the track?

 

Acknowledgements

My involvement with functional programming goes all the way back to college, where teachers like Dr. Takis Metaxis introduced me to symbolic computing with LISP.

I highly recommend reading Real-World Functional Programming: With Examples in F# and C# by Tomas Petricek and Jon Skeet.

If you get a chance to listen to the “Programming with Purity” talk by Joseph Albahari, do so – he has one of the most lucid explanations of Monads around!

I have stood on the shoulders of giants by reading their blogs: The relevant posts by Eric Lippert, Wes Dyer and Stephen Toub are especially enlightening. Mike Hadlow’s blog is also good reading.

For a truly mind-blowing experience, see the 13-part series on Functional Programming by the one and only Erik Meijer, who, of course, has the last word on monads and Category Theory!

Do stick around, and I’ll try to explain how I synthesize some of these concepts into tools that I use frequently.

Tuesday, October 29, 2013

It’s been a while…but a lot has happened!

Can’t believe it’s been a whole year since I blogged anything!

And what a year it’s been!

This year has turned out to be one of the more interesting ones in my career – lots of very cool stuff that’s been taking up all my time…now that it’s coming to a close, I’m going to write up some blog posts to talk about what has happened!

Daily work stuff

In my day job, I work for a software development company that builds accounting software which runs in the cloud. The flagship product – the one I work on – is a .NET rewrite of a very successful C++/ISAM desktop app. The rewrite began as a desktop application replacement so it was designed as such, but with the emergence of cloud technologies, we decided to make the app a hybrid and offer a rich client interface to a cloud service.

The basic premise was a good one – the architecture was well layered so we were able to stick the internet between the server and the client and get the thing to run. However, the devil in the details was performance – the desktop application, quite reasonably, expected to have a fast, fat connection to a local database, but that changed once we separated the UI from the server.

Over the last couple of years, we tried all the quick wins and got the app out, but performance was still sub-optimal and we needed to come up with some fundamental changes to break the problem.

At the start of this year, a couple of us was given the opportunity to go in with a sledgehammer and try to break the problem out. In six weeks we had a pattern for some drastic improvements and we’ve spent the better part of this year rolling that pattern out through the most performance sensitive areas.

Big win!

I’ll blog about some of the technical approaches we have taken to get these dramatic improvements, but it’s safe to say that they are not for the faint-hearted!

We actually rounded off the last year with 3 months of really creative work building the first cut of the MYOB API. Working with @H_Desai and building stuff with ASP.NET Web API was a pretty satisfying experience!

Open Source Mania

In my copious free time (ha ha) I decided to clean up and open source a whole bunch of stuff.

Amongst the weaponry that is now public are:

BrightSword SwissKnife

A  (mostly functional-programming based) set of utilities to help with everyday programming.

I’ve got a blog series on practical functional programming which centres around SwissKnife coming.

Warning: Contains Monads.

BrightSword Squid 

Rhino Mocks is a nice way to create anonymous types for interfaces but it is slow as all heck and a bit unwieldy, so I decided to write a fast library to generate DTOs with behaviours for general purpose use. MVVM patterns with Property Change Notifications work straight from the tin! And it’s 3 orders of magnitude faster than Rhino!

Contains lots of nasty nasty IL and Reflection.Emit!

BrightSword Pegasus

A reference library for CQRS and (Dynamic) Queue Centric Workflows based on Windows Azure. This thing is seriously cool and makes short work of writing extensible, scalable cloud applications.

Most of the stuff is pretty well tested – and I even set up my own automated build-pipeline to monitor source control, compile, test and upload to nuget. So all these libraries can be referenced and used through nuget.

Use the Nuget Packages. Fork the repos. Submit Pull requests.

The Cloud Architecture Talk

@MaheshKrishnan and I have been doing talks together for a few years now, and we spend a fair bit of our time idly gossiping about technology and trends. We observed that the best-practice software patterns for the cloud aren’t well articulated and well understood by the community at large, so we decided to build a talk around the most common patterns.

We built a application to demonstrate the patterns as well, and drew up some diagrams. Unlike most of our other talks, we demonstrate very little code in the talk itself, but the code is all open source.

We are both humbled and pleasantly surprised by its reception!

So far, we’ve given it at:

  • The DDD! Melbourne Conference
  • MYOB technical brown bag series
  • Melbourne Azure Meetup & VIC.NET
  • Singapore .NET Dev SIG
  • TechEd Australia 2013

And we have been invited to give it at NDC London!

Yes. That’s right! We’re speaking at NDC London!

ndc_london

TechEd Australia 2013

This year’s TechEd was pretty special!

Went to the Gold Coast with @simonraikallen, @couchcoder and @shaun_a_wilde.

20130905_174921

Hanging with @shaun_a_wilde and @couch_coder

IMG_9933

With @MaheshKrishnan at the Microsoft stand

20130906_164019

20130906_164100

@MaheshKrishnan and I gave a slightly modified version of the Cloud Architecture Patterns talk focusing on the implementation aspects. It was pretty well received, and we got a fair bit of good feedback from the audience. Lively discussion and lots of questions are always a good sign that people paid attention to the material and that it was relevant.

IMG_9957

IMG_9962

Got to meet @linqpad – would’ve loved to have an hour pairing with him on some stuff to do with monads, but perhaps another day.

20130904_211817

Met and hung out with the usual suspects…

3F0C9904

Hanging with @RockyH

IMG_0523

@drmcghee and @frankarr. Didn’t know at the time that this was the last TechEd Frank Arrigo was going to be at – so I’m glad we got this shot!

I won a 3D printer at the raffle, but wasn’t around to collect it so I lost it! I still get guys poking fun at me for this!

Our company had been selected as a case study for our use of Azure, and it was a pretty special privilege to represent them at the last, conference-wide Lock Note session.

It was made all the more special because the Microsoft CVP responsible for Azure – Scott Guthrie – was presenting the Lock Note, and I got a chance to get on stage with him and talk for five minutes about how we used Azure.

20130906_154952IMG_0017

IMG_0019

IMG_0021

So the reason I wasn’t around to claim my printer was that I was hanging out with @ScottGu, rehearsing for the Lock Note, and I got to spend about half an hour just hanging with the Gu, who is, for the record, a very friendly, down-to-earth guy.

IMG_0030

Pretty awesome, all told!

Thursday, September 27, 2012

Casablanca: C++ on Azure

 

We (John Azariah and Mahesh Krishnan) gave a talk at Tech Ed Australia this year titled Casablanca: C++ on Azure. The talk itself was slotted in at 8:15 am on the last day of Tech.Ed after a long party the night before. The crowd was small, and although we were initially disappointed by the turn out, we took heart in the fact that this was the most viewed online video at Tech.Ed this year – lots of five star ratings, Facebook likes and tweets.  This post gives you an introduction to Casablanca and highlights the things we talked about in the Tech.Ed presentation.

So, what is Casablanca? Casablanca is an incubation effort from Microsoft with the aim of providing an option for people to run C++ on Windows Azure. Until now, if you were a C++ programmer, the easiest option for you to use C++ would be to create a library and then P/Invoke it from C# or VB.NET code. Casablanca gives you an option to do away with things like that.

If you are a C++ developer and want to move your code to Azure right away, all we can say is “Hold your horses!” It is, like we said, an incubation effort and  not production ready, yet. But you can download it from the Devlabs site, play with it and provide valuable feedback to the product team.

You are also probably thinking, “Why use C++?” The answer to that question is really “Why not?” Microsoft has been providing developers the option to use various other languages/platforms such as java and Node.js to write for Azure, and now they are giving the same option to C++ programmers – use the language of their choice to write applications in Azure. Although there has been a bit of resurgence in C++ in the last couple of years, we are not really trying to sell C++ to you. If we are writing a Web App that talks to a database, then our first choice would probably still be ASP.NET MVC using C#, and maybe Entity Frameworks to talk to the DB. What we are trying to say is that you still need to use the right language and framework that works best for you, and if C# is the language you are comfortable with, then why change.

On the other hand if you are using C++, then you probably already know why you want to continue using it. You may be using it for cross-platform compatibility or better performance or maybe you have lots of existing legacy code that you can’t be bothered porting across. Whatever the reason, Casablanca gives you an option to bring your C++ code to Azure without having to use another language to talk to its libraries.

 

The Node influence

When you first start to look at Casablanca code, you will notice how some of the code has some resemblance to Node.js. A simple Hello World example in node will look like this -

var http = require('http');

http.createServer(function (request, response) {

response.writeHead(200,
{'Content-Type': 'text/plain'});
respose.end('Hello World!');

}).listen(8080, '127.0.0.1');

The equivalent Hello World in C++ would look something like this -

using namespace http;

http_listener::create("http://127.0.0.1:8080/",
[](http_request request)
{
return request.reply(status_codes::OK,
"text/plain", "Hello World!");
}).listen();

Notice the similarity? This isn’t by accident. The Casablanca team has been influenced a fair bit by Node and the simplicity by which you can code in node.

 

Other inclusions


The proliferation of HTML, Web servers, web pages and the various languages to write web applications based on HTML happened in the 90s. C++ may have been around a lot longer than that, but surprisingly, it didn’t ride the HTML wave. Web servers were probably written in C++, but the applications themselves were written using much simpler languages like PHP. Of course, we did have CGI, which you could write using C++, and there were scores of web applications written in C++ but somehow, it really wasn’t the language of choice for writing them. (It didn’t help that scores of C++ developers moved on to things like Java, C#, and Ruby). What C++ needed was a good library or SDK to work with HTTP requests, and process them.

In addition to this, RESTful applications are becoming common place, and is increasingly becoming the preferred way to write services. So, the ability to easily process GET, PUT, POST and DELETE requests in C++ was also needed.

When we talk about RESTful apps, we also need to talk about the format in which the data is sent to/from the server. JSON seems to be the format of choice these days due to the ease with which it works with Javascript.

The Casablanca team took these things into consideration and added classes into Casablanca to work with the HTTP protocol, easily create RESTful apps and work with JSON.

To process the different HTTP actions and write a simple REST application to do CRUD operations, the code will look something like this:

auto listener = http_listener::create(L"http://localhost:8082/books/");

listener.support(http::methods::GET, [=](http_request request)
{
//Read records from DB and send data back
});

listener.support(http::methods::POST, [=](http_request request)
{
//Create record from data sent in Request body
});

listener.support(http::methods::PUT, [=](http_request request)
{
//Update record based on data sent in Request body
});

listener.support(http::methods::DEL, [=](http_request request)
{
//Delete
});

/* Prevent Listen() from returning until user hits 'Enter' */
listener.listen([]() { fgetc(stdin); }).wait();


Notice how easy it is to process the individual HTTP actions? So, how does conversion from and to Json objects work? To convert a C++ object to a Json object and send it back as a response, the code will look something like this:

using namespace http::json;
...

value jsonObj = value::object();
jsonObj[L"Isbn"] = value::string(isbn);
jsonObj[L"Title"] = value::string(title);
...

request.reply(http::status_codes::OK, jsonObj);

To read json data from the request, the code will look something like this:

using namespace http::json;

...

value jsonValue = request.extract_json().get();

isbn = jsonValue[L"Isbn"].as_string();

You have a collection? no problem, the following code snippet shows how you can create a Json array

...
auto elements = http::json::value::element_vector();
for (auto i = mymap.begin(); i != mymap.end(); ++i)
{
T t = *i;

auto jsonOfT = ...; // Convert t to http::json::value
elements.insert(elements.end(), jsonOfT);
}
return http::json::value::array(elements);


 


Azure Storage


If you are running your application in Windows Azure, then chances are you may also want to use Azure storage. Casablanca provides you with the libraries to be able to do this. The usage, again is quite simple, to create the various clients for blobs, queues and tables the usage is as follow:

storage_credentials creds(storageName, storageKey);

cloud_table_client table_client(tableUrl, creds);
cloud_blob_client blob_client(blobUrl, creds);
cloud_queue_client queue_client(queueUrl, creds);


Notice the consistent way of creating the various client objects. Once you have initialized them, then their usage is quite simple too. The following code snippet shows you how to read data from Table storage:

cloud_table table(table_client, tableName);
query_params params;
...
auto results = table.query_entities(params)
.get().results();

for (auto i = results.begin();
i != result_vector.end(); ++i)
{
cloud_table_entity entity = *i;
entity.match_property(L"ISBN", isbn);
...
}


Writing to Table storage is not difficult either, as seen in this code snippet:

cloud_table table(table_client, table_name);
cloud_table_entity entity(partitionKey, rowKey);

entity.set(L"ISBN", isbn, cloud_table_entity::String);
...

cloud_table.insert_or_replace_entity(entity);

Writing to blobs, and queues follow a similar pattern of usage.

 

Async…


nother one of the main inclusions in Casablanca is the ability to do things in an asynchronous fashion. If you’ve looked at the way things are done on Windows Store applications or used Parallel Patterns Library (PPL), then you would already be familiar with the “promise” syntax. In the previous code snippets, we resisted the urge to use it, as we hadn’t introduced it yet.

 

… and Client-Side Libraries


Also, we have been talking mainly about the server side use of Casablanca, but another thing to highlight is the fact that it can also be used to do client side programming. The following code shows the client side use of Casablanca and how promises can be used:

http::client::http_client client(L"http://someurl/");
client.request(methods::GET, L"/foo.html")
.then(
[=](pplx::task task)
{
http_response response = task.get();
//Do something with response
...
});

If you need to find out more about ppl and promises, then you should read the article Asynchronous Programming in C++ written by Artur Laksberg in the MSDN magazine.

 

Wait, there is more…but first lets get started


Casablanca has also been influenced by Erlang, and the concept of Actors, but let’s talk about it another post. To get started with Casablanca, download it from the DevLabs site. It is available for both VS 2010 and 2012.

 


image012

Tuesday, September 25, 2012

I’m back!

 

It’s been a crazy year since TechEd Australia 2011, and I’ve been strangely quiet…

But I’m back now…