Pipes and Filters

Pipes and filters (or just pipeline) is another common pattern. Oren Eini and Jeremy Likness both have very interesting posts about it on their respective blogs.

While Jeremy’s post aims at the slick creation of pipelines, Oren talks more about the pattern itself and how to implement it in a specific manner (using input and output values of Type IEnumerable<T>).

Some interesting argument came up in the comments to Oren’s post. “Why not use LINQ instead of your custom pipeline (framework)?”

I won’t repeat all the pros and cons here. Better read the posts yourself, they are worth your time!

My opinion on the topic: LINQ is a great tool. But I believe it’s neither the only solution for chains of queries and transformations, nor is it the best in all cases.

I will pick up one point from the “pro LINQ” point of view. “With LINQ you can have different input and output types.”

Sure you can. But who says you can’t do that with pipes and filters just as easily?

We define an abstract base class Filter<TIn, TOut>

public abstract class Filter<TIn, TOut>
{
  public abstract IEnumerable<TOut> Process(IEnumerable<TIn> input);
  public Filter<TIn, TNext> Pipe<TNext>(Filter<TOut, TNext> next)
  {
    return new Pipe<TIn, TOut, TNext>(this, next);
  }
}

and a derived class Pipe<TIn, T, TOut>

public class Pipe<TIn, T, TOut> : Filter<TIn, TOut>
{
  private readonly Filter<TIn, T> source;
  private readonly Filter<T, TOut> destination;
  public Pipe(Filter<TIn, T> source, Filter<T, TOut> destination)
  {
    this.source = source;
    this.destination = destination;
  }
  public override sealed IEnumerable<TOut> Process(IEnumerable<TIn> input)
  {
    var x = this.source.Process(input);
    var result = this.destination.Process(x);
    return result;
  }
}

With these two as a base we can easily chain filters with different input and output types.

We can also fine-tune the ends of the pipeline a bit so that the code is a nicer read. With a small extension method and a just as small dummy class we can use any enumerable as the starting point for defining a pipeline.

public static Filter<TIn, TOut> Pipe<TIn, TOut>(this IEnumerable<TIn> enumerable, Filter<TIn, TOut> filter)
{
  return new PipelineStartDummy<TIn, TOut>(enumerable, filter);
}
private class PipelineStartDummy<TIn, TOut> : Filter<TIn, TOut>
{
  private readonly IEnumerable<TIn> enumerable;
  private readonly Filter<TIn, TOut> filter;
  public PipelineStartDummy(IEnumerable<TIn> enumerable, Filter<TIn, TOut> filter)
  {
    this.enumerable = enumerable;
    this.filter = filter;
  }
  public override IEnumerable<TOut> Process(IEnumerable<TIn> input)
  {
    return this.filter.Process(this.enumerable);
  }
}

If we don’t care what comes out of the pipeline and just want to start processing values from the source we can use another extension method that encapsulates Oren’s enumerator magic.

public static void Start<TIn, TOut>(this Filter<TIn, TOut> filter)
{
  var enumerable = filter.Process(null);
  var enumerator = enumerable.GetEnumerator();
  while (enumerator.MoveNext()) { }
}

And now we put it all together and get the following:

new Numbers(3).Pipe(new Square()).Pipe(new Printer()).Start();

Numbers just returns integer values between 1 and the constructor parameter. Square squares all input values and the Printer writes the input to the console. The call to Start() starts the processing.

While it is not the most impressive example implementation of the pipes and filters pattern I believe that it demonstrates how powerful and flexible the pattern can be. And the code is still readable and very explicit about what you are doing. With just a few lines of code you have a composable, easy to understand solution where you can recombine filters in different orders to change the behavior of the pipeline. And you can do just about anything inside of those filters (think validation or enriching the values traversing through the pipeline with data from services or persistent storage). You can even change the type of the values you are processing between steps. This is yet another case of “Like it a lot!”

Advertisements

Testing and databases

I prefer unit tests over integration tests any time. But at some point you just can’t avoid the latter. You need your components to hit a database to verify that your queries are correct and that you update the right records.

Some databases (like RavenDB for example, see Ayende’s answer) support running completely in-memory specifically for testing scenarios. If you are in the convenient situation to use one of them you can spin up a db instance, fill it with data, run your tests and throw the instance away. Otherwise you have to find another way to prepare your materialized database to offer a well defined set of records.

There are at least two ways to do that: Have a designated test database. Run each test inside a transaction and rollback after the test. Your database should be returned to the original state it was in before the test. Or you can drop the database after each test and recreate it from scratch with the data needed for each test.

I want to talk about the second approach. I wrote a small helper class for a task I’m currently working on. It allows you to define some setup steps (like picking a connection string or selecting the name of the database to create for the test) and a build-up sequence (drop existing db, create empty db, create empty tables etc.). And to make using it nice and easy it automatically drops the created database at the end of the test run.

[Fact]
public void Should_CreateAndDropDatabase()
{
  using (var result = Database.Build(
    x =>
      {
        x.WithConnectionStringNamed("MyConnectionString");
        x.WithDatabaseName("FooDatabase");
        x.BuildSequence(
          db =>
            {
              db.DropExistingDatabase();
              db.CreateEmptyDatabase();
              db.CreateTables();
              db.CreateTestData();
            });
      }))
  {
    Assert.Null(result.Error);

    // ... run your actual test code
  }
}

Running this test writes some basic information about what’s going on to the console. But you can easily use some other means for tracing.

Retrieving ConnectionString named ‘MyConnectionString’.
Using database name ‘FooDatabase’.
Droping database ‘FooDatabase’.
Creating empty database ‘FooDatabase’.
Creating tables.
Creating test data.
Droping database ‘FooDatabase’.

How is it working? We start from a static entry point (similar to TopShelf’s HostFactory):

public static class Database
{
  public static DatabaseBuildResult Build(Action<DatabaseBuildConfiguration> action)
  {
    Guard.AssertNotNull(action, "action");
    var config = new DatabaseBuildConfiguration();
    action(config);
    return config.Execute();
  }
}

From there we make calls to DatabaseBuildConfiguration

public class DatabaseBuildConfiguration
{
  private readonly AppendOnlyCollection<Action> actions;
  private bool dontDropDatabaseOnDispose;
  public DatabaseBuildConfiguration()
  {
    this.actions = new AppendOnlyCollection<Action>();
  }
  public IAppendOnlyCollection<Action> Actions { get { return this.actions; } }
  public string ConnectionString { get; private set; }
  public string Database { get; private set; }
  
  public DatabaseBuildResult Execute()
  {
    Action onDispose = this.dontDropDatabaseOnDispose ? () => { } : this.GetDropDatabaseAction();
    var result = new DatabaseBuildResult(onDispose);
    try
    {
      foreach (Action action in this.Actions)
      {
        action();
      }
    }
    catch (Exception ex)
    {
      result.Error = ex;
    }
    return result;
  }
  
  // ... more methods
  
  public void WithConnectionStringNamed(string connectionStringName)
  {
    Guard.AssertNotEmpty(connectionStringName, "connectionStringName");
    Action action = () =>
      {
        Console.WriteLine("Retrieving ConnectionString named '{0}'.", connectionStringName);
        this.ConnectionString =
          ConfigurationManager.ConnectionStrings[connectionStringName].ConnectionString;
      };
    this.Actions.Add(action);
  }
  public void BuildSequence(Action<DatabaseBuildSequenceConfiguration> action)
  {
    var config = new DatabaseBuildSequenceConfiguration(this);
    action(config);
  }
}

This class is basically a container for a list of actions. The methods on the class place actions in the list. Execute() adds some error handling and runs these actions.

DatabaseBuildSequenceConfiguration defines the actions used to manipulate the database. I won’t show it’s code here as it works in the same way as DatabaseBuildConfiguration.

In the sample at the beginning of this post you can see a using-block surrounding the actual test code. The DatabaseBuildResult implements IDisposable. By default disposing the result will also drop the entire database. But you can turn off that behavior by calling DontDropDatabaseOnDispose().

You can add more methods for e.g. running sql scripts to generate or manipulate data or hookup ready to load database files. The model is quite easy to extend.

It’s surely not as good as having a fully functional (and functionally equivalent!) in-memory database but for a reasonably complex database it makes my live a lot easier!

Specification Pattern

The Specification Pattern. Yet another one I find useful in daily business. But a few enhancements can make it even more useful and a lot nicer to handle.

1, 2, many!

The CompositeSpecification is most often implemented with two fields for spec1 and spec2 or left and right. The AndSpecification and OrSpecifications evaluate these fields with their respective operator.

public class Or<T> : CompositeSpecification<T>
{
  // ...

  public override bool IsSatisfiedBy(T candidate)
  {
    return spec1.IsSatisfiedBy(candidate) || spec2.IsSatisfiedBy(candidate);
  }
}

This can result in a degenerated tree structure when you link many specifications with the same operator (e.g. a.Or(b).Or(c)...Or(n)).

To avoid this I decided to change the behavior of the composite. Instead of two fields it uses a list of children. If you try to link two specifications the code checks wether either one of them is of the same type as the composite and adds the other composites children instead of the composite itself to the list.

Sounds more complicated than it is. Let’s see some code.

public abstract class CompositeSpecification<T> : Specification<T>
{
  private readonly List<Specification<T>> children;
  public CompositeSpecification()
  {
    this.children = new List<Specification<T>>();
  }
  public IEnumerable<Specification<T>> Children
  {
    get { return children; }
  }
  public int Count
  {
    get { return this.children.Count; }
  }
  protected void Add(Specification<T> specification)
  {
    this.children.Add(specification);
  }
  protected void AddRange(IEnumerable<Specification<T>> specifications)
  {
    this.children.AddRange(specifications);
  }
}

public class Or<T> : CompositeSpecification<T>
{
  public Or(Specification<T> specification, Specification<T> other)
  {
    this.Include(specification);
    this.Include(other);
  }
  public override string Description
  {
    get { return " || "; }
  }
  public override bool IsSatisfiedBy(T candidate)
  {
    foreach (Specification<T> specification in this.Children)
    {
      if (specification.IsSatisfiedBy(candidate))
      {
        return true;
      }
    }
    return false;
  }
  private void Include(Specification<T> specification)
  {
    Or<T> or = specification as Or<T>;
    if (or != null)
    {
      this.AddRange(or.Children);
    }
    else
    {
      this.Add(specification);
    }
  }
}

And this is how two specifications can be linked with an Or operator.

public abstract class Specification<T>
{
  // ...
  
  public abstract bool IsSatisfiedBy(T candidate);
  public Specification<T> Or(Specification<T> other)
  {
    return new Or<T>(this, other);
  }
}

In the end this will put all successive Or’s in a single list, which makes finding the right specification in the tree a lot easier.

What do we have?

Generating a human readable representation of the specification tree can be tedious but is often beneficial if you need to see “what you have”. The easiest way to traverse a tree structure is a Visitor. The same pattern is used by Microsoft in their expression trees.

We add an Accept method and a property for the description of the specification to the base classes

public abstract class Specification<T>
{
  // ...
  public abstract string Description { get; }
  public virtual void Accept(SpecificationVisitor<T> visitor)
  {
    visitor.Visit(this);
  }
}

public abstract class CompositeSpecification<T> : Specification<T>
{
  // ...
  public override void Accept(SpecificationVisitor<T> visitor)
  {
    visitor.Visit(this);
  }
}

public class Or<T> : CompositeSpecification<T>
{
  // ...
  public override string Description
  {
    get { return " || "; }
  }
}

Define a base class for the SpecificationVisitor

public abstract class SpecificationVisitor<T>
{
  public abstract void Visit(Specification<T> specification);
  public abstract void Visit(CompositeSpecification<T> composite);
}

And the implementation of a PrettyPrinter becomes as simple as that

public class PrettyPrinter<T> : SpecificationVisitor<T>
{
  private readonly StringBuilder sb;
  public PrettyPrinter()
  {
    this.sb = new StringBuilder(250);
  }
  public override void Visit(Specification<T> specification)
  {
    this.sb.Append(specification.Description);
  }
  public override void Visit(CompositeSpecification<T> composite)
  {
    this.sb.Append("(");
    foreach (Specification<T> child in composite.Children)
    {
      child.Accept(this);
      this.sb.Append(composite.Description);
    }
    int l = composite.Description.Length;
    this.sb.Remove(sb.Length - l, l);
    this.sb.Append(")");
  }
  public override string ToString()
  {
    return this.sb.ToString();
  }
}

And this gives you a nice and friendly printout of your graph

var spec = new AlwaysFalse().Or(new AlwaysTrue().And(new Odd())).Or(new AlwaysTrue());
var printer = new PrettyPrinter<T>();
spec.Accept(printer);
string friendly = printer.ToString(); // (false || (true && Odd) || true)

First time I tried to attach some zipped source code and found out that WordPress won’t let me… Get the source code here (project TecX.Common folder Specifications and the test suite that puts them to use in TecX.Common.Test).

(Time-)Tracing with Unity Interception, Decorators and Reflection

We use tracing to find out what takes us so long when processing X or computing Y. Tracing is a cross-cutting concern. It can be used anywhere in our applications. As such there are some well-known ways to implement it to keep our business code clean such as using hand-crafted decorators or an interception framework of our preference.

This is a non-scientific comparison of three different approaches. One uses Unity’s Interception Extension, two use decorators where one figures out what method it is currently tracing using reflection and the other needs magic strings for the same purpose.

There are many ways to measure the time it takes to perform a certain operation. Below you see the easiest and most convenient way I found over time.

public class TraceTimer : IDisposable
{
  private readonly string operationName;
  private readonly DateTime start = DateTime.Now;
  public TraceTimer()
  {
    StackTrace trace = new StackTrace();
    MethodBase method = trace.GetFrame(1).GetMethod();
    this.operationName = MethodSignatureHelper.GetMethodSignature(method);
  }
  public TraceTimer(string operationName)
  {
    this.operationName = operationName;
  }
  public void Dispose()
  {
    TimeSpan elapsed = DateTime.Now - this.start;
    string msg = string.Format("{0} took {1:F3}s", this.operationName, elapsed.TotalSeconds);
    Debug.WriteLine(msg);
  }
}

And this is how it is used:

using (new TraceTimer("Foo()"))
{
  Foo();
}

Which writes a message like ‘Foo() took 1.038s‘  to the debug console.

Using System.Diagnostics.Debug for writing your measurements is most often not the way you want to go in a real application. You can use a logging framework of your choice instead. But Debug does the job for our purpose here.

The TraceTimer has two constructors. One takes the name of the operation it measures as a parameter. The other extracts the method signature by using a StackFrame.

Unity Interception

Unity uses classes that implement IInterceptionBehavior to intercept calls to a certain target. For this test the behavior looks like this:

public class DummyTracingBehavior : IInterceptionBehavior
{
  public IMethodReturn Invoke(IMethodInvocation input, GetNextInterceptionBehaviorDelegate getNext)
  {
    var methodSignature = MethodSignatureHelper.GetMethodSignature(input.MethodBase);
    using (new TraceTimer(methodSignature))
    {
      return getNext()(input, getNext);
    }
  }
  public IEnumerable<Type> GetRequiredInterfaces() { return Type.EmptyTypes; }
  public bool WillExecute { get { return true; } }
}

And this is the test setup:

var container = new UnityContainer();
container.AddNewExtension<Interception>();
container.RegisterType<IDummy, Dummy>(
  new Interceptor<InterfaceInterceptor>(),
  new InterceptionBehavior<DummyTracingBehavior>());
using (new TraceTimer("WithInterception"))
{
  for (int i = 0; i < NumberOfRunsPerCycle; i++)
  {
    var dummy = container.Resolve<IDummy>();
    dummy.DoIt(i);
  }
}

Hand-crafted Decorator

The decorator implements the same interface as the target and delegates all calls to that target.

public class DummyDecorator : IDummy
{
  private readonly IDummy inner;
  public DummyDecorator(IDummy inner)
  {
    this.inner = inner;
  }
  public string DoIt(int i)
  {
    using (new TraceTimer("DoIt(int)"))
    {
      return this.inner.DoIt(i);
    }
  }
}

It won’t get much simpler than that. And again the test setup:

var container = new UnityContainer();
container.RegisterType<IDummy, Dummy>("Dummy");
container.RegisterType<IDummy, DummyDecorator>(
  new InjectionConstructor(new ResolvedParameter<IDummy>("Dummy")));
using (new TraceTimer("WithDecorator"))
{
  for (int i = 0; i < NumberOfRunsPerCycle; i++)
  {
    var dummy = container.Resolve<IDummy>();
    dummy.DoIt(i);
  }
}

Decorator with Reflection

The second decorator does not rely on magic strings to identify the method name and parameters but uses reflection to figure those details out on its own.

public class DummyDecoratorWithReflection : IDummy
{
  private readonly IDummy inner;
  public DummyDecoratorWithReflection(IDummy inner)
  {
    this.inner = inner;
  }
  public string DoIt(int i)
  {
    using (new TraceTimer())
    {
      return this._inner.DoIt(i);
    }
  }
}

The only thing that differs between the two decorators is the missing string parameter in the constructor call of the TraceTimer. The test setup is identical. Just replace the type DummyDecorator with DummyDecoratorWithReflection.

Prerequisites

For each of the three approaches I set the NumberOfRunsPerCycle to 10,000 and ran 10 cycles for each test.

Each test uses Unity to create a new test object for each run. That is by intention. To work with interception you need Unity anyway. The decorators can be new’ed up which saves you something between 2 and 5% per run. But you will loose Unity’s support for auto-wiring your targets. I prefer convenience and configurability over the last ounce of performance as long as there is no business need for that performance. So whenever possible I use a container to create my objects.

Results

I added up the times measured for each cycle and calculated the average time per cycle. The ranking is as follows:

  1. Hand-crafted Decorator with 3.1015s
  2. Unity Interception with 3.4265s
  3. Decorator with Reflection with 3.5391

The total time taken is not really important as it depends on the machine that runs the tests, the logging framework you use to write your measurements, wether you run the tests as console application or unit-test etc. pp.

But it allows you to compare how one solution performs compared to the others. Not surprisingly the hand-crafted decorator is the fastest solution. But it is also the one with the lowest convenience for your developers. For each target (interface) you want to trace you would have to implement a decorator and you would have to provide the method signature on your own. If that signature changes you would have to change your decorator’s code as well.

Unity’s interception mechanism is about 10% slower than hand-crafted code. But it’s convenience factor is way higher. You write one implementation and all you have to do then is to change your configuration to enable it for arbitrary targets.

Looking at the numbers decorators that use reflection to figure out what is happening seem more like an academic exercise. They are slower than interception but provide less convenience. You have to write a decorator for each target but you don’t get the performance of the first solution. Yet you don’t have to care about changes to the method’s signature. If you can’t use Unity for one reason or another but want to avoid the overhead of maintaining your decorators they might still be a choice. Otherwise I think they are a suboptimal solution.