Building a Producer Consumer Queue with TPL Dataflow

I had never used the TPL Dataflow library until this summer and I was very pleasantly surprised at how easy and effective it was.

In my last post I introduced the new “Async Daemon” feature in Marten that allows you to continuously update projected views over the event store as new events are captured in the system. In essence, the async daemon has to do two things:

Fetch event data from the underlying Postgresql database and put it into the form that the projections and event processors expect
Run the event data previously fetched through each projection or event processor and commit any projected document views back to the database.

Looking at it that way, the async daemon looks like a good fit for a producer/consumer queue. In this case, the event fetching “produces” batches of events for the projection “consumer” to process downstream. The goal of this approach is to improve overall throughput by allowing the fetching and processing to happen in parallel.

I had originally assumed that I would use Reactive Extensions for the async daemon, but after way too much research and dithering back and forth on my part, I decided that the TPL Dataflow library was a better fit in this particular case.

The producer/consumer queue inside of the async daemon consists of a couple main players:

The Fetcher class is the “producer” that continuously polls the database for the new events. It’s smart enough to pause the polling if there are no new events in the database, but otherwise it’s pretty dumb.
An instance of the IProjection interface that does the actual work of processing events or updating projected documents from the events.
The ProjectionTrack class acts as a logical controller to both Fetcher and IProjection
A pair of ActionBlock‘s from the TPL Dataflow library used as the consumer queue for processing events and a second queue for coordinating the activities within ProjectionTrack.

In the pure happy path workflow of the async daemon, it functions like this sequence diagram below:

AsyncDaemonSequence

The Fetcher object runs continuously fetching a new “page” of events and queues each page where it will be consumed by ProjectionTrack in its ExecutePage() method in a different thread.

The usage of the ActionBlock objects to connect the workflow together turned out to be pretty simple. In the following code taken from the ProjectionTrack class, I’m setting up the ActionBlock for the execution queue with a lambda to call the ExecutePage() method. One thing to notice is that I had to configure a couple options to ensure that each item enqueued to that ActionBlock is executed serially in the same order that it was received.

_executionTrack 
    = new ActionBlock<EventPage>(page => ExecutePage(page, _cancellation.Token),
	new ExecutionDataflowBlockOptions
	{
		MaxDegreeOfParallelism = 1,
		EnsureOrdered = true
	});

The value of the ActionBlock class usage is that it does all the heavy lifting for me in regards to the threading. The ActionBlock will trigger the ExecutePage() method in a different thread and ensure that every page is executed sequentially.

Incorporating Backpressure

I also wanted to incorporate the idea of “back pressure” so that if the event fetching producer is getting too far ahead of the event processing consumer, the async daemon would stop fetching new events to prevent spikes in memory usage and possibly reserve more system resources for the consumer until the consumer could catch up.

To do that, there’s a little bit of logic in ProjectionTrack that checks how many events are queued up in the execution track shown above and pauses the Fetcher if the configured threshold is exceeded:

public async Task CachePage(EventPage page)
{
	// Accumulator is just a little helper used to
	// track how many events are in flight
	Accumulator.Store(page);

	// If the consumer is backed up, stop fetching
	if (Accumulator.CachedEventCount > _projection.AsyncOptions.MaximumStagedEventCount)
	{
		_logger.ProjectionBackedUp(this, Accumulator.CachedEventCount, page);
		await _fetcher.Pause().ConfigureAwait(false);
	}


	_executionTrack?.Post(page);
}

When the consumer works through enough of the staged events, ProjectionTrack knows to restart the Fetcher to begin producing new pages of events:

// This method is called after every EventPage is successfully
// executed
public Task StoreProgress(Type viewType, EventPage page)
{
	Accumulator.Prune(page.To);

	if (shouldRestartFetcher())
	{
		_fetcher.Start(this, Lifecycle);
	}

	return Task.CompletedTask;
}

The actual “cooldown” logic inside of ProjectionTrack is implemented in this method:

private bool shouldRestartFetcher()
{
	if (_fetcher.State == FetcherState.Active) return false;

	if (Lifecycle == DaemonLifecycle.StopAtEndOfEventData && _atEndOfEventLog) return false;

	if (Accumulator.CachedEventCount <= _projection.AsyncOptions.CooldownStagedEventCount &&
		_fetcher.State == FetcherState.Paused)
	{
		return true;
	}

	return false;
}

To make this more concrete, by default Marten will pause a Fetcher if the consuming queue has over 1,000 events and won’t restart the Fetcher until the queue goes below 500. Both thresholds are configurable.

As I said in my last post, I thought that the async daemon overall was very challenging, but I felt that the usage of TPL Dataflow went very smoothly.

Doing it the Old Way with BlockingCollection

In the past, I’ve used the BlockingCollection to build producer/consumer queues in .Net. In the Storyteller project, I used producer/consumer queues to parallelize executing batches of specifications by dividing the work in stages that all do some kind of work on a “SpecExecutionRequest” object (read in the specification file, do some preparation work to build a “plan”, and finally to actually execute the specification). At the heart of that is a the ConsumingQueue class that allows you to queue up tasks for one of these SpecExecutionRequest stages:

    public class ConsumingQueue : IDisposable, IConsumingQueue
    {
        private readonly BlockingCollection<SpecExecutionRequest> _collection =
            new BlockingCollection<SpecExecutionRequest>(new ConcurrentBag<SpecExecutionRequest>());

        private Task _readingTask;
        private readonly Action<SpecExecutionRequest> _handler;

        public ConsumingQueue(Action<SpecExecutionRequest> handler)
        {
            _handler = handler;
        }

        public void Dispose()
        {
            _collection.CompleteAdding();
            _collection.Dispose();
        }

        // This does not block the caller
        public void Enqueue(SpecExecutionRequest plan)
        {
            _collection.Add(plan);
        }

        private void runSpecs()
        {
            // This loop runs continuously and calls _handler() for
            // each plan added to the queue in the method above
            foreach (var request in _collection.GetConsumingEnumerable())
            {
                if (request.IsCancelled) continue;

                _handler(request);
            }
        }

        public void Start()
        {
            _readingTask = Task.Factory.StartNew(runSpecs);
        }
    }

For more context, you can see how these ConsumingQueue objects are assembled and used in the SpecificationEngine class in the Storyteller codebase.

After doing it both ways, I think I prefer the TPL Dataflow approach over the older BlockingCollection mechanism.

Marten is Ready for Early Adopters

I’ve been using RavenDb for development over the past several years and I’m firmly convinced that there’s a pretty significant productivity advantage to using document databases over relational databases for many systems. For as much as I love many of the concepts and usability of RavenDb, it isn’t running very successfully at work and it’s time to move our applications to something more robust. Fortunately, we’ve been able to dedicate some time toward using Postgresql as a document database. We’ve been able to do this work as a new OSS project called Marten. Our hope with Marten has been to retain the development time benefits of document databases (along with an easy migration path away from RavenDb) with a robust technological foundation — and even I’ll admit that it will occasionally be useful to fall back to using Postgresql as a relational database where that is still advantageous.

I feel like Marten is at a point where it’s usable and what we really need most is some early adopters who will kick the tires on it, give some feedback about how well it works, what’s missing that would make it easier to use, and how it’s performing in their systems. Fortunately, as of today, Marten now has (drum role please):

An available nuget published to the public Nuget.org feed with very few dependencies (just Npgsql) after I wrestled with ILMerge today (grrr)
A comprehensive documentation website with a getting started tutorial and plenty of information about what is available and other features that are only planned.

And of course, the Marten Gitter room is always open for business.

An Example Quickstart

To get started with Marten, you need two things:

A Postgresql database schema (either v9.4 or v9.5)
The Marten nuget installed into your application

After that, the quickest way to get up and running is shown below with some sample usage:

var store = DocumentStore.For("your connection string");

Now you need a document type that will be persisted by Marten:

    public class User
    {
        public Guid Id { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public bool Internal { get; set; }
        public string UserName { get; set; }
    }

As long as a type can be serialized and deserialized by the JSON serializer of your choice and has a public field or property called “Id” or “id”, Marten can persist and load it back later.

To persist and load documents, you use the IDocumentSession interface:

    using (var session = store.LightweightSession())
    {
        var user = new User {FirstName = "Han", LastName = "Solo"};
        session.Store(user);

        session.SaveChanges();
    }

Building an EventStore with User Defined Projections on top of Postgresql and Node.js

EDIT 3/28/2016: Since this blog post still gets plenty of reads, here’s an update. The work described here never got used in a real project (#sadtrombone), but fear not, the projection support is going to live again in a new project called Marten that seeks to turn Postgresql into a very functional Document Db and Event Store with projections.

I did an internal talk on the tooling and concepts in this post at our Salt Lake City office a couple months ago. The recording, for what it’s worth, is here. I’m assuming that you’re at least somewhat familiar with the concepts of Event Sourcing and CQRS, but if you’re not, there are links to descriptive explanations of these concepts in the body of the post.

For most of the 2000’s my goto strategy for application persistence was to use some sort of object relational mapping to persist and read the object structures that I wanted to work with in my code. Sometimes I used hand rolled code to do the mapping, and other times my teams used NHibernate. In the past couple years I’ve been on projects that used the RavenDb document database with mixed success. I’ve also worked on a couple codebases that used an event sourcing strategy to persist meaningful business events, sometimes with RavenDb as the underlying storage engine and another project that uses an older version of NEventStore with Sql Server as the storage mechanism.

For various reasons, we’ve chosen to use a Node.js based stack to rewrite an old WPF application that is a suitable candidate for event sourcing on the backend (Corey Kaylor explained his take on this decision in a blog post). Since we already wanted to replace Sql Server (and probably RavenDb) with Postgresql in the long run, at Corey’s suggestion I have been working on and off to try leveraging to create a new event store suitable for Node.js development that supports user-defined projections. Lacking all originality, I’m calling this new library “pg-events.” You can find pg-events hosted under my GitHub account (my very first foray back into OSS post-FubuMVC).

Feature Set

Support the basic event sourcing pattern by appending the raw business events as JSON to the event store
Track events by a “stream” of related events that probably relates directly to some kind of business concept or workflow
Support user-defined projections of the raw event data to create “read side” views for clients
Support aggregated views of a stream (really just another projection). Use a basic snapshotting strategy of the aggregate state for efficiency
Build time tooling to initialize a postgresql database with the custom schema objects and import javascript libraries to postgresql
A crude, partial implementation of CommonJS that runs within postgresql

Conceptual Architecture

The first thing to know is that we’re making a very large bet on the portability of Javascript code and the ability to run at least a subset of this new event store code hosted in Postgresql, Node.js, embedded in other programming, or even potentially in a browser. The user-defined projections could potentially be executed in any of the pieces below, and we think that flexibility will pay off down the road for both performance and scalability tuning.

So far, I think the end state is going to consist of these four pieces:

Postgresql Database
1. Custom schema objects largely based on Greg Young’s Building an Event Store paper to store events and stream metadata.
2. Tables to persist projected views. Most projection views will be persisted to separate tables instead of one giant “pge_views” table for better query performance
3. Stored procedures to update and query data in the event storage tables, mostly using postgresql’s Javascript support
4. Executes and updates aggregate snapshots and synchronous projections. More on that in the following section.
A Node.js Client
1. Exposes methods to append events to the store
2. Exposes methods to query for projected views, aggregate snapshots, and raw event stream data
3. See https://github.com/jeremydmiller/pg-events/wiki/Client for more information
Admin CLI Tool
1. Build the necessary schema objects into a Postgresql
2. Loads the user defined projections and other pg-events libraries into the database
3. Can reset the event storage and projected view tables to an empty state for testing
4. Eventually, this tool will also support “recapitulation” to rebuild projection data from the raw events when the definition of a projection changes
Background Projection Runner
1. Executes and updates projected views in a background process. This is my very next coding adventure. I’m going to build it out first with Node.js, then try my hand at implementing it again with a standalone Golang executable that uses an embedded V8 engine to execute the projections. Expect my twitter feed to be entertaining when I’m able to start that work. I’ll blog about this later when I know what it’s going to actually look like;)

User Defined Projections

We looked at EventStore at first and I definitely liked their first class support for user defined projections. Our implementation of projections is very obviously influenced by EventStore’s.

I think that the event sourcing efforts I’ve been a part of have been successful overall, but “projecting” the raw event stream into a persisted read side or view model has been challenging. For pg-events, we’re expressing the projections with simple transformation functions that will take in the initial state and the raw event data and simply return the new state (it’s a logical fold left operation for the projections that work across multiple events).

For a sample event sourcing domain, I’ve been using the idea of a quest from the way too many fantasy books I’ve read over my lifetime. During a quest, our heroes might record events like “QuestStarted”, “MembersJoined”, “MembersDeparted”, or “TownReached.” To know or understand the exact composition of a quest party at any time, we need to replay some of the change events (Gandalf stayed behind to fight the Balrog, Boromir was killed, Frodo and Sam ran off, Gollum joined up, etc.) for the quest.

Say we write a projection for a new view across the events in a single quest called “Party” just to understand the membership. From the unit tests, that projection looks like:

require("../../lib/projections")
	.projectStream({
		name: 'Party',
		stream: 'Quest', 
		mode: 'sync',

		$init: function(){
			return {
				active: true,
				traveled: 0,
				location: null,
				members: []
			}
		},

		QuestStarted: function(state, evt){
			state.active = true;
			state.location = evt.location;
			state.members = evt.members.slice(0);

			state.members.sort();
		},

		TownReached: function(state, evt){
			state.location = evt.location;
			state.traveled += evt.traveled;
		},

		EndOfDay: function(state, evt){
			state.traveled += evt.traveled;
		},

		QuestEnded: function(state, evt){
			state.active = false;
			state.location = evt.location;
		},

		MembersJoined: function(state, evt){
			state.members = state.members.concat(evt.members);
			state.members.sort();
		},

		MembersDeparted: function(state, evt){
			state.location = evt.location;

			for (var i = 0; i < evt.members.length; i++){
				var index = state.members.indexOf(evt.members[i]);
				state.members.splice(index, 1);
			}

			state.members.sort();
		}


	});

You’ll notice that there’s a field called “mode” with a value of “sync.” Using the portability of Javascript, we’re planning for these modes:

sync – A ‘sync’ projection will be executed synchronously inside postgresql within the same transaction as the event capture
async – In progress. An ‘async’ projection will be calculated in a background process instead of at event capture time (Eventual Consistency).
live – Forthcoming. These projections will only be calculated upon demand. I’m not yet sure if we’ll do the actual projection transformations within the database or the Node.js client. I guess we could allow two different “live” modes if there’s value in doing that.

So, eventual consistency killing you in your current event sourcing efforts because you hit errors by querying off of stale data? Opt for synchronous projections. Have lots of writes, but relatively few reads? Use asynchronous or even live projections that are only calculated on demand. Have lots of reads but very few writes? I think I would again opt for synchronous projections.

I worked on a system a couple years ago in a failed startup that ran projections in a browser to do historical point in time simulations. I don’t see any reason why we couldn’t do something similar in pg-events if that is ever valuable.

Believe it or not, I have a decent start on documenting the projection support at https://github.com/jeremydmiller/pg-events/wiki/Projections.

Why Postgresql?

Postgresql is the people pleaser of database engines. Want all the normal RDBMS capabilities? Would you be more productive using Postgresql as a document database? Want to write stored procedures with a language that closely resembles Oracle’s PL/SQL (no, a thousand times no, never again)? Would you even want to use Javascript inside the database itself? Regardless of how you answered any of those questions, Postgresql is trying really hard to be what you want. In our case, I like that we can use postgresql as an event store, a document database for things that don’t fit into the event sourcing model, and a classic RDBMS if that’s what we want in some circumstances.

Mostly though, we like that Postgresql has a proven track record and we suspect that the DevOps support tools will be more effective than we’ve experienced with other OSS database tools.

Of course, the only reason why pg-events is viable in the first place is that postgresql has outstanding JSON support and the ability to author stored procedures with Javascript using Google’s V8 engine. With our project timeline, it’s also safe to assume that Postgresql 9.4 with its significant improvements to the JSON storage will be available before we go live to production.

Why CQRS isn’t crazy

I’ll feely admit that the first time I saw Greg Young talking about the Command Query Responsibility Segregation (CQRS) style of architecture in 2008 I thought it was nuts. Specifically, I was afraid that doing the transformations between the “write side” model and the “read side” model consumed by the clients would lead to far too much repetitive “left hand, right hand” code. The reality is, of course, is that I was already doing a lot of work to map database tables to object graphs, transforming domain model objects to DTO’s to send over the wire, and crafting database views to transform our raw data into something more conducive to reporting requirements. In a way, CQRS just explicitly calls out a large part of software development efforts that is often overlooked. If we simply accept the idea that different consumers and producers of the persisted state in our system naturally have different needs as far as how the same information is written, structured, and consumed, CQRS isn’t really “crazy talk” or extra work. One of the biggest differences is that with event sourcing + CQRS you probably try to pre-build and persist the read side views instead of trying to create views or DTO’s on the fly from the “one true database model.”

Some thoughts on Relational Databases

I’m very much in the camp that says that the database is strictly for persistence and your business logic and/or user interface should never be tightly coupled to whatever the database is, so the idea of just consuming the raw database tabular data in business logic code is a non-starter for me — not to mention that a flat database table structure is very rarely the exact structure that you’d want in your business logic code outside of CRUD-centric applications. I’ve been a part of technical arguments with database-centric folks for so long that I’m simply happy to say “agree to disagree” on these issues and let us all go on our way.

There’s a tremendous amount of inertia and investment in tooling in our industry in regards to the usage of relational databases as the de facto standard for just about all persistence needs. Additionally, most developers, testers, and even the business people seem to naturally understand the relational database model. Even so, as alternative models like document or graph databases build up more tooling, acceptance, and developer familiarity, I think that relational databases will eventually be consigned to reporting applications or pure CRUD applications (but even then I prefer document databases).

That being said, I think that the future really is “polyglot persistence” and that our children are going to laugh at us in decades to come when we explain how we built systems against relational databases.

StructureMap 3 is gonna tell you what’s wrong and where it hurts

tl;dr: StructureMap 3 introduces some cool new diagnostics, improves the old diagnostics, and makes the exception messages a lot better. If nothing else scroll to the very bottom to see the new “Build Plan” visualization that I’m going to claim is unmatched in any other IoC container.

I’ve had several goals in mind with the work on the shortly forthcoming StructureMap 3.0 release. Make it run FubuMVC/FubuTransportation applications faster, remove some clumsy limitations, make the registration DSL consistent, and make the StructureMap exception messages and diagnostic tools completely awesome so I don’t have to do so much work answering questions in the user list users will have a much better experience. To that last end, I’ve invested a lot of energy into improving the diagnostic abilities that StructureMap exposes and adding a lot more explanatory information to exceptions when they do happen.

First, let’s say that we have these simple classes and interfaces that we want to configure in a StructureMap Container:

    public interface IDevice{}
    public class ADevice : IDevice{}
    public class BDevice : IDevice{}
    public class CDevice : IDevice{}
    public class DefaultDevice : IDevice{}

    public class DeviceUser
    {
        // Depends on IDevice
        public DeviceUser(IDevice device)
        {
        }
    }

    public class DeviceUserUser
    {
        // Depends on DeviceUser, which depends
        // on IDevice
        public DeviceUserUser(DeviceUser user)
        {
        }
    }

    public class BadDecorator : IDevice
    {
        public BadDecorator(IDevice inner)
        {
            throw new DivideByZeroException("No can do!");
        }
    }

Contextual Exceptions

Originally, StructureMap used the System.Reflection.Emit classes to create dynamic assemblies on the fly to call constructor functions and setter properties for better performance over reflection alone. Almost by accident, having those generated classes made for a decently revealing stack trace when things went wrong. When I switched StructureMap to using dynamically generated Expression‘s, I got a much easier model to work with for me inside of StructureMap code, but the stack trace on runtime exceptions became effectively worthless because it was nothing but a huge series of nonsensical Lambda classes.

As part of the effort for StructureMap 3, we’ve made the Expression building much, much more sophisticated to create a contextual stack trace as part of the StructureMapException message to explain what the container was trying to do when it blew up and how it got there. The contextual stack can tell you, from inner to outer steps like:

The signature of any constructor function running
Setter properties being called
Lambda expressions or Func’s getting called (you have to supply the description yourself for the Func, but SM can use an Expression to generate a description)
Decorators
Activation interceptors
Which Instance is being constructed including the description and any explicit name
The lifecycle (scoping like Singleton/Transient/etc.) being used to retrieve a dependency or the root Instance

So now, let’s say that we have this container configuration that experienced StructureMap users know is going to fail when we try to fetch the DeviceUserUser object:

        [Test]
        public void no_configuration_at_all()
        {
            var container = new Container(x => {
                // Notice that there's no default
                // for IDevice
            });

            // Gonna blow up
            container.GetInstance<DeviceUserUser>();
        }

will give you this exception message telling you that there is no configuration at all for IDevice.

One of the things that trips up StructureMap users is that in the case of having multiple registrations for the same plugin type (what you’re asking for), StructureMap has to be explicitly told which one is the default (where other containers will give you the first one and others will give you the last one in). In this case:

        [Test]
        public void no_configuration_at_all()
        {
            var container = new Container(x => {
                // Notice that there's no default
                // for IDevice
            });

            // Gonna blow up
            container.GetInstance<DeviceUserUser>();
        }

Running the NUnit test will give you an exception with this exception message (in Gist).

One last example, say you get a runtime exception in the constructor function of a decorating type. That’s way out of the obvious way, so let’s see what StructureMap will tell us now. Running this test:

        [Test]
        public void decorator_blows_up()
        {
            var container = new Container(x => {
                x.For<IDevice>().DecorateAllWith<BadDecorator>();
                x.For<IDevice>().Use<ADevice>();
            });

            // Gonna blow up because the decorator
            // on IDevice blows up
            container.GetInstance<DeviceUserUser>();
        }

generates this exception.

Container.WhatDoIHave()

StructureMap has had a textual report of its configuration for quite a while, but the WhatDoIHave() feature gets some better formatting and the ability to filter the results by plugin type, assembly, or namespace to get you to the configuration that you want when you want it.

This unit test:

        [Test]
        public void what_do_I_have()
        {
            var container = new Container(x => {
                x.For<IDevice>().AddInstances(o => {
                    o.Type<ADevice>().Named("A");
                    o.Type<BDevice>().Named("B").LifecycleIs<ThreadLocalStorageLifecycle>();
                    o.Type<CDevice>().Named("C").Singleton();
                });

                x.For<IDevice>().UseIfNone<DefaultDevice>();
            });

            Debug.WriteLine(container.WhatDoIHave());

            Debug.WriteLine(container.WhatDoIHave(pluginType:typeof(IDevice)));
        }

will generate this output.

The WhatDoIHave() report will list each PluginType matching the filter, all the Instance’s for that PluginType including a description, the lifecycle, and any explicit name. This report will also tell you about the “on missing named Instance” and the new “fallback” Instance for a PluginType if one exists.

It’s not shown in this blog post, but all of the information that feeds the WhatDoIHave() report is query able from the Container.Model property.

Container.AssertConfigurationIsValid()

At application startup time, you can verify that your StructureMap container is not missing any required configuration and generally run environment tests with the Container.AssertConfigurationIsValid() method. If anything is wrong, this method throws an exception with a report of all the problems it found (build exceptions, missing primitive arguments, undeterminable service dependencies, etc.).

For an example, this unit test with a missing IDevice configuration…

        [Test]
        public void assert_container_is_valid()
        {
            var container = new Container(x => {
                x.ForConcreteType<DeviceUserUser>()
                    .Configure.Singleton();
            });

            // Gonna blow up
            container.AssertConfigurationIsValid();
        }

…will blow up with this report.

Show me the Build Plan!

I saved the best for last. At any time, you can interrogate a StructureMap container to see what the entire “build plan” for an Instance is going to be. The build plan is going to tell you every single thing that StructureMap is going to do in order to build that particular Instance. You can generate the build plan as either a shallow representation showing the immediate Instance and any inline dependencies, or a deep representation that recursively shows all of its dependencies and their dependencies.

This unit test:

        [Test]
        public void show_me_the_build_plan()
        {
            var container = new Container(x =>
            {
                x.For<IDevice>().DecorateAllWith<BadDecorator>();
                x.For<IDevice>().Use<ADevice>();

            });

            var shallow = container.Model
                .For<DeviceUserUser>()
                .Default
                .DescribeBuildPlan();

            Debug.WriteLine("This is the shallow representation");
            Debug.WriteLine(shallow);

            var deep = container.Model
                .For<DeviceUserUser>()
                .Default
                .DescribeBuildPlan(3);

            Debug.WriteLine("This is the recursive representation");
            Debug.WriteLine(deep);
        }

generates this representation.

FubuMVC Turns 1.0

The FubuMVC team has published a 1.0 version of the main libraries (FubuMVC.Core, FubuMVC.StructureMap, FubuMVC.AutoFac, FubuMVC.Core.UI, and the view engines) to the public nuget feed. We’re certainly not “done,” and we’re severely lacking in some areas (*cough* documentation *cough*), but I’m happy with our technical core and feel like we’re able to make that all important, symbolic declaration of “SemVer 1/the major public API signatures are stable.”

It’s been a long journey from Chad Myers and I’s talk at KaizenConf all the way back in 2008 to CodeMash in 2013 and in this highly collaborative OSS on GitHub world, we’ve had a lot of collaborators. In particular, I’d like to thank Chad Myers and Josh Flanagan for their help at the beginning, Josh Arnold for being my coding partner the past couple years, Corey Kaylor for being the grown up in the room, and Alex Johannessen for his boundless enthusiasm. I’ve genuinely enjoyed my interactions with the FubuMVC community and I look forward to seeing us grow in the new year.

There’s plenty more to do, but for a week or so, my only priority is rest (and finishing the last couple hundred pages of A Memory of Light) — and that doesn’t have anything to do with HATEOAS or hypermedia.

What’s not there yet…

I saw somebody on Twitter last week saying that the “U” in FubuMVC stands for “undocumented,” and that it’s so bad that we had to use two “U’s.” I’m very painfully aware of this, and I think we’re ready to start addressing the issue permanently.

A good “quick start” nuget and guide. The FubuMVC team made a heroic effort over the past couple months to make the FubuMVC 1.0 release just before our CodeMash workshop this week, and I dropped the ball on updating the old “FubuMVC” nuspec file to be relevant to the streamlined API’s as they are now.
The new “FubuWorld” website with documentation on all of the major and hopefully most of the minor FubuMVC projects (including StructureMap and StoryTeller as well). We effectively wrote our own FubuMVC-hosted version of readthedocs, but we haven’t yet exploited this capability and gotten a new website with updated documentation online. I’m deeply scarred by my experiences with documenting StructureMap and how utterly useless it has been. This time the projects will have strong support for living documentation.
Lots of Camtasia videos
Lots of google-able blog posts

Continuous Design and Reversibility at Agile Vancouver (video)

In November I got to come out of speaking retirement at Agile Vancouver. Over a couple days I finally got to meet Mike Stockdale in person, have some fun arguments with Adam Dymitruk, see some beautiful scenery, and generally annoy the crap out of folks who are hoarding way too much relational database cheese in my talk called Continuous Design and Reversibility (video via the link).

I think the quality of reversibility in your architecture is a very big deal, especially if you have the slightest interest in effectively doing continuous design. Roughly defined, reversibility is your ability to alter or delay elements of your software architecture. Low reversibility means that you’re more or less forced to get things right upfront and it’s expensive to be wrong — and sorry, but you will be wrong about many things in your architecture on any non-trivial project. By contrast, using techniques and technologies that have higher reversibility qualities vastly improves my ability to delay technical decisions so that I can focus on one thing at a time like say, building out the user interface for a feature to get vital user feedback quickly without having to first lay down every single bit of my architecture for data access, security or logging first. In the talk, I gave several concrete examples from my project work including the usage of document databases instead of relational databases.

Last Responsible Moment

I think we can all conceptually agree with the idea of the “Last Responsible Moment,” meaning that the best time to make a decision is as late in the project as possible when you have the most information about your real needs. How “late” your last responsible moment is for any given architectural decision is largely a matter of reversibility.

For the old timers reading this, consider the move from VB6 with COM to .Net a decade and change ago. With COM, adding a new public method to an existing class or changing the signature of an existing public method could easily break the binary compatibility, meaning that you’d have to recompile any downstream COM components that used the first COM component. In that scenario, it behooved you to get the public signatures locked down and stable as fast as possible to avoid the clumsiness and instability with downstream components — and let me tell you youngsters, that’s a brittle situation because you always find reasons to change the API’s when you get deep into your requirements and start stumbling into edge cases that weren’t obvious upfront. Knowing that you can happily add new public members to .Net classes without breaking downstream compatibility, my last responsible moment for locking down public API’s in upstream components is much later than it was in the VB6 days.

The original abstract:

From a purely technical perspective, you can almost say that Extreme Programming was a rebellion against the traditional concept of “Big Design Upfront.” We spent so much time explaining why BDUF was bad that we might have missed a better conversation on just how to responsibly and reliably design and architect applications and systems in an evolutionary way.

I believe that the key to successful continuous or evolutionary design is architectural “reversibility,” the ability to reverse or change technical decisions in the code. Designing for reversibility helps a team push back the “Last Responsible Moment” to make more informed technical decisions.

I work on a very small technical team building a large system with quite a bit of technical complexity. In this talk I’ll elaborate on how we’ve purposely exploited the concept of reversibility to minimize the complexity we have to deal with at any given time. More importantly, I’ll talk about how reversibility led us to choose technologies like document databases, how we heavily exploit conventions in the user interface, and the testing process that made it all possible. And finally, just to make the talk more interesting, I’ll share the times when delaying technical decisions blew up in our faces.

Kicking off StructureMap 3

Actually, I started working hard on StructureMap 3.0 in the summer of 2010 but got badly derailed by other projects and a nasty bout of burnout. I’m writing this post because I would dearly love to get community input and contributions and I’ve got folks contacting me that are chomping at the bit to start working on this.

StructureMap was originally written in the summer of 2003 and revamped in the spring of 2004 for its very first release in June of that year. Over the years it has had some significant rework (the 2.5 and 2.6 releases were both large changes), but at this point I firmly believe that the current 2.6.* internal structure is not worth improving. Yes Virginia, I am opting to gut some of the internals of StructureMap in order to fix the most egregious problems and limitations of the current architecture and build a container that is good enough to last until we all give up on this silly .Net thing. I’d also like to tear out any feature that I think is obsolete or just plain ugly to use and make StructureMap much leaner.

Nothing here is set in stone and feedback is very welcome.

My thoughts for 3.0:

My personal drivers for doing StructureMap3 are mostly to kill the nested container problems and get StructureMap ready to better handle multi-tenancy scenarios in a high volume FubuMVC application. I think that better Profile’s and/or the child container feature below would make multi-tenancy easier to achieve without killing the server’s memory usage. Well, and I would like to make StructureMap easier to use for other people Winking smile Making StructureMap the most used container in .Net or competing with the other hundred container tools to do every possible crazy scenario that folks can come up with is not on my agenda.

Remove the [Obsolete] methods
Better exception messages. The error messages and the stacktraces really took a step backwards when I replaced the old Reflection.Emit code with dynamically generated Expression’s in the 2.6 release. At a bare minimum, the stacktrace and exception messages need to be much cleaner and more accurately present what has gone wrong.
Better configuration diagnostics. Completely taking a page out of FubuMVC and Bottles, I would like a StructureMap container to be able to tell you why and how it was configured the way it is. Why did it select this constructure, why is this the default, where did this type come from.
Configuration model. Today there is the configuration model (PluginGraph) and a runtime model (PipelineGraph and InstanceManager). I would like to eliminate the separate models and make the configuration model much easier to consume by users. From the lessons we learned with FubuMVC, I think the key to making the convention model far better is a very good semantic model that can be easily altered and read by both conventions and explicit configuration. I think this is going to be the biggest change in the internals.
Far better convention support. See the above feature. Think of policies like “set the value of each constructor argument named ‘connectionString’ to this” or “make any Instance the singleton lifecycle where the concrete class name ends with ‘Cache’.” We can do that kind of thing today with FubuMVC’s BehaviorGraph model. I’d like to do the same with StructureMap.
Profiles. I think we just flat our redo Profile’s from scratch and completely redesign that functionality from all new requirements.
Runtime flexibility. I would like to be able to allow users to register policies that could “teach” StructureMap how to resolve a requested type that it doesn’t know anything about. I think we’d convert some of the hard coded rules in current StructureMap to this new pluggable strategy. Think things like “this is a concrete type and I can resolve all of its dependencies, so I’ll just do it” or “this type closes an open generic type that I do know about” or “the name of this class ends in ‘Settings’ so I’ll use FubuCore’s ISettingsProvider to resolve it”
Better Lifecycle support. A longtime limitation in StructureMap is that lifecycle can only be configured by the requested type, i.e., all instances of ISomething have to be the same lifecycle. I’d like to eliminate that limitation.
Better support for modular configuration. We already have the Registry model and I think it has worked out very well. Most of the other IoC containers have implemented something similar by this point. I’d like to extend the model to allow you to specify ordering rules between Registry classes and dependencies (hence, FubuCore’s dependency analysis functionality). I would also like to add semantics to only add configuration if it is missing or conditional configuration.
Pluggable strategies for selecting constructor functions. I don’t care for this one bit, but at least a couple prominent .Net OSS frameworks need this.
Nested containers. I love this feature and its usability. FubuMVC depends very heavily on this feature in StructureMap. Its implementation, however, is horrific and there’s a nasty outstanding bug that I felt was too difficult to fix in 2.6.*. I think we rewrite the nested container feature so that we have proper separation in scoping between the parent and nested container and avoid the need to do any copying/shuffling of the underlying configuration structure.
Child containers. Not quite the same thing as nested containers. This would be the ability to quickly clone an existing container and override the parent’s configuration.
Eliminate the Xml configuration. I have already ripped the Xml configuration support out of the core assembly in StructureMap 3. I wouldn’t mind coming back and adding a subset of the existing Xml configuration back as an addon assembly and nuget.
Eliminate the old attribute configuration. I had left this in there for years, but I’d never recommend to anyone that they use it. I would like to consider just using the convention support to work against a subset of the same CLR attributes that MEF uses.
Full, living documentation. I rewrote the documentation for the 2.5 release, but it wasn’t usable enough and quickly got out of date when 2.6 was released. For 3.0 I’d like to use Sphinx for the documentation generation and host on http://readthedocs.org/ and make the documentation publish with pushes to Git. Another heavy lesson learned is that the we need to strive to make the documentation organized around the tasks that a user would do instead of organized around StructureMap jargon.
Recipes. This is where I really need community help the most. I’d like to have some examples of integrating StructureMap into common .Net tools and frameworks. I’m at a disadvantage because I’ve become very disconnected from mainstream .Net. I have not used Entity Framework, WCF, Silverlight, Workflow Foundation, MEF, and barely used WPF, ASP.Net MVC, Prism, or WebForms. I just don’t have enough visibility into those tools to help much.
Backwards Compatibility. With a few exceptions, I think the registry DSL in StructureMap has settled into something usable. I’d like to remove all the [Obsolete] methods and change a few things that seem to be confusing to use, but otherwise make it as easy as possible to upgrade from 2.6.* to 3.0.
No Silverlight support. I have no intention of supporting Silverlight or any other mobile variant of .Net at this time. I’m open to this happening later and I’m contemplating at least a version of StructureMap that is usable in the client profile. This is an important decision to make soon-ish because I would like for StructureMap 3 to take a dependency on our FubuCore library and I don’t really want to care about the size of the assembly right now.
Fubu project portfolio. I would like to fold StructureMap under the Fubu project family. Part of that is branding, but it’s also community, the convenience of the GitHub organization model, and the common infrastructure that’s starting to grow up around documentation, Ripple, and the build support.
Use FubuCore. There is quite a bit of overlap between the FubuCore library and what’s in the current codebase that I’d like to eliminate. I’d also like to use FubuCore’s dependency graph support, the ObjectConverter, and integrate the SettingsProvider service out of the box for externalized configuration in StructureMap (here’s an explanation of an earlier version that’s still relevant).

Things I don’t plan on changing

The interception model. By and large I think it’s been good enough for anything I’ve ever needed to do with it
The basic Registry DSL
Any of the methods on IContainer
Still don’t plan on adding AOP, but I’d like to have addon Nuget libraries for integrating existing AOP solutions with StructureMap someday

So why do this at all?

A couple months back I expressed some admiration for where one of the other IoC containers was going and that I was perfectly willing to forgo trying to compete with that particular tool (don’t even ask which one because I took a harder look at it and changed my mind). He asked me why I didn’t just contribute the stuff I needed from StructureMap that was missing in that other container. Fair question, but no, I’m not going to do that. Why? Because one of the best learning experiences you can possibly have as a developer is to take a hard problem that you’ve already solved and reflect on how you could solve that same problem in a much better way with all the things you’ve learned. I’ve worked on and off on StructureMap for 8-9 years. I’ve rewritten some of the same subsystems in StructureMap a couple different times and even got a conference talk out of my experience called The Joys and Pains of a Long Lived Codebase at QCon – but I still think I will learn a great deal by going through with one last version of StructureMap.