Long Lived Codebases: Architecture Gardening, Rewrites, and Automated Tests

This post continues my series about my experiences maintaining and growing the StructureMap codebase over the last decade and change based on my talk this year at CodeMash. This series started with The Challenges, and this post is a direct continuation of the discussion on my last post on A Crime Against Computer Science. You’ll likely want to read the previous post first.

A Timeline

2009 – I introduced the nested container feature into StructureMap 2.6 as a purely tactical feature to fill an immediate need within a project at work. The implementation of nested containers, as I described in my my previous post, was highly problematic.

2010 – I attempted a complete, ground up rewrite of StructureMap largely to address the problems with the nested container. I burned out badly on OSS work for a while before I got very far into it and never picked the work up again.

2014 – StructureMap 3.0 was released with a considerably different architecture that largely fixed the known issues with the nested container feature (I measured a 100X improvement in nested container performance in a bigger application). The 3.0 work was done by applying a series of intermediate transformative changes to the existing 2.6 codebase until the architecture essentially resembled the vision of the earlier attempt to rewrite StructureMap.

Sacrificial Architecture and Continuous Design

I’ve beaten myself up quite a bit over the years for how bad the original implementation of the nested container was, but looking at it from another perspective, I was able to deliver some real value to a project in a hurry. The original implementation and the problems it incurred definitely informed the design decisions that I made in the 3.0 release. I can be a lot easier on myself if I can just treat the initial design as Martin Fowler’s Sacrificial Architecture concept.

My one true regret is just that it took so many years to permanently address the issue. It’s fine and dandy to write some throwaway code you know isn’t that great to satisfy an immediate need if you really do come back and actually throw it away before it does a lot of harm. And as an industry, we’re really good about replacing throwaway code, right? Right?

I’ve been a big believer in the idea of continuous design my entire career — even before the Agile programming movement came around to codify that idea. Simply put, I think that most of the examples of great technical work I’ve been a part of was the direct result of iterating and evolving an idea rather than pure creation. Some of that mental iteration can happen on paper or whiteboards as you refine your ideas. Other times it seems to take some initial stumbles in code before you really learn how to better solve your coding problem.

I’m not the greatest OSS project leader of all time by any means, but the one piece of advice that I can give out with a straight face is to simply avoid being over-extended. The gap in major StructureMap releases, and the continual lack of complete documentation, is a direct result of me having way too many OSS irons in the fire.

Play the Long Game

One of the biggest lessons I’ve learned in working with long lived codebases is that you can never act as if the architecture is set in stone.

In the case of a long lived codebase that changes in functionality and technology way beyond its initial vision, you need to constantly challenge the basic architecture, abstractions, and technical assumptions rather than just try to jam new features into the existing codebase. In the case of StructureMap, there have been several occasions where doing structural changes first has made new functionality easier to implement. In the case of the nested container feature, I tried to get away with building the new feature in even though the current architecture didn’t really lend itself to the new feature and I paid for that.

I think this applies directly to my day job. We don’t really start many new projects and instead continually change years old codebases. In many cases we know we’d like to make some big changes in the application architecture or switch out elements of the technical infrastructure (we’re getting rid of Angular 1.* if it’s the last thing I do), but there’s rarely time to just stop and do that. I think that you have to play the “long game” in those cases. You still need to continuously make and refine architectural plans for what you wish the system could be, then continuously move closer to your ideal as you get the chance in the course of normal project work.

In most of my OSS projects I’ve been able to think ahead to many future features and have a general idea of how I would want to built them or what structural changes would be necessary. In the case of the original nested container implementation, I was caught flat-footed. I don’t know how you completely avoid this, but I highly recommend not trying to make big changes in a hurry without enough time spent thinking through the implications.

Tactical vs. Strategic Thinking

I think the balance between thinking tactically and strategically in regards to software projects is a valuable topic that I don’t see discussed enough. Tilt too far to the tactical side and you get dramatically shitastic code going into production so that you can say that you kinda, sorta made some sort of arbitrary deadline — even though it’s going to cause you heartburn in subsequent releases. Tilt too far to the strategic side of things and you spend all your day going architect astronaut, fiddling with the perfect project automation, and generally getting nothing done.

In my case, I was only looking at the short term gains and not thinking about how this new nested container feature should fit into StructureMap’s architecture. The improved nested container implementation in 3.0 was only possible because I stepped back and reconsidered how the architecture could be changed to better support the existing feature. Since the nested container problems were so severe, that set of architectural improvements was the main driver for me to finally sit down and make the 3.0 release happen.

Whither the Rewrite?

To my credit, I did recognize the problems with the original nested container design early on. I also realized that the StructureMap internals needed to be quite different in order to improve the nested container feature that was suddenly so important in our application architecture. I started to envision a different configuration model that would enable StructureMap to create nested containers much more efficiently and eliminate the problems I had had with lifecycle bugs. I also wanted to streamline some unnecessary duplication between nearly parallel configuration models that had grown over the years in the StructureMap code.

In 2010 I started an entirely new Github repository for StructureMap 3. When I looked at the differences between my architectural vision for StructureMap and its current state in the 2.6 era, I thought that it was going to be far too difficult to make the changes in place so I was opting for a complete, ground up rewrite (don’t bother looking for the repository, I think I deleted it because people were getting confused about where the real StructureMap code was). In this case, the rewrite flopped — mostly because I just wasn’t able to devote enough time in a burst to rebuild all the functionality.

The killer risk for doing any kind of rewrite is that you just can’t derive much value from it until you’re finished. My experience, both on side projects and in enterprise IT, is that rewrite efforts frequently get interrupted or even completely shelved before they advance far enough to be usable.

At this point, I think that unless a project is small or at least has a limited and very well understood set of functionality, you should probably avoid attempts at rewriting. The StructureMap rewrite attempt failed because I was interrupted before it got any momentum. In the end, I think it was a much wiser choice to evolve and transform the existing codebase instead — all while staying within the existing acceptance level tests in the code (more on this below). It can be significantly harder to come up with the intermediate steps such that you can evolve to your end goal than just doing the rewrite, but I think you have a much better chance of delivering value without accidentally dropping functionality.

I’m the primary developer on a couple OSS projects that have gone through, or about to go through, a rewrite or restructure effort:

  • StructureMap 3.0 — As stated before, I gave up on the rewrite and transformed the existing code to a more effective internal model. I’m mostly happy with how this turned out.
  • Storyteller 3.0 — Storyteller is a tool for executable specifications I originally built in 2009. We use it heavily at work with some mixed success, but the WPF based user interface is clumsy and we’re having some severe throughput issues with it on a big project. Everybody wanted a complete rewrite of the WPF client into a web application (React.js with a Flux-like architecture), but I was still left with the rump .Net engine. I identified several structural changes I wanted to make to try to improve usability and performance. I tried to identify intermediate steps to take in the existing code, but in that case I felt like I needed to make too many changes that were interrelated and I was just getting tired of constantly pounding out “git checkout –force” and opted for a rewrite. In this case, I felt like the scope was limited, I could reuse the acceptance tests from the original code to get back to the same functionality, and that a rewrite was the easier approach. So far, so good, but I think this was an exception case.
  • FubuMVC 3 / “Jasper” — I know I said that I was giving up on FubuMVC and I meant it at the time, but for a variety of reasons we want to just move our existing FubuMVC .Net applications to the vNext platform this year and decided that it would be easier to modernize FubuMVC rather than have to rewrite so much of our code to support the new ASP.Net MVC/Web API combo. At the same time we want to transform FubuMVC’s old “Behavior” middleware model to just use the OWIN middleware signature throughout. After a lot of deliberation about a new, ground up framework, we’ve tentatively decided to just transform the existing FubuMVC 2 code and stay within the existing automated test coverage as we make the architectural changes.

So in three cases, I’ve opted against the rewrite twice. I think my advice at this point is to avoid big rewrites. Rewriting a subsystem at a time, sure, but not a complete rewrite unless the scope is limited.

Automated Testing is Good, Except When it Isn’t

Automated testing can most certainly help you create better designs and architectures by providing the all important quality of reversibility — and if you believe in or practice the idea of continuous design, you absolutely have to have reversibility. In other all too common cases, automated tests that are brittle can prevent you from making changes to the codebase because you’re too afraid of breaking the tests.

What I’ve found through the years of StructureMap development is that high level acceptance tests that are largely black box tests that express the desired functionality from a client perspective are highly valuable as regression tests. Finer grained tests that are tightly coupled to the implementation details can be problematic, especially when you want to start restructuring the code. When I was doing the bigger changes for StructureMap 3.0 I relied very heavily on the coarser grained tests. I would even write all new characterization tests to record the existing functionality before making structural changes when I felt like the existing test coverage was too light.

I’m not prepared to forgo the finer grained unit tests that are largely the result of using TDD to drive the low level design. Those tests are still valuable as a way to think though low level design details and keep your debugger turned off. I do think you can take steps to limit the coupling to implementation details.

If fine grained tests are causing me difficulty while making changes, I generally take one of a couple approaches:

  1. Delete them. Obviously use your best judgement on this one;)
  2. Comment the body of the test out and stick some kind of placeholder in instead that makes the test fail with a message to rewrite. I’ve fallen into the habit of using Assert.Fail(“NWO”) just to mean “rewrite to the new world order.”
  3. Write all new code off to the side to replace an entire subsystem. After switching over to using the new subsystem and getting all the acceptance level tests passing, go back and delete the old code plus the now defunct unit tests

Was I lulled to sleep by passing tests?

I had a great question from the audience at CodeMash about the poor initial implementation that caught me a little flat-footed. I was asked if “I had been lulled by passing tests into believing that everything was fine?” Quite possibly, yes. It’s certain that I wasn’t thinking about performance or how to first change the StructureMap internals to better support the new feature. I’m going to blame schedule pressure and overly tactical thinking at that time more than my reliance on TDD however.

There’s an occasional meme floating in the software world that TDD leads to developers just not thinking, and I know at the time of my talk I had just read this post from Ali Kheyrollahi and I was probably defensive about TDD. I’m very clearly in the pro-TDD camp of course and still feel very strongly that TDD can be a key contributor to better software designs. I’m also the veteran of way too many online debates about the effectiveness of TDD (see this one from 2006 for crying out loud). Unlike Ali, my experience is that there is a very high correlation suggesting that developers who use TDD are more effective than developers that don’t — but I think that might be more of a coincidental effect than a root cause.

That all being said, TDD in my experience is more effective for doing design at the small scale down at the class, method, or function level than it is as a technique for more larger scale issues like performance or extensibility. While I certainly wouldn’t rule out the usage of TDD, it’s just a single tool and should never be the sole design tool in your toolbox. TDD advocate or not, I do a lot of software design with pencil and paper using an amalgmation  of “UML as sketch” and CRC cards (I’m still a big fan of Responsibility Driven Design).

At the end of the day, there isn’t really any substitute for just flat out paying attention to what you’re doing. I probably should have done some performance testing on the nested container feature right off the bat or at least thought through the performance implications of all the model copying I was doing in the original implementation (see the previous post for background on that one).

Long Lived Codebases: A Crime Against Computer Science

All code samples are links to the exact lines of code in GitHub. Every thing I’ve ever done to embed code into blog posts has been a PITA, so I’m just punting this time.

This continues the adaptation of my CodeMash 2015 talk about my experiences developing StructureMap over the past decade and change. This series started last week with The Challenges. This post is about the wretchedly poor original implementation of StructureMap’s “nested container” feature and how I re-architected the StructureMap internals in the 3.0 release to greatly improve the performance of this feature. I ran out of steam while writing this, so I ended up breaking this out into two posts.

A little background on nested containers

Sometime around early 2009 my team and I were building a small quasi-service bus for our system that processed messages sent to a queue. In our simple case, the processing of each message would be treated as a single logical transaction. At the time, we were using NHibernate to do all our persistence, so the ISession interface is your de facto unit of work. What we needed was for every object that would participate in the handling of a single message to use the right ISession for that message handling transaction — even if the object was resolved lazily from the StructureMap container.

What we needed was a new kind of container scoping for a logical operation independent of thread or HttpContext or any of the older mechanisms that the IoC containers typically used at that time. Inspired by a similar feature in Windsor* (I think), I conceived of what is now the Nested Container feature in StructureMap that I quickly rolled into the 2.6 release just so we could use it in our homegrown service bus at work.

From a functionality perspective, the nested container feature has been a complete success. It’s used by my own FubuMVC and FubuTransportation frameworks plus other development frameworks like MassTransit, NServiceBus, and even ASP.Net MVC and Web API through OSS adapters. That being said, the original implementation of nested containers in the 2.6.* versions was a mess that suffered from poor performance and usability bugs that took me years to address — so bad that the 3.0 release was largely wrapped around some very significant architectural changes specifically to improve the nested container feature.

* People sometimes get upset by the number of different IoC containers in .Net, but there is some real value in competition between different technical solutions to push and inspire each other. I’ve always thought that development tools in .Net would be significantly better overall if the wider .Net community would be more willing to adopt tools originating outside of Microsoft so that MS would be forced to compete for adoption. 

The Original Implementation

During my talk at CodeMash, I stated that I believe the original implementation of the nested container feature in StructureMap caused other developers more harm than any other thing I’ve ever done. So what was so bad about the original version?

There was a pretty nasty bug related to object scoping in regards to singleton scoped objects that wasn’t really addressed outside of published workarounds until the 3.0 release 3-4 years later. The biggest issues though were performance and thread contention at runtime.

One of the original requirements for the nested container was to enable users to override service registrations in a nested container without having any impact on the original, parent container.* For context, the acceptance test for this behavior demonstrates what I mean. To pull this requirement off, I made the fateful decision to make a complete copy of the parent’s internal configuration model to pass into the nested container.

To illustrate why this turned out to be such an awful approach, see the code for the method PipelineGraph.ToNestedGraph() in the 2.6 branch that’s used to create the isolated configuration model for a new nested container. In particular, see how that code is making deep, programmatic clones of several structures. See also the code lock(this) that creates an exclusive lock around the parent object as it performs the work inside of the code block (I had to create a lock around the dictionary being cloned so that nothing else could alter that dictionary while I was in the process of iterating through it).** Doing the deep clone of the configuration models is expensive, especially when StructureMap was used in bigger applications. Even worse though, the shared lock that I had to do in order to copy the internal configuration structures meant that only one thread in the entire application could be creating a nested container at one time — which is a pretty big problem when you’re talking about a web application under significant load that wants to create an individual nested container for each unique HTTP request.

* FubuMVC exploits this ability to inject services that represent the current HTTP request into a nested container just before building the handlers for that HTTP request so that you can use constructor injection “all the way down.”

** Yes, I do buy that this is an example of where immutability can be valuable in concurrent code. Do also cut me a little bit of slack because this code was written long before .Net 4.0, the TPL, and the newer concurrent collection classes that came with it.

Re-architecting Nested Containers in 3.0 

It took about three years (and another year before a public release), but I was finally able to permanently fix (knock on wood!) the performance, thread contention, and scoping bugs related to the nested container feature in the 3.0 release. In my testing against one of our biggest codebases at work, I measured a two order of magnitude improvement in the time it took to create a new nested container. I was also able to completely eliminate the thread contention issues.

How was I able to do make those improvements? Heres the new version of PipelineGraph.ToNestedGraph() as it exists in the 3.0 code today — note that all it does is create a few new objects and pass in references to some existing objects. No deep cloning, no crazy data shuffling, and certainly no shared lock.

The nested container now has its own configuration model for its overrides and a reference to its parent’s configuration model. In the new world order, the nested container fulfills requests by using a sort of chain of responsibility pattern internally to locate the right action. If you ask a nested container for a service, it will:

  1. Look in its own configuration to see if it has an explicit override for that service. If one exists, build that configuration.
  2. If the nested container has no explicit override, it looks into its parent for the configuration for that service. Assuming one is found, the nested container builds out that configuration

This is over-simplified of course, but that’s the gist of the new structure and design.

I’ve had to revisit several of my OSS infrastructure projects lately (StructureMap’s nested container and shared locking problems, FubuMVC’s startup time, and now StoryTeller‘s throughput speed) to address performance issues. Depending upon my ambition level, I may write a blog post on those experiences.

How and Why?

So what factors might have led me to blunder so badly with the first implementation? What are the signs that I didn’t pick up on at the time that should have told me to go a different way than the flawed original approach? Why did it take me so long to get to a better state? How did I transform the StructureMap code into a very different internal structure to enable the better nested container performance? Why did I chicken out on a complete rewrite of StructureMap a few years back? In the next post I’ll attempt to answer those questions…

Ok, to be perfectly honest, I just ran out of steam and wanna hit “publish.” Till next time, laters.

Long Lived Codebases: The Challenges

I did a talk at CodeMash 2015 called “Lessons Learned from a Long Lived Codebase” that I thought went very well and I promised to turn into a series of blog posts. I’m not exactly sure how many posts it’s going to be yet, but I’m going to try to get them all out by the end of January. This is the first of maybe 4-5 theoretical posts on my experience evolving and supporting the StructureMap codebase over the past 11-12 years.

Some Background

In 2002 the big corporate IT shop I was working in underwent a massive “Dilbert-esque” reorganization that effectively trapped me in a non-coding architect role that I hated. I could claim 3-4 years of development experience and had some significant technical successes under my belt in that short time, but I’d mostly worked with the old COM-based Windows DNA platform (VB6, MTS, ADO, MSXML, ASP) and Oracle technologies right as J2EE and the forthcoming .Net framework seemed certain to dominate enterprise software development for the foreseeable future.

I was afraid that I was in danger of being made obsolete in my new role. I looked for some kind of project I could do out in the open that I could use to both level up on the newer technologies and prove to potential employers that “yes, I can code.” Being a pretty heavy duty relational database kinda guy back then, I decided that I was going to build the greatest ORM tool the world had ever seen on the new .Net platform. I was going to call it “StructureMap” to reflect its purpose of mapping the database to object structures. I read white papers, doodled UML diagrams like crazy, and finally started writing some code — but got bogged down trying to write an over-engineered configuration and modularity layer that would effectively allow you to configure object graphs in Xml. No matter, I managed to land a job with ThoughtWorks (TW) and off I went to be a real developer again.

During the short time that I worked at ThoughtWorks, Martin Fowler published his paper about Dependency Injection and Inversion of Control Containers and other folks at the company built an IoC container in Java called PicoContainer that was getting some buzz on internal message boards. I came to TW in hopes of being one of the cool kids too, so I dusted off the configuration code for my abandoned ORM tool and transformed that it into an IoC library for .Net during my weekly flights between Austin and Chicago. StructureMap was put into a production application in early 2004 and publicly released on SourceForge in June of 2004 as the very first production ready IoC tool on the .Net platform (yes, StructureMap is actually older than Windsor or Spring.Net even though they were much better known for many years).

Flash forward to today and there’s something like two dozen OSS IoC containers for .Net (all claiming to be a special snowflake that’s easier to use than the others while being mostly about the same as the others), at least three (Unity, MEF, and the original ObjectBuilder) from Microsoft itself with yet another brand new one coming in the vNext platform. I’m still working with and on StructureMap all these years later after the very substantial improvements for 3.o last year — but at this point very little remains unchanged from the early code. I’m not going to waste your time trying to sell you on StructureMap, especially since I’m going to spend so much time talking about the mistakes I’ve made during its development. This series is about the journey, not the tool itself.

What’s Changed around Me

Being 11 years old and counting, StructureMap has gone through a lot of churn as the technologies have changed and approaches have gone in and out of favor. If you maintain a big codebase over time, you’re very likely going to have to migrate it to newer versions of your dependencies, use completely different dependencies, or you’ll want to take advantage of newer programming language features. In no particular order:

  • StructureMap was originally written against .Net 1.1, but at the time of this post targets .Net 4.0 with the PCL compliance profile.
    • Newer elements of the .Net runtime like Task and Lazy<T> have simplified the code internals.
    • Lambdas as introduced in .Net 3.5 made a tremendous difference in the coding internals and had a big impact on the usage of the tool itself.
    • As I’ll discuss in a later post, the introduction of generics support into StructureMap 2.0 was like the world’s brightest spotlight shining on all the structural mistakes I made in the initial code structure of early StructureMap, but I’ll still claim that the introduction of generic types has made for huge improvements in StructureMap’s usability — and also one of the main reason why I think that the IoC tools in .Net are generally more usable than those in Java or Scala.
  • The build automation was originally done with NAnt, NUnit, and NMock. As my tolerance for Xml and coding ceremony decreased, StructureMap moved to using Rake and RhinoMocks. For various reasons, I’m looking to change the automation tooling yet again to modernize the StructureMap development experience.
  • StructureMap was originally hosted on SourceForge with Subversion source control. Releases were done in the byzantine fashion that SourceForge required way back then. Today, StructureMap is hosted on GitHub and distributed as Nuget packages. Nuget packages are generated as an artifact of each continuous integration build and manually promoted to Nuget.org whenever it’s time to do a public release. Nuget is an obvious improvement in distribution over manually created zip files. It is my opinion that GitHub is the single best thing to ever happen for Open Source Software development. StructureMap has received vastly more community contribution since moving to GitHub. I’m on record as being critical of the .Net community for being too passive and not being participatory in regards to .Net community tooling. I’m pleasantly surprised with how much help I’ve received from StructureMap users since the 3.0 release last year to fix bugs and fill in usability gaps.
  • The usage patterns and the architectures that folks build using StructureMap. In a later post I’ll do a deep dive on the evolution of the nested container feature.
  • Developer aesthetics and preferences, again, in a later post

 

Other People

Let’s face it, you and I are perfectly fine, but the “other” developers are the problem. In the particular case of a widely used library, you frequently find out that other developers use your tool in ways that you did not expect or anticipate. Frameworks that abstract the IoC container with some sort of adapter library have been some of the worst offenders in this regard.

The feedback I’ve gotten from user problems has led to many changes over the years:

  • All new features. The interception capabilities were originally to support AOP scenarios that I don’t generally use myself.
  • Changing the API to improve usability when verbiage is wrong
  • Lots and lots of work tweaking the internals of StructureMap as users describe architectural strategies that I would never think of, but do turn out to be useful — usually, but not always, involving open generic types in some fashion
  • New conventions and policies to remove repetitive code in the tool usage
  • Additional diagnostics to explain the outcome of the new conventions and policies from above
  • Adding more defensive programming checks to find potential problems faster. My attitude toward defensive programming is much more positive after supporting StructureMap over the years. This might apply more to tools that are configuration intense like say, an IoC tool.
  • A lot of work to improve exception messages (more on this later maybe)

 

One thing that should happen is to publish and maintain best practice recommendations for StructureMap. I have been upset with the developers of a popular .Net OSS tool who did, in my opinion, a wretched job of integrating StructureMap in their adapter library (to the point where I advise users of that framework to adopt a different IoC tool). Until I actually manage to publish the best practice advice to avoid the very problems they caused in their StructureMap usage, those problems are probably on me. Trying to wean users off of using StructureMap as a static service locator and being a little too extreme in applying a certain hexagonal architecture style have been constant problems on the user group over the years.

I’m not sure why this is so, but I’ve learned over the years that the more vitriolic a user is being toward you online when they’re having trouble with your tool, the more likely it is that they themselves are just doing something very stupid that’s not necessarily a poor reflection on your tool. If you ever publish an OSS tool, keep that in mind before you make the mistake of opening a column in your Twitter client just to spot references to your project or a keyword search in StackOverflow. I’ve also learned that users who have uncovered very real problems in StructureMap can be reasonable and even helpful if you engage them as collaborators in fixing the issue instead of being defensive. As I said earlier about the introduction of GitHub, I have routinely gotten much more assistance from StructureMap users in reproducing, diagnosing, and fixing problems in StructureMap over the past year than I ever had before.

 

Pull, not Push for New Features

In early 2008 I was preparing the grand StructureMap 2.5 release as the purported “Python 3000” release that was going to fix all the usability and performance issues in StructureMap once and for all time (Jimmy Bogard dubbed it the Duke Nukem Forever release too, but the 3.0 release took even longer;)). At the same time, Microsoft was gearing up for not one, but two new IoC tools (Unity from P&P and MEF from a different team). I swore that I wasn’t going down without a fight as Microsoft stomped all over my OSS tool, so I kicked into high gear and started stuffing StructureMap with new features and usability improvements. Those new things roughly fell into two piles:

  • Features or usability improvements I made based on my experience with using StructureMap on real projects that I knew would remove some friction from day to day usage. These features introduced in the 2.5 release have largely survived until today and I’d declare that many of them were successful
  • Things that I just thought would be cool, but which I had no immediate usage in my own work. You’ve already called it, much of this work was unsuccessful and later removed because it was either in the way, confusing to use, easily done in other ways, or most especially, a pain in the neck for me to support online because it wasn’t well thought out in the first place.

You have to understand that any feature you introduce is effectively inventory you have to support, document, and keep from breaking in future work. To reaffirm one of the things that the Lean Programming people have told us for years, it’s better to “pull” new features into your tool based on a demonstrated need and usage than it is to “push” a newly conceived feature in the hope that someone might find useful later.

 

Yet to come…

I tend to struggle to complete these kinds of blog series, but I do have the presentation and all of the code samples, so maybe I pull it off. I think that the candidates for following posts are something like:

  • A short discussion on backward compatibility
  • My documentation travails and how I’m trying to fix that
  • “Crimes against Computer Science” — the story of the nested container feature, how it went badly at first, and what I learned while fixing it in 3.0
  • “The Great Refactoring of Aught Eight”
  • API Usage Now and Then
  • Diagnostics and Exceptions

Thoughts from CodeMash 2015

I had a fantastic time at CodeMash last week and I’d like to think all of the organizers for the hard work they do making one of the best development community events happen each year. I was happy with how my talk went and I will get around to what looks like a 3 part blog series adaptation of it later this week after I catch up on other things. I had a much lighter speaking load than I’ve had in the past, leading to many more opportunities and energy to just talk to the other developers there. As best I can recall, here’s a smattering of the basic themes from those discussions:

  • Microservices — I lived through DCOM and all the sheer lunacy of the first wave of SOA euphoria, making me very leery of the whole micro-service concept. I did spoke to a couple people that I respect who were much more enthusiastic about micro services than I am.
  • Roslyn — I feel bad for saying this, but I’m tempering my hopes for Roslyn right now — but that’s largely a matter of me having had outsize expectations and hopes for Roslyn. I’m disappointed by the exclusion of runtime metaprogramming for now and I think the earlier hype about how much faster the Roslyn compiler would be compared to the existing CSC compiler might have been overstated. The improved ability to introspect your code with Roslyn is pretty sweet though.
  • RavenDb — I still love RavenDb conceptually and it’s my favorite database development time experience, but the quality issues and the poor DevOps tooling makes it a borderline liability in production. From my conversations with other RavenDb users at CodeMash, this seems to be a common opinion and experience. I hate to say it, but I’m in favor of phasing out RavenDb at work in the next couple years.
  • Postgresql — Postgres has a bit of buzz right now and I’ve been very happy with my limited usage of it so far. I talked to several people who were interested in using postgres as a pseudo document database. I’m planning to do a lot more side work with postgresql in the coming year to see how easy it would be to use it as more of a document database and add a .Net client to the event store implementation I was building for Node.js.
  • Programming Languages — The trend that I see is a blurring of the line between static and dynamic typing. Static languages are getting better and better type inference and dynamic languages keep getting more optional type declarations. I think this trend can only help developers over time. I do wish I hadn’t overslept the introduction to Rust workshop, but you can’t do everything.
  • OWIN — One of the highlights of CodeMash for me was fellow Texan Ryan Riley‘s history of the OWIN specification set to the Pina Colada song. For as insane as the original version of OWIN was, I think we’ll end up being glad that the OWIN community persevered through all the silliness and drama on their list to deliver something that’s usable today. I do still think Ryan needs to add “mystery meat” into his description of OWIN though.
  • Functional Programming — In the past, over zealous FP advocates have generally annoyed me the same way that I bet I annoyed folks online when Extreme Programming hype back in the day. The last time I was at CodeMash I remember walking out of a talk that was ostensibly about FP because it was nothing but a very long winded straw man argument against OOP done badly. This time around I enjoyed the couple FP talks I took in and appreciated the candor from the FP guys I spoke with. I do wish the FP guys would stop looking at the FP vs. OOP or imperative development comparison as a zero sum game and spend more time talking about the specific areas and problems where FP is valuable and much less time bashing everything else.

 

 

My Talk on Long-Lived Codebases at Codemash

When I go to conferences I see a lot of talks that follow the general theme of I/my team/our product/our process/this tool is wonderful. In contrast to that, I’m giving a talk Friday at Codemash entitled “Lessons Learned from a Long-Lived Codebase” about my experiences with the evolution of the StructureMap codebase in its original production usage in 2004 all the way up to the very different tool that it is today. Many structural weaknesses in your code structure and architecture don’t become apparent or even harmful until you either need to make large functional changes you didn’t foresee or the code gets used in far higher volume systems. I’m going to demonstrate some of the self-inflicted problems I’ve incurred over the years in the StructureMap and what I had to do to ameliorate those issues in the hopes of helping others dodge some of that pain.

Because StructureMap is such an old tool that’s been constantly updated, it’s also a great microcosm of how our general attitudes toward software development, designing API’s, and how we expect our tools to work have changed in the past decade and I’ll talk about some of that.

In specific, I’m covering:

  • Why duplication of even innocuous logic can be such a problem for changing the system later
  • Abstractions should probably be based strictly on roles and variations and created neither too early nor too late
  • Why your API should be expressed in terms of your user’s needs rather than in your own tool’s internals
  • Trying to pick names for concepts in your tool that are not confusing to your users
  • For the usage of any development tool that’s highly configurable there’s a tension between making the user be very explicit about what they want to do versus providing a much more concise or conventional usage that is potentially too “magical” for many users.
  • How the diagnostics and exception messages in StructureMap have evolved in continuous attempts to reduce the cost of supporting StructureMap users;)
  • Using “living documentation” to try to keep your basic documentation reasonably up to date
  • Organizing whatever documentation you do have in regards to what the users are trying to do rather than being a dry restatement of the implementation details
  • The unfortunate tension between backward compatibility and working to improve your tool
  • Automated testing in the context of a long lived codebase. What kinds of tests allow you to improve the internals of your code over time, where I still see value in TDD for fine grained design, and my opinions about how to best write acceptance tests that came out of my StructureMap work over the years
  • Lastly, the conversation about automated testing is a great sequeway to talk about the great “Rewrite vs. Improve In-Place” discussion I faced before the big StructureMap 3.0 release earlier this year.

This is a big update to a talk I did at QCon San Francisco in 2008. Turns out there was plenty more lessons to learn since then;)

I’d be willing to turn this talk into a series of more concrete blog posts working in the before and after code samples if I get at least 5-6 comments here from people that would be interested in that kind of thing.

Some Thoughts on the new .Net

The big announcements on the new ASP.Net runtime were made a couple weeks ago, but I write very slowly these days, so here’s my delayed reaction.

 

Less Friction and a Faster Feedback Cycle Maybe? 

I’ve started to associate .Net “classic” with seemingly constant aggravations like strong naming conflicts, csproj file merge hell, slow compilation, slow nuget restores, and how absurdly heavyweight and bloated that Visual Studio.Net has become over the years. Plenty of it was self-inflicted, but I felt at the end that way too much of our efforts in FubuMVC development was really about working around limitations in the .Net ecosystem or ASP.Net itself (the existing System.Web infrastructure is awful for modularity in my opinion and clearly shows its age and the MSBuild model is just flat out nuts).

In contrast, my brief exposure to Node.js development has been a breathe of fresh air. My preferred workflow is to just start mocha in “watch” mode in a terminal window with Growl notifications on and continuously code in Sublime. The feedback cycle between making a change in code to test results is astonishing fast compared to .Net (admittedly less awesome when using lots of generator functions or using WebPack to bundle up React.js JSX files or running tests in Karma) and the editor is snappy.

ASP.Net vNext is very clearly a reaction to the improved development workflows you can get with Node.js or Golang. In particular, I’m positive or at least hopeful that:

  • The Roslyn compiler is fast enough that auto-test tools in .Net “feel” much snappier
  • The new package resolution functionality is much faster than the current Nuget
  • The new runtime is much more command line friendly. I do particularly like how easy they’ve made it to expose commands from Nuget packages in the project.json declaration. That was a big irritation to me in the existing Nuget that we eventually solved by moving to gems instead for CLI tools.
  • The new project.json model is so much cleaner and simpler than the existing MSBuild schema. I hope that translates to easier tooling for scaffolding and much less merge hell than .Net classic
  • I’m impressed with how much activity is happening around the OmniSharp project, because I’d like to be productive in C# without having to fire up Visual Studio.Net at all — or at least only when I want ReSharper to do a lot of heavy refactoring work.
  • The abomination that was strong naming is gone

 

The Spirit of Openness

I like that the ASP.Net team has posted their code for their next generation work on GitHub. The best part of that for me is that they moved the code over before it was completely baked and you can actually see what they’re up to. Even crazier to me is that the ASP.Net team seems to actually be listening to the feedback (I’ve felt in the past that the ASP.Net team hasn’t been all that receptive to feedback).

Compare that to how the tool that became Nuget was developed. Nuget was largely conceived and built in secret at the same time that a pair of community efforts were trying to build out a new package management strategy for .Net (Seb’s OpenWrap and the Nu project to adapt RubyGems to .Net that I was somewhat involved in). Contrary to the pleasant sounding legend that’s grown up around it, Nuget is not a continuation of a community project. Nuget partially took over the name and asked the guys doing Nu to just stop.

The Nuget episode is irritating to me, not just because of how they de facto killed off the community efforts, but because I think they made some design decisions that caused some severe problems for usage (I described the problems I saw and my recommendations for Nuget earlier this year). Maybe, just maybe, the usage and performance problems could have been solved by more contributions, awareness, and feedback from the larger community if Nuget had been built in the open.

 

Kill the Insider Groups

Most of us didn’t known anything about Nuget or the new ASP.Net vNext platform before it was announced (I did, but not through official channels), but Microsoft has several official “Insider” groups that get much more advance warning and theoretically provide feedback to Microsoft on proposals and early work. I’d like to propose right now that these self-selected “insider” groups be disbanded and much more of these conversations happen in more public forums. My reasoning is pretty simple: there are far more strong developers outside of Redmond and that list than there are on the official list. My fear is that those types of insider, true believer clubs can too easily turn into echo chambers with little exposure to the world outside of .Net — and I feel that my limited involvement with that list frankly confirms that suspicion.

I was not a member of the ASP.Net Insider list until somewhat recently. If I had been aware of the plans for “Project K” earlier, I would have shut down FubuMVC development much earlier than I did because so much of the improvements in ASP.Net vNext make some very large development efforts we did in FubuMVC irrelevant or unnecessary.

 

OSS and the Community Thing

I’m guilty of spreading the “.Net community sucks” meme, but that’s not precisely true, it’s just different than other communities. What makes .Net so much different than many other development communities is how the majority of it revolves around what one particular company is doing.

So much of .Net is open source now and they even take contributions. Awesome, great, but my very first reaction was that it doesn’t matter much because the .Net community as a whole isn’t as participatory as other communities and that would have to change before ASP.Net vNext being OSS matters. It’ll be interesting to me to see if that changes over time.

With few exceptions (NancyFx comes to mind), OSS projects in .Net struggle to get an audience and build a community. I just don’t think there are enough venues for community to coalesce around OSS efforts in .Net, and I think that’s the reason why we have so much duplication of OSS projects without any one alternative getting much involvement beyond the initial authors. I do suspect that the .Net OSS community is much better in Europe now than North America, but that might just me having become such a homebody over the past 5 years.

 

Mono is More Viable and Docker(!)

I thought that trying to support running FubuMVC on Mono was a borderline disaster, to the point where I didn’t want to touch Mono again afterward (it’s a longer conversation, but beyond the normal aggravations like the file system differences between Windows and *nix we hit several Mono bugs). With the recent announcements about open sourcing so much more of the .Net runtime, the PCL standardization, and the better collaboration between Microsoft and Xamarin on display in Miguel de Icaza’s blogpost on the .Net announcements, I think that building applications on server side Mono is going to be much more viable.

Like just about every development shop around, we’re very interested in using Docker for deployments. The ability to eventually deploy our existing .Net applications as Docker containers could be a very big win for us.

 

We wanna code on the Mac

I resisted switching over to Mac’s for years, even when so many of my friends were doing even .Net development on their Macs, just because I was too cheap. After a couple years now of using a Mac, I’d really prefer to stay on that side of things and hopefully give my Windows VM much more time off. Mac OS being a first class citizen for the new .Net and the progress on the OmniSharp tools for Sublime or MacVim is going to make the new ASP.Net vNext runtime a much easier sell in my shop.

 

I think we might just head back to the new .Net

We’ve been working toward eventually moving off of .Net and I have been trying to concentrate my side work efforts on Node.js related things, but the truth of the matter is that we have a huge amount of existing code on .Net and a heavy investment in .Net OSS tools that we just can’t make go away overnight. If the feedback cycle is really much better in the new runtime, we can code on Macs, and use Docker, our thinking is that we’re still better off in the end to just adopt the better .Net instead of rewriting the existing systems on another platform.

 

vNext & Me

I have no intentions of restarting FubuMVC itself, but we’ve put together a concept for a  partial reboot called “Jasper” that may or may not become reality someday. I do want to move our FubuTransportation service bus to vNext next year. At work we *might* try to extend the new version of ASP.Net MVC to act more like FubuMVC to make migrating our existing applications easier if we don’t try to do the Jasper idea. StructureMap 3 is already PCL compliant, so it *should* be easy to get up and going on vNext.

In the meantime, I think I’m going to start a blog series just reviewing bits of the code in the ASP.Net GitHub organizations to get myself more familiar with the new tooling.

 

StructureMap 3 Documentation

Shockingly, my efforts to complete the documentation on StructureMap 3 have taken much, much longer than I had hoped — but there’s some real progress worth talking about. This time around, I’m adopting the idea of “living documentation” where the code samples are taken directly out of the code in the main GitHub repository at publishing time so that the documentation never gets out of sync with the code. For the most part, I’m using unit test code to demonstrate API usage in the documentation with the thinking there that the resulting documentation is much less ambiguous and again, cannot be out of sync with how the code actually works as long as those unit tests are passing.

If you’re curious, I’ve been using our (now abandoned) FubuDocs project with FubuMVC.CodeSnippets to author and publish the documentation. All the documentation is in GitHub in the StructureMap.Docs project (and yes, I certainly take pull requests).

The new documentation has moved to GitHub pages hosting at http://structuremap.github.io. I’m not sure what’s going to happen to the old structuremap.net website yet.

Some highlights of the documentation so far:

 

I’m committed to finishing the documentation, but I’m obviously not sure when it will be complete. I’d like to say it’s “done” before starting any significant new OSS project and I’m using that to force myself to finally finish;)

How We Do Strong Typed Configuration

TL;DR: I’ve used a form of “strong typed configuration” for the past 5-6 years that I think has some big advantages over the traditional approach to configuration in .Net. I still like this approach and I’m seeing something similar showing up in ASP.Net vNext as well. I’m NOT trying to sell you on any particular tool here, just the technique and concepts.

What’s Wrong with Traditional .Net Configuration?

About a decade ago I came late into a project where we were retrofitting some sort of new security framework onto a very large .Net codebase.* The codebase had no significant automated test coverage, so before we tried to monkey around with its infrastructure, I volunteered to create a set of characterization tests**. My first task was to stand up a big chunk of the application architecture on my own box with a local sql server database. My first and biggest challenge was dealing with all the configuration needs of that particular architecture (I’m wanting to say it was >75 different items in the appSettings collection).

As I recall, the specific problems with configuration were:

  1. It was difficult to easily know what configuration items any subset of the code (assemblies, classes, or subsystems) needed in order to run without painstakingly tracing the code
  2. It was somewhat difficult to understand how items in very large configuration files were consumed by the running code. Better naming conventions probably would have helped.
  3. There was no way to define configuration items on the fly in code. The only option was to change entries in the giant web.config Xml file because the code that used the old System.Configuration namespace “pulled” its configuration items when it needed configuration. What I wanted to do was to “push” configuration to the code to change database connections or file paths in test scenarios or really any time I just needed to repurpose the code. In a more general sense, I called this the Pull, Don’t Push rule in my CodeBetter days.

From that moment on, I’ve been a big advocate of approaches that make it much easier to both trace configuration items to the code that consumes it and also to make application code somehow declare what configuration it needs.

 

Strong Typed Configuration with “Settings” Classes 

For the past 5-6 years I’ve used an approach where configuration items like file paths, switches, url’s, or connection information is modeled on simple POCO classes that are suffixed with “Settings.” As an example, in the FubuPersistence package we use to integrate RavenDb with FubuMVC, we have a simple class called RavenDbSettings that holds the basic information you would need to connect to a RavenDb database, partially shown below:

    public class RavenDbSettings
    {
        public string DataDirectory { get; set; }
        public bool RunInMemory { get; set; }
        public string Url { get; set; }
        public bool UseEmbeddedHttpServer { get; set; }


        [ConnectionString]
        public string ConnectionString { get; set; }

        // And some methods to create in memory datastore's
        // or connect to the specified external datastore
    }

Setting aside how that object is built up for the moment, the DocumentStoreBuilder class that needs the configuration above just gets this object through simple constructor injection like so: new DocumentStoreBuilder(new RavenDbSettings{}, ...);. The advantages of this approach for me are:

  • The consumer of the configuration information is no longer coupled to how that information is resolved or stored in any way. As long as it’s giving the RavenDbSettings object in its constructor, DocumentStoreBuilder can happily go about its business.
  • I’m a big fan of using constructor injection as a way to create traceability and insight into the dependencies of a class, and injecting the configuration makes the dependency on configuration from a class much more transparent than the older “pull” forms of configuration.
  • I think it’s easier to trace back and forth between the configuration items and the code that depends on that configuration. I also feel like it makes the code “declare” what configuration items it needs through the signature of the Settings classes

 

Serving up the Settings with an IoC Container

I’m sure you’ve already guessed that we just use StructureMap to inject the Settings objects into constructor functions. I know that many of you are going to have a visceral reaction to the usage of an IoC container, and while I actually do respect that opinion, it’s worked out very well for us in practice. Using StructureMap (I think most of the other IoC containers could do this as well) we get a couple big benefits in regards to default configuration and the ability to swap out configuration at runtime (mostly for testing).

Since the Settings classes are generally concrete classes with no argument constructors, StructureMap can happily build them out for you even if StructureMap has no explicit registration for that type. That means that you can forgo any external configuration or StructureMap configuration and your code can still work as long as the default values of the Settings class is useful.  To use the example of RavenDbSettings from the previous section, calling new RavenDbSettings() creates a configuration that will connect to a new embedded RavenDb database that stores its data to the file system in a folder called /data parallel to the project directory (you can see the code here).

The result of the design above is that a FubuMVC/FubuTransportation application was completely connected to a working RavenDb database by simply installing the FubuMVC.RavenDb nuget with zero additional configuration.

I demoed that several times at conferences last year and the audiences seemed to be very unimpressed and disinterested. Either that’s not nearly as impressive as I thought it was, too much magic,  I’m not a good presenter, or they don’t remember what a PITA it used to be just to install and configure everything you needed just to get a blank development database going. I still thought it was cool.

The other huge advantage to using an IoC container to deliver all the configuration to consumers is how easy that makes it to swap out configuration at runtime. Again going to the RavenDbSettings example, we can build out our entire application and swap out the RavenDb connection at will without digging into Xml or Json files of any kind. The main usage has been in testing to get a clean database per test when we do end to end testing, but it’s also been helpful in other ways.

 

So where does the Setting data come from?

Alright, so the actual data has to come from somewhere outside the codebase at some point (like every developer of my age I have a couple good war stories from development teams hard coding database connection strings directly into compiled code). We generally put the raw data into the normal appSettings key/value pairs with the naming convention “[SettingsClassName].[PropertyName].” The first time a Settings object is needed within StructureMap, we read the raw key/value data from the configuration file and use FubuCore’s model binding support to create the object and do all the type coercion necessary to create the Settings object. An early version of this approach was described by my former colleague Josh Flanagan way back in 2009. The actual mechanics are in the FubuMVC.StructureMap code in the main fubumvc repository.

Some other points:

  • The data source for the model binding in this configuration setup was pluggable, so you could use external files or anything else that could be exposed as key/value pairs. We generally just use the appSettings support now, but on a previous project we were able to centralize the configuration files to a common location for several related processes
  • The model binding is flexible enough to support deep objects, enumerable properties, and just about anything you would need to use in your Settings objects
  • We also supported a model where you could combine key/value information from multiple sources, but layer the data in precedence to enable overrides to the basic configuration. My goal with that support was to avoid making all customer or environment specific configuration be in the form of separate override files without having to duplicate so much boilerplate configuration. In this setup, when the Settings objects were bound to the raw data, profile or environment specific information just got a higher precedence than the default configuration.
  • As of FubuMVC 2.0, if you’re using the StructureMap integration, you no longer have to do anything to setup this settings provider infrastructure. If StructureMap encounters an unknown dependency that’s a concrete type suffixed by “Settings,” it will try to resolve it from the model binding via a custom StructureMap policy.

 

Programmatic Configuration in FubuMVC 

We also used the “Settings” configuration idea to programmatically specify configuration for features inside the application configuration by applying alterations to a Setting object like this code from one of our active projects:

// Sets up a custom authorization rule on any diagnostic 
// pages
AlterSettings<DiagnosticsSettings>(x => {
	x.AuthorizationRights.Add(new AuthorizationCheckPolicy<InternalUserPolicy>());
});

// Directs FubuMVC to only look for client assets
// in the /public folder
AlterSettings<AssetSettings>(x =>
{
	x.Mode = SearchMode.PublicFolderOnly;
});

The DiagnosticSettings and AssetSettings objects used above are injected into the application container as the application bootstraps, but only after the alterations show above are applied. Behind the scenes, FubuMVC will first resolve the objects using any data found in the appSettings key/value pairs, then apply the alternation overrides in the code above. Using the alteration lambdas instead of just injecting the Settings objects directly allowed us to also embed settings overrides in external plugins, but ensure that the main application overrides always “win” in the case of conflicts.

I’m still happy with how this has turned out in real usage and I’ve since noticed that ASP.Net vNext uses a remarkably similar mechanism to configure options in their IApplicationBuilder scheme (think SetupOptions<MvcOptions>(Action<MvcOptions>)). I’m interested to see if the ASP.Net team will try to exploit that capability to provide much better modularity than the older ASP.Net MVC & Web API frameworks.

 

 

Other Links

 

* That short project was probably one of the strongest teams I’ve ever been a part of in terms of talented individuals, but it also spawned several of my favorite development horror stories (massive stored procedures, non-coding architects, the centralized architect team anti-pattern, harmful coding standards, you name it). In my career I’ve strangely seen little correlation between the technical strength of the development team (beyond basic competence anyway) and the resulting success of the project. Environment, the support or lack thereof from the business, and the simple wisdom of doing the project in the first place seem to be much more important.

** As an aside, that effort to create characterization tests as a crude regression test suite did pay off and we it did find some regression errors after we started making the bigger changes with that test suite. I think the Feathers’ playbook for legacy systems, where I got the inspiration for those characterization tests, is still very relevant some 10+ years later.

Transaction Scoping in FubuMVC with RavenDb and StructureMap

TL;DR – We built unit of work and transactional boundary mechanics into our RavenDb integration with FubuMVC (with StructureMap supplying the object scoping) that I’m still happy with and I’d consider using the same conceptual design again in the future.

Why write this now long after giving up on FubuMVC as an OSS project? We still use FubuMVC very heavily at work, I’m committed to supporting StructureMap as long as it’s relevant, and RavenDb is certainly going strong. I’ve seen other people trying to accomplish the same kind of IoC integration and object scoping that I’m showing here on other frameworks and this might be useful to those folks. Mostly though, this post might benefit some of our internal teams and almost every enterprise software project involves some kind of transaction management.

Transaction Boundaries

If you’re working on any kind of software system that persists state, you probably need to be consciously aware of the ACID rules (AtomicityConsistencyIsolationDurability) to determine where your transaction boundaries should be. Roughly speaking, you want to prevent any chance of a system getting into an invalid state because some database changes succeeded while other database changes within the same logical operation fail. The canonical example is transfering money between two different bank accounts. For that use case, you’re probably making at least three different changes to the underlying database:

  1. Subtract the amount of the transfer from the first account
  2. Add the amount of the transfer to the second account
  3. Write some sort of log record for the transfer

All three database operations above need to succeed together, or all three should be rolled back to prevent your system from getting into an invalid state. Failing to manage your transaction boundaries could result in the bank customer either losing some of their funds or the bank magically giving more money to the consumer. Following the ACID rules, we would make sure that all three operations are within the same database transaction so that they all have to succeed or fail together.

Using event sourcing and eventual consistency flips this on its head a little bit. In that case you might just capture the log record, then rebuild the “read side” views for the two accounts asynchronously to reflect the transfer. Needless to say, that can easily generate a different set of problems when there’s any kind of lag in the “eventual.”

The Unit of Work Pattern

Historically, one of the easier ways to collect and organize database operations into transactions has been the Unit of Work pattern that’s now largely built into most every persistence framework. In RavenDb, you use the IDocumentSession service as the logical unit of work. Using RavenDb, we can just collect all the documents that need to be persisted within a single transaction by calling the Delete() or Store() methods. When we’re ready to commit the entire unit of work as a transaction, we just call the SaveChanges() method to execute all our batched up persistence operations.

One of the big advantages to using the unit of work pattern is the ability to easily coordinate transaction boundaries across multiple unrelated classes, services, or functions — as long as each is working against the same unit of work object. The challenge then is to control the object lifecycle of the RavenDb IDocumentSession object in the lifecycle of a logical operation. While you can certainly use some sort of transaction controller (little ‘c’) to manage the lifecyle of an IDocumentSession object and coordinate its usage within the actual workers, we built IDocumentSession lifecycle handling directly into our FubuMVC.RavenDb library so that you can treat an HTTP POST, PUT, or DELETE in FubuMVC or the handling of a single message in the FubuTransportation service bus as a logical unit of work and simply let the framework itself do all your transaction boundary kind of bookkeeping for you.

IDocumentSession Lifecycle Management 

First off, FubuMVC utilizes the StructureMap Nested Container feature to create an isolated object scope per HTTP request — meaning that any type registered with the default lifecyle that is resolved by StructureMap within a request (or message) will be the same object instance. In the case of IDocumentSession, this scoping means that every concrete type built by StructureMap will have the same shared instance of IDocumentSession.

To make it concrete, let’s say that you have these three classes that all take in an IDocumentSession as a constructor argument:

    public class RavenUsingEndpoint
    {
        private readonly IDocumentSession _session;
        private readonly Worker1 _worker1;
        private readonly Lazy<Worker2> _worker2;

        public RavenUsingEndpoint(IDocumentSession session, Worker1 worker1, Lazy<Worker2> worker2)
        {
            _session = session;
            _worker1 = worker1;
            _worker2 = worker2;
        }

        public string post_session_is_the_same()
        {
            _session.ShouldBeTheSameAs(_worker1.Session);
            _session.ShouldBeTheSameAs(_worker2.Value.Session);

            return "All good!";
        }
    }

    public class Worker1
    {
        public IDocumentSession Session { get; set; }

        public Worker1(IDocumentSession session)
        {
            Session = session;
        }
    }

    public class Worker2 : Worker1
    {
        public Worker2(IDocumentSession session) : base(session)
        {
        }
    }

When a FubuMVC request executes that uses the RavenUsingEndpoint.post_session_is_the_same() method, FubuMVC is creating a RavenUsingEndpoint object, a Worker1 object, a Lazy<Worker2> object, and one single IDocumentSession object that is injected both into RavenUsingEndpoint and Worker1. Additionally, if the Lazy<Worker2> is evaluated, a Worker2 object will be created from the active nested container and the exact same IDocumentSession will again be injected into Worker2. With this type of scoping, we can use RavenUsingEndpoint, Worker1, and Worker2 in the same logical transaction without worrying about how we’re passing around IDocumentSession or dealing with any sort of less efficient transaction scoping tied to the thread.

TransactionalBehavior

So step 1, and I’d argue the hardest step, was making sure that the same IDocumentSession object was used by each handler in the request. Step 2 is to execute the transaction.

Utilizing FubuMVC’s Russian Doll model, we insert a new “behavior” into the request handling chain for simple transaction management:

    public class TransactionalBehavior : WrappingBehavior
    {
        private readonly ISessionBoundary _session;

        // ISessionBoundary is a FubuMVC.RavenDb service
        // that just tracks the current IDocumentSession
        public TransactionalBehavior(ISessionBoundary session)
        {
            _session = session;
        }

        protected override void invoke(Action action)
        {
            // The 'action' below is just executing
            // the handlers nested inside this behavior
            action();

            // Saves all changes *if* an IDocumentSession
            // object was created for this request
            _session.SaveChanges();
        }
    }

The sequence of events inside a FubuMVC request using the RavenDb unit of work mechanics would be to:

  1. Determine which chain of behaviors should be executed for the incoming url (routing)
  2. Create a new StructureMap nested container
  3. Using the new nested container, build the entire “chain” of nested behaviors
  4. The TransactionalBehavior starts by delegating to the inner behaviors, one of which would be…
  5. A behavior that invokes the RavenUsingEndpoint.post_session_is_the_same() method
  6. The TransactionBehavior object, assuming there are no exceptions, executes all the queued up persistence actions as a single transaction
  7. At the end of the request, FubuMVC will dispose the nested container for the request, which will also call Dispose() on any IDisposable object created by the nested container during the request, including the IDocumentSession object.

How does the TransactionalBehavior get applied?

In the early days we got slap happy with conventions and the simple inclusion of the FubuMVC.RavenDb library in the applications bin path was enough to add the TransactionBehavior to any route that responded to the PUT, POST, or DELETE http verbs. For FubuMVC 2.0 I made this behavior “opt in” instead of being automatic, so you just had to either include a policy to specify which routes or message handling chains needed the transactional semantics:

    public class NamedEntityRegistry : FubuRegistry
    {
        public NamedEntityRegistry()
        {
            Services(x => x.ReplaceService(new RavenDbSettings { RunInMemory = true}));

            // Applies the TransactionalBehavior to all PUT, POST, DELETE routes
            Policies.Local.Add<TransactionalBehaviorPolicy>();
        }
    }

You could also use attributes for fine grained application if there was a need for that.

Explicit Transaction Boundaries

I fully realize that many people will dislike the API usage with the nested closure below, but it’s worked out well for us in my opinion — especially when it’s rarely used as the exception case rather than the normal every day usage.

All the above stuff is great as long as your within the context of an HTTP request or a FubuTransportation message that represents a single logical transaction, but that leaves plenty of circumstances where you need to explicitly control transaction boundaries. To that end, we’ve used the ITransaction service in FubuPersistence to mark a logical transaction boundary by again executing a logical transaction within the scope of a completely isolated nested container for the operation.

Let’s say you have a simple transaction worker class like this one:

    public class ExplicitWorker
    {
        private readonly IDocumentSession _session;

        public ExplicitWorker(IDocumentSession session)
        {
            _session = session;
        }

        public void DoSomeWork()
        {
            // do something that results in writes
            // to IDocumentSession
        }
    }

To execute the ExplicitWorker.DoSomeWork() method in its own isolated transaction, I would invoke the ITransaction service like this:

// Just spinning up an empty FubuMVC application
// FubuMVC.RavenDb is picked up automatically by the assembly
// being in the bin path
using (var runtime = FubuApplication.DefaultPolicies().StructureMap().Bootstrap())
{
    // Pulling a new instance of the ITransaction service
    // from the main application container
    var transaction = runtime.Factory.Get<ITransaction>();
            
    transaction.Execute<ExplicitWorker>(x => x.DoSomeWork());
}

At runtime, the sequence of actions is:

  1. Create a new StructureMap nested container
  2. Resolve a new instance of ExplicitWorker from that nested container
  3. Execute the Action<ExplicitWorker> passed into the ITransaction against the newly created ExplicitWorker object to execute the logical unit of work
  4. Assuming that the action didn’t throw an exception, submit all changes to RavenDb as a single transaction
  5. Dispose the nested container

Use this strategy when you need to execute transactions outside the scope of FubuMVC HTTP requests or FubuTransportation message handling, or anytime when it’s appropriate to execute multiple transactions for a single HTTP request or FT message.

Building an EventStore with User Defined Projections on top of Postgresql and Node.js

EDIT 3/28/2016: Since this blog post still gets plenty of reads, here’s an update. The work described here never got used in a real project (#sadtrombone), but fear not, the projection support is going to live again in a new project called Marten that seeks to turn Postgresql into a very functional Document Db and Event Store with projections.

I did an internal talk on the tooling and concepts in this post at our Salt Lake City office a couple months ago. The recording, for what it’s worth, is here. I’m assuming that you’re at least somewhat familiar with the concepts of Event Sourcing and CQRS, but if you’re not, there are links to descriptive explanations of these concepts in the body of the post.

For most of the 2000’s my goto strategy for application persistence was to use some sort of object relational mapping to persist and read the object structures that I wanted to work with in my code. Sometimes I used hand rolled code to do the mapping, and other times my teams used NHibernate. In the past couple years I’ve been on projects that used the RavenDb document database with mixed success. I’ve also worked on a couple codebases that used an event sourcing strategy to persist meaningful business events, sometimes with RavenDb as the underlying storage engine and another project that uses an older version of NEventStore with Sql Server as the storage mechanism.

For various reasons, we’ve chosen to use a Node.js based stack to rewrite an old WPF application that is a suitable candidate for event sourcing on the backend (Corey Kaylor explained his take on this decision in a blog post). Since we already wanted to replace Sql Server (and probably RavenDb) with Postgresql in the long run, at Corey’s suggestion I have been working on and off to try leveraging  to create a new event store suitable for Node.js development that supports user-defined projections. Lacking all originality, I’m calling this new library “pg-events.”  You can find pg-events hosted under my GitHub account (my very first foray back into OSS post-FubuMVC).

 

Feature Set

  • Support the basic event sourcing pattern by appending the raw business events as JSON to the event store
  • Track events by a “stream” of related events that probably relates directly to some kind of business concept or workflow
  • Support user-defined projections of the raw event data to create “read side” views for clients
  • Support aggregated views of a stream (really just another projection). Use a basic snapshotting strategy of the aggregate state for efficiency
  • Build time tooling to initialize a postgresql database with the custom schema objects and import javascript libraries to postgresql
  • A crude, partial implementation of CommonJS that runs within postgresql

 

Conceptual Architecture

The first thing to know is that we’re making a very large bet on the portability of Javascript code and the ability to run at least a subset of this new event store code hosted in Postgresql, Node.js, embedded in other programming, or even potentially in a browser. The user-defined projections could potentially be executed in any of the pieces below, and we think that flexibility will pay off down the road for both performance and scalability tuning.

So far, I think the end state is going to consist of these four pieces:

 

pg-events

  1. Postgresql Database
    1. Custom schema objects largely based on Greg Young’s Building an Event Store paper to store events and stream metadata.
    2. Tables to persist projected views. Most projection views will be persisted to separate tables instead of one giant “pge_views” table for better query performance
    3. Stored procedures to update and query data in the event storage tables, mostly using postgresql’s Javascript support
    4. Executes and updates aggregate snapshots and synchronous projections. More on that in the following section.
  2. A Node.js Client
    1. Exposes methods to append events to the store
    2. Exposes methods to query for projected views, aggregate snapshots, and raw event stream data
    3. See https://github.com/jeremydmiller/pg-events/wiki/Client for more information
  3. Admin CLI Tool
    1. Build the necessary schema objects into a Postgresql
    2. Loads the user defined projections and other pg-events libraries into the database
    3. Can reset the event storage and projected view tables to an empty state for testing
    4. Eventually, this tool will also support “recapitulation” to rebuild projection data from the raw events when the definition of a projection changes
  4. Background Projection Runner
    1. Executes and updates projected views in a background process. This is my very next coding adventure. I’m going to build it out first with Node.js, then try my hand at implementing it again with a standalone Golang executable that uses an embedded V8 engine to execute the projections. Expect my twitter feed to be entertaining when I’m able to start that work. I’ll blog about this later when I know what it’s going to actually look like;)

 

 

User Defined Projections 

We looked at EventStore at first and I definitely liked their first class support for user defined projections. Our implementation of projections is very obviously influenced by EventStore’s.

I think that the event sourcing efforts I’ve been a part of have been successful overall, but “projecting” the raw event stream into a persisted read side or view model has been challenging. For pg-events, we’re expressing the projections with simple transformation functions that will take in the initial state and the raw event data and simply return the new state (it’s a logical fold left operation for the projections that work across multiple events).

For a sample event sourcing domain, I’ve been using the idea of a quest from the way too many fantasy books I’ve read over my lifetime. During a quest, our heroes might record events like “QuestStarted”, “MembersJoined”, “MembersDeparted”, or “TownReached.” To know or understand the exact composition of a quest party at any time, we need to replay some of the change events (Gandalf stayed behind to fight the Balrog, Boromir was killed, Frodo and Sam ran off, Gollum joined up, etc.) for the quest.

Say we write a projection for a new view across the events in a single quest called “Party” just to understand the membership. From the unit tests, that projection looks like:

require("../../lib/projections")
	.projectStream({
		name: 'Party',
		stream: 'Quest', 
		mode: 'sync',

		$init: function(){
			return {
				active: true,
				traveled: 0,
				location: null,
				members: []
			}
		},

		QuestStarted: function(state, evt){
			state.active = true;
			state.location = evt.location;
			state.members = evt.members.slice(0);

			state.members.sort();
		},

		TownReached: function(state, evt){
			state.location = evt.location;
			state.traveled += evt.traveled;
		},

		EndOfDay: function(state, evt){
			state.traveled += evt.traveled;
		},

		QuestEnded: function(state, evt){
			state.active = false;
			state.location = evt.location;
		},

		MembersJoined: function(state, evt){
			state.members = state.members.concat(evt.members);
			state.members.sort();
		},

		MembersDeparted: function(state, evt){
			state.location = evt.location;

			for (var i = 0; i < evt.members.length; i++){
				var index = state.members.indexOf(evt.members[i]);
				state.members.splice(index, 1);
			}

			state.members.sort();
		}


	});

You’ll notice that there’s a field called “mode” with a value of “sync.” Using the portability of Javascript, we’re planning for these modes:

  1. sync – A ‘sync’ projection will be executed synchronously inside postgresql within the same transaction as the event capture
  2. async – In progress. An ‘async’ projection will be calculated in a background process instead of at event capture time (Eventual Consistency).
  3. live – Forthcoming. These projections will only be calculated upon demand. I’m not yet sure if we’ll do the actual projection transformations within the database or the Node.js client. I guess we could allow two different “live” modes if there’s value in doing that.

So, eventual consistency killing you in your current event sourcing efforts because you hit errors by querying off of stale data? Opt for synchronous projections. Have lots of writes, but relatively few reads? Use asynchronous or even live projections that are only calculated on demand. Have lots of reads but very few writes? I think I would again opt for synchronous projections.

I worked on a system a couple years ago in a failed startup that ran projections in a browser to do historical point in time simulations. I don’t see any reason why we couldn’t do something similar in pg-events if that is ever valuable.

Believe it or not, I have a decent start on documenting the projection support at https://github.com/jeremydmiller/pg-events/wiki/Projections.

 

Why Postgresql? 

Postgresql is the people pleaser of database engines. Want all the normal RDBMS capabilities? Would you be more productive using Postgresql as a document database? Want to write stored procedures with a language that closely resembles Oracle’s PL/SQL (no, a thousand times no, never again)?  Would you even want to use Javascript inside the database itself? Regardless of how you answered any of those questions, Postgresql is trying really hard to be what you want. In our case, I like that we can use postgresql as an event store, a document database for things that don’t fit into the event sourcing model, and a classic RDBMS if that’s what we want in some circumstances.

Mostly though, we like that Postgresql has a proven track record and we suspect that the DevOps support tools will be more effective than we’ve experienced with other OSS database tools.

Of course, the only reason why pg-events is viable in the first place is that postgresql has outstanding JSON support and the ability to author stored procedures with Javascript using Google’s V8 engine. With our project timeline, it’s also safe to assume that Postgresql 9.4 with its significant improvements to the JSON storage will be available before we go live to production.

 

Why CQRS isn’t crazy

I’ll feely admit that the first time I saw Greg Young talking about the Command Query Responsibility Segregation (CQRS) style of architecture in 2008 I thought it was nuts. Specifically, I was afraid that doing the transformations between the “write side” model and the “read side” model consumed by the clients would lead to far too much repetitive “left hand, right hand” code. The reality is, of course, is that I was already doing a lot of work to map database tables to object graphs, transforming domain model objects to DTO’s to send over the wire, and crafting database views to transform our raw data into something more conducive to reporting requirements. In a way, CQRS just explicitly calls out a large part of software development efforts that is often overlooked. If we simply accept the idea that different consumers and producers of the persisted state in our system naturally have different needs as far as how the same information is written, structured, and consumed, CQRS isn’t really “crazy talk” or extra work. One of the biggest differences is that with event sourcing + CQRS you probably try to pre-build and persist the read side views instead of trying to create views or DTO’s on the fly from the “one true database model.”

 

Some thoughts on Relational Databases

I’m very much in the camp that says that the database is strictly for persistence and your business logic and/or user interface should never be tightly coupled to whatever the database is, so the idea of just consuming the raw database tabular data in business logic code is a non-starter for me — not to mention that a flat database table structure is very rarely the exact structure that you’d want in your business logic code outside of CRUD-centric applications. I’ve been a part of technical arguments with database-centric folks for so long that I’m simply happy to say “agree to disagree” on these issues and let us all go on our way.

There’s a tremendous amount of inertia and investment in tooling in our industry in regards to the usage of relational databases as the de facto standard for just about all persistence needs. Additionally, most developers, testers, and even the business people seem to naturally understand the relational database model. Even so, as alternative models like document or graph databases build up more tooling, acceptance, and developer familiarity, I think that relational databases will eventually be consigned to reporting applications or pure CRUD applications (but even then I prefer document databases).

That being said, I think that the future really is “polyglot persistence” and that our children are going to laugh at us in decades to come when we explain how we built systems against relational databases.