Initial thoughts on some new-fangled things part 1

I’ve been lucky over the past year and change to work with some interesting projects that used some of the newer technologies and architectural concepts like Command Query Responsiblity Separation (CQRS), Event Sourcing, Eventual Consistency, and RavenDb as a document database. I cannot speak to the scalability benefits of these tools because that’s just not an area where I have expertise. Instead, I’m interested in how these tools have reduced coding ceremony, improved testability, and allowed my very small teams to effectively do continuous design by giving us much more architectural reversibility. I ran out of time and energy on this post, but I’ll follow up next week with more on event sourcing, what I like about RavenDb, and how we’ve used all of this in our projects.

Continuous Design is better with a Document Database

I gave a talk earlier this month at Agile Vancouver called “Architectural Reversibility, ” largely about how we can create better designs if we are able to do design incrementally throughout the lifetime of a project instead of having to do it all upfront. My point of view on this topic is that we’re far more likely to succeed if we’re able to recover from the inevitable errors in architecture, design, or requirements — or better yet, if we’re able to delay commitment to elements of our technical architecture until we know more later on in the project. Furthermore, I said that you should be cognizant of this when selecting technologies. One of my slides showed this progression of data access/persistence technologies from my own development career that went something like this:

Stored procedures (sproc) for every single bit of data access
Object Relational Mapper & Relational Database
Document Database

Let’s say that I need to add a property to an entity in my existing system. Using the same numbering scheme as above, I would have to:

Change the DDL defining the proper table. Update every sproc that returns that field and any that might need to search on that field. Go update all the places in the code that use the data returned by that table.
Change the DDL defining the proper table or a data migration. Change the relevant class in the code (even with Ruby ActiveRecord you may still touch the class to add validation rules). Change the ORM mapping to add this field and verify the persistence of the new field all the way to the database.
Add a new property to the proper class and make sure that it serializes.

Adding or changing the shape of the data in the 90’s style stored procedure model was tedious. Back then you had to try much harder to get things right on the first try. Using an ORM was much better, especially if you used conventions to drive the ORM mapping or even to generate the database schema from your classes. However, using a document database where you just serialize objects to a json structure with no schema requiring you to effectively do double data entry for the database and object model? That’s the best possible solution for really able to do continuous design because there’s very minimal friction in changing your object model (at least before you deploy for the first time anyway).

To summarize, document databases absolutely rock for architectural reversibility and that’s a very, very big deal.

Automated testing

In my strong opinion, doing automated, end to end testing using the database is vastly easier and more effective with a document database than with a relational database. I feel that this advantage is enough by itself to justify the usage of a document database. Why do I think that? Well first, let’s review the two mandatory parts of any repeatable automated test:

Known inputs
Expected outcomes

In order to be really successful with automated testing, I think you need to achieve a couple things:

The tests have to run fast enough to provide timely feedback.
It has to be mechanically cheap for a test author to put the system into the initial state
You can not allow state to bleed between tests because that makes them unreliable
And a Jeremy special: data input for automated tests should be isolated by test, i.e. no shared test data!

Referential integrity has repeatedly been a huge source of friction in test automation. I have found myself frequently adding junk data to a database for automated tests that was not remotely germaine to the meaning of the test just to get the database constraints to shut up. Folks, that’s friction that you just won’t have with a document database.

Immediately after adopting RavenDb we quickly adopted the trick of using Raven’s in memory storage for testing, and completely scrapping the full database between tests, virtually guaranteeing that we have our tests isolated from each other. You can certainly do something like this with relational databases, but in my experience doing this is much more work and far slower no matter how you do things. Being able to very quickly drop and rebuild a clean database in code is a killer feature for automated testing.

Separating the read and write models

The first time I saw Greg Young present on CQRS in 2008 I thought to myself “that’s interesting, but keeping two separate models for the same thing sounds like a lot of busywork to me.” In practice, I’m finding it to be more helpful than I thought because it has allowed my team to be able to focus on one problem at a time and jump into the work without having to understand everything at once.

We just started a project where we’ll be exchanging messages from our web application to an existing backend. We don’t exactly have the messaging workflow locked down, but our immediate concern is getting feedback on the usability and workflow of the proposed user interface. To that end we created a very simple “read” model that stores only the data that our views need and in a shape that’s easy to consume on the page with little concern for what the real, behavioral “write” side model will look like later on. We’re even able to write end to end automated tests against our user interface by setting up flat “read” documents in the database.

In iteration 2, we’ll be focusing on the events and messages throughout the system and flush out the “write” model and how it responds and changes with events. In both cases, we are able to tightly focus on only one aspect of the system and test each in isolation. Later on we’ll either use RavenDb’s built in mechanisms to or a code based “denormalizer” to keep the write and read models synchronized. I like this path of working because it’s allowing me to focus on a subset of the application at a time without ever having to be overwhelmed with so many variables.

Honestly, I think I’d be a lot more hesitant to try this kind of architecture with a relational database where I’d have to lug around more stuff (DDL scripts, ORM mappings, data migration scripts, etc.) than I do today with a document database where the document json structure just flows out of the existing classes. RavenDb’s index feature does a lot to alleviate the tedious “left hand/right hand” coding that I worried about when I first learned about CQRS.

Eventual Consistency requires some care in testing

Jimmy Bogard recently blogged about the downsides of eventual consistency with a user interface. We had some similar issues on a previous project Rather than repeat everything Jimmy said, I’ll simply add that you must be cognizant of eventual consistency during testing. A typical testing pattern is going to be something like:

Arrange — set up a test scenario
Act — do something that is expected to change the state of the system
Assert — check that the system is in the state that you expected

Your problem here with eventual consistency is that there’s an asynchronous process between writing data in step 2 and being able to read the new data in step 3. You absolutely have to account for this in both your automated tests and any manual testing. My cheap solution with RavenDb is to swap out our low level RavenDb “persistor” in our IoC container with a testing implementation that just forces any reads to wait for all pending writes to finish first.

More importantly, I’m going to spend quite some time with our testers making sure that they have insight and visibility into this behavior so that everyone gets to keep from pulling out all our hair.

Finally…

I’m not a deep expert on these tools and techniques, but I’m seeing some things that I like so far. At this point, I’d strongly prefer to avoid working on projects involving a relational database ever again. As for RavenDb, it’s made a strong first impression on me and I’m looking forward to seeing where it goes from here. I will commit to flushing out a quick start recipe for integrating RavenDb with a drop in “Bottle” for FubuMVC as our de facto recommendation for new FubuMVC projects.

Next time…

It’s Friday afternoon, I have to hit publish before the end of the day for an elimination bet, and I haven’t seen the inside of the gym all week, so I’m quitting here. In part 2 I’d like to share why I think persistence is much easier with a document database, how we’re able to just not worry about a database at all early on, and my thoughts on developing with event sourcing. Until next time, adieu.

Jeremy’s Only Rule of Testing

Years ago I wrote a series of blog posts describing “Jeremy’s Laws of Test Driven Development” (1, 2, 3, and 4) describing what I thought were some important coding and design rules to be more successful while using TDD. I still believe in the thinking behind all those silly “laws,” but I now I would say that all of that writing is a manifestation of lower level first causes in successful software development — namely the extreme importance of quality feedback in your software efforts.

Consider this thought: every single line of code you write, every thought you have about the user experience, the business rules, the design you intend to use, and the assumptions about the system’s usage you’re making are potentially wrong — but often wrong in subtle, hard to notice ways. My experience is that my projects have gone much better when my team and I are able to work in tight cycle times with solid feedback mechanisms that constantly nudge us towards better results.

With that in mind, I’ve boiled down my old personal rules for using TDD into a single, lower level rule to maximize the effectiveness of the feedback my team gets from testing:

Test with the finest grained mechanism that tells you something important

Since both the quantity and quality of your testing feedback matters, here’s a pair of examples from my new job that illustrate how this rule can guide your approach.

Scenario #1: Use a tighter feedback loop

A couple weeks ago, I watched one of my new colleagues troubleshooting an issue with one of our phone helpdesk applications. The call waiting elevator music wasn’t playing or switching off at the right time, and you know how annoying that can be. My colleague had to work by kicking off the process by first making a call with the world’s lamest looking 90’s era cellphone and then stepping through the code manually until he was able to find the faulty logic in our system. The problem turned out to be in the coordination logic written by my company and not in the 3rd party phone control software.

The fault definitely lies with the design of that code, but my colleague and I were violating my little testing rule because we were forced to use an unnecessarily slow and cumbersome feedback cycle. What if instead, the code had been structured in a such a way that we could write narrowly focused unit test nothing but the logic that decided when to turn the call waiting music on and off. That very narrowly focused, very fast running unit test could have told my colleague something valuable, namely that the if/then coordination logic was all wrong — all without having to look terminally uncool using the cheap 1990’s looking cell phone. Add in the number of times we had to repeat the process to track down the problem and then to verify that the fix was correct and the finer grained tests look even better.

Scenario #2: Sometimes a unit test is useless

I had a conversation the other day with a different colleague asking me if he’d be able to write a unit test in Jasmine for the code we’ll need to write that configures event handling and options in a SlickGrid table embedded in our application. Applying my rule again, this proposed testing mechanism is a very tight feedback loop, but the test just doesn’t tell me anything useful. I can assert all day long that I’m calling methods and setting properties on the SlickGrid JavaScript object, but that doesn’t tell me whether or not the grid behaves the way that we want it to when the application is running. In this case, we have to go to a more coarse grained integration test that works against the user interface.

Making testing more useful

What’s the purpose of testing in your daily job? Is it to certify that the software works exactly the way it’s supposed to? What if instead we shifted our thinking about testing to focus on removing flaws and risk from our software project? That might seem like a subtle restating of the same goal, but it can drastically change how your team or organization approaches software testing.

If your goal is to verify that the system works correctly, you’re probably more likely to focus on black box testing of the system in realistic scenarios and environments because that’s the only real way to know that the system really does work. In that approach you probably have some formal separation between the developers and the testing team — again to guarantee that you have a completely independent appraisal of the code.

On the other hand, if you’re using testing as a way to remove defects and risk, I think you’re much more likely to follow a testing philosophy similar to my rule about tighter feedback loops, which I think inevitably leads to an emphasis on white box testing solutions and fine-grained unit testing backed up with some minimal black box testing. If you’re not familiar with the term “white box testing,” it means taking advantage of a detailed knowledge of the system internals in your testing. I’m sure that it can be done otherwise, but I wouldn’t even begin to try to use white box testing without a very deep synergy and a high degree of collaboration between developers and testers. In this approach, I think you’d be foolish to keep your developers and testers formally separated.

… and lastly, a brief aside about mocking

I once wrote that you shouldn’t mock interfaces outside of your own codebase or chatty interfaces. Taking the two examples above, doing an assertion that a message was sent to “TurnOffCallWaiting()” or “TurnOnCallWaiting()” is useful in my opinion. I certainly have to test the real code behind the “TurnOn/Off()” methods, but I will happily use interaction testing against this kind of goal-oriented interface.

Moving to my second scenario, doing mock object assertions that I fiddled a lot of fine-grained “beforeBeginCellEdit” and “invalidateRow()” methods when I really just care that the data in a row in an html table was updated? Not so much.

If you do need to interact with any kind of chatty, low level API — especially if it’s in a 3rd party library or tool — I think you’re much better off to wrap a gateway interface around that API that’s expressed in the semantics of your goals for that API like “TurnOffCallWaithing().”

Kicking off StructureMap 3

Actually, I started working hard on StructureMap 3.0 in the summer of 2010 but got badly derailed by other projects and a nasty bout of burnout. I’m writing this post because I would dearly love to get community input and contributions and I’ve got folks contacting me that are chomping at the bit to start working on this.

StructureMap was originally written in the summer of 2003 and revamped in the spring of 2004 for its very first release in June of that year. Over the years it has had some significant rework (the 2.5 and 2.6 releases were both large changes), but at this point I firmly believe that the current 2.6.* internal structure is not worth improving. Yes Virginia, I am opting to gut some of the internals of StructureMap in order to fix the most egregious problems and limitations of the current architecture and build a container that is good enough to last until we all give up on this silly .Net thing. I’d also like to tear out any feature that I think is obsolete or just plain ugly to use and make StructureMap much leaner.

Nothing here is set in stone and feedback is very welcome.

My thoughts for 3.0:

My personal drivers for doing StructureMap3 are mostly to kill the nested container problems and get StructureMap ready to better handle multi-tenancy scenarios in a high volume FubuMVC application. I think that better Profile’s and/or the child container feature below would make multi-tenancy easier to achieve without killing the server’s memory usage. Well, and I would like to make StructureMap easier to use for other people Winking smile Making StructureMap the most used container in .Net or competing with the other hundred container tools to do every possible crazy scenario that folks can come up with is not on my agenda.

Remove the [Obsolete] methods
Better exception messages. The error messages and the stacktraces really took a step backwards when I replaced the old Reflection.Emit code with dynamically generated Expression’s in the 2.6 release. At a bare minimum, the stacktrace and exception messages need to be much cleaner and more accurately present what has gone wrong.
Better configuration diagnostics. Completely taking a page out of FubuMVC and Bottles, I would like a StructureMap container to be able to tell you why and how it was configured the way it is. Why did it select this constructure, why is this the default, where did this type come from.
Configuration model. Today there is the configuration model (PluginGraph) and a runtime model (PipelineGraph and InstanceManager). I would like to eliminate the separate models and make the configuration model much easier to consume by users. From the lessons we learned with FubuMVC, I think the key to making the convention model far better is a very good semantic model that can be easily altered and read by both conventions and explicit configuration. I think this is going to be the biggest change in the internals.
Far better convention support. See the above feature. Think of policies like “set the value of each constructor argument named ‘connectionString’ to this” or “make any Instance the singleton lifecycle where the concrete class name ends with ‘Cache’.” We can do that kind of thing today with FubuMVC’s BehaviorGraph model. I’d like to do the same with StructureMap.
Profiles. I think we just flat our redo Profile’s from scratch and completely redesign that functionality from all new requirements.
Runtime flexibility. I would like to be able to allow users to register policies that could “teach” StructureMap how to resolve a requested type that it doesn’t know anything about. I think we’d convert some of the hard coded rules in current StructureMap to this new pluggable strategy. Think things like “this is a concrete type and I can resolve all of its dependencies, so I’ll just do it” or “this type closes an open generic type that I do know about” or “the name of this class ends in ‘Settings’ so I’ll use FubuCore’s ISettingsProvider to resolve it”
Better Lifecycle support. A longtime limitation in StructureMap is that lifecycle can only be configured by the requested type, i.e., all instances of ISomething have to be the same lifecycle. I’d like to eliminate that limitation.
Better support for modular configuration. We already have the Registry model and I think it has worked out very well. Most of the other IoC containers have implemented something similar by this point. I’d like to extend the model to allow you to specify ordering rules between Registry classes and dependencies (hence, FubuCore’s dependency analysis functionality). I would also like to add semantics to only add configuration if it is missing or conditional configuration.
Pluggable strategies for selecting constructor functions. I don’t care for this one bit, but at least a couple prominent .Net OSS frameworks need this.
Nested containers. I love this feature and its usability. FubuMVC depends very heavily on this feature in StructureMap. Its implementation, however, is horrific and there’s a nasty outstanding bug that I felt was too difficult to fix in 2.6.*. I think we rewrite the nested container feature so that we have proper separation in scoping between the parent and nested container and avoid the need to do any copying/shuffling of the underlying configuration structure.
Child containers. Not quite the same thing as nested containers. This would be the ability to quickly clone an existing container and override the parent’s configuration.
Eliminate the Xml configuration. I have already ripped the Xml configuration support out of the core assembly in StructureMap 3. I wouldn’t mind coming back and adding a subset of the existing Xml configuration back as an addon assembly and nuget.
Eliminate the old attribute configuration. I had left this in there for years, but I’d never recommend to anyone that they use it. I would like to consider just using the convention support to work against a subset of the same CLR attributes that MEF uses.
Full, living documentation. I rewrote the documentation for the 2.5 release, but it wasn’t usable enough and quickly got out of date when 2.6 was released. For 3.0 I’d like to use Sphinx for the documentation generation and host on http://readthedocs.org/ and make the documentation publish with pushes to Git. Another heavy lesson learned is that the we need to strive to make the documentation organized around the tasks that a user would do instead of organized around StructureMap jargon.
Recipes. This is where I really need community help the most. I’d like to have some examples of integrating StructureMap into common .Net tools and frameworks. I’m at a disadvantage because I’ve become very disconnected from mainstream .Net. I have not used Entity Framework, WCF, Silverlight, Workflow Foundation, MEF, and barely used WPF, ASP.Net MVC, Prism, or WebForms. I just don’t have enough visibility into those tools to help much.
Backwards Compatibility. With a few exceptions, I think the registry DSL in StructureMap has settled into something usable. I’d like to remove all the [Obsolete] methods and change a few things that seem to be confusing to use, but otherwise make it as easy as possible to upgrade from 2.6.* to 3.0.
No Silverlight support. I have no intention of supporting Silverlight or any other mobile variant of .Net at this time. I’m open to this happening later and I’m contemplating at least a version of StructureMap that is usable in the client profile. This is an important decision to make soon-ish because I would like for StructureMap 3 to take a dependency on our FubuCore library and I don’t really want to care about the size of the assembly right now.
Fubu project portfolio. I would like to fold StructureMap under the Fubu project family. Part of that is branding, but it’s also community, the convenience of the GitHub organization model, and the common infrastructure that’s starting to grow up around documentation, Ripple, and the build support.
Use FubuCore. There is quite a bit of overlap between the FubuCore library and what’s in the current codebase that I’d like to eliminate. I’d also like to use FubuCore’s dependency graph support, the ObjectConverter, and integrate the SettingsProvider service out of the box for externalized configuration in StructureMap (here’s an explanation of an earlier version that’s still relevant).

Things I don’t plan on changing

The interception model. By and large I think it’s been good enough for anything I’ve ever needed to do with it
The basic Registry DSL
Any of the methods on IContainer
Still don’t plan on adding AOP, but I’d like to have addon Nuget libraries for integrating existing AOP solutions with StructureMap someday

So why do this at all?

A couple months back I expressed some admiration for where one of the other IoC containers was going and that I was perfectly willing to forgo trying to compete with that particular tool (don’t even ask which one because I took a harder look at it and changed my mind). He asked me why I didn’t just contribute the stuff I needed from StructureMap that was missing in that other container. Fair question, but no, I’m not going to do that. Why? Because one of the best learning experiences you can possibly have as a developer is to take a hard problem that you’ve already solved and reflect on how you could solve that same problem in a much better way with all the things you’ve learned. I’ve worked on and off on StructureMap for 8-9 years. I’ve rewritten some of the same subsystems in StructureMap a couple different times and even got a conference talk out of my experience called The Joys and Pains of a Long Lived Codebase at QCon – but I still think I will learn a great deal by going through with one last version of StructureMap.

Hello again.

Most of you stumbling into this will know that I’m a longtime blogger on the CodeBetter site, but not so much the past 2-3 years. I used to write an absurd amount of content on software design, Agile practices, and automated testing:

Best of the Shade Tree Developer
Best of the Shade Tree Developer Part II
Build Your Own CAB – I’m sorry guys, but I’m not going to finish the book
My “Patterns in Practice” series of articles from MSDN

I’ve frequently told people that starting to blog was the best thing I’ve ever done for my career and I’ve encouraged other people to blog as well. It was a much more innocent time when I started blogging and I enjoyed it. Then came ALT.NET, endless bile and frustration over whatever Microsoft was doing, and a celebrity programmer saying at a conference that he was such a nice guy and not “one of those CodeBetter guys pounding on the table and telling you how you’re supposed to code.” Add a lot of personal stress and my involvement with several OSS projects and it was pretty natural that I more or less stopped blogging.

Here lately I’ve started to miss writing and I’m even generating ideas for new posts. To that end, and my simple desire to keep things more positive for myself overall, I have three career related New Year’s Resolutions this year:

Avoid “nerd rage” events and don’t every be this guy again. I’m hoping to keep the word “rant” very tiny in the keyword cloudmap for this new blog.
Eliminate myself as a bottleneck in all the open source projects that I’m involved with – and that means a lot more information sharing and visibility into what’s going on.
Blog more ideas and information, tweet less noise. As I was writing this blog post I was also watching a twitter conversation on mocking dependencies that ended with this.

So, all that being said, welcome to my new little blog. I’ll be writing on a semi-regular basis about whatever software topic I’m interested in at the moment, the OSS projects I’m involved with, and I’d even like to go back and rewrite some of the material from my original blog on topics like mocking and TDD just to see how my own thinking has changed since then.