Would I use RavenDb again?

EDIT on 2/12/2016: This is almost a 3 year old post, but still gets quite a few reads. For an update, I’m part of a project called Marten that is seeking to use Postgresql as a document database that we intend to use as a replacement for RavenDb in our architecture. While I’m still a fan of most of the RavenDb development experience, the reliability, performance, and resource utilization in production has been lacking. At this point, I would not recommend adopting RavenDb for new projects.

I’m mostly finished with a fairly complicated project that used RavenDb and all is not quite well. All too frequently in the past month I’ve had to answer the question “was it a mistake to use RavenDb?” and the more Jeremy’s ego-bruising “should we scrap RavenDb and rebuild this on a different architecture?” Long story short, we made it work and I think we’ve got an architecture that can allow us to scale later, but the past month was miserable and RavenDb and our usage of RavenDb was the main culprit.

Some Context

Our system is a problem resolution system for an automated data exchange between our company and our clients. The data exchange has long suffered from data quality issues and hence, we were tasked with building an online system to ameliorate the current manual heavy process for resolving the data issues. We communicate with the upstream system by receiving and sending flat files dropped into a folder (boo!). The files can be very large, and the shape of the data is conceptually different than how our application displays and processes events in our system. As part of processing the data we receive we have to do a fuzzy comparison to the existing data for each logical document because we don’t have any correlation identifier from the upstream system (this was obviously a severe flaw in the process, but I don’t have much control over this issue). The challenge for us with RavenDb was that we would have to process large bursts of data that involved both heavy reads and writes.

On the read side to support the web UI, the data was very hierarchical and using a document database was a huge advantage in my opinion.

First, some Good Stuff

RavenDb has to be the easiest persistence strategy in all of software development to get up and running on day one. Granted that you’ll have to change settings for production later, but you can spin up a new project using RavenDb as an embedded database and start writing an application with persistence in nothing flat. I’ve told some of my ex-.Net/now Rails friends that I think I can spin up a FubuMVC app that uses RavenDb for persistence faster than they can with Rails and ActiveRecord. The combination of a document database and static typed document classes is dramatically lower friction in my opinion than using static typed domain entities with NHibernate or EF as well.
I love, love, love being able to dump and rebuild a clean database from scratch in automated testing scenarios
I’m still very high on document database’s, especially in the read side of an application. RavenDb might have fallen down for us in terms of write’s, but there were several places where storing a hierarchical document is just so much easier than dealing with relational database joins across multiple tables
No DB migrations necessary
Being able to drop down to Lucene queries helped us considerably in the UI
I like the paging support in RavenDb
RavenDb’s ability to batch up read’s was a big advantage when we were optimizing our application. I really like the lazy request feature and the IDocumentSession.Load(array of id’s) functions.

Memory Utilization

We had several memory usage problems that we ultimately attributed to RavenDb and its out of the box settings. In the first case, we had to turn off all of the 2nd level caching because it never seemed to release objects, or at least not before our application fell over from OutOfMemoryExceptions. In our case, the 2nd level cache would not have provided much value anyway except for a handful of little entities, so we just turned it off across the board. I think I would recommend that you only use caching with a whitelist of documents.

Also be aware that the implementations of IDocumentSession seem to be very much optimized for short transactions with limited activity at any one time. Unfortunately we were almost a batch driven system and our logical transactions became quite large and potentially involved a lot of reads against contextual information. After examining our application with a memory profiler, we determined that IDocumentSession was hanging on to the data we only read. We solved that issue by explicitly calling Evict() to remove objects from an IDocumentSession’s cache.

Don’t Abstract RavenDb Too Much

To be blunt, I really don’t agree with many of Ayende’s opinions about software development, but in regards to abstractions for RavenDb you have to play by his rules. We have a fubu project named FubuPersistence that adds common persistence capabilities like multi-tenancy and soft deletes on top of RavenDb in an easy to use way. That’s great and all, but we had to throw a lot of that goodness away because you so frequently have to get down to the metal with RavenDb to either tighten up performance or avoid stale data. We were able to happily spin up a database on the fly for testing scenarios, so you might look to do that more often than trying to swap out RavenDb for mocks, stubs, or 100% in memory repositories. Those tests are still slower than what you’d get with mocks or stubs, but you don’t have any choice when you start having to muck with RavenDb’s low level API’s.

Bulk Inserts

I think RavenDb is weak in terms of dealing with large batches of updates or inserts. We tried using the BulkInsert functionality, and while it was a definite improvement in performance, we found it to be buggy and probably just immature (it is a recent feature). We first hit problems with map/reduce operations not finishing after processing a batch. We updated to a later version of RavenDb (2330), then had to retreat back to our original version (2230) with problems using Windows authentication in combination with the BulkInsert feature. We saw the same issues with the edge version of RavenDb as well. We also noticed that BulkInsert did not seem to honor the batch size settings and had several QA bugs under load because of this. We eventually solved the BulkInsert problems by sending batches of 200 documents for processing through our service bus and putting retry semantics around the BulkInsert to get around occasional hiccups.

The Eventual Consistency Thing

If you’re not familiar with Eventual Consistency and its implications, you shouldn’t even dream of putting a system based on RavenDb into production. The key with RavenDb is that query/command separation is pretty well built in. Writes are transactional, and reads by the document id will always give you the latest information, but other queries execute against indexes that are built in background threads as a result of writes. What this means to you is a chance of receiving stale results from queries against anything but a document id. There’s a real set of rationale behind this decision, but it’s still a major complication in your life with RavenDb.

With our lack of correlation identifiers from upstream, we were forced to issue a lot of queries against “natural key” data and we frequently ran into trouble with stale indexes in certain circumstances. Depending on circumstances, we fixed or prevented these issues by:

Introducing a static index instead of relying on dynamic indexes. I think I’d push you to try to use a static index wherever possible.
Judiciously using the WaitForNonStaleResults****** methods. Be careful with this one though, because it can have negative repercussions as well
In a few cases we introduced an in-memory cache for certain documents. You *might* be able to utilize the 2nd level cache instead
In another case or two, we switched from using surrogate keys to using natural keys because you always get the latest results when loading by the document id. User and login documents are the examples of this that I remember offhand.

The stale index problem is far more common in automated testing scenarios, so don’t panic when it happens.

Conclusion

I’m still very high on RavenDb’s future potential, but there’s a significant learning curve you need to be aware of. The most important thing to know about RavenDb in my opinion is that you can’t just use it, you’re going to have to spend some energy and time learning how it works and what some of the knobs and levers are because it doesn’t just work. On one hand, RavenDb has several features and capabilities that an RDBMS doesn’t and you’ll want to exploit those abilities. On the other hand, I do not believe that you can get away with using RavenDb with all of its default settings on a project with larger data sets.

Honestly, I think the single biggest problem on this project was in not doing the heavy load testing earlier instead of the last moment, but everybody involved with the project has already hung their heads in shame over that one and vowed to never do that again. Doing something challenging and doing something challenging right up against a deadline are too very different things. It is my opinion that while we did struggle with RavenDb that we would have had at least some struggle to optimize the performance if we’d built with an RDBMS and the user interface would have been much more challenging.

Knowing what I know now, I think it’s 50/50 that I would use RavenDb for a similar project again. If they get their story fixed for bigger transactions though, I’m all in.

Big Update on StructureMap 3.0 Progress

I can finally claim some very substantial progress on StructureMap 3.0 today. For a background on the goals and big changes for the 3.0 release, see Kicking off StructureMap 3 from last year and some additions from last month when I started again. As of today, StructureMap 3.0 development is in the master branch in GitHub. If you need to get at StructureMap 2.6 level code, use the TwoSix branch.

What’s been done?

I removed the strong naming.
All the old [Obsolete] API methods have been removed
The registration API has been greatly streamlined and there’s much more consistency internally now
The nested container implementation has been completely redone. It’s much simpler, should be much faster because it’s doing much less on setup, and the old lifecycle confusion between the parent and nested container problems have been fixed.
The “Profile” functionality has been completely redesigned and rebuilt. It’s also much more capable now than it was before.
The container spinup time *should* be much better because there’s so much less going on and a lot more decision making is done in a lazy way with memoization along the way. Lazy<T> FTW!
There’s much more runtime “figure out what I could do” type possibilities now
You can apply lifecycle scoping Instance by Instance instead of only at the PluginType level. That’s been a big gripe for years.
The Xml configuration has been heavily streamlined
The old [PluginFamily] / [Pluggable] attributes have been completely ripped out
Internally, the old PipelineGraph, InstanceFactory, ProfileManager architecture is all gone. The new PipelineGraph implementations just wrap one or more PluginGraph objects, so there’s vastly less data structure shuffling gone on internally.

What’s left to do?

I’ve transcribed my own notes about outstanding work (minus the documentation) to the GitHub issues page. There are a few items that are going to need some serious forethought, but I think the biggest architectural changes are already done and that list is starting to be more of a punchlist. I would dearly love any kind of help, design input, additions, or feedback on the outstanding work. If you’re inclined to get involved and tackle some of the issues, I tried to label the issues for the effort level.

If you think of the issues as picking a sword fight, the tags line up like this:

“Easy Fix” – Facing a sheepherder who probably stole that heron mark blade he’s carrying
“Medium Effort” – Fighting a Trolloc
“Architectural Level Change” – Fade. I will likely need to be involved with any of these

Fairly soon, I’ll be making a call for folks to try out a prerelease version of StructureMap 3 in their existing applications. As part of that effort, I’d really like to get some feedback about the observed performance and see if we can beat on it enough to find any memory leak issues.

If you or someone you know is a multi-threading guru, I’d probably be interested in talking through some things with you in the codebase.

Docs? Someday? Maybe?

Hopefully someday soon. The FubuMVC core team will be relaunching a completely new website sometime in the next couple years with our own implementation of a readthedocs style infrastructure. I’m planning on making the new StructureMap documentation part of that website. Documentation will be in git where it’ll be easy to take in pull requests for additions and corrections, and you’ll be able to use either Html or Markdown for the content. We’ve already got a working mechanism to “slurp” code samples live out of a source code tree and put into the we pages with formatting via pretty print to achieve “living” documentation this time around.

Last Thoughts

I haven’t paid attention to any of the “IoC Container Performance Shootout!” type blog posts in a long time, but StructureMap used to routinely come in well ahead of the other full-featured IoC containers (tools like Funq shouldn’t be considered apples to apples with StructureMap/Windsor/Ninject/Autofac/whatever. If you don’t support auto-wiring, rich lifecycle support, and maybe even interception, I say you don’t count as full-featured) in terms of performance. However, as I’ve torn into the StructureMap codebase with an eye towards better performance for the first time in years, I’ve found a scary amount of performance killing cruft code. My final thought is that as bad as the StructureMap code was (and trust me, it was), if it’s really faster than the other IoC containers, then what does that say about their code internals at that time? 😉

Big Proposed Changes for StructureMap 3

Just trying to round up more feedback as I go, here’s a handful of discussions I’ve started on the big proposed changes for StructureMap 3:

Please feel free to chime in here, twitter, or the list on any of these topics or any other thing you want for StructureMap 3.

Thanks,

Jeremy

A Simple Example of a Table Driven Executable Specification

My shop is starting to go down the path of executable specifications (using Storyteller2 as the tooling, but that’s not what this post is about). As an engineering practice, executable specifications* involves specifying the expected behavior of a user story with concrete examples of exactly how the system should behave before coding. Those examples will hopefully become automated tests that live on as regression tests.

What are we hoping to achieve?

Remove ambiguity from the requirements with concrete examples. Ambiguity and misunderstandings from prose based requirements and analysis has consistently been a huge time waste and source of errors throughout my career.
Faster feedback in development. It’s awfully nice to just run the executable specs in a local branch before pushing anything to the testers
Find flaws in domain logic or screen behavior faster, and this has been the biggest gain for us so far
Creating living documentation about the expected behavior of the system by making the specifications human readable
Building up a suite of regression tests to make later development in the system more efficient and safer

Quick Example

While executable specifications are certainly a very challenging practice from the technical side of things, in the past week or so I’m aware of 3-4 scenarios where the act of writing the specification tests has flushed out problems with our domain logic or screen behavior a lot faster than we could have done otherwise.

Part of our application logic involves fuzzy matching against people in our system against some, ahem, not quite trustworthy data from external partners. Our domain expert explained the matching logic that he wanted was to match a person’s social security number, birth date, first name, and last name — but the name matching should be case insensitive and it’s valid to match on the initial of the first name. Since this logic can be expressed as a set number of inputs and the one output with a great number of permutations, I chose to express this specification as a table with Storyteller (conceptually identical to the old ColumnFixture in FitNesse). The final version of the spec is shown below (click the image to get a more readable version):

The image above is our final, approved version of this functionality that now lives as both documentation and a regression test. Before that though, I wrote the spec and got our domain expert to look at it, and wouldn’t you know it, I had misunderstood a couple assumptions and he gave me very concrete feedback about exactly what the spec should have been.

To make this just a little bit more concrete, our Storyteller test harness connects the table inputs to the system under test with this little bit of adapter code:

The code behind the executable spec

    public class PersonFixture : Fixture
    {
        public PersonFixture()
        {
            Title = “Person Matching Logic”;
        }
        [ExposeAsTable(“Person Matching Examples”)]
        [return:AliasAs(“Matches”)]
        public bool PersonMatches(
            string Description,
            [Default(“555-55-5555”)]SocialSecurityNumber SSN1,
            [Default(“Hank”)]string FirstName1,
            [Default(“Aaron”)]string LastName1,
            [Default(“01/01/1974”)]DateCandidate BirthDate1,
                                  [Default(“555-55-5555”)]SocialSecurityNumber SSN2,
            [Default(“Hank”)]string FirstName2,
            [Default(“Aaron”)]string LastName2,
            [Default(“01/01/1974”)]DateCandidate BirthDate2)
        {
            var person1 = new Person
            {
                SSN = SSN1,
                FirstName = FirstName1,
                LastName = LastName1,
                BirthDate = BirthDate1
            };
            var person2 = new Person
            {
                SSN = SSN2,
                FirstName = FirstName2,
                LastName = LastName2,
                BirthDate = BirthDate2
            };
            return person1.Equals(person2);
        }
    }

* Jeremy, is this really just Behavior Driven Development (BDD)? Or the older idea of Acceptance Test Driven Development (ATDD)? This is some folks’ definition of BDD, but BDD is so overloaded and means so many different things to different people that I hate using the term. ATDD never took off, and “executable specifications” just sounds cooler to me, so that’s what I’m going to call it.

Let’s try this again, StructureMap 3.0 in en route as of now

I’ll be honest, I haven’t worked much on StructureMap since I originally shelved my original 3.0/rewrite work in the summer of 2010 — and yes, the documentation is almost worthless. Now that FubuMVC reached that magic 1.0 mark I’m turning my attention back to StructureMap for a bit, but I think I want some feedback about what I’m thinking right now.

For background, read:

Kicking off StructureMap 3.0 — I just re-read this, and I’m still thinking all of the same things here and all the feedback is still valid.
Proposed StructureMap 2.7 Release

A month ago my plan was to do a small 2.7 release on the existing codebase to remove all the [Obsolete] API calls and grab some pull requests along the way. Having done that, I would then turn my attention back to the 3.0 codebase where I planned to essentially rewrite the core of StructureMap and retrofit the existing API on top of the new, cleaner core. A week or so into the work for the 2.7 release and I’ve changed my mind. First off, by the rules of semantic versioning, I should bump the major version to 3.0.0 when I make the breaking API changes. Secondly, I’m coming around to the idea of restructuring the existing code in place instead of a full rewrite.

To reiterate the major points, the 3.0 release means:

All [Obsolete] API calls are going away
Removing the strong naming — if you absolutely *have* to have this, maybe we can make separate nuget packages. I suggest we name that “structuremap.masochistic.”
Move to .Net 4.0. I don’t think it’s time to go to 4.5 yet and I don’t really want to mess with that anyway
Taking a dependency on FubuCore — if that causes pushback we’ll ilmerge it
Streamlining the Xml support
Rewrite the “Profile” feature completely
Make nested containers not be a crime against computer science
NOT adding every random brainfart “feature” that Windsor has
Make it faster
Make the diagnostics much better
Removing some obscure, clumsy features I never use and really wish you wouldn’t either

Additionally, we have a new “living documentation” infrastructure baking for the Fubu projects. I know some work already happened to transfer the StructureMap docs to Jekyll, but I’d far prefer to publish on the new fubu world website whenever that happens.

For right now, the 3.0 branch is in the original StructureMap repository at https://github.com/structuremap/structuremap/tree/three.

My Opinions on Data Setup for Functional Tests

I read Jim Holmes’s post Data Driven Testing: What’s a Good Dataset Size? with some interest because it’s very relevant to my work. I’ve been heavily involved in test automation efforts over the past decade, and I’ve developed a few opinions about how best to handle test data input for functional tests (as opposed to load/scalability/performance tests). First though, here’s a couple concerns I have:

Automated tests need to be reliable. Tests that require external, environmental tests can be brittle in unexpected ways. I hate that.
Your tests will fail from time to time with regression bugs. It’s important that your tests are expressed in a way that makes it easy to understand the cause and effect relationship between the “known inputs” and the “expected outcomes.” I can’t tell you how many times I’ve struggled to fix a failing test because I couldn’t even understand what exactly it was supposed to be testing.
My experience says loudly that smaller, more focused automated tests are far easier to diagnose and fix when they fail than very large, multi-step automated tests. Moreover, large tests that drive user interfaces are much more likely to be unstable and unreliable. Regardless of platform, problem domain, and team, I know that I’m far more productive when I’m working with quicker feedback cycles. If any of my colleagues are reading this, now you know why I’m so adamant about having smaller, focused tests rather than large scripted scenarios.
Automated tests should enable your team to change or evolve the internals of your system with more confidence.

Be Very Cautious with Shared Test Data

If I have my druthers, I would not share any test setup data between automated tests except for very fundamental things like the inevitable lookup data and maybe some default user credentials or client information that can safely be considered to be static. Unlike “real” production coding where “Don’t Repeat Yourself” is crucial for maintainability, in testing code I’m much more concerned with making the test as self-explanatory as possible and completely isolated from one another. If I share test setup data between tests, there’s frequently going to be a reason why you’ll want to add a little bit more data for a new test which ends up breaking the assertions in the existing tests. Besides that, using a shared test data set means that you probably have more data than any single test really needs — making the diagnosis of test failures harder. For all of those reasons and more, I strongly prefer that my teams copy and paste bits of test data sets to keep them isolated by test rather than shared.

Self-Contained Tests are Best

I’ve been interested in the idea of executable specifications for a long time. In order to make the tests have some value as living documentation about the desired behavior of the system, I think it needs to be as clear as possible what the relationship is between the germane data inputs and the observed behavior of the system. Plus, automated tests are completely useless if you cannot reliably run them on demand or inside a Continuous Integration build. In the past I’ve also found that understanding or fixing a test is much harder if I have to constantly ALT-TAB between windows or even just swivel my head between a SQL script or some other external file and the body of the rest of a script.

I’ve found that both the comprehensibility and reliability of an automated test are improved by making each automated test self-contained. What I mean by that is that every part of the test is expressed in one readable document including the data setup, exercising the system, and verifying the expected outcome. That way the test can be executed at almost any time because it takes care of its own test data setup rather than being dependent on some sort of external action. To pull that off you need to be able to very concisely describe the initial state of the system for the test, and shared data sets and/or raw SQL scripts, Xml, Json, or raw calls to your system’s API can easily be noisy. Which leads me to say that I think you should…

Decouple Test Input from the Implementation Details

I’m a very large believer in the importance of reversibility to the long term success of a system. With that in mind, we write automated tests to pin down the desired behavior of the system and spend a lot of energy towards designing the structure of our code to more readily accept changes later. All too frequently, I’ve seen systems become harder to change over time specifically because of tight coupling between the automated tests and the implementation details of a system. In this case, the automated test suite will actually retard or flat out prevent changes to the system instead of enabling you to more confidently change the system. Maybe even worse, that tight coupling means that the team will have to eliminate or rewrite the automated tests in order to make a desired change to the system.

With that in mind, I somewhat strongly recommend against expressing your test data input in some form of interpreted format rather than as SQL statements or direct API calls. My team uses Storyteller2 where all test input is expressed in logical tables or “sentences” that are not tightly coupled to the structure of our persisted documents. I think that simple textual formats or interpreted Domain Specific Language’s are also viable alternatives. Despite the extra work to write and maintain a testing DSL, I think there are some big advantages to doing it this way:

You’re much more able to make additions to the underlying data storage without having to change the tests. With an interpreted data approach, you can simply add fake data defaults for new columns or fields
You can express only the data that is germane to the functionality that your test is targeting. More on this in below when I talk about my current project.
You can frequently make the test data setup be much more mechanically cheaper per test by simply reducing the amount of data the test author will have to write per test with sensible default values behind the scenes. I think this topic is probably worth a blog post on its own someday.

This goes far beyond just the test data setup. I think it’s very advantageous in general to express your functional tests in a way that is independent of implementation details of your application — especially if you’re going to drive a user interface in your testing.

Go in through the Front Door

Very closely related to my concerns about decoupling tests from the implementation details is to avoid using “backdoor” ways to set up test scenarios. My opinion is that you should set up test scenarios by using the real services your application uses itself to persist data. While this does risk making the tests run slower by going through extra runtime hoops, I think it has a couple advantages:

It’s easier to keep your test automation code synchronized with your production code as you refactor or evolve the production code and data storage
It should result in writing less code period
It reduces logical duplication between the testing code and the production code — think database schema changes
When you write raw data to the underlying storage mechanisms you can very easily get the application into an invalid state that doesn’t happen in production

Case in point, I met with another shop a couple years ago that was struggling with their test automation efforts. They were writing a Silverlight client with a .Net backend, but using Ruby scripts with ActiveRecord to setup the initial data sets for automated tests. I know from all of my ex-.Net/now Ruby friends that everything in Ruby is perfect, but in this case, it caused the team a lot of headaches because the tests were very brittle anytime the database schema changed with all the duplication between their C# production code and the Ruby test automation code.

Topics for later…

It’s Saturday and my son and I need to go to the park while the weather’s nice, so I’m cutting this short. In a later post I’ll try to get more concrete with examples and maybe an additional post that applies all this theory to the project I’m doing at work.

Proposed StructureMap 2.7 Release

So StructureMap 3 hasn’t really gotten going again, but I still have intentions of doing so this year. In the mean time, I’ve got a batch of pull requests stacked up in the StructureMap 2.6 codebase and it’s time for a new intermediate release. At this time, what I’d like to do is rev up to StructureMap 2.7 and do this:

Take in all the outstanding pull requests
Remove all [Obsolete] API members
Mark as [Obsolete] some various parts of the registration API that I know that I will not support in 3.0 (conditional construction comes to mine)
Mark all StructureMap attributes except for [DefaultConstructor] as [Obsolete] as I think we will dump all the circa 2003 attributes that you used to need to use.
Remove the strong naming because it’s death in combination with Nuget. If this is an issue for you, I will happily take a pull request to make a separate nuget package for a signed version of StructureMap
Ideally, I’d like to clean up the more coarse grained unit tests in a new namespace called “Acceptance” in order to get these ready for usage in StructureMap 3 and maybe provide a level of living documentation for later.
MAYBE — take a look at cleaning up the exception stack traces to give you more contextual information about where StructureMap caught an exception. We lost a lot of contextual information when I eliminated the Reflection.Emit usage in favor of compiling Expression’s.

Thoughts?

Clean Database per Automated Test Run? Yes, please.

TL;DR We’re able to utilize RavenDb‘s support for embedded databases, some IoC trickery, and our FubuMVC.RavenDb library to make automated testing far simpler by quickly spinning up a brand new database for each individual automated test to have complete control over the state of our system. Oh, and removing ASP.Net and relational databases out of the equation makes automated functional testing far easier too.

Known inputs and expected outcomes is the mantra of successful automated testing. This is generally pretty simple with unit tests and more granular integration tests, but sooner or later you’re going to want to exercise your application stack with a persistent database. You cannot sustain your sanity, much less be successful, while doing automated testing if you cannot easily put your system in a known state before you try to exercise the system. Stateful elements of your application architecture includes things like queues, the file system, and in memory caches, but for this post I’m only concerned with controlling the state of the application database.

On my last several projects we’ve used some sort of common test setup action to roll back our database to a near empty state before a test adds the exact data to the database that it needs as part of the test execution (the “arrange” part of arrange, act, and assert). You can read more about the ugly stuff I’ve tried in the past at the bottom of this post, but I think we’ve finally arrived at a solution for this problem that I think is succeeding.

Our Solution

First, we’re using RavenDb as a schema-less document database. We also use StructureMap to compose the services in our system, and RavenDb’s IDocumentStore is built and scoped as a singleton. In functional testing scenarios, we run our entire application (FubuMVC website hosted with an embedded web server, RavenDb, our backend service) in the same AppDomain as our testing harness, so it’s very simple for us to directly alter the state of the application. Before each test, we:

Eject and dispose any preexisting instance of IDocumentStore from our main StructureMap container
Replace the default registration of IDocumentStore with a new, completely empty instance of RavenDb’s EmbeddedDocumentStore
Write a little bit of initial state into the new database (a couple pre-canned logins and tenants).
Continue to the rest of the test that will generally start by adding test specific data using our normal repository classes helpfully composed by StructureMap to use the new embedded database

I’m very happy with this solution for a couple different reasons. First, it’s lightning fast compared with other mechanics I’ve used and describe at the bottom of this post. Secondly, using a schema-less database means that we don’t have much maintenance work to do to keep this database cleansing mechanism up to date with new additions to our persistent domain model and event store — and I think this is a significant source of friction when testing against relational databases.

Show me some code!

I won’t get into too much detail, but we use StoryTeller2 as our test harness for functional testing. The “arrange” part of any of our functional tests gets expressed like this taken from one of our tests for our multi-tenancy support:

----------------------------------------------
|If the system state is |

|The users are                                |
|Username |Password |Clients                  |
|User1    |Password1|ClientA, ClientB, ClientC|
|User2    |Password2|ClientA, ClientB         |
----------------------------------------------

In the test expressed above, the only state in the system is exactly what I put into the “arrange” section of the test itself. The “If the system state is” DSL is implemented by a Fixture class that runs this little bit of code in its setup:

Code Snippet

        public override void SetUp(ITestContext context)
        {
            // There’s a bit more than this going on here, but the service below
            // is part of our FubuPersistence library as a testing hook to
            // wipe the slate clean in a running application
            _reset = Retrieve<ICompleteReset>();
            _reset.ResetState();
        }

As long as my team is using our “If the system state is” fixture to setup the testing state, the application database will be set back to a known state before every single test run — making the automated tests far more reliable than other mechanisms I’ve used in the past.

The ICompleteReset interface originates from the FubuPersistence project that was designed in no small part to make it simpler to completely wipe out the state of your running application. The ResetState() method looks like this:

Code Snippet

        public void ResetState()
        {
            // Shutdown any type of background process in the application 
            // that is stateful or polling before resetting the database
            _serviceResets.Each(x => {
                trace(“Stopping services with {0}”, x.GetType().Name);
                x.Stop();
            });
            // The call to replace the database
            trace(“Clearing persisted state”);
            _persistence.ClearPersistedState();
            // Load any basic state that has to exist for all tests.  
            // I’m thinking that this is nothing but a couple default 
            // login credentials and maybe some static lookup list
            // data
            trace(“Loading initial data”);
            _initialState.Load();
            // Restart any and all background processes to run against the newly
            // created database
            _serviceResets.Each(x => {
                trace(“Starting services with {0}”, x.GetType().Name);
                x.Start();
            });
        }

The method _persistence.ClearPersistedState() called above to rollback all persistence is implemented by our RavenDbPersistedState class. That method does this:

Code Snippet

        public void ClearPersistedState()
        {
            // _container is the main StructureMap IoC container for the
            // running application.  The line below will
            // eject any existing IDocumentStore from the container
            // and dispose it
            _container.Model.For<IDocumentStore>().Default.EjectObject();
            // RavenDbSettings is another class from FubuPersistence
            // that just controls the very intial creation of a
            // RavenDb IDocumentStore object.  In this case, we’re
            // overriding the normal project configuration from
            // the App.config with instructions to use an
            // EmbeddedDocumentStore running completely
            // in memory. 
            _container.Inject(new RavenDbSettings
            {
                RunInMemory = true
            });
        }

The code above doesn’t necessarily create a new database, but we’ve set ourselves up to use a brand new embedded, in memory database whenever something does request a running database from the StructureMap container. I’m not going to show this code for the sake of brevity, but I think it’s important to note that the RavenDb database construction will use your normal mechanisms for bootstrapping and configuring an IDocumentStore including all the hundred RavenDb switches and pre-canned indices.

All the code shown here is from the FubuPersistence repository on GitHub.

Conclusion

I’m generally happy with this solution. So far, it’s quick in execution and we haven’t required much maintenance as we’ve progressed other than more default data. Hopefully, this solution will be applicable and reusable in future projects out of the box. I would happily recommend a similar approach to other teams.

But, but, but…

If you did read this carefully, I think you’ll find some things to take exception with:

I’m assuming that you really are able to test functionality with bare minimum data sets to keep the setup work to a minimum and the performance at an acceptable level. This technique isn’t going to be useful for anything involving performance or load testing — but are you really all that concerned about functionality testing when you do that type of testing?
We’re not running our application in its deployed configuration when we collapse everything down to the same AppDomain. Why I think this is a good idea, the benefits, and how we do it are a topic for another blog post. Promise.
RavenDb is schema-less and that turns out to make a huge difference in how long it takes to spin up a new database from scratch compared to relational databases. Yes, there may be some pre-canned indices that need to get built up when you spin up the new embedded database, but with an empty database I don’t see that as a show stopper.

Other, less successful ways of controlling state I’ve used in the past

Over the years I’ve done automated testing against persisted databases with varying degrees of frustration. The worst possible thing you can do is to have everybody testing against a shared relational database in the development and testing environments. You either expect the database to be in a certain state at the start of the test, or you ran a stored procedure to set up the tables you wanted to test against. I can’t even begin to tell you how unreliable this turns out to be when more than one person is running tests at the same time and fouling up the test runs. Unfortunately, many shops still try to do this and it’s a significant hurdle to clear when doing automated testing. Yes, you can try to play tricks with transactions to isolate the test data or try to use randomized data, but I’m not a believer in either approach.

Having an isolated relational database per developer, preferably on their own development box, was a marked improvement, but it adds a great deal of overhead to your project automation. Realistically, you need a developer to be able to build out the latest database on the fly from the latest source on their own box. That’s not a deal breaker with modern database migration tools, but it’s still a significant about of work for your team. The bigger problem to me is how you tear down the existing state in a relational database to put it into a known state before running an automated test. You’ve got a couple choices:

Destroy the schema completely and rebuild it from scratch. Don’t laugh, I’ve seen people do this and the tests were as painfully slow as you can probably imagine. I suppose you could also script the database to rollback to a checkpoint or reattach a backed up copy of the database, but again, I’m never going to recommend that if you have other options.
Execute a set of commands that wipes most if not all of the data in a database before each test. I’ve done this before, and while it definitely helped create a known state in the system, this strategy performed very poorly and it took quite a bit of work to maintain the “clean database” script as the project progressed. As a project grows, the runtime of your automated test runs becomes very important to keep the feedback loop useful. Slow tests hamper the usefulness of automated testing.
Selectively clean out and write data to only the tables affected by a test. This is probably much faster performance wise, but I think it will require more coding inside of the testing code to do the one off, set up the state code.

* As an aside, I really suggest keeping the project database data definition language scripts and/or migrations in the same source control system as the code so that it’s very easy to trace the version of the code running against which version of the database schema. The harsh reality in my experience is that the same software engineering rigor we generally use for our application code (source control, unit testing, TDD, continuous integration) is very often missing in regards to the relational database DDL and environment. If you’re a database guy talking to me at a conference, you better have your stuff together on this front before you dare tell me that “my developers can’t be trusted to access my database.”

FubuMVC Turns 1.0

The FubuMVC team has published a 1.0 version of the main libraries (FubuMVC.Core, FubuMVC.StructureMap, FubuMVC.AutoFac, FubuMVC.Core.UI, and the view engines) to the public nuget feed. We’re certainly not “done,” and we’re severely lacking in some areas (*cough* documentation *cough*), but I’m happy with our technical core and feel like we’re able to make that all important, symbolic declaration of “SemVer 1/the major public API signatures are stable.”

It’s been a long journey from Chad Myers and I’s talk at KaizenConf all the way back in 2008 to CodeMash in 2013 and in this highly collaborative OSS on GitHub world, we’ve had a lot of collaborators. In particular, I’d like to thank Chad Myers and Josh Flanagan for their help at the beginning, Josh Arnold for being my coding partner the past couple years, Corey Kaylor for being the grown up in the room, and Alex Johannessen for his boundless enthusiasm. I’ve genuinely enjoyed my interactions with the FubuMVC community and I look forward to seeing us grow in the new year.

There’s plenty more to do, but for a week or so, my only priority is rest (and finishing the last couple hundred pages of A Memory of Light) — and that doesn’t have anything to do with HATEOAS or hypermedia.

What’s not there yet…

I saw somebody on Twitter last week saying that the “U” in FubuMVC stands for “undocumented,” and that it’s so bad that we had to use two “U’s.” I’m very painfully aware of this, and I think we’re ready to start addressing the issue permanently.

A good “quick start” nuget and guide. The FubuMVC team made a heroic effort over the past couple months to make the FubuMVC 1.0 release just before our CodeMash workshop this week, and I dropped the ball on updating the old “FubuMVC” nuspec file to be relevant to the streamlined API’s as they are now.
The new “FubuWorld” website with documentation on all of the major and hopefully most of the minor FubuMVC projects (including StructureMap and StoryTeller as well). We effectively wrote our own FubuMVC-hosted version of readthedocs, but we haven’t yet exploited this capability and gotten a new website with updated documentation online. I’m deeply scarred by my experiences with documenting StructureMap and how utterly useless it has been. This time the projects will have strong support for living documentation.
Lots of Camtasia videos
Lots of google-able blog posts

Continuous Design and Reversibility at Agile Vancouver (video)

In November I got to come out of speaking retirement at Agile Vancouver. Over a couple days I finally got to meet Mike Stockdale in person, have some fun arguments with Adam Dymitruk, see some beautiful scenery, and generally annoy the crap out of folks who are hoarding way too much relational database cheese in my talk called Continuous Design and Reversibility (video via the link).

I think the quality of reversibility in your architecture is a very big deal, especially if you have the slightest interest in effectively doing continuous design. Roughly defined, reversibility is your ability to alter or delay elements of your software architecture. Low reversibility means that you’re more or less forced to get things right upfront and it’s expensive to be wrong — and sorry, but you will be wrong about many things in your architecture on any non-trivial project. By contrast, using techniques and technologies that have higher reversibility qualities vastly improves my ability to delay technical decisions so that I can focus on one thing at a time like say, building out the user interface for a feature to get vital user feedback quickly without having to first lay down every single bit of my architecture for data access, security or logging first. In the talk, I gave several concrete examples from my project work including the usage of document databases instead of relational databases.

Last Responsible Moment

I think we can all conceptually agree with the idea of the “Last Responsible Moment,” meaning that the best time to make a decision is as late in the project as possible when you have the most information about your real needs. How “late” your last responsible moment is for any given architectural decision is largely a matter of reversibility.

For the old timers reading this, consider the move from VB6 with COM to .Net a decade and change ago. With COM, adding a new public method to an existing class or changing the signature of an existing public method could easily break the binary compatibility, meaning that you’d have to recompile any downstream COM components that used the first COM component. In that scenario, it behooved you to get the public signatures locked down and stable as fast as possible to avoid the clumsiness and instability with downstream components — and let me tell you youngsters, that’s a brittle situation because you always find reasons to change the API’s when you get deep into your requirements and start stumbling into edge cases that weren’t obvious upfront. Knowing that you can happily add new public members to .Net classes without breaking downstream compatibility, my last responsible moment for locking down public API’s in upstream components is much later than it was in the VB6 days.

The original abstract:

From a purely technical perspective, you can almost say that Extreme Programming was a rebellion against the traditional concept of “Big Design Upfront.” We spent so much time explaining why BDUF was bad that we might have missed a better conversation on just how to responsibly and reliably design and architect applications and systems in an evolutionary way.

I believe that the key to successful continuous or evolutionary design is architectural “reversibility,” the ability to reverse or change technical decisions in the code. Designing for reversibility helps a team push back the “Last Responsible Moment” to make more informed technical decisions.

I work on a very small technical team building a large system with quite a bit of technical complexity. In this talk I’ll elaborate on how we’ve purposely exploited the concept of reversibility to minimize the complexity we have to deal with at any given time. More importantly, I’ll talk about how reversibility led us to choose technologies like document databases, how we heavily exploit conventions in the user interface, and the testing process that made it all possible. And finally, just to make the talk more interesting, I’ll share the times when delaying technical decisions blew up in our faces.