Category Archives: Automated Testing

Adventures in Custom Testing Infrastructure

tl;dr: Sometimes the overhead of writing custom testing infrastructure can lead to easier development


Quick Feedback Cycles are Key

It’d be nice if someday I could write all my code perfectly in both structure and function the first time through, but for now I have to rely on feedback mechanisms to tell me when the code isn’t working correctly. That being said, I feel the most productive when I have the tightest feedback cycle between making a change in code and knowing how it’s actually working — and by “quick” I mean both the time it takes for me to setup the feedback cycle and how long the feedback cycle itself takes.

While I definitely like using quick twitch feedback tools like REPL’s or auto-reloading/refreshing web tools like our own fubu run or Mimosa.js’s “watch” command, my primary feedback mechanism for code centric tasks is usually automated tests. That being said, it helps when the tests are mechanically easy to write and run quickly enough that you can get into a nice “red/green/refactor” cycle. For whatever reasons, I’ve hit several problem domains in the last couple years where it was laborious in my time to set up the preconditions and testing inputs and also to measure and assert on the expected outcomes.


Maybe Invest in Some Custom Testing Infrastructure?

In some cases I knew right away that testing a feature was going to be a problem, so I started by asking myself “how do I wish I could express the test setup and assertions.” If it seems feasible, I’ll write custom ObjectMother if that’s possible or Test Data Builder‘s for the data setup in more complex cases. I’ve occasionally resorted to building little interpreters that read text and create data structures or files (I do this more often for hierarchical data than anything else I think) or perform assertions on the final state.

You can see an example of this in my old Storyteller2 codebase. Storyteller is a tool for automated acceptance tests and includes a tree view pane in the UI with the inevitable hierarchy of tests organized by suites in an n-deep hierarchy like:

Top Level Suite
  - Suite 1
    -Suite 2
    -Suite 3
      - Test 1
      - Test 2

In the course of building the Storyteller client, I needed to write a series of tests on the tree view state that had to start with a known hierarchy of suites and test files as inputs. After performing actions like filtering or receiving state updates within the UI, I needed to assert on the expected display in this test explorer pane (which tests and suites were visible and were they marked as running, failed, successful, or unknown).

First, to deal with the setup of the hierarchical data I created a little custom class that read flat text data and turned that into the desired hierarchy:

            hierarchy =

Then in the “assertion” part of the test I created a custom specification class that could again read its expectations expressed as flat text and assert that the resulting tree view exactly matched the specified state:

        public void the_child_nodes_are_constructed_with_the_empty_suite()
            var spec =
                new TreeNodeSpecification(


As I recall, writing the simple text parsing classes just to make the expression of the automated tests made it pretty easy to add new behavior quickly. In this case, the time investment upfront for the custom testing infrastructure paid off.


FubuMVC’s View Engine Support

A couple months ago I finally got to carve off some time to finally go overhaul the view engine support code in FubuMVC. My main goals were to cut the unnecessarily complex internal code down to something more manageable as a precursor to optimizing both runtime performance and FubuMVC’s time to initialize an application. Since I was about to start monkeying around quite a bit with the internals of code that many of our users depend on, it’s a good thing that we had an existing suite of integration tests that acted as acceptance tests (think layouts, partials, HTML helpers, and our conventional attachment of views to routes) so that in theory I could safely make the restructuring changes without breaking existing behavior.

Going in though, I knew that there was some significant drawbacks to using our existing mechanism for testing the view engine support and I wasn’t looking forward to the inevitable test failures or formulating new integration tests.


Problems with the Existing Test Suite

In order to write end to end tests against the view engine support we had been effectively writing little mini FubuMVC applications inside our integration test libraries. Quite naturally, that often meant adding several view files and folders to simulate all the different permutations for layout rendering, using partials, sharing views from external Bottles (a superset of Area’s for you ASP.Net MVC folks), and view profiles (mobile vs. desktop for example). In the test fixtures we would spin up a FubuMVC application with Katana, run HTTP requests, and make assertions against the content that should or should not be present in the HTTP response body.

It wasn’t terrible, but it came with a serious drawbacks:

  1. It wasn’t complete and I’d need to add additional tests
  2. It was expensive in mechanical effort to create those little mini FubuMVC applications that had to be spread over so many different files and even folders
  3. Understanding the tests when something went wrong could be difficult because the expression of the test was effectively split over so many files


The New Approach

Before going too far into the code changes against the view engine support, I built a new test harness that would allow me to express in one testing class file:

  1. What all the views and layouts were in the entire system including the content of the views
  2. What the views were in external Bottles loaded into the application
  3. If necessary, configure a complete FubuMVC application if the defaults weren’t sufficient for the test
  4. Declare what content should and should not be rendered when certain routes were executed

The end result was a base class I called ViewIntegrationContext. Mechanically, I made TestFixture classes deriving from this abstract class. In the constructor function of the test fixture classes I would specify the location, content, and view model of any number of Spark or Razor views. When the test fixture class was first executed, it would:

  1. Create a brand new folder using a guid as the name to host the new “application” to avoid collisions with existing test runs (while the new test harness does try to clean up after itself, I’ve learned not to be very trusting of the file system during automated tests)
  2. Write out the Spark and Razor files based on the data specified in the constructor function to the new application folder
  3. Optionally load content Bottles and FubuMVC configurations inside the test harness (ignore that for now if you would, but it was a huge win for me)
  4. Load a new FubuMVC application in memory with the root directory pointing to our new folder for just this test

For each test, the ViewIntegrationContext object uses FubuMVC 2.0’s brand new in memory test harness (somewhat inspired by PlaySpecification from Scala) to execute a “Scenario” where I could declaratively specify what url to render and assert what content should or should not be present in the HTML output.

To make this concrete, the very simplest test to check that FubuMVC really can render a Spark view looks like this:

    public class Simple_rendering : ViewIntegrationContext
        public Simple_rendering()
<p>This is real output</p>

        public void can_render()
            Scenario.Get.Input(new AirInputModel{TakeABreath = true});
            Scenario.ContentShouldContain("<h2>Breathe in!</h2>");

    public class AirEndpoint
        public AirViewModel TakeABreath(AirRequest request)
            return new AirViewModel { Text = "Take a {0} breath?".ToFormat(request.Type) };

        public BreatheViewModel get_breathe_TakeABreath(AirInputModel model)
            var result = model.TakeABreath
                ? new BreatheViewModel { Text = "Breathe in!" }
                : new BreatheViewModel { Text = "Exhale!" };

            return result;

    public class AirRequest
        public AirRequest()
            Type = "deep";

        public string Type { get; set; }

    public class AirInputModel
        public bool TakeABreath { get; set; }

    public class AirViewModel
        public string Text { get; set; }

    public class BreatheViewModel : AirViewModel



So did this payoff? Heck yeah it did, especially for scenarios where I needed to build out multiple views and layouts. The biggest win for me was that the tests were completely self-contained instead of spread out over so many files and folders. Even better yet, the new in memory Scenario support in FubuMVC made the actual tests very declarative with decently descriptive failure messages.


It’s Not All Rainbows and Unicorns

I cherry picked some examples that I felt went well, but there have been some other times when I’ve gone down a rabbit hole of building custom testing infrastructure only to see it be a giant boondoggle. There’s a definite bit of overhead to writing this kind of tooling and you always have to consider whether you’ll save time in the whole compared to writing more crude or repetitive testing code. While I tend to be aggressive about building custom test harnesses, you might accurately call it a speculative exercise and hold off until you feel some pain in your testing.

Moreover, any kind of custom test harness where you decouple the expression of the test (inputs, actions, and assertions) from the actual code that’s being exercised obfuscates your traceability back to the actual code. I’ve seen plenty of cases where the “goodness” of making the expression of the test prettier and more declarative was more than offset by how hard it was to debug test failures because of the extra mental overhead of connecting the meaning of the test to the code that should be implementing it. It’s for that reason that I’ve never been a big fan of most Behavior Driven Development tools for testing that isn’t customer facing.





A Simple Example of a Table Driven Executable Specification

My shop is starting to go down the path of executable specifications (using Storyteller2 as the tooling, but that’s not what this post is about).  As an engineering practice, executable specifications* involves specifying the expected behavior of a user story with concrete examples of exactly how the system should behave before coding.  Those examples will hopefully become automated tests that live on as regression tests.

What are we hoping to achieve?

  • Remove ambiguity from the requirements with concrete examples.  Ambiguity and misunderstandings from prose based requirements and analysis has consistently been a huge time waste and source of errors throughout my career.
  • Faster feedback in development.  It’s awfully nice to just run the executable specs in a local branch before pushing anything to the testers
  • Find flaws in domain logic or screen behavior faster, and this has been the biggest gain for us so far
  • Creating living documentation about the expected behavior of the system by making the specifications human readable
  • Building up a suite of regression tests to make later development in the system more efficient and safer

Quick Example

While executable specifications are certainly a very challenging practice from the technical side of things, in the past week or so I’m aware of 3-4 scenarios where the act of writing the specification tests has flushed out problems with our domain logic or screen behavior a lot faster than we could have done otherwise.

Part of our application logic involves fuzzy matching against people in our system against some, ahem, not quite trustworthy data from external partners. Our domain expert explained the matching logic that he wanted was to match a person’s social security number, birth date, first name, and last name — but the name matching should be case insensitive and it’s valid to match on the initial of the first name.  Since this logic can be expressed as a set number of inputs and the one output with a great number of permutations, I chose to express this specification as a table with Storyteller (conceptually identical to the old ColumnFixture in FitNesse).  The final version of the spec is shown below  (click the image to get a more readable version):


The image above is our final, approved version of this functionality that now lives as both documentation and a regression test.  Before that though, I wrote the spec and got our domain expert to look at it, and wouldn’t you know it, I had misunderstood a couple assumptions and he gave me very concrete feedback about exactly what the spec should have been.

To make this just a little bit more concrete, our Storyteller test harness connects the table inputs to the system under test with this little bit of adapter code:

The code behind the executable spec
  1.     public class PersonFixture : Fixture
  2.     {
  3.         public PersonFixture()
  4.         {
  5.             Title = “Person Matching Logic”;
  6.         }
  7.         [ExposeAsTable(“Person Matching Examples”)]
  8.         [return:AliasAs(“Matches”)]
  9.         public bool PersonMatches(
  10.             string Description,
  11.             [Default(“555-55-5555”)]SocialSecurityNumber SSN1,
  12.             [Default(“Hank”)]string FirstName1,
  13.             [Default(“Aaron”)]string LastName1,
  14.             [Default(“01/01/1974”)]DateCandidate BirthDate1,
  15.                                   [Default(“555-55-5555”)]SocialSecurityNumber SSN2,
  16.             [Default(“Hank”)]string FirstName2,
  17.             [Default(“Aaron”)]string LastName2,
  18.             [Default(“01/01/1974”)]DateCandidate BirthDate2)
  19.         {
  20.             var person1 = new Person
  21.             {
  22.                 SSN = SSN1,
  23.                 FirstName = FirstName1,
  24.                 LastName = LastName1,
  25.                 BirthDate = BirthDate1
  26.             };
  27.             var person2 = new Person
  28.             {
  29.                 SSN = SSN2,
  30.                 FirstName = FirstName2,
  31.                 LastName = LastName2,
  32.                 BirthDate = BirthDate2
  33.             };
  34.             return person1.Equals(person2);
  35.         }
  36.     }

* Jeremy, is this really just Behavior Driven Development (BDD)?  Or the older idea of Acceptance Test Driven Development (ATDD)?  This is some folks’ definition of BDD, but BDD is so overloaded and means so many different things to different people that I hate using the term.  ATDD never took off, and “executable specifications” just sounds cooler to me, so that’s what I’m going to call it.

My Opinions on Data Setup for Functional Tests

I read Jim Holmes’s post Data Driven Testing: What’s a Good Dataset Size? with some interest because it’s very relevant to my work.  I’ve been heavily involved in test automation efforts over the past decade, and I’ve developed a few opinions about how best to handle test data input for functional tests (as opposed to load/scalability/performance tests).  First though, here’s a couple concerns I have:

  • Automated tests need to be reliable.  Tests that require external, environmental tests can be brittle in unexpected ways.  I hate that.
  • Your tests will fail from time to time with regression bugs.  It’s important that your tests are expressed in a way that makes it easy to understand the cause and effect relationship between the “known inputs” and the “expected outcomes.”  I can’t tell you how many times I’ve struggled to fix a failing test because I couldn’t even understand what exactly it was supposed to be testing.
  • My experience says loudly that smaller, more focused automated tests are far easier to diagnose and fix when they fail than very large, multi-step automated tests.  Moreover, large tests that drive user interfaces are much more likely to be unstable and unreliable.  Regardless of platform, problem domain, and team, I know that I’m far more productive when I’m working with quicker feedback cycles.  If any of my colleagues are reading this, now you know why I’m so adamant about having smaller, focused tests rather than large scripted scenarios.
  • Automated tests should enable your team to change or evolve the internals of your system with more confidence.

Be Very Cautious with Shared Test Data

If I have my druthers, I would not share any test setup data between automated tests except for very fundamental things like the inevitable lookup data and maybe some default user credentials or client information that can safely be considered to be static.  Unlike “real” production coding where “Don’t Repeat Yourself” is crucial for maintainability, in testing code I’m much more concerned with making the test as self-explanatory as possible and completely isolated from one another.  If I share test setup data between tests, there’s frequently going to be a reason why you’ll want to add a little bit more data for a new test which ends up breaking the assertions in the existing tests.  Besides that, using a shared test data set means that you probably have more data than any single test really needs — making the diagnosis of test failures harder.  For all of those reasons and more, I strongly prefer that my teams copy and paste bits of test data sets to keep them isolated by test rather than shared.

Self-Contained Tests are Best

I’ve been interested in the idea of executable specifications for a long time.  In order to make the tests have some value as living documentation about the desired behavior of the system, I think it needs to be as clear as possible what the relationship is between the germane data inputs and the observed behavior of the system.  Plus, automated tests are completely useless if you cannot reliably run them on demand or inside a Continuous Integration build.  In the past I’ve also found that understanding or fixing a test is much harder if I have to constantly ALT-TAB between windows or even just swivel my head between a SQL script or some other external file and the body of the rest of a script.

I’ve found that both the comprehensibility and reliability of an automated test are improved by making each automated test self-contained.  What I mean by that is that every part of the test is expressed in one readable document including the data setup, exercising the system, and verifying the expected outcome.  That way the test can be executed at almost any time because it takes care of its own test data setup rather than being dependent on some sort of external action.  To pull that off you need to be able to very concisely describe the initial state of the system for the test, and shared data sets and/or raw SQL scripts, Xml, Json, or raw calls to your system’s API can easily be noisy.  Which leads me to say that I think you should…

Decouple Test Input from the Implementation Details

I’m a very large believer in the importance of reversibility to the long term success of a system.  With that in mind, we write automated tests to pin down the desired behavior of the system and spend a lot of energy towards designing the structure of our code to more readily accept changes later.  All too frequently, I’ve seen systems become harder to change over time specifically because of tight coupling between the automated tests and the implementation details of a system.  In this case, the automated test suite will actually retard or flat out prevent changes to the system instead of enabling you to more confidently change the system.  Maybe even worse, that tight coupling means that the team will have to eliminate or rewrite the automated tests in order to make a desired change to the system.

With that in mind, I somewhat strongly recommend against expressing your test data input in some form of interpreted format rather than as SQL statements or direct API calls.  My team uses Storyteller2 where all test input is expressed in logical tables or “sentences” that are not tightly coupled to the structure of our persisted documents.  I think that simple textual formats or interpreted Domain Specific Language’s are also viable alternatives.  Despite the extra work to write and maintain a testing DSL, I think there are some big advantages to doing it this way:

  • You’re much more able to make additions to the underlying data storage without having to change the tests.  With an interpreted data approach, you can simply add fake data defaults for new columns or fields
  • You can express only the data that is germane to the functionality that your test is targeting.  More on this in below when I talk about my current project.
  • You can frequently make the test data setup be much more mechanically cheaper per test by simply reducing the amount of data the test author will have to write per test with sensible default values behind the scenes.  I think this topic is probably worth a blog post on its own someday.

This goes far beyond just the test data setup.  I think it’s very advantageous in general to express your functional tests in a way that is independent of implementation details of your application — especially if you’re going to drive a user interface in your testing.

Go in through the Front Door

Very closely related to my concerns about decoupling tests from the implementation details is to avoid using “backdoor” ways to set up test scenarios.  My opinion is that you should set up test scenarios by using the real services your application uses itself to persist data.  While this does risk making the tests run slower by going through extra runtime hoops, I think it has a couple advantages:

  • It’s easier to keep your test automation code synchronized with your production code as you refactor or evolve the production code and data storage
  • It should result in writing less code period
  • It reduces logical duplication between the testing code and the production code — think database schema changes
  • When you write raw data to the underlying storage mechanisms you can very easily get the application into an invalid state that doesn’t happen in production

Case in point, I met with another shop a couple years ago that was struggling with their test automation efforts.  They were writing a Silverlight client with a .Net backend, but using Ruby scripts with ActiveRecord to setup the initial data sets for automated tests.  I know from all of my ex-.Net/now Ruby friends that everything in Ruby is perfect, but in this case, it caused the team a lot of headaches because the tests were very brittle anytime the database schema changed with all the duplication between their C# production code and the Ruby test automation code.

Topics for later…

It’s Saturday and my son and I need to go to the park while the weather’s nice, so I’m cutting this short.  In a later post I’ll try to get more concrete with examples and maybe an additional post that applies all this theory to the project I’m doing at work.

Clean Database per Automated Test Run? Yes, please.

TL;DR We’re able to utilize RavenDb‘s support for embedded databases, some IoC trickery, and our FubuMVC.RavenDb library to make automated testing far simpler by quickly spinning up a brand new database for each individual automated test to have complete control over the state of our system.  Oh, and removing ASP.Net and relational databases out of the equation makes automated functional testing far easier too.

Known inputs and expected outcomes is the mantra of successful automated testing.  This is generally pretty simple with unit tests and more granular integration tests, but sooner or later you’re going to want to exercise your application stack with a persistent database.  You cannot sustain your sanity, much less be successful, while doing automated testing if you cannot easily put your system in a known state before you try to exercise the system.  Stateful elements of your application architecture includes things like queues, the file system, and in memory caches, but for this post I’m only concerned with controlling the state of the application database.

On my last several projects we’ve used some sort of common test setup action to roll back our database to a near empty state before a test adds the exact data to the database that it needs as part of the test execution (the “arrange” part of arrange, act, and assert). You can read more about the ugly stuff I’ve tried in the past at the bottom of this post, but I think we’ve finally arrived at a solution for this problem that I think is succeeding.

Our Solution

First, we’re using RavenDb as a schema-less document database.  We also use StructureMap to compose the services in our system, and RavenDb’s IDocumentStore is built and scoped as a singleton.  In functional testing scenarios, we run our entire application (FubuMVC website hosted with an embedded web server, RavenDb, our backend service) in the same AppDomain as our testing harness, so it’s very simple for us to directly alter the state of the application.  Before each test, we:

  1. Eject and dispose any preexisting instance of IDocumentStore from our main StructureMap container
  2. Replace the default registration of IDocumentStore with a new, completely empty instance of RavenDb’s EmbeddedDocumentStore
  3. Write a little bit of initial state into the new database (a couple pre-canned logins and tenants).
  4. Continue to the rest of the test that will generally start by adding test specific data using our normal repository classes helpfully composed by StructureMap to use the new embedded database

I’m very happy with this solution for a couple different reasons.  First, it’s lightning fast compared with other mechanics I’ve used and describe at the bottom of this post.  Secondly, using a schema-less database means that we don’t have much maintenance work to do to keep this database cleansing mechanism up to date with new additions to our persistent domain model and event store — and I think this is a significant source of friction when testing against relational databases.

Show me some code!

I won’t get into too much detail, but we use StoryTeller2 as our test harness for functional testing.  The “arrange” part of any of our functional tests gets expressed like this taken from one of our tests for our multi-tenancy support:

|If the system state is |

|The users are                                |
|Username |Password |Clients                  |
|User1    |Password1|ClientA, ClientB, ClientC|
|User2    |Password2|ClientA, ClientB         |

In the test expressed above, the only state in the system is exactly what I put into the “arrange” section of the test itself.  The “If the system state is” DSL is implemented by a Fixture class that runs this little bit of code in its setup:

Code Snippet
  1.         public override void SetUp(ITestContext context)
  2.         {
  3.             // There’s a bit more than this going on here, but the service below
  4.             // is part of our FubuPersistence library as a testing hook to
  5.             // wipe the slate clean in a running application
  6.             _reset = Retrieve<ICompleteReset>();
  7.             _reset.ResetState();
  8.         }

As long as my team is using our “If the system state is” fixture to setup the testing state, the application database will be set back to a known state before every single test run — making the automated tests far more reliable than other mechanisms I’ve used in the past.

The ICompleteReset interface originates from the FubuPersistence project that was designed in no small part to make it simpler to completely wipe out the state of your running application.  The ResetState() method looks like this:

Code Snippet
  1.         public void ResetState()
  2.         {
  3.             // Shutdown any type of background process in the application
  4.             // that is stateful or polling before resetting the database
  5.             _serviceResets.Each(x => {
  6.                 trace(“Stopping services with {0}”, x.GetType().Name);
  7.                 x.Stop();
  8.             });
  9.             // The call to replace the database
  10.             trace(“Clearing persisted state”);
  11.             _persistence.ClearPersistedState();
  12.             // Load any basic state that has to exist for all tests.  
  13.             // I’m thinking that this is nothing but a couple default
  14.             // login credentials and maybe some static lookup list
  15.             // data
  16.             trace(“Loading initial data”);
  17.             _initialState.Load();
  18.             // Restart any and all background processes to run against the newly
  19.             // created database
  20.             _serviceResets.Each(x => {
  21.                 trace(“Starting services with {0}”, x.GetType().Name);
  22.                 x.Start();
  23.             });
  24.         }

The method _persistence.ClearPersistedState() called above to rollback all persistence is implemented by our RavenDbPersistedState class.  That method does this:

Code Snippet
  1.         public void ClearPersistedState()
  2.         {
  3.             // _container is the main StructureMap IoC container for the
  4.             // running application.  The line below will
  5.             // eject any existing IDocumentStore from the container
  6.             // and dispose it
  7.             _container.Model.For<IDocumentStore>().Default.EjectObject();
  8.             // RavenDbSettings is another class from FubuPersistence
  9.             // that just controls the very intial creation of a
  10.             // RavenDb IDocumentStore object.  In this case, we’re
  11.             // overriding the normal project configuration from
  12.             // the App.config with instructions to use an
  13.             // EmbeddedDocumentStore running completely
  14.             // in memory.
  15.             _container.Inject(new RavenDbSettings
  16.             {
  17.                 RunInMemory = true
  18.             });
  19.         }

The code above doesn’t necessarily create a new database, but we’ve set ourselves up to use a brand new embedded, in memory database whenever something does request a running database from the StructureMap container.  I’m not going to show this code for the sake of brevity, but I think it’s important to note that the RavenDb database construction will use your normal mechanisms for bootstrapping and configuring an IDocumentStore including all the hundred RavenDb switches and pre-canned indices.

All the code shown here is from the FubuPersistence repository on GitHub.


I’m generally happy with this solution.  So far, it’s quick in execution and we haven’t required much maintenance as we’ve progressed other than more default data.  Hopefully, this solution will be applicable and reusable in future projects out of the box.  I would happily recommend a similar approach to other teams.

But, but, but…

If you did read this carefully, I think you’ll find some things to take exception with:

  1. I’m assuming that you really are able to test functionality with bare minimum data sets to keep the setup work to a minimum and the performance at an acceptable level.  This technique isn’t going to be useful for anything involving performance or load testing — but are you really all that concerned about functionality testing when you do that type of testing?
  2. We’re not running our application in its deployed configuration when we collapse everything down to the same AppDomain.  Why I think this is a good idea, the benefits, and how we do it are a topic for another blog post.  Promise.
  3. RavenDb is schema-less and that turns out to make a huge difference in how long it takes to spin up a new database from scratch compared to relational databases.  Yes, there may be some pre-canned indices that need to get built up when you spin up the new embedded database, but with an empty database I don’t see that as a show stopper.

Other, less successful ways of controlling state I’ve used in the past

Over the years I’ve done automated testing against persisted databases with varying degrees of frustration.  The worst possible thing you can do is to have everybody testing against a shared relational database in the development and testing environments.   You either expect the database to be in a certain state at the start of the test, or you ran a stored procedure to set up the tables you wanted to test against.  I can’t even begin to tell you how unreliable this turns out to be when more than one person is running tests at the same time and fouling up the test runs.  Unfortunately, many shops still try to do this and it’s a significant hurdle to clear when doing automated testing.  Yes, you can try to play tricks with transactions to isolate the test data or try to use randomized data, but I’m not a believer in either approach.

Having an isolated relational database per developer, preferably on their own development box, was a marked improvement, but it adds a great deal of overhead to your project automation.  Realistically, you need a developer to be able to build out the latest database on the fly from the latest source on their own box.  That’s not a deal breaker with modern database migration tools, but it’s still a significant about of work for your team.  The bigger problem to me is how you tear down the existing state in a relational database to put it into a known state before running an automated test.  You’ve got a couple choices:

  1. Destroy the schema completely and rebuild it from scratch.  Don’t laugh, I’ve seen people do this and the tests were as painfully slow as you can probably imagine.  I suppose you could also script the database to rollback to a checkpoint or reattach a backed up copy of the database, but again, I’m never going to recommend that if you have other options.
  2. Execute a set of commands that wipes most if not all of the data in a database before each test.  I’ve done this before, and while it definitely helped create a known state in the system, this strategy performed very poorly and it took quite a bit of work to maintain the “clean database” script as the project progressed.  As a project grows, the runtime of your automated test runs becomes very important to keep the feedback loop useful.  Slow tests hamper the usefulness of automated testing.
  3. Selectively clean out and write data to only the tables affected by a test.  This is probably much faster performance wise, but I think it will require more coding inside of the testing code to do the one off, set up the state code.

* As an aside, I really suggest keeping the project database data definition language scripts and/or migrations in the same source control system as the code so that it’s very easy to trace the version of the code running against which version of the database schema.  The harsh reality in my experience is that the same software engineering rigor we generally use for our application code (source control, unit testing, TDD, continuous integration) is very often missing in regards to the relational database DDL and environment. If you’re a database guy talking to me at a conference, you better have your stuff together on this front before you dare tell me that “my developers can’t be trusted to access my database.”