Category Archives: Automated Testing

A Small Case Study in Test Automation (and other things)

I’m trying to walk a line here in this post between avoiding specifics about a client project for obvious reasons, but providing enough detail to make this post worthwhile for that client. One of our client’s development managers is interested in speeding up their testing, and I’m hoping to use this post to lay out some ideas and approaches to improve the testing procedures in this system.

I’ve been part of an integration project for the past couple years that validates, routes, and processes financial transactions coming from an external partner of our client’s all the way to a very large 3rd party hosted in our client’s environment. We’re in the middle of some significant changes in the integration to that 3rd party application that is going to trigger a round of regression testing of the entire system — and that’s where this post comes in. Testing this application has been very challenging and extremely time consuming. Any opportunity to make regression testing be quicker and more effective is going to make everyone’s jobs easier.

It’s not just that the testing itself is slower than desired. Because the testing is slow and not easily repeatable, the development team can’t really do much technical improvement through refactoring as they learn more about the system behavior and how the code structure is working out over time. That’s been a definite negative for code and architectural quality.

Before I get into the details of the existing system, know that what I’m showing and discussing here is a bit of an idealized version of how I wish we had architected the system and what we’ve recommended to the client for the longer term. The real system is a bit messier and significantly harder to test than what I’m presenting here — but there’s a lesson for you, testability should be a first class architectural goal in many cases (and Conway’s Law is legitimately something to work around).

From a 10,000 foot level, here’s the entire system:

TestAutomationScenario-High Level

The workflow is:

  1. A couple times a day, a new flat file containing new transactions will be dropped into a file share
  2. The File Reader console application is executed to find this file, parse it into little transaction messages, and publish those messages to Rabbit MQ. There’s a little bit of database tracking going on for reporting and just general activity tracking.
  3. Rabbit MQ publishes the transaction messages to the subscribing Transaction Processor application (an ASP.Net Core application with an active subscriber for these incoming messages).
  4. The Transaction Processor handles each transaction message by:
    1. Pulling in a helluva lot of information from the 3rd Party Application and other information from a Configuration DB related to the account number in the transaction message
    2. Using the information from the previous step to validate whether the transaction can be processed normally, or has to go into a queue for manual resolution
    3. For the valid transactions, use the information from step #1 to decide how the money in the incoming transaction will be applied (routing to sub-transactions)
    4. Send the routed sub-transactions from the previous step to the 3rd Party Application through its externally facing API.

While there are some unit tests and intermediate level integration tests today on some of the subsystems, the overall official testing effort to date has relied strictly on end to end, manual testing of the entire system. Some of the emphasis on black box, end to end testing is due to our client’s mandatory regulatory auditing requirements and that can’t completely go away. However, there’s worlds of opportunity and new willingness to explore other alternatives like white box testing techniques or new processes for testing as a complement to the formal audit-style testing, so let’s jump into some ideas for making things work more efficiently.

Some Necessary Shifts in Testing Philosophy

First off, there’s an important shift from trying to prove that the system is working perfectly with strictly black box testing to thinking about testing as a feedback mechanism to identify and remove problems in the code so that the code can be deployed to production. If you look at testing as more of a feedback cycle, you can utilize the testing pyramid idea to maximize feedback about how your system functions with more efficient testing techniques.

Secondly, I think you have to have collaboration between testers, developers, and architects to make white box testing more effective. Part of that is increasing the testability of the system architecture, and another part of that is trying to avoid duplication in effort between tests written by developers and other tests performed by the testers. Moreover, if developers are actively engaged in writing tests — and they should be in my world view — it’s very helpful to have the testers involved in the content of those developer-written tests. In other words, I think that having strict separation between testers and development can be very inefficient. I know there are folks who strongly believe that strict independence for the testers from the developers is necessary, but I think that does more harm than good.

For more information on whitebox vs. blackbox testing and improving test feedback, also see:

If you’ll buy any of the two previous paragraphs, or you’re at least open-minded to continue, let’s see how some of this Test Pyramid thinking would play out in our big integration system.

At a high level, I would want the testing strategy to focus on:

  • Some kind of Behavioral Driven Development approach for all the business rules for validating and routing the transactions.
  • Mid-level integration tests on all the code that acts as a gateway or service proxy to the 3rd Party Application. This would include both the code that sends commands to the 3rd Party Application and the code that queries or reads the 3rd Party Application.
  • Mid-level integration tests on the File Reader that probably stubs the outgoing Rabbit MQ and just measures how the File Reader parses the incoming files, writes tracking information, and what messages it publishes.
  • A handful of fully end to end tests through the entire system to prove out all the integration points — but by and large you use finer grained tests to test out business rules and the integration with the 3rd Party Application.

 

The Transaction Processor

Most of the meat of the bigger transaction processing project is within what I’m calling the Transaction Processor shown below in a little more detail:

TestAutomationScenario-Transaction Processor

There’s a couple big responsibilities here:

  • Querying data from the 3rd Party Application with its heinously unusable, custom Xml query language to use inside the business rules
  • Look up some configuration parameters about accounts from a second Configuration DB
  • Carry out validation rules against incoming transactions
  • Route the incoming transactions into sub-transactions based on business rules
  • Post the sub-transactions to the 3rd Party Application with its, shall we say, interesting XML API.

Channeling some Domain-Driven Design thinking here, let’s go straight into the business rules for validation and routing. The business rules required a lot of input parameters, there were a lot of permutations to build and test, and the developers new to the problem domain had plenty of misunderstandings early on about the desired behavior.

From an architectural standpoint, I think it is extremely important to completely isolate these business rules from the 3rd Party Application, the configuration databases, and even the incoming flat file format because:

  • It was very difficult to set up test scenario inputs in the 3rd Party Application
  • There’s a tremendous number of test cases because of the permutations on account state and transaction parameters involved, so there would be a large benefit to tests being quick to author and execute
  • This logic is key to the business and has already evolved significantly since this project started. It’s imperative that this logic be safe to change over time, and that happens most effectively when it’s cheap to write new tests and quick to execute the existing test coverage.
  • I probably shouldn’t say this too loudly, but I think this client should reconsider coupling their ecosystem to the 3rd Party Application

To that end, the business rules should only depend on a domain model that’s internal to the Transaction Processor. We’ll use the A-Frame Architecture idea from Jim Shore’s Testing Without Mocks paper to isolate the business rule behavior from the infrastructure. The domain model objects that implement all the business logic will have no dependency whatsoever on the external dependencies. Instead, we’ll effectively write our own mapping layer to take the data returned from the 3rd Party Application and the Configuration DB and build all the state the domain model needs, then hand that to the business logic code in the domain model.

From the perspective of testing, there’s a lot of opportunity to get the business rules wrong. Rather than depend solely on design or requirements documents, I strongly recommend using Behavior Driven Development (BDD) techniques here to author executable specifications that are readable and reviewed (if not written) by the business domain experts and testers. What I largely recommend here is that the developers mostly write the test harness code, but business domain experts and more likely testers will own the content and meaning of the BDD specifications. Working in this manner, we should be able to treat the BDD specifications as the official tests for the business rule behavior even though this doesn’t run the entire process.

So that handles the business rules, now on to the rest of the Transaction Processor. The “controller” code in the diagram is playing a coordination role to mediate between the business rules and the code that interacts directly with the 3rd Party Application’s external API endpoints. I’d mostly use unit tests and maybe even *gasp* interaction testing with mock objects to test out the workflow and error handling of this code.

The service gateway code that interacts with the 3rd Party Application was extremely problematic in both development and testing. In retrospect, I wish we’d hammered at this code in isolation much more before even bothering trying to run end to end tests. The big issue we never pushed through (yet) was how to establish known system state in the 3rd Party Application so that we could write reliable automated tests around just the service gateway code in the Transaction Processor. I think it would be worthwhile for domain experts and/or testers to be involved in this step as well to verify the expected results are really happening in the 3rd Party Application.

Lastly, I’d opt to do some bigger tests for just the Transaction Processor where you directly enqueue the transaction messages in Rabbit MQ and test the entire Transaction Processor stack all the way down to the external dependencies. The point of these tests are to prove out the integrations and configuration. You don’t try to recreate all the business rules functionality tests covered by the smaller, faster unit tests.

 

“Some” End to End Tests

There are absolutely some issues that can only be tested through true, end to end tests. Integrations, configuration, environments, and security are examples. We’ll still write and perform some end to end tests, but we won’t try to recreate the business functionality tests covered.

No matter what though, the tests need to be as easily repeatable as possible so there’s still going to be a level of automation to speed things along. Here’s my thoughts on what that might look like:

  • The flat file format was originally used by mainframe applications, so as you can imagine, it’s not remotely user friendly to edit or read. I’d suggest using some custom code that can transform a much simpler format to the mainframe-friendly format so the testers can write new test cases more efficiently and everyone else can actually read and understand the test inputs
  • The undeniable, cardinal rule of automated testing is that you have to have known inputs and expected outcomes. In this system, that means being able to set up the 3rd Party Application in a known state for each end to end testing scenario. The failure to do that (not a technical impossibility, but it’s a long story) is my single biggest regret from this project. See My Opinions on Data Setup for Functional Tests for more on what I recommend for test input data.
  • Automating the testing of asynchronous workflows like this system can be very challenging. The biggest issue is making an automated test harness understand when the work is really done across multiple systems so it can proceed to the “assert” part of the standard “arrange, act, assert” test workflow. I’ve had some success with this in the past by making the test harness listen to the various application logging or some kind of visible side effect like data being written to a database to “know” when the work is complete.
  • Tests do fail from time to time, so I’d actually try to have the end to end test harness able to gather up the relevant logs for all the systems active in the test. That’s even more valuable if you can somehow manage to correlate the logging activity with only the active test run.
  • Finally, the big expensive end to end tests my client has to follow for official certification and auditing? Yeah, you have to do that, but my very strong recommendation and where I think they’re starting to head is to use finer-grained and more efficient testing techniques to remove problems first. Then come back and do the laboriously slow audit tests when you can justifiably expect success with few iterations.

 

 

Summary

There’s a couple big points I wanted to drive home in this post:

  • Embrace the test pyramid idea, and try to get over any aversion to white box testing because of its advantages for efficiency
  • Treat testing as a feedback mechanism more than a certification process
  • Tests of all type need to be repeatable to be effective feedback. Manual testing, and especially manual testing where it’s time consuming to set up the necessary system state first, is not very repeatable
  • I think you need to embrace the Agile idea of blurring the lines between roles. Developers and architects need to be involved in the automated testing for a better chance of success. Testers may need to get their hands dirty directly in the code or at least exploit their knowledge of the coding internals in order to make the testing more efficient
  • Developers, testers, and architects need to collaborate to be truly successful in testing. Waterfall style testing where all testing happens at the end is just not the way to be successful
  • Try to avoid duplicating effort between developer written tests and the tester activity, which might be just yet another way of saying the testers and developers need to be collaborating as the project goes on
  • Feedback cycles of all kinds are valuable for quality software

My integration testing challenges this week

EDIT 3/12: All these tests are fixed, and it wasn’t as bad as I’d thought it was going to be, but the extra logging and faster change code/run test under debugger cycle definitely helped.

This was meant to be a short post, or at least an easy to write post on my part, but it spilled out from integration testing and into some Storyteller mechanics, semi-advanced xUnit.Net usage, ASP.Net Core logging integration, and even a tepid defense of compositional architectures wrapped around an IoC container.

I’ve been working feverishly the past couple of months to push Jasper to a 1.0 release. As (knock on wood) the last big epic, I’ve been working on a large overhaul and redesign of the message persistence behind Jasper’s durable messaging. The existing code was fairly well covered by integration tests, so I felt confident that I could make the large scale changes and use the big bang integration tests to ensure the intended functionality still worked as designed.

I assume that y’all can guess how this has turned out. After a week of code changes and fixing any and all unit test and intermediate integration test failures, I got to the point where I was ready to run the big bang integration tests and, get this, they didn’t pass on the first attempt! I know, shocking right? Moreover, the tests involve all kinds of background processing and even multiple logical applications (think ASP.Net Core IWebHost objects) being started up and shut down during the tests, so it wasn’t exactly easy to spot the source of my test failures.

I thought it’d be worth talking about how I first stopped and invested in improving my test harness to make it faster to launch the test harness and to capture a lot more information about what’s going on in all the multithreading/asynchronous madness going on but…

First, an aside on Debugger Hell

You probably want to have some kind of debugger in your IDE of choice. You also want to be reasonably good using that tool, know how to use its features, and conversant in its supported keyboard shortcuts. You also want to try really hard to avoid needing to use your debugger too often because that’s just not an efficient way to get things done. Many folks in the early days of Agile development, including me, described debugger usage as a code or testing “smell.” And while “smell” is mostly used today as a pejorative to put down something that you just don’t like, it was originally just meant as a warning you should pay more attention to in your code or approach.

In the case of debugger usage, it might be telling you that your testing approach needs to be more fine grained or that you are missing crucial test coverage at lower levels. In my case, I’ll be looking for places where I’m missing smaller tests on elements of the bigger system and fill in those gaps before getting too worked up trying to solve the big integration test failures.

Storyteller Test Harness

For these big integration tests, I’m using Storyteller as my testing tool (think Cucumber,  but much more optimized for integration testing as opposed to being cute). With Storyteller, I’ve created a specification language that lets me script out message failover scenarios like the one shown below (which is currently failing as I write this):

JasperFailoverSpec

In the specification above, I’m starting and stopping Jasper applications to prove out Jasper’s ability to fail over and recover pending work from one running node to another using its Marten message persistence option. At runtime, Jasper has a background “durability agent” constantly running that is polling the database to determine if there is any unclaimed work to do or known application nodes are down (using advisory locks through Postgresql if anybody would ever be interested in a blog post about just that). Hopefully that’s enough description just to know that this isn’t particularly an easy scenario to test or code.

In my initial attempts to diagnose the failing Storyteller tests I bumped into a couple problems and sources of friction:

  1. It was slow and awkward mechanically to get the tests running under the debugger (that’s been a long standing problem with Storyteller and I’m finally happy with the approach shown later in this post)
  2. I could tell quickly that exceptions were being thrown and logged in the background processing, but I wasn’t capturing that log output in any kind of usable way

Since I’m pretty sure that the tests weren’t going to get resolved quickly and that I’d probably want to write even more of these damn tests, I believed that I first needed to invest in better visibility into what was happening inside the code and a much quicker way to cycle into the debugger. To that end, I took a little detour and worked on some Storyteller improvements that I’ve been meaning to do for quite a while.

Incorporating Logging into Storyteller Results

Jasper uses the ASP.Net Core logging abstractions for its own internal logging, but I didn’t have anything configured except for Debug and Console tracing to capture the logs being generated at runtime. Even with the console output, what I really wanted was all the log information correlated with both the individual test execution and which receiver or sender application the logging was from.

Fortunately, Storyteller has an extensibility model to capture custom logging and instrumentation directly into its test results. It turned out to be very simple to whip together an adapter for ASP.Net Core logging that captured the logging information in a way that can be exposed by Storyteller.

You can see the results in the image below. The table below is just showing all the logging messages received by ILogger within the “Receiver1” application during one test execution. The yellow row is an exception that was logged during the execution that I might not have been able to sense otherwise.

AspNetLoggingInStoryteller

For the implementation, ASP.Net Core exposes the ILoggerProvider service such that you can happily plug in as many logging strategies as you want to an application in a combinatorial way. On the Storyteller side of things, you have the Report interface that let’s you plug in custom logging that can expose HTML output into Storyteller’s results.

Implementing that crudely came out as a single class that implements both adapter interface (here’s a gist of the whole thing):

public class StorytellerAspNetCoreLogger : Report, ILoggerProvider

The actual logging just tracks all the calls to ILogger.Log() as little model objects in memory:

public void Log<TState>(LogLevel logLevel, EventId eventId, TState state, Exception exception, Func<TState, Exception, string> formatter)
{
    var logRecord = new LogRecord
    {
        Category = _categoryName,
        Level = logLevel.ToString(),
        Message = formatter(state, exception),
        ExceptionText = exception?.ToString()
    };

    // Just keep all the log records in an in memory list
    _parent.Records.Add(logRecord);
}

Fortunately enough, in the Storyteller Fixture code for the test harness I bootstrap the receiver and sender applications per test execution, so it’s really easy to just add the new StorytellerAspNetCoreLogger to both the Jasper applications and the Storyteller test engine:

var registry = new ReceiverApp();
registry.Services.AddSingleton<IMessageLogger>(_messageLogger);

var logger = new StorytellerAspNetCoreLogger(key);

// Tell Storyteller about the new logger so that it'll be
// rendered as part of Storyteller's results
Context.Reporting.Log(logger);

// This is bootstrapping a Jasper application through the 
// normal ASP.Net Core IWebHostBuilder
return JasperHost
    .CreateDefaultBuilder()
    .ConfigureLogging(x =>
    {
        x.SetMinimumLevel(LogLevel.Debug);
        x.AddDebug();
        x.AddConsole();
        
        // Add the logger to the new Jasper app
        // being built up
        x.AddProvider(logger);
    })

    .UseJasper(registry)
    .StartJasper();

And voila, the logging information is now part of the test results in a useful way so I can see a lot more information about what’s happening during the test execution.

It sucks that my code is throwing exceptions instead of just working, but at least I can see what the hell is going wrong now.

Get the debugger going quickly

To be honest, the friction of getting Storyteller tests running under a debugger has always been a drawback to Storyteller — especially compared to how fast that workflow is with tools like xUnit.Net that integrate seamlessly into your IDE. You’ve always been able to just attach your debugger to the running Storyteller process, but I’ve always found that to be clumsy and slow — especially when you’re trying to quickly cycle between attempted fixes and re-running the tests.

I made some attempts in Storyteller 5 to improve the situation (after we gave up on building a dotnet test adapter because that model is bonkers), but that still takes some set up time to make it work and even I have to always run to the documentation to remember how to do it. Sometime this weekend the solution for a quick xUnit.Net execution wrapper around Storyteller popped into my head and it honestly took about 15 minutes flat to get things working so that I could kick off individual Storyteller specifications from xUnit.Net as shown below:

StorytellerWithinXUnit

Maybe that’s not super exciting, but the end result is that I can rerun a specification after making changes with or without debugging with nothing but a simple keyboard shortcut in the IDE. That’s a dramatically faster feedback cycle than what I had to begin with.

Implementation wise, I just took advantage of xUnit.Net’s [MemberData] feature for parameterized tests and Storyteller’s StorytellerRunner class that was built to allow users to run specifications from their own code. After adding a new xUnit.Net test project and referencing the original Storyteller specification project named “StorytellerSpecs, ” I added the code file shown below in its entirety::

// This only exists as a hook to dispose the static
// StorytellerRunner that is hosting the underlying
// system under test at the end of all the spec
// executions
public class StorytellerFixture : IDisposable
{
    public void Dispose()
    {
        Runner.SpecRunner.Dispose();
    }
}

public class Runner : IClassFixture<StorytellerFixture>
{
    internal static readonly StoryTeller.StorytellerRunner SpecRunner;

    static Runner()
    {
        // I'll admit this is ugly, but this establishes where the specification
        // files live in the real StorytellerSpecs project
        var directory = AppContext.BaseDirectory
            .ParentDirectory()
            .ParentDirectory()
            .ParentDirectory()
            .ParentDirectory()
            .AppendPath("StorytellerSpecs")
            .AppendPath("Specs");

        SpecRunner = new StoryTeller.StorytellerRunner(new SpecSystem(), directory);
    }

    // Discover all the known Storyteller specifications
    public static IEnumerable<object[]> GetFiles()
    {
        var specifications = SpecRunner.Hierarchy.Specifications.GetAll();
        return specifications.Select(x => new object[] {x.path}).ToArray();
    }

    // Use a touch of xUnit.Net magic to be able to kick off and
    // run any Storyteller specification through xUnit
    [Theory]
    [MemberData(nameof(GetFiles))]
    public void run_specification(string path)
    {
        var results = SpecRunner.Run(path);
        if (!results.Counts.WasSuccessful())
        {
            SpecRunner.OpenResultsInBrowser();
            throw new Exception(results.Counts.ToString());
        }
    }
}

And that’s that. Something I’ve wanted to have for ages and failed to build, done in 15 minutes because I happened to remember something similar we’d done at work and realized how absurdly easy xUnit.Net made this effort.

Summary

  • Sometimes it’s worthwhile to take a step back from trying to solve a problem through debugging and invest in better instrumentation or write some automation scripts to make the debugging cycles faster rather than just trying to force your way through the solution
  • Inspiration happens at random times
  • Listen to that niggling voice in your head sometimes that’s telling you that you should be doing things differently in your code or tests
  • Using a modular architecture that’s composed by an IoC container the way that ASP.net Core does can sometimes be advantageous in integration testing scenarios. Case in point is how easy it was for me to toss in an all new logging provider that captured the log information directly into the test results for easier test failure resolution

And when I get around to it, these little Storyteller improvements will end up in Storyteller itself.

Subcutaneous Testing against React + .Net Applications

Everything in this post is from a proof of concept project we did for the technique described here. We have not used this tooling on a real project yet, but we have a team starting a project where this might be useful, so I promised a write up for them.

In my previous post I laid out how I see the testing pyramid and test tool and technique choices against my company’s typical web application technology stack. As a reminder, our recommended stack for new development on web applications or API’s looks like this (plus a backing database):

Slide1

Last week I talked through how we might test the React components and Redux store setup, including the interaction between Redux and React. I also talked about how we could go about testing the .Net backend both at a unit level and through integration tests through to the backing database. Lastly, I said we’d use a modicum of end to end, Selenium-based tests, but said that we should avoid depending on too many of those kinds of tests. That leaves us with a pretty big hole in coverage against the interaction between the Javascript code running in the browser and the .Net code and database interactions running server side.

As a possible solution for this gap, my team at work did a proof of concept for using Storyteller to do subcutaneous testing against the full application stack, but minus the actual React component “view layer.” The general idea is to use Storyteller with its Storyteller.Redux extension to host the ASP.Net Core application so that it can easily drive both test data input through the real data layer of the .Net code and then turn around and use the real system services to verify the state of the application and the backing database as the “assert” stage of the tests. The basic value proposition here is that this mechanism could be far more efficient in terms of developer time against its benefits compared to end to end, Selenium based testing. We’re also theorizing that the feedback cycles would be much tighter through faster tests and definitely more reliable tests than the equivalent tests against the browser every could be.

A couple things to note or argue:

  • This technique would be most useful if your React components are generally dumb and only communicate with the external world by dispatching well defined actions to the Redux store (I’m assuming that you’re utilizing Redux middleware like redux-thunk or redux-saga here).
  • Why Storyteller as the driver for this instead of another test runner? I’m obviously biased, but I think Storyteller has the very best story in test automation tooling for declarative set up and verification of system state. Plus, unlike any of the xUnit tools I’m aware of, Storyteller is built specifically with integration testing in mind (think configurable retries, bailing out on runaway tests, better control over the lifecycle of the test harness)
  • Storyteller has support for declarative assertions against a JSON document that should be handy for making assertions against the Redux store state
  • We’re theorizing that it’ll be vastly easier to make assertions against the Redux store state than it would to hunt down DOM elements with Selenium
  • The Storyteller.Redux extension subscribes to any changes to the store state and exposes that to the Storyteller test engine. The big win here is that it gives you a single mechanism to handle the dreaded “everything is asynchronous so how does the test harness know when it’s time to check the expected outcomes” problem that makes Selenium testing so dad gum hard in the real world.
  • The Storyteller.Redux extension can capture any logged messages to console.log or console.error in the running browser. Add that to any server side logging that you can also pipe into the Storyteller results

The general topology in these tests would look like this:

Slide2

The test harness would consist of:

  1. A Storyteller project that bootstraps the ASP.Net Core application and runs it within the Storyteller test engine. You can use the Storyteller.AspNetCore extension to make that easier (or you could after I update it for ASP.Net Core 2 and its breaking changes).
  2. The Storyteller.Redux extension for Storyteller provides the Websockets glue to communicate between the launched browser with your Redux store and the running Storyteller engine
  3. The Storyteller ISystem in this project has to have some way to launch a web browser to the page that hosts the Javascript bundle. In the proof of concept project, I just built out a static HTML page that included the bundle Javascript and directly launched the browser to the file location, but you could always use Selenium just to open the brower and navigate to the right Url.
  4. Storyteller Fixtures for setting up system state for tests, sending Redux actions directly to the running Redux store to simulate user interactions, asserting on the expected system state on the backend, and checking the expected Redux store state
  5. An alternative Javascript bundle that includes all the reducer and middleware code in your application, along with some “special sauce” code shown in a section down below that enables Storyteller to send messages and retrieve the current state of the running Redux store via Websockets.

The Special Sauce in the Javascript Bundle

Your custom bundle for the subcutaneous testing would need to have this code in its Webpack entry point file (the full file is on GitHub here):

// "store" is your configured Redux store object. 
// "transformState" is just a hook to convert your Redux
// store state to something that Storyteller could consume
function ReduxHarness(store, transformState){
    if (!transformState){
        transformState = s => s;
    }

    function getQueryVariable(variable)
    {
       var query = window.location.search.substring(1);
       var vars = query.split("&");
       for (var i=0;i<vars.length;i++) {                var pair = vars[i].split("=");                if(pair[0] == variable){return pair[1];}        }        return(false);     }     var revision = 1;     var port = getQueryVariable('StorytellerPort');     var wsAddress = "ws://127.0.0.1:5250";     var socket = new WebSocket(wsAddress); 	socket.onclose = function(){ 		console.log('The socket closed'); 	}; 	socket.onerror = function(evt){ 		console.error(JSON.stringify(evt)); 	}     socket.onmessage = function(evt){         if (evt.data == 'REFRESH'){             window.location.reload();             return;         }         if (evt.data == 'CLOSE'){             window.close();             return;         } 		var message = JSON.parse(evt.data); 		console.log('Got: ' + JSON.stringify(message) + ' with topic ' + message.type); 	 		store.dispatch(message); 	};     store.subscribe(() => {
        var state = store.getState();

        revision = revision + 1;
        var message = {
            type: 'redux-state',
            revision: revision,
            state: transformState(state)
        }

		if (socket.readyState == 1){
            var json = JSON.stringify(message);
            console.log('Sending to engine: ' + json);
			socket.send(json);
		}
    });

    // Capturing any kind of client side logging
    // and piping that into the Storyteller test results
    var originalLog = console.log;
    console.log = function(msg){
        originalLog(msg);

        var message = {
            type: 'console.log',
            text: msg
        }

        var json = JSON.stringify(message);
        socket.send(json);
    }

    // Capture any logged errors in the JS code
    // and pipe that into the Storyteller results
    var originalError = console.error;
    console.error = function(e){
        originalError(e);

        var message = {
            type: 'console.error',
            error: e
        }

        var json = JSON.stringify(message);
        socket.send(json);
    }
}


ReduxHarness(store, s => s.toJS())

The Storyteller System

In my proof of concept, I connected Storyteller to the Redux testing bundle like this (the real code is here):

    public class Program
    {
        public static void Main(string[] args)
        {
            StorytellerAgent.Run(args, new ReduxSampleSystem());
        }
    }

    public class ReduxSampleSystem : SimpleSystem
    {
        protected override void configureCellHandling(CellHandling handling)
        {
            // The code below is just to generate the static file I'm 
            // using to host the reducer + websockets code
            var directory = AppContext.BaseDirectory;
            while (Path.GetFileName(directory) != "ReduxSamples")
            {
                directory = directory.ParentDirectory();
            }

            var jsFile = directory.AppendPath("reduxharness.js");
            Console.WriteLine("Copying the reduxharness.js file to " + directory);
            var source = directory.AppendPath("..", "StorytellerRunner", "reduxharness.js");


            File.Copy(source, jsFile, true);

            var harnessPath = directory.AppendPath("harness.htm");
            if (!File.Exists(harnessPath))
            {
                var doc = new HtmlDocument();

                var href = "file://" + jsFile;

                doc.Head.Add("script").Attr("src", href);

                Console.WriteLine("Writing the harness file to " + harnessPath);
                doc.WriteToFile(harnessPath);
            }

            var url = "file://" + harnessPath;

            // Add the ReduxSagaExtension and point it at your view
            handling.Extensions.Add(new ReduxSagaExtension(url));
        }
    }

The static HTML file generation above isn’t mandatory. You *could* do that by running the real page from the instance of the application hosted within Storyteller as long as the ReduxHarness function shown above is applied to your Redux store at some point.

Storyteller Fixtures that Drive or Check the Redux Store

For driving and checking the Redux store, we created a helper class called ReduxFixture that enables you to do simple actions and value checks in a declarative way as shown below:

    public class CalculatorFixture : ReduxFixture
    {
        // There's a little bit of magic here. This would send a JSON action
        // to the Redux store like {"type": "multiply", "operand": "5"}
        [SendJson("multiply")]
        public void Multiply(int operand)
        {

        }

        // Does an assertion against a single value within the current state
        // of the redux store using a JSONPath expression
        public IGrammar CheckValue()
        {
            return CheckJsonValue("$.number", "The current number should be {number}");
        }

    }

You can of course skip the built in helpers and send JSON actions directly to the running browser or write your own assertions against the current state of the Redux store. There’s also some built in functionality in the ReduxFixture class to track Redux store revisions and to wait for any change to the Redux store before performing assertions.

Thoughts on Agile Database Development

I’m flying out to our main office next week and one of the big things on my agenda is talking over our practices around databases in our software projects. This blog post is just me getting my thoughts and talking points together beforehand. There are two general themes here, how I’d do things in a perfect world and how to make things better within the constraints of the organization and software architecture that have now.

I’ve been a big proponent of Agile development processes and practices going back to the early days of Extreme Programming (before Scrum came along and ruined everything about the way that Scrappy ruined Scooby Doo cartoons for me as a child). If I’m working in an Agile way, I want:

  1. Strong project and testing automation as feedback cycles that run against all changes to the system
  2. Some kind of easy traceability from a built or deployed system to exactly the version of the code and its dependencies , preferably automated through your source control processes
  3. Technologies, tools, and frameworks that provide high reversibility to ease the cost of doing evolutionary software design.

From the get go, relational databases have been one of the biggest challenges in the usage of Agile software practices. They’re laborious to use in automated testing, often expensive in time or money to install or deploy, the change management is a bit harder because you can’t just replace the existing database objects the way we can with other code, and I absolutely think it’s reduces reversibility in your system architecture compared to other options. That being said, there are some practices and processes I think you should adopt so that your Agile development process doesn’t crash and burn when a relational database is involved.

Keep Business Logic out of the Database, Period.

I’m strongly against having any business logic tightly coupled to the underlying database, but not everyone feels the same way. For one reason, stored procedure languages (tSQL, PL/SQL, etc.) are very limited in their constructs and tooling compared to the languages we use in our application code (basically anything else). Mostly though, I avoid coupling business logic to the database because having to test through the database is almost inevitably more expensive both in developer effort and test run times than it would be otherwise.

Some folks will suggest that you might want to change out your database later, but to be honest, the only time I’ve ever done that in real life is when we moved from RavenDb to Marten where it had little impact on the existing structure of the code.

In practice this means that I try to:

  1. Eschew usage of stored procedures. Yes, I think there are still some valid reasons to use sprocs, but I think that they are a “guilty until proven innocent” choice in almost any scenario
  2. Pull business logic away from the database persistence altogether whenever possible. I think I’ll be going back over some of my old designing for testability blog posts from the Codebetter/ALT.Net days to try to explain to our teams that “wrap the database in an interface and mock it” isn’t always the best solution in every case for testability
  3. Favor persistence tools that invert the control between the business logic and the database over tooling like Active Record that creates a tight coupling to the database. What this means is that instead of having business logic code directly reading and writing to the database, something else (Dapper if we can, EF if we absolutely have to) is responsible for loading and persisting application state back and forth between the domain in code and the underlying database. The point is to be able to completely test your business logic in complete isolation from the database.

I would make exceptions for use cases where using the database engine to do set based logic in a stored procedure is a more efficient way to solve the problem, but I haven’t been involved in systems like that for a long time.

 

Database per Developer/Tester/Environment

My very strong preference and recommendation is to have each developer, tester, and automated testing environment using a completely separate database. The key reason is to isolate each thread of team activity to avoid simultaneous operations or database changes from interfering with each other. Sharing the database makes automated testing much less effective because you often get false negatives or false positives from database activity going on somewhere else at the same time — and yes, this really does happen and I’ve got the scars to prove it.

Additionally, it’s really important for automated testing to be able to tightly control the inputs to a test. While there are some techniques you can use to do this in a shared database (multi-tenancy usage, randomized data), it’s far easier mechanically to just have an isolated database that you can easily control.

Lastly, I really like being able to look through the state of the database after a failed test. That’s certainly possible with a shared database, but it’s much easier in my opinion to look through an isolated database where it’s much more obvious how your code and tests changed the database state.

I should say that I’m concerned here with logical separation between different threads of activity. If you do that with truly separate databases or separate schemas in the same database, it serves the same goal.

“The” Database vs. Application Persistence

There are two basic development paradigms to how we think about databases as part of a software system:

  1. The database is the system and any other code is just a conduit to get data back and forth from the database and  its consumers
  2. The database is merely the state persistence subsystem of the application

I strongly prefer and recommend the 2nd way of looking at that, and act accordingly. That’s a admittedly a major shift in thinking from traditional software development or database centric teams.

In practice, this generally means that I very strongly favor the concept of an application database that is only accessed by one application and can be considered to be just part of the application. In this case, I would opt to have all of the database DDL scripts and migrations in the source control repository for the application. This has a lot of benefits for development teams:

  1. It makes it dirt simple to correlate the database schema changes to the rest of the application code because they’re all versioned together
  2. Automated testing is easier within continuous integration builds becomes easier because you know exactly what scripts to apply to the database before running the tests
  3. No need for elaborate cascading builds in your continuous integration setup because it’s just all together

In contrast, a shared database that’s accessed by multiple applications is a lot more potential friction. The version tracking between the two moving parts is harder to understand and it harms your ability to do effective automated testing. Moreover, it’s wretchedly nasty to allow lots of different applications to float on top of the same database in what I call the “pond scum anti-pattern” because it inevitably causes nasty coupling issues that will almost result in regression bugs due to it being so much harder to understand how changes in the database will ripple out to the applications sharing the database. A much, much younger version of myself walked into a meeting and asked our “operational data store” folks to add a column to a single view and got screamed at for 30 minutes straight on why that was going to be impossible and do you know how much work it’s going to be to test everything that uses that view young man?

Assuming that you absolutely have to continue to use a shared database like my shop does, I’d at least try to ameliorate that by:

  • Make damn sure that all changes to that shared database schema are captured in source control somewhere so that you have a chance at effective change tracking
  • Having a continuous integration build for the shared database that runs some level of regression tests and then subsequently cascades to all of the applications that touch that database being automatically updated and tested against the latest version of the shared database. I’m expecting some screaming when I recommend that in the office next week;-)
  • At the least, have some mechanism for standing up a local copy of the up to date database schema with any necessary baseline data on demand for isolated testing
  • Some way to know when I’m running or testing the dependent applications exactly what version of the database schema repository I’m currently using. Git submodules? Distribute the DB via Nuget? Finally do something useful with Docker, distribute the DB as a versioned Docker image, and brag about that to any developer we meet?

The key here is that I want automated builds constantly running as feedback mechanisms to know when and what database changes potentially break (or fix too!) one of our applications. Because of some bad experiences in the past, I’m hesitant to use cascading builds between separate repositories, but it’s definitely warranted in this case until we can get the big central database split up.

At the end of the day, I still think that the shared database architecture is a huge anti-pattern that most shops should try to avoid and I’d certainly like to see us start moving away from that model more and more.

 

Document Databases over Relational Databases

I’ve definitely put my money where my mouth is on this (RavenDb early on, and now Marten). In my mind, evolutionary or incremental software design is much easier with document databases for a couple reasons:

  • Far fewer changes in the application code result in database schema changes
  • It’s much less work to keep the application and database in sync because the storage just reflects the application model
  • Less work in the application code to transform the database storage to structures that are more appropriate for the business logic. I.e., relational databases really aren’t great when your domain model is logically hierarchical rather than flat
  • It’s a lot less work to tear down and set up known test input states in document databases. With a relational database you frequently end up having to deal with extraneous data you don’t really care about just to satisfy relational integrity concerns. Likewise, tearing down relational database state takes more care and thought than it does with a document database.

I would still opt to use a relational database for reporting or if there’s a lot of set based logic in your application. For simpler CRUD applications, I think you’re fine with just about any model and I don’t object to relational databases in those cases either.

It sounds trivial, but it does help tremendously if your relational database tables are configured to use cascading deletes when you’re trying to set a database into a known state for tests.

Team Organization

My strong preference is to have a completely self-contained team that has the ability and authority to make any and all changes to their application database, and that’s most definitely been valid in my experience. Have the database managed and owned separately from the development team is a frequent source of friction and definitely a major hit to your reversibility that forces you to do more potentially wrong, upfront design work. It’s much worse when that separate team does not share your priorities or simply works on a very different release schedule. I think it’s far better for a team to own their database — or at the very worst, have someone who is allowed to touch the database in the team room and team standup’s.

If I had full control over an organization, I would not have a separate database team. Keeping developers and database folks on separate team makes your team have to spend more time on inter-team coordination, takes away from the team’s flexibility in deciding what they can deliver, and almost inevitably causes a bottleneck constraint for projects. Even worse in my mind is when neither the developers nor the database team really understand how their work impacts the other team.

Even if we say that we have a matrix organization, I want the project teams to have primacy over functional teams. To go farther, I’d opt to make functional teams (developers, testers, DBA’s) be virtual teams solely for the purpose of skill acquisition, knowledge sharing, and career growth. My early work experience was being an engineer within large petrochemical project teams, and the project team dominant matrix organization worked a helluva lot better than it did at my next job in enterprise IT that focused more on functional teams.

As an architect now rather than a front line programmer, I constantly worry about not being able to feel the “pain” that my decisions and shared libraries cause developers because that pain is an important feedback mechanism to improve the usability of our shared infrastructure or application architecture. Likewise, I worry that having a separate database team creates a situation where they’re not very aware of the impact of their decisions on developers or vice versa. One of the very important lessons I was taught as an engineer was that it was very important to understand how other engineering disciplines work and what they needed so that we could work better with them.

Now though, I do work in a shop that has historically centralized the control of the database in a centralized database team. To mitigate the problems that naturally arise from this organizational model, we’re trying to have much more bilateral conversations with that team. If we can get away with this, I’d really like to see members of that team spend more time in the project team rooms. I’d also love it if we could steal a page from my original engineering job (Bechtel)  and suggest some temporary rotations between the database and developer teams to better appreciate how the other half of that relationship works and what their needs are.

 

 

 

Testing HTTP Handlers with No Web Server in Sight

FubuMVC 2.0 and 3.0 introduced some tooling I called “Scenarios” that allow users to write mostly declarative integration tests against the entire HTTP pipeline in memory without having to host the application in a web server. I promised a coworker that I would write a blog post about using Scenarios for an internal team that wants to start using it much more in their work. A week of procrastination later and here you go:

NOTE: All samples are using FubuMVC 3.0

Why Integration Tests?

From the very beginning, we tried very hard to make unit testing FubuMVC action methods in isolation as easy as possible. I think we largely succeeded in that goal. However, within the context of a handling an HTTP request, FubuMVC like most web frameworks will potentially wrap those action methods with various middleware strategies for cross cutting technical things like authentication, authorization, logging, transaction management, and content negotiation. At some point, to truly exercise an HTTP endpoint you really do need to write an integration test that exercises the entire chain of HTTP handlers for an HTTP request exactly the way it will be configured inside the running application.

Toward that end, I built a class called EndpointDriver in early versions of FubuMVC that you could use to write integration tests against a FubuMVC application hosted with an embedded Katana server. This early tooling just wrapped WebClient with a FubuMVC specific fluent interface for resolving url’s, setting common options like the content-type and accepts headers, and verifying parts of the HTTP response. Below is a sample from our content negotiation support integration tests in FubuMVC 1.3 (“endpoints” is a reference to the EndpointDriver object for the running application):

[Test]
public void force_to_json_with_querystring()
{
    endpoints.Get("conneg/override/Foo?Format=Json", acceptType: "text/html")
        .ContentTypeShouldBe(MimeType.Json)
        .ReadAsJson<OverriddenResponse>()
        .Name.ShouldEqual("Foo");
}

EndpointDriver was fine at first, but our test library started getting slower as we added more and more tests and the fluent interface just never kept up with everything we needed for HTTP testing (plus I think that WebClient is awkward to use).

Using OWIN for HTTP “Scenarios”

As part of my FubuMVC 2.0 effort last year, I knew that I wanted a much better mechanism than the older EndpointDriver for doing integration testing of HTTP endpoints. Specifically, I wanted:

  • To be able to run HTTP requests and verify the response without having to take the performance hit of a web server
  • To run a FubuMVC application as it would be configured in production
  • To completely configure any part of an HTTP request
  • To be able to declaratively express multiple assertions against the expected response
  • To utilize FubuMVC’s support for “reverse URL resolution” for more traceable tests
  • Access to the raw HTTP request and response for anything unusual you would need to do that didn’t have a specific helper

The end result was a mechanism I called “Scenario’s” that exploited FubuMVC’s OWIN support to run HTTP requests in memory using this signature off of the new FubuRuntime object I explained in an earlier blog post:

OwinHttpResponse Scenario(Action<Scenario> configuration)

The Scenario object models both the HTTP request provides a way to specify expectations about the HTTP response for commonly used things like HTTP status codes, header values, and checking for the presence of string values in the HTTP response body. If need be, you also have access to FubuMVC’s abstractions for the entire HTTP request and response (more on this later).

To make this concrete, let’s say that you’re working through a “Hello, World” exercise with FubuMVC with this class and action method that just returns the text “Hello, World” when you issue a GET to the root “/” url of an application:

public class HomeEndpoint
{
    public string Index()
    {
        return "Hello, World";
    }
}

A scenario test for the action above would look like this code below:

using (var runtime = FubuRuntime.Basic())
{
    // Execute the home route and verify
    // the response
    runtime.Scenario(_ =>
    {
        _.Get.Url("/");

        _.StatusCodeShouldBeOk();
        _.ContentShouldBe("Hello, World");
        _.ContentTypeShouldBe("text/plain");
    });
}

In the scenario above, I’m issuing a GET request to the “/” url of the application and specifying that the resulting status code should be HTTP 200, “content-type” response header should be “text/plain”, and the exact contents of the response body should be “Hello, World.” When a Scenario is executed, it will run every single assertion instead of quitting on the first failure and report on every failed expectation in the specification output. This behavior is valuable when you have to author specifications with slower running scenario setup.

Specifying Url’s

FubuMVC has a model for reverse URL lookup from any endpoint method or the input model that we exploited in Scenario’s for traceable tests:

host.Scenario(_ =>
{
    // Specify a GET request to the Url that runs an endpoint method:
    _.Get.Action<InMemoryEndpoint>(e => e.get_memory_hello());

    // Or specify a POST to the Url that would handle an input message:
    _.Post

        // This call serializes the input object to Json using the 
        // application's configured JSON serializer and setting
        // the contents on the Request body
        .Json(new HeaderInput {Key = "Foo", Value1 = "Bar"});

    // Or specify a GET by an input object to get the route parameters
    _.Get.Input(new InMemoryInput { Color = "Red" });
});

I like the reverse url lookup instead of specifying Url’s directly in the scenarios because:

  1. It makes your scenario tests traceable to the actual handling code
  2. It insulates your scenarios from changes to the Url structures later

Checking the Response Body

For the 3.0 work I did a couple months ago, I fleshed out the Scenario support with more mechanisms to analyze the HTTP response body:

host.Scenario(_ =>
{
    // set up a request here

    // Read the response body as text
    var bodyText = _.Response.Body.ReadAsText();

    // Read the response body by deserializing Json
    // into a .net type with the application's
    // configured Json serializer
    var output = _.Response.Body.ReadAsJson<MyResponse>();

    // If you absolutely have to work with Xml...
    var xml = _.Response.Body.ReadAsXml();
});

Some Other Things…

I’ll happily explain the details of this list on request, but here are some other attributes of Scenario’s that FubuMVC supports right now:

  • You can specify expected values for HTTP response headers
  • You can assert on status codes and descriptions
  • There are helpers to send Json or Xml serialized data based on an input object message
  • There is a mechanism that allows you to disable all security middleware in the application for a single Scenario that has been frequently helpful in testing
  • You have access to the underlying IoC container for the running application from the Scenario if you need to resolve and use application services
  • FubuMVC is now StructureMap 4.0-only for its IoC usage, so we’re able to rely on StructureMap’s child container feature to resolve services during a Scenario execution from a unique child container per run. This allows you to replace services in your application with fakes, mocks, and stubs in a way that prevents your fake services from impacting more than one test.

Scenarios in Jasper

If you didn’t see my blog post earlier this year, FubuMVC is getting a complete reboot into a new project called Jasper late this year/early next year. I absolutely plan on bringing the Scenario support forward into Jasper very early, but this time around we’re completely dropping all of FubuMVC’s HTTP abstractions in favor of directly using the OWIN environment dictionary as the single model of HTTP requests and responses. My thought right now is that we’ll invest heavily in extension methods hanging off of IDictionary<string, object> for commonly used operations against that OWIN dictionary.

To some extent, we’re hoping as well that there will be a good ecosystem of OWIN helpers from other people and projects that will be usable from within Jasper.

Other Reading

Storyteller 3: Executable Specifications and Living Documentation for .Net

tl;dr: The open source Storyteller 3 is an all new version of an old tool that my shop (and others) use for customer facing acceptance tests, large scale test automation, and “living documentation” generation for code-centric systems.

A week from today I’m giving a talk at .Net Unboxed on Storyteller 3, an open source tool largely built by myself and my colleagues for creating, running, and managing Executable Specifications against .Net projects based on what we feel are the best practices for automated testing based on over a decade of working with automated integration testing. As I’ll try to argue in my talk and subsequent blog posts, the most complete approach in the .Net ecosystem for reliably and economically writing large scale automated integration tests.

My company and a couple other early adopters have been using Storyteller for daily work since June and the feedback has been pleasantly positive so far. Now is as good of time as any to make a public beta release for the express purpose of getting more feedback on the tool so we can continue to improve the tool prior to an official 3.0 release in January.

If you’re interested in kicking the tires on Storyteller, the latest beta as of now is 3.0.0.279-alpha available on Nuget.org. For help getting started, see our tutorial and getting started pages.

Some highlights:

It’s improved a lot since then, but I gave a talk at work in March previewing Storyteller 3 that at least discusses the goals and philosophy behind the tool and Storyteller’s approach to acceptance tests and integration tests.

A Brief History

I had a great time at Codemash this year catching up with old friends. I was pleasantly surprised when I was there to be asked several times about the state of Storyteller, an OSS project others had originally built in 2008 as a replacement for FitNesse as our primary means of expressing and executing automated customer facing acceptance tests. Frankly, I always thought that Storyteller 1 and the incrementally better Storyteller 2 were failures in terms of usability and I was so burnt out on working with it that I had largely given up on it and ignored it for years.

Unfortunately, my shop has a large investment in Storyteller tests and our largest and most active project was suffering with heinously slow and unreliable Storyteller regression test suites that probably caused more harm than good with their support costs. After a big town hall meeting to decide whether to scrap and replace Storyteller with something else, we instead decided to try to improve Storyteller to avoid having to rewrite all of our tests. The result has been an effective rewrite of Storyteller with an all new client. While trying very hard to mostly preserve backward compatibility with the previous version in its public API’s, the .Net engine is also a near rewrite in order to squeeze out as much performance and responsiveness as we could.

Roadmap

The official 3.0 release is going to happen in early January to give us a chance to possibly get more early user feedback and maybe to get some more improvements in place. You can see the currently open issue list on GitHub. The biggest things outstanding on our roadmap are:

  • Modernize the client technology to React.js v14 and introduce Redux and possibly RxJS as a precursor to doing any big improvements to the user interface and trying to improve the performance of the user interface with big specification suites
  • A “step through” mode in the interactive specification running so users can step through a specification like you would in a debugger
  • The big one, allow users to author the actual specification language in the user interface editor with some mechanics to attach that language to actual test support code later

Succeeding with Automated Integration Tests

tl;dr This post is an attempt to codify my thoughts about how to succeed with end to end integration testing. A toned down version of this post is part of the Storyteller 3 documentation

About six months ago the development teams at my shop came together in kind of a town hall to talk about the current state of our automated integration testing approach. We have a pretty deep investment in test automation and I think we can claim some significant success, but we also have had some problems with test instability, brittleness, performance, and the time it takes to author new tests or debug existing tests that have failed.

Some of the problems have since been ameliorated by tightening up on our practices — but that still left quite a bit of technical friction and that’s where this post comes in. Since that meeting, I’ve been essentially rewriting our old Storyteller testing tool in an attempt to address many of the technical issues in our automated testing. As part of the rollout of the new Storyteller 3 to our ecosystem, I thought it was worth a post on how I think teams can be more successful at automated end to end testing.

Test Stability

I’ve worked in far too many environments and codebases where the automated tests were “flakey” or unreliable:

  • Teams that do all of their development against a single shared, development database such that the data setup is hard to control
  • Web applications with a lot of asynchronous behavior are notoriously hard to test and the tests can be flakey with timing issues — even with all the “wait for this condition on the page to be true” discipline in the world.
  • Distributed architectures can be difficult to test because you may need to control, coordinate, or observe multiple processes at one time.
  • Deployment issues or technologies that tend to hang on to file locks, tie up ports, or generally lock up resources that your automated tests need to use

To be effective, automated tests have to be reliable and repeatable. Otherwise, you’re either going to spend all your time trying to discern if a test failure is “real” or not, or you’re most likely going to completely ignore your automated tests altogether as you lose faith in them.

I think you have several strategies to try to make your automated, end to end tests more reliable:

  1. Favor white box testing over black box testing (more on this below)
  2. Closely related to #1, replace hard to control infrastructure dependencies with stub services, even in functional testing. I know some folks absolutely hate this idea, but my shop is having a lot of success in using an IoC tool to swap out dependencies on external databases or web services in functional testing that are completely out of our control.
  3. Isolate infrastructure to the test harness. For example, if your system accesses a relational database, use an isolated schema for the testing that is only used by the test harness. Shared databases can be one of the worst impediments to successful test automation. It’s both important to be able to set up known state in your tests and to not get “false” failures because some other process happened to alter the state of your system while the test is running. Did I mention that I think shared databases are a bad idea yet?*
  4. Completely control system state setup in your tests or whatever build automation you have to deploy the system in testing.
  5. Collapse a distributed application down to a single process for automated functional testing rather than try to run the test harness in a different process than the application. In our functional tests, we will run the test harness, an embedded web server, and even an embedded database in the same process. For distributed applications, we have been using additional .Net AppDomain’s to load related services and using some infrastructure in our OSS projects to coordinate the setup, teardown, and even activity in these services during testing time.
  6. As a last resort for a test that is vulnerable to timing issues and race conditions, allow the test runner to retry the test

Failing all of those things, I definitely think that if a test that is so unstable and unreliable that it renders your automated build useless that you just delete that test. I think a reliable test suite with less coverage is more useful to a team than a more expansive test suite that is not reliable.

You Gotta Have Continuous Integration

This section isn’t the kind of pound on the table, Uncle Bob-style of “you must do this or you’re incompetent” kind of rant that causes the Rob Conery’s of the world have conniptions. Large scale automation testing simply does not work if the automated tests are not running regularly as the system continues to evolve.

Automated tests that are never or seldom executed can even be a burden on a development team that still try to keep that test code up to date with architectural changes. Even worse, automated tests that are not constantly executed are not trustworthy because you no longer know if test failures are real or just because the application structure changed.

Assuming that your automated tests are legitimately detecting regression problems, you need to determine what recent change introduced the problem — and it’s far easier to do that if you have a smaller list of possible changes and those changes are still fresh in the developer’s mind. If you are only occasionally running those automated tests, diagnosing failing tests can be a lot like finding the proverbial needle in the haystack.

I strongly prefer to have all of the automated tests running as part of a team’s continuous integration (CI) strategy — even the heavier, slower end to end kind of tests. If the test suite gets too slow (we have a suite that’s currently taking 40+ minutes), I like the “fast tests, slow tests” strategy of keeping one main build that executes the quicker tests (usually just unit tests) to give the team reasonable confidence that things are okay. The slower tests would be executed in a cascading build triggered whenever the main build completes successfully. Ideally, you’d like to have all the automated tests running against every push to source control, but even running the slower tests suites in a nightly or weekly scheduled build is better than nothing.

Make the Tests Easy to Run Locally

I think the section title is self-explanatory, but I’ve gotten this very wrong in the past in my own work. Ideally, you would have a task in your build script (I still prefer Rake, but substitute MSBuild, Fake, Make, Gulp, NAnt, whatever you like) that completely sets up the system under test on your machine and runs whatever the test harness. In a less perfect world a developer has to jump through hoops to find hidden dependencies and take several poorly described steps in order to run the automated tests. I think this issue is much less problematic than it was earlier in my career as we’ve adopted much more project build automation and moved to technologies that are easier to automate in deployment. I haven’t gotten to use container technologies like Docker myself yet, but I sure hope that those tools will make doing the environment setup for automating tests easier in the future.

Whitebox vs. Blackbox Testing

I strongly believe that teams should generally invest much more time and effort into whitebox tests than blackbox tests. Throughout my career, I have found that whitebox tests are frequently more effective in finding problems in your system – especially for functional testing – because they tend to be much more focused in scope and are usually much faster to execute than the corresponding black box test. White box tests can also be much easier to write because there’s simply far less technical stuff (databases, external web services, service buses, you name it) to configure or set up.

I do believe that there is value in having some blackbox tests, but I think that these blackbox tests should be focused on finding problems in technical integrations and infrastructure whereas the whitebox tests should be used to verify the desired functionality.

Especially at the beginning of my career, I frequently worked with software testers and developers who just did not believe that any test was truly useful unless the testing deployment was exactly the same as production. I think that attitude is inefficient. My philosophy is that you write automated tests to find and remove problems from your system, but not to prove that the system is perfect. Adopting that philosophy, favoring white box over black box testing makes much more sense.

Choose the Quickest, Useful Feedback Mechanism

Automating tests against a user interface has to be one of the most difficult and complex undertakings in all of software development. While teams have been successful with test automation using tools like WebDriver, I very strongly recommend that you do not test business logic and rules through your UI if you don’t have to. For that matter, try hard to avoid testing business logic without using the database. What does this mean? For example:

  • Test complex logic by calling into a service layer instead of the UI. That’s a big issue for one of the teams I work with who really needs to replace a subsystem behind http json services without necessarily changing the user interface that consumes those services. Today the only integration testing involving that subsystem is done completely end to end against the full stack. We have plenty of unit test coverage on the internals of that subsystem, but I’m pretty certain that those unit tests are too coupled to the implementation to be useful as regression or characterization tests when that team tries to improve or replace that subsystem. I’m strongly recommending that that team write a new suite of tests against the gateway facade service to that subsystem for faster feedback than the end to end tests could ever possibly be.
  • Use Subcutaneous Tests even to test some UI behavior if your application architecture supports that
  • Make HTTP calls directly against the endpoints in a web application instead of trying to automate the browser if that can be useful to test out the backend.
  • Consider testing user interface behavior with tightly controlled stub services instead of the real backend

The general rule we encourage in test automation is to use the “quickest feedback cycle that tells you something useful about your code” — and user interface testing can easily be much slower and more brittle than other types of automated testing. Remember too that we’re trying to find problems in our system with our tests instead of trying to prove that the system is perfect.

Setting up State in Automated Tests

I wrote a lot about this topic a couple years ago in My Opinions on Data Setup for Functional Tests, and I don’t have anything new to say since then;) To sum it up:

  • Use self-contained tests that set up all the state that a test needs.
  • Be very cautious using shared test data
  • Use the application services to set up state rather than some kind of “shadow data access” layer
  • Don’t couple test data setup to implementation details. I.e., I’d really rather not see gobs of SQL statements in my automated test code
  • Try to make the test data setup declarative and as terse as possible

Test Automation has to be a factor in Architecture

I once had an interview for a company that makes development tools. I knew going in that their product had some serious deficiencies in their automated testing strategy. When I told my interviewer that I was confident that I could help that company make their automated testing support much better, I was told that testing was just a “process issue.” Last I knew, it is still weak for its support for automating tests against systems that use that tool.

Automated testing is not merely a “process issue,” but should be a first class citizen in selecting technologies and shaping your system architecture. I feel like my shop is far above average for our test automation and that is in no small part because we have purposely architected our applications in such a way to make functional, automated testing easier. The work I described in sections above to collapse a distributed system into one process for easier testing, using a compositional architecture effectively composed by an IoC tool, and isolatating business rules from the database in our systems has been vital to what success we have had with automated testing. In other places we have purposely added logging infrastructure or hooks in our application code for no other reason than to make it easier for test automation infrastructure to observe or control the application.

Other Stuff for later…

I don’t think that in 10 years of blogging I’ve ever finished a blog series, but I might get around to blogging about how we coordinate multiple services in distributed messaging architectures during automated tests or how we’re integrating much more diagnostics in our automated functional tests to spot and prevent performance problems from creeping into application.

* There are some strategies to use in testing if you absolutely have no other choice in using a shared database, but I’m not a fan. The one approach that I want to pursue in the future is utilizing multi-tenancy data access designs to create a fake tenant on each test run to keep the data isolated for the test even if the damn database is shared. I’d still rather smack the DBA types around until they get their project automation act together so we could all get isolated databases.

Integration Testing with FubuMVC and OWIN

tl;dr: Having an OWIN host to run FubuMVC applications in process made it much easier to write integration tests and I think OWIN will end up being a very positive thing for the .Net development community.

FubuMVC is still dead-ish as an active OSS project, but the things we did might still be useful to other folks so I’m continuing my series of FubuMVC lessons learned retrospectives — and besides, I’ve needed to walk a couple other developers through writing integration tests against FubuMVC applications in the past week so writing this post is clearly worth my time right now.

One of the primary design goals of FubuMVC was testability of application code, and while I think we succeeded in terms of simpler, cleaner unit tests from the very beginning (especially compared to other web frameworks in .Net), there are so many things that can only be usefully tested and verified by integration tests that execute the entire HTTP stack.

Quite admittedly, FubuMVC was initially very weak in this area — partially because the framework itself only ran on top of ASP.Net hosting. While we had good unit test coverage from day one, our integration testing story had to evolve as we went in roughly these steps:

  1. Haphazardly build sample usages of features in an ASP.Net application that had to run hosted in either IIS or IISExpress that had to be executed manually. Obviously not a great solution for regression testing or for setting up test scenarios either.
  2. A not completely successful attempt to run FubuMVC on the early “Delegate of Doom” version of the OWIN specification and the old Kayak HTTP server. It worked just well enough to get my hopes up but didn’t really fly (there were some scenarios where it would just hang). If you follow me on Twitter and seen me blast OWIN as a “mystery meat” API, it’s largely because of lingering negativity from our first attempt.
  3. Running FubuMVC on top of Web API’s Self Host libraries. This was a nice win as it enabled us to embed FubuMVC applications in process for the first time, and our integration test coverage improved quickly. The Self Host option was noticeably slow and never worked on Mono* for us.
  4. Just in time for the 1.0 release, we finally made a successful attempt at OWIN hosting with the less insane OWIN 1.0 specification, and we’ve ran most of our integration tests and acceptance tests on Katana ever since. Moving to Katana and OWIN was much faster than the Web API Self Host infrastructure and at least in early versions, worked on Mono (the Katana team has periodically broken Mono support in later versions).
  5. For the forthcoming 2.0 release I built a new “InMemoryHost” model that can execute a single HTTP request at a time using our OWIN support without needing any kind of HTTP server.

I cannot overstate how valuable it has been to have an embedded hosting model has been for automated testing. Debugging test failures, using the application services exactly as the real application is configured, and setting up test data in databases or injecting in fake services is much easier with the embedded hosting models.

Here’s a real problem that I hit early this year. FubuMVC’s content negotiation (conneg) logic for selecting readers and writers was originally based only on the HTTP accepts and content-type headers, which is great and all except for how many ill behaved clients there are out there that don’t play nice with the HTTP specification. I finally went into our conneg support earlier this year and added support for overriding the accepts header, with an in the box default that looked for a query string on the request like “?format=json” or “format=xml.” While there are some unit tests for the internals of this feature, this is exactly the kind of feature that really has to be tested through an entire HTTP request and response to verify correctness.

If you’re having any issues with the formatting of the code samples, you can find the real code on GitHub at the bottom of this page.

 

I started by building a simple GET action that returned a simple payload:

    public class OverriddenConnegEndpoint
    {
        public OverriddenResponse get_conneg_override_Name(OverriddenResponse response)
        {
            return response;
        }
    }

    public class OverriddenResponse
    {
        public string Name { get; set; }
    }

Out of the box, the new “conneg/override/{Name}” route up above would respond with either a json or xml serialization of the output model based on the value of the accepts header, with “application/json” being the default representation in the case of wild cards for the accepts header. In the functionality, content negotiation needs to also look out for the new query string rules.

The next step is to define our FubuMVC application with a simple IApplicationSource class in the same assembly:

    public class SampleApplication : IApplicationSource
    {
        public FubuApplication BuildApplication()
        {
            return FubuApplication.DefaultPolicies().StructureMap();
        }
    }

The role of the IApplicationSource class in a FubuMVC application is to bootstrap the IoC container for the application and define any custom policies or configuration of a FubuMVC application. By using an IApplicationSource class, you’re establishing a reusable configuration that can be quickly applied to automated tests to make your tests as close to the production deployment of your application as possible for more realistic testing. This is crucial for FubuMVC subsystems like validation or authorization that mix in behavior to routes and endpoints off of conventions determined by type scanning or configured policies.

 

Using Katana and EndpointDriver with FubuMVC 1.0+

First up, let’s write a simple NUnit test for overriding the conneg behavior with the new “?format=json” trick, but with Katana and the FubuMVC 1.0 era EndpointDriver object:

        [Test]
        public void with_Katana_and_EndpointDriver()
        {
            using (var server = EmbeddedFubuMvcServer
                .For<SampleApplication>())
            {
                server.Endpoints.Get("conneg/override/Foo?format=json", acceptType: "text/html")
                    .ContentTypeShouldBe(MimeType.Json)
                    .ReadAsJson<OverriddenResponse>()
                    .Name.ShouldEqual("Foo");
            }
        }

Behind the scenes, the EmbeddedFubuMvcServer class spins up the application and a new instance of a Katana server to host it. The “Endpoints” object exposes a fluent interface to define and execute an HTTP request that will be executed with .Net’s built in WebClient object. The EndpointDriver fluent interface was originally built specifically to test the conneg support in the FubuMVC codebase itself, but is usable in a more general way for testing your own application code written on top of FubuMVC.

 

Using the InMemoryHost and Scenarios in FubuMVC 2.0

EndpointDriver was somewhat limited in its coverage of common HTTP usage, so I hoped to completely replace it in FubuMVC 2.0 with a new “scenario” model somewhat based on the Play Framework’s Play Specification tooling in Scala. I knew that I also wanted a purely in memory hosting model for integration tests to avoid the extra time it takes to spin up an instance of Katana and sidestep potential port contention issues.

The result is the same test as above, but written in the new “Scenario” style concept:

        [Test]
        public void with_in_memory_host()
        {
            // The 'Scenario' testing API was not completed,
            // so I never got around to creating more convenience
            // methods for common things like deserializing JSON
            // into .Net objects from the response body
            using (var host = InMemoryHost.For<SampleApplication>())
            {
                host.Scenario(_ => {
                    _.Get.Url("conneg/override/Foo?format=json");
                    _.Request.Accepts("text/html");

                    _.ContentTypeShouldBe(MimeType.Json);
                    _.Response.Body.ReadAsText()
                        .ShouldEqual("{\"Name\":\"Foo\"}");
                });
            }
        }

 

The newer Scenario concept was an attempt to make HTTP centric testing be more declarative. C# is not as expressive as Scala is, but I was still trying to make the test expression as clean and readable as possible and the syntax above probably would have evolved after more usage. The newer Scenario concept also has complete access to FubuMVC 2.0’s raw HTTP abstractions so that you’re not limited at all in what kinds of things you can express in the integration tests.

If you want to see more examples of both EmbeddedFubuMvcServer/EndpointServer and InMemoryHost/Scenario in action, please see the FubuMVC.IntegrationTesting project on GitHub.

 

Last Thoughts

If you choose to use either of these tools in your own FubuMVC application testing, I’d highly recommend doing something at the testing assembly level to cache the InMemoryHost or EmbeddedFubuMvcServer so that they can be used across test fixtures to avoid the nontrivial cost of repeatedly initializing your application.

While I’m focusing on HTTP centric testing here, using either tool above also has the advantage of building your application’s IoC container out exactly the way it should be in production for more accurate integration testing of underlying application services.

 

If I had to do it all over again…

We would have had an embedded hosting model from the very beginning, even if it had been on a fake, “only one HTTP request at a time” model. Moreover, if we were to ever reboot a new FubuMVC like web framework in the future KVM, I would vote for wrapping any new framework completely around the OWIN signature as our one and only model of an HTTP request. In any theoretical future effort, I’d invest time from the very beginning in something like FubuMVC 2.0’s InMemoryHost model early to make integration and acceptance testing easier and faster.

With the recent release of StructureMap 3, I’d also opt for a new child container model such that you can fake out application services on a test by test basis and rollback to the previous container state without having to re-initialize the entire application each time for faster testing.

 

* Mono support was a massive time sink for the FubuMVC project and never really paid off. Maybe having Xamarin involved much earlier in the new KVM .Net runtime and the emphasis on PCL supportwill make that situation much better in the future.

Adventures in Custom Testing Infrastructure

tl;dr: Sometimes the overhead of writing custom testing infrastructure can lead to easier development

 

Quick Feedback Cycles are Key

It’d be nice if someday I could write all my code perfectly in both structure and function the first time through, but for now I have to rely on feedback mechanisms to tell me when the code isn’t working correctly. That being said, I feel the most productive when I have the tightest feedback cycle between making a change in code and knowing how it’s actually working — and by “quick” I mean both the time it takes for me to setup the feedback cycle and how long the feedback cycle itself takes.

While I definitely like using quick twitch feedback tools like REPL’s or auto-reloading/refreshing web tools like our own fubu run or Mimosa.js’s “watch” command, my primary feedback mechanism for code centric tasks is usually automated tests. That being said, it helps when the tests are mechanically easy to write and run quickly enough that you can get into a nice “red/green/refactor” cycle. For whatever reasons, I’ve hit several problem domains in the last couple years where it was laborious in my time to set up the preconditions and testing inputs and also to measure and assert on the expected outcomes.

 

Maybe Invest in Some Custom Testing Infrastructure?

In some cases I knew right away that testing a feature was going to be a problem, so I started by asking myself “how do I wish I could express the test setup and assertions.” If it seems feasible, I’ll write custom ObjectMother if that’s possible or Test Data Builder‘s for the data setup in more complex cases. I’ve occasionally resorted to building little interpreters that read text and create data structures or files (I do this more often for hierarchical data than anything else I think) or perform assertions on the final state.

You can see an example of this in my old Storyteller2 codebase. Storyteller is a tool for automated acceptance tests and includes a tree view pane in the UI with the inevitable hierarchy of tests organized by suites in an n-deep hierarchy like:

Top Level Suite
  - Suite 1
    -Suite 2
    -Suite 3
      - Test 1
      - Test 2

In the course of building the Storyteller client, I needed to write a series of tests on the tree view state that had to start with a known hierarchy of suites and test files as inputs. After performing actions like filtering or receiving state updates within the UI, I needed to assert on the expected display in this test explorer pane (which tests and suites were visible and were they marked as running, failed, successful, or unknown).

First, to deal with the setup of the hierarchical data I created a little custom class that read flat text data and turned that into the desired hierarchy:

            hierarchy =
                StoryTeller.Testing.DataMother.BuildHierarchy(
                    @"
t1,Success
t2,Failure
t3,Success
s1/t4,Success
s1/t5,Success
s1/t6,Failure
s1/s2/t7,Success
s1/s2/t7,Success
");

Then in the “assertion” part of the test I created a custom specification class that could again read its expectations expressed as flat text and assert that the resulting tree view exactly matched the specified state:

        [Test]
        public void the_child_nodes_are_constructed_with_the_empty_suite()
        {
            var spec =
                new TreeNodeSpecification(
                    @"
suite:Empty
suite:s1
test:s1/t4
test:s1/t5
test:s1/t6
test:t1
test:t2
test:t3
");

            spec.AssertMatch(view.TestNode);
        }

As I recall, writing the simple text parsing classes just to make the expression of the automated tests made it pretty easy to add new behavior quickly. In this case, the time investment upfront for the custom testing infrastructure paid off.

 

FubuMVC’s View Engine Support

A couple months ago I finally got to carve off some time to finally go overhaul the view engine support code in FubuMVC. My main goals were to cut the unnecessarily complex internal code down to something more manageable as a precursor to optimizing both runtime performance and FubuMVC’s time to initialize an application. Since I was about to start monkeying around quite a bit with the internals of code that many of our users depend on, it’s a good thing that we had an existing suite of integration tests that acted as acceptance tests (think layouts, partials, HTML helpers, and our conventional attachment of views to routes) so that in theory I could safely make the restructuring changes without breaking existing behavior.

Going in though, I knew that there was some significant drawbacks to using our existing mechanism for testing the view engine support and I wasn’t looking forward to the inevitable test failures or formulating new integration tests.

 

Problems with the Existing Test Suite

In order to write end to end tests against the view engine support we had been effectively writing little mini FubuMVC applications inside our integration test libraries. Quite naturally, that often meant adding several view files and folders to simulate all the different permutations for layout rendering, using partials, sharing views from external Bottles (a superset of Area’s for you ASP.Net MVC folks), and view profiles (mobile vs. desktop for example). In the test fixtures we would spin up a FubuMVC application with Katana, run HTTP requests, and make assertions against the content that should or should not be present in the HTTP response body.

It wasn’t terrible, but it came with a serious drawbacks:

  1. It wasn’t complete and I’d need to add additional tests
  2. It was expensive in mechanical effort to create those little mini FubuMVC applications that had to be spread over so many different files and even folders
  3. Understanding the tests when something went wrong could be difficult because the expression of the test was effectively split over so many files

 

The New Approach

Before going too far into the code changes against the view engine support, I built a new test harness that would allow me to express in one testing class file:

  1. What all the views and layouts were in the entire system including the content of the views
  2. What the views were in external Bottles loaded into the application
  3. If necessary, configure a complete FubuMVC application if the defaults weren’t sufficient for the test
  4. Declare what content should and should not be rendered when certain routes were executed

The end result was a base class I called ViewIntegrationContext. Mechanically, I made TestFixture classes deriving from this abstract class. In the constructor function of the test fixture classes I would specify the location, content, and view model of any number of Spark or Razor views. When the test fixture class was first executed, it would:

  1. Create a brand new folder using a guid as the name to host the new “application” to avoid collisions with existing test runs (while the new test harness does try to clean up after itself, I’ve learned not to be very trusting of the file system during automated tests)
  2. Write out the Spark and Razor files based on the data specified in the constructor function to the new application folder
  3. Optionally load content Bottles and FubuMVC configurations inside the test harness (ignore that for now if you would, but it was a huge win for me)
  4. Load a new FubuMVC application in memory with the root directory pointing to our new folder for just this test

For each test, the ViewIntegrationContext object uses FubuMVC 2.0’s brand new in memory test harness (somewhat inspired by PlaySpecification from Scala) to execute a “Scenario” where I could declaratively specify what url to render and assert what content should or should not be present in the HTML output.

To make this concrete, the very simplest test to check that FubuMVC really can render a Spark view looks like this:

    [TestFixture]
    public class Simple_rendering : ViewIntegrationContext
    {
        public Simple_rendering()
        {
            SparkView<BreatheViewModel>("Breathe")
                .Write(@"
<p>This is real output</p>
<h2>${Model.Text}</h2>");
        }

        [Test]
        public void can_render()
        {
            Scenario.Get.Input(new AirInputModel{TakeABreath = true});
            Scenario.ContentShouldContain("<h2>Breathe in!</h2>");
        }
    }

    public class AirEndpoint
    {
        public AirViewModel TakeABreath(AirRequest request)
        {
            return new AirViewModel { Text = "Take a {0} breath?".ToFormat(request.Type) };
        }

        public BreatheViewModel get_breathe_TakeABreath(AirInputModel model)
        {
            var result = model.TakeABreath
                ? new BreatheViewModel { Text = "Breathe in!" }
                : new BreatheViewModel { Text = "Exhale!" };

            return result;
        }
    }

    public class AirRequest
    {
        public AirRequest()
        {
            Type = "deep";
        }

        public string Type { get; set; }
    }

    public class AirInputModel
    {
        public bool TakeABreath { get; set; }
    }

    public class AirViewModel
    {
        public string Text { get; set; }
    }

    public class BreatheViewModel : AirViewModel
    {

    }

 

So did this payoff? Heck yeah it did, especially for scenarios where I needed to build out multiple views and layouts. The biggest win for me was that the tests were completely self-contained instead of spread out over so many files and folders. Even better yet, the new in memory Scenario support in FubuMVC made the actual tests very declarative with decently descriptive failure messages.

 

It’s Not All Rainbows and Unicorns

I cherry picked some examples that I felt went well, but there have been some other times when I’ve gone down a rabbit hole of building custom testing infrastructure only to see it be a giant boondoggle. There’s a definite bit of overhead to writing this kind of tooling and you always have to consider whether you’ll save time in the whole compared to writing more crude or repetitive testing code. While I tend to be aggressive about building custom test harnesses, you might accurately call it a speculative exercise and hold off until you feel some pain in your testing.

Moreover, any kind of custom test harness where you decouple the expression of the test (inputs, actions, and assertions) from the actual code that’s being exercised obfuscates your traceability back to the actual code. I’ve seen plenty of cases where the “goodness” of making the expression of the test prettier and more declarative was more than offset by how hard it was to debug test failures because of the extra mental overhead of connecting the meaning of the test to the code that should be implementing it. It’s for that reason that I’ve never been a big fan of most Behavior Driven Development tools for testing that isn’t customer facing.

 

 

 

A Simple Example of a Table Driven Executable Specification

My shop is starting to go down the path of executable specifications (using Storyteller2 as the tooling, but that’s not what this post is about).  As an engineering practice, executable specifications* involves specifying the expected behavior of a user story with concrete examples of exactly how the system should behave before coding.  Those examples will hopefully become automated tests that live on as regression tests.

What are we hoping to achieve?

  • Remove ambiguity from the requirements with concrete examples.  Ambiguity and misunderstandings from prose based requirements and analysis has consistently been a huge time waste and source of errors throughout my career.
  • Faster feedback in development.  It’s awfully nice to just run the executable specs in a local branch before pushing anything to the testers
  • Find flaws in domain logic or screen behavior faster, and this has been the biggest gain for us so far
  • Creating living documentation about the expected behavior of the system by making the specifications human readable
  • Building up a suite of regression tests to make later development in the system more efficient and safer

Quick Example

While executable specifications are certainly a very challenging practice from the technical side of things, in the past week or so I’m aware of 3-4 scenarios where the act of writing the specification tests has flushed out problems with our domain logic or screen behavior a lot faster than we could have done otherwise.

Part of our application logic involves fuzzy matching against people in our system against some, ahem, not quite trustworthy data from external partners. Our domain expert explained the matching logic that he wanted was to match a person’s social security number, birth date, first name, and last name — but the name matching should be case insensitive and it’s valid to match on the initial of the first name.  Since this logic can be expressed as a set number of inputs and the one output with a great number of permutations, I chose to express this specification as a table with Storyteller (conceptually identical to the old ColumnFixture in FitNesse).  The final version of the spec is shown below  (click the image to get a more readable version):

ExecutableSpec

The image above is our final, approved version of this functionality that now lives as both documentation and a regression test.  Before that though, I wrote the spec and got our domain expert to look at it, and wouldn’t you know it, I had misunderstood a couple assumptions and he gave me very concrete feedback about exactly what the spec should have been.

To make this just a little bit more concrete, our Storyteller test harness connects the table inputs to the system under test with this little bit of adapter code:

The code behind the executable spec
  1.     public class PersonFixture : Fixture
  2.     {
  3.         public PersonFixture()
  4.         {
  5.             Title = “Person Matching Logic”;
  6.         }
  7.         [ExposeAsTable(“Person Matching Examples”)]
  8.         [return:AliasAs(“Matches”)]
  9.         public bool PersonMatches(
  10.             string Description,
  11.             [Default(“555-55-5555”)]SocialSecurityNumber SSN1,
  12.             [Default(“Hank”)]string FirstName1,
  13.             [Default(“Aaron”)]string LastName1,
  14.             [Default(“01/01/1974”)]DateCandidate BirthDate1,
  15.                                   [Default(“555-55-5555”)]SocialSecurityNumber SSN2,
  16.             [Default(“Hank”)]string FirstName2,
  17.             [Default(“Aaron”)]string LastName2,
  18.             [Default(“01/01/1974”)]DateCandidate BirthDate2)
  19.         {
  20.             var person1 = new Person
  21.             {
  22.                 SSN = SSN1,
  23.                 FirstName = FirstName1,
  24.                 LastName = LastName1,
  25.                 BirthDate = BirthDate1
  26.             };
  27.             var person2 = new Person
  28.             {
  29.                 SSN = SSN2,
  30.                 FirstName = FirstName2,
  31.                 LastName = LastName2,
  32.                 BirthDate = BirthDate2
  33.             };
  34.             return person1.Equals(person2);
  35.         }
  36.     }

* Jeremy, is this really just Behavior Driven Development (BDD)?  Or the older idea of Acceptance Test Driven Development (ATDD)?  This is some folks’ definition of BDD, but BDD is so overloaded and means so many different things to different people that I hate using the term.  ATDD never took off, and “executable specifications” just sounds cooler to me, so that’s what I’m going to call it.