Skip to content

Table Based Specs and Custom Assertions with Storyteller 3

After over a year of work, I’m finally getting close to making an official 3.0 release of the newly rebuilt Storyteller project for executable specifications (BDD). There’s a webinar on youtube that I got to record for JetBrains for more background.

As a specification tool, Storyteller shines when the problem domain you’re working in lends itself toward table based specifications. At the same time, we’ve also invested heavily in making Storyteller mechanically efficient for expressing test data inputs with tables and the ability to customize data parsing in the specifications.

For an example, I’ve been working on a small OSS project named “Alba” that is meant to be a building block for a future web framework. Part of that work is a new HTTP router based on the Trie algorithm. One of our requirements for the new routing engine was to be able to detect routes with or without parameters (think “document/:id” where “id” is a routing parameter) and to be able to accurately match routes regardless of what order the routes were added (ahem, looking at you old ASP.Net Routing Module).

This turns out to be a pretty natural fit for expressing the requirements and sample scenarios with Storyteller. I started by jotting some notes on how I wanted to express the specifications by first setting up all the available routes in a new instance of the router, then running a series of scenarios through the router and proving that the router was choosing the correct route pattern and determining the route arguments for the routes that have parameters. That results of one of the specifications for the routing engine is shown below (but cropped for space):


Looking at the spec above, I did a couple things.

  1. “If the routes are” is a table grammar that just configures a router object with the supplied routes
  2. “The selection and arguments should be” is a second table grammar that takes in a Url pattern as an input, then asserts expected values against the route that was matched in the “Selected” column and uses a custom assertion to match up on the route parameters parsed from the Url (or asserts that there was “NONE”).

To set up the routing table in the first place, the “If the routes are” grammar is this (with the Fixture setup code to add some necessary context”:

        // This runs silently as the first step of a 
        // section using this Fixture
        public override void SetUp()
            _tree = new RouteTree();

        [ExposeAsTable("If the routes are")]
        public void RoutesAre(string Route)
            var route = new Route(Route, HttpVerbs.GET, _ => Task.CompletedTask);


The table for verifying the route selection is implemented by a second method:

        [ExposeAsTable("The selection and arguments should be")]
        public void TheSelectionShouldBe(
            string Url, 
            out string Selected, 
            [Default("NONE")]out ArgumentExpectation Arguments)
            var env = new Dictionary<string, object>();
            var leaf = _tree.Select(Url);

            Selected = leaf.Pattern;

            leaf.SetValues(env, RouteTree.ToSegments(Url));

            Arguments = new ArgumentExpectation(env);

The input value is just a single string “Url.” The method above takes that url string, runs it through the RouteTree object we had previously configured (“If the routes are”), finds the selected route, and fills the two out parameters. Storyteller itself will compare the two out values to the expected values defined by the specification. In the case of “Selected”, it just compares two strings. In the case of “ArgumentExpectation”, that’s a custom type I built in the Alba testing library as a custom assertion for this grammar. The key parts of ArgumentExpectation are shown below:

        private readonly string[] _spread;
        private readonly IDictionary<string, object> _args;

        public ArgumentExpectation(string text)
            _spread = new string[0];
            _args = new Dictionary<string, object>();

            if (text == "NONE") return;

            var args = text.Split(';');
            foreach (var arg in args)
                var parts = arg.Trim().Split(':');
                var key = parts[0].Trim();
                var value = parts[1].Trim();
                if (key == "spread")
                    _spread = value == "empty" 
                        ? new string[0] 
                        : value.Split(',')
                        .Select(x => x.Trim()).ToArray();
                    _args.Add(key, value);


        public ArgumentExpectation(Dictionary<string, object> env)
            _spread = env.GetSpreadData();
            _args = env.GetRouteData();

        protected bool Equals(ArgumentExpectation other)
            return _spread.SequenceEqual(other._spread) 
                && _args.SequenceEqual(other._args);

Storyteller provides quite a bit of customization on how the engine can convert a string to the proper .Net type for any particular “Cell.” In the case of ArgumentExpectation, Storyteller has a built in convention to use any constructor function with the signature “ctor(string)” to convert a string to the specified type and I exploit that ability here.

You can find all of the code for the RoutingFixture behind the specification above on GitHub. If you want to play around or see all of the parts of the specification, you can run the Storyteller client for Alba by cloning the Github repository, then running the “storyteller.cmd” file to compile the code and open the Storyteller client to the Alba project.

Why was this useful?

Some of you are rightfully reading this and saying that many xUnit tools have parameterized tests that can be used to throw lots of test scenarios together quickly. That’s certainly true, but the Storyteller mechanism has some advantages:

  1. The test results are shown clearly and inline with the specification html itself. It’s not shown above (because it is a regression test that’s supposed to be passing at all times;-)), but failures would be shown in red table cells with both the expected and actual values. This can make specification failures easier to understand and diagnose compared to the xUnit equivalents.
  2. Only the test inputs and expected results are expressed in the specification body. This makes it substantially easier for non technical stakeholders to more easily comprehend and review the specifications. It also acts to clearly separate the intent of the code from the mechanical details of the API. In the case of the Alba routing engine, that is probably important because the implementation today is a little tightly coupled to OWIN hosting but it’s somewhat likely we’d like to decouple the router from OWIN later as ASP.Net seems to be making OWIN a second class citizen from here on out.
  3. The Storyteller specifications or their results can be embedded into technical documentation generated by Storyteller. You can see an example of that in the Storyteller docs themselves.
  4. You can also add prose in the form of comments to the Storyteller specifications for more descriptions on the desired functionality (not shown here).


Marten Takes a Big Step Forward with v0.6

EDIT: Nuget v0.6.1 is already up with some improvements to the async code in Marten. Hat tip to Daniel Marbach for his pull request on that one.

Marten is a new OSS project that seeks to turn Postgresql into a robust, usable document database (and an event store someday) for .Net development. There’s a recording of an internal talk I gave introducing Marten at work live on YouTube for more background.

Marten v0.6 just went live on nuget this afternoon. This turned into a pretty substantial release that I feel makes Marten much more robust, usable, and generally a lot closer to ready for production usage in bigger, more complicated systems.

This release came with substantial contributions from other developers and incorporates feedback from early adopters. I’d like to thank (in no particular order) Jens Pettersson, Corey Kaylor, Bojan Veljanovski, Jeff Doolittle, Phillip Haydon, and Evgeniy Kulakov for their contributions and feedback in this release.

What’s New:

You can see the complete set of changes from the v0.6 milestone on GitHub.

So, what’s next?

More than anything, I’m hoping to get more early adopters giving us feedback (and pull requests!) on what’s missing, what’s not easy to use, and where it needs to change. I think I’ll get the chance to try converting a large project from RavenDb to Marten soon that should help as well.

Feature wise, I think the next couple things up for a future v0.7 release would be:

  • Batched queries (futures)
  • Readside projections, but whether that’s going to be via Javascript, .Net transforms, or both is yet to be determined
  • Using saved queries to avoid unnecessarily taking the hit of Linq expression parsing

The Jetbrains Storyteller 3 Webinar is Online

The Storyteller 3 webinar I did last week for Jetbrains has been published to YouTube this morning.

The latest Storyteller nuget ( addressed all of the UI flaws I managed to hit in the demos;)

There’ll be much more to come on Storyteller 3 as it gets closer and closer to finally getting an official release. The next step is a bit more UI performance optimization (i.e., Jeremy finally learns some RxJS) and the ability to step through a specification for easier debugging.


How I’m Testing Redux’ified React Components

Some of you are going to be new to all of the tools I’m describing here, and for you, I wanted to show how I think I’ve made authoring automated tests somewhat painless. For those of you who are already familiar with the whole React/Redux stack, feel free to make suggestions on making things better in the comments;)

As part of my big push to finally release Storyteller 3.0, I recently upgraded all of its JavaScript client dependencies (React.js/Babel/Webpack/etc. I might write a full on rant-y blog post about that later). As part of that effort, I’ve known for some time that I wanted to convert the client’s homegrown Flux-lite architecture based around Postal.js to Redux before I started to add any new features to it before making the final 3.0 release. After finishing that conversion to Redux, I can’t say that I’m really thrilled with how much work it took to make that transition, but I’m happy with the final results. In particular, I really like how easy the usage of react-redux has made the Karma specs for many of my React.js components.

Step 1 was to effectively shatter all of my existing karma specs. Step 2 was to figure out how to most easily connect my components under test to the new Redux architecture. I had an existing testing harness that had been somewhat helpful I used to first sketch out what a new Karma harness should do:

  1. For various reasons, I’m insisting on using a real browser when using Karma on my React.js components instead of something like jsdom, so I wanted the new harness to make it as quick as possible to render a React.js component in the browser
  2. I wanted the harness to take care of spinning up a new Redux store with the correct reducer function
  3. Despite my preference for “self-contained” tests and dislike of shared testing data sets, I opted to have the new harness start up with an existing JSON state of the client recorded from the server output to a JS file.
  4. Give me quick access to the mounted React.js component instance or the actual DOM element.
  5. I do still use Postal.js to broadcast requests from my React.js components to the rest of the application, so for the sake of testing I wanted some testing spy‘s to listen for messages to Postal.js to verify some of the event handlers of my components.

Those requirements led to a harness class I quite creatively called “ComponentHarness.” Looking at the interesting parts of the constructor function for ComponentHarness, you can see how I set up an isolated test state and element for a React.js component:

        // Make sure you aren't failing because of faulty
        // Postal listeners left behind by previous tests
        // Sets up a new Redux store with the correct
        // Reducer function. = createStore(Reducer);
        // Establish an initial data set based on 
        // server side data from the .Net tests;

        // Create a brand new container div for the 
        // React.js component being tested and add that
        // to the current document
		this.div = document.createElement('div');

        // Sets up some test spy's for Postal.js channels
        // that just listen for messages being received
        // during a spec run
        this.engineMessages = new Listener('engine-request');
        this.editorMessages = new Listener('editor');
        this.explorerMessages = new Listener('explorer');

Now, to put this into usage, I have a small React component called “QueueCount” that sits in the header bar of the Storyteller client and displays a Bootstrap “badge” element showing how many specifications are currently queued up for execution and links to another page showing the active queue. In the system’s initial state, there are no specifications queued and this badge element should be completely hidden.

At the top of my specification code for this component, I start up a new ComponentHarness and render the QueueCount component that I want to test against:

describe('QueueCount', function(){
	var component, harness;

            component = (<QueueCount />);
            harness = new ComponentHarness();

Inside of ComponentHarness, the render(component) method renders the component you pass into it in the DOM, but nested within the <Provider /> component from react-redux that does the work of wiring the Redux state to a React.js component:

            // The Provider component here is from
            // react-redux and acts to "wire up"
            // the given redux store to all the elements
            // nested inside of it
            <Provider store={}>
        , this.div);

Since the ComponentHarness is starting the store at a known state with no specifications currently queued for execution, the QueueCount component should be rendered as an empty <span /> element, and the first specification states this:

it('is just a blank span with no specs queued', function(){
        // element() gives me access to the root DOM element
        // for the rendered React component
        var element = harness.element();

Next, I needed to specify that the QueueCount component would render the proper count when there are specifications queued for execution. When running the full application, this information flows in as JSON messages from the .Net server via web sockets — and can update so quickly that it’s very difficult to really verify visually or with automated tests against the whole stack. Fortunately, this “how many specs are queued up” state is very easily to set up in tests by just dispatching the JSON messages to the Redux store and verifying the expected state of the component afterward as shown in the following Karma spec:

it('displays the updated counts after some specs are queued', function(){
        // Dispatch an 'action' to the underlying
        // Redux store to mutate the state{
            type: 'queue-state', 
            queued: ['embeds', 'sentence1', 'sentence3']

        // Check out the DOM element again to see the
        // actual state
        var element = harness.element();

Digging into the DOM

Call me completely uncool, but I do still use jQuery, especially for reading and querying the DOM during these kinds of tests. For ComponentHarness, I added a couple helper methods to it that allow you to quickly query the DOM from the mounted React component with jQuery:

    // jQuery expression within the mounted component 
    // DOM elements
        return $(match, this.div);
    // Find either the root element of the 
    // mounted component or search via css
    // selectors within the DOM
        if (!css){
            return this.div.firstChild;
		return $(css, this.div).get(0);

These have been nice just because you’re constantly adding components to new <div /> ‘s dynamically added to the running browser page. In usage, these methods are used like this (from a different Karma testing file):

    it('does not render when there are no results', () => {
        // this is just a convenience method to mount a particular
        // React.js component that shows up in dozens of tests
        var isRendered = harness.$('#spec-result-header').length > 0;


Postal Test Spy’s

Another usage for me has been to test event handlers to either prove that they’re successfully updating the state of the Redux store by dispatching actions (I hate the Redux/Flux parlance of ‘actions’ when they really mean messages, but when in Rome…) or to verify that an expected message has been sent to the server by listening in on what messages are broadcast via Postal. In the unit test below, I’m doing just this by looking to see that the “Cancel All Specifications” button in part of the client sends a message to the server to remove all the queued specifications and stop anything that might already be running:

	it('can cancel all the specs', function(){
        // click tries to find the matched element
        // inside the rendered component and click it'#cancel-all-specs');

		var message = harness



The ComponentHarness class has been a pretty big win in my opinion. For one thing, it’s made it relatively quick to mount React.js components connected to all the proper state in tests. Maybe more importantly, it’s made it pretty simple to get the system into the proper state to exercise React.js components by just dispatching little JSON actions into the mounted Redux store.

I’m not a fan of pre-canned test data sets, but in this particular case it’s been a huge time saver. The downsides are that many unit tests will likely break if I ever have to update that data set in the future, and sometimes it’s harder to understand a unit test without peering through the big JSON data of initial data.

In the longer term, as more of our clients at work are transitioned to React.js with Redux (that’s an ongoing process), I think I’m voting to move quite a bit of the testing we do today with Webdriver and fully integrated tests to using something like the Karma/Redux approach I’m using here. While there are some kinds of integration problems you’ll never be able to flush out with purely Karma tests and faked data being pushed into the Redux stores, at least we could probably make the Karma tests be much faster and far more reliable than the equivalent Webdriver tests are today. Food for thought, and we’ll see how that goes.


“Introduction to Marten” Video

I gave an internal talk today at our Salt Lake City office on Marten that we were able to record and post publicly. I discussed why Postgresql, why or when to choose a document database over a relational database, what’s already done in Marten, and where it still needs to go.

And of course, if you just wanna know what Marten is, the website is here.

Any feedback is certainly welcome here or in the Marten Gitter room.

Today I learned that the only thing worse than doing a big, important talk on not enough sleep is doing two talks and a big meeting on technical strategy on the same day.

Webinar on Storyteller 3 and Why It’s Different

JetBrains is graciously letting me do an online webinar on the new Storyteller 3 tool this Thursday (Jan. 21st). Storyteller is a tool for expressing automated software tests in a form that is consumable by non-technical folks and suitable for the idea of “executable specifications” or Behavior Driven Development. While Storyteller fills the same niche as Gherkin-based tools like SpecFlow or Cucumber, it differs sharply in the mechanical approach (Storyteller was originally meant to be a “better” FitNesse and is much more inspired by the original FIT concept than Cucumber).

In this webinar I’m going to show what Storyteller can do, how we believe you make automated testing more successful, and how that thinking has been directly applied to Storyteller. To try to answer the pertinent question of “why should I care about Storyteller?,” I’m going to demonstrate:

  • The basics of crafting a specification language for your application
  • How Storyteller integrates into Continuous Integration servers
  • Why Storyteller is a great tool for crafting deep reaching integration tests and allows teams to address complicated scenarios that might not be feasible in other tools
  • The presentation of specification results in a way that makes diagnosing test failures easier
  • The steps we’ve taken to make test data setup and authoring “self-contained” tests easier
  • The ability to integrate application diagnostics into Storyteller with examples from web applications and distributed messaging systems (I’m showing integration with FubuMVC, but we’re interested in doing the same thing next year with ASP.Net MVC6)
  • The effective usage of table driven testing
  • How to use Storyteller to diagnose performance problems in your application and even apply performance criteria to the specifications
  • If there’s time, I’ll also show Storyteller’s secondary purpose as a tool for crafting living documentation

A Brief History of Storyteller

Just to prove that Storyteller has been around for awhile and there is some significant experience behind it:

  • 2004 – I worked on a project that tried to use the earliest .Net version of FIT to write customer facing acceptance testing. It was, um, interesting.
  • 2005 – On a new project, my team invested very heavily in FitNesse testing with the cooperation of a very solid tester with quite a bit of test automation experience. We found FitNesse to be very difficult to work with and frequently awkward — but still valuable enough to continue using it. In particular, I felt like we were spending too much time troubleshooting syntax issues with how FitNesse parsed the wiki text written by our tester.
  • 2006-2008 – The original incarnation of Storyteller was just a replacement UI shell and command line runner for the FitNesse engine. This version was used on a couple projects with mixed success.
  • 2008-2009 – For reasons that escape me at the moment, I abandoned the FitNesse engine and rewrote Storyteller as its own engine with a new hybrid WPF/HTML client for editing tests. My concern at the time was to retain the strengths of FIT, especially table-driven testing, while eliminating much of the mechanical friction in FIT. The new “Storyteller 1.0” on Github was somewhat successful, but still had a lot of usability problems.
  • 2012 – Storyteller 2 came with some mild improvements on usability when I changed into my current position.
  • End of 2014 – My company had a town hall style meeting to address the poor results we were having with our large Storyteller test suites. Our major concerns were the efficiency of authoring specs, the reliability of the automated specs, and the performance of Storyteller itself. While we considered switching to SpecFlow or even trying to just do integration tests with xUnit tools and giving up on the idea of executable specifications altogether, we decided to revamp Storyteller instead of ditching it.
  • First Half of 2015 – I effectively rewrote the Storyteller test engine with an eye for performance and throughput. I ditched the existing WPF client (and nobody mourned it)  and wrote an all new embedded web client based on React.js for editing and interactively running specifications. The primary goals of this new Storyteller 3.0 effort has been to make specification authoring more efficient and to try to make the execution more performant. Quite possibly the biggest success of Storyteller 3 in real project usage has been the extra diagnostics and performance information that it exposes to help teams understand why tests and the underlying systems are behaving the way that they are.
  • July 2015 – now: The alpha versions of Storyteller 3 are being used by several teams at my shop and a handful of early adopter teams. We’ve gotten a couple useful pull requests — including several usability improvements from my colleagues — and some help with understanding what teams really need.

Deleting Code

I’m in the middle of a now weeks-long effort to “modernize” a good sized web client to the latest, greatest React.js stack. I just deleted a good chunk of existing code that I was able to render unnecessary with my tooling, and that seems like a good time for me to take a short break to muse about deleting code.

I’ve had quite a bit of cause over the six months to purge quite a bit of code out of some of my ongoing OSS projects. That’s gotten me to thinking about the causes of code deletion and when that is or is not a good thing.


It’s So Much Better Now

I’m in the process of retrofitting the Storyteller 3 React.js client to use Redux and Immutable.js in place of the homegrown Flux-like architecture using Postal.js I originally used way, way back (in React.js time) at this time last year. I was just able to rip out several big, complicated Javascript files that were replaced by much more concise, simpler, and hopefully less error prone code using Redux. The end result is being very positive, but you have to weigh that against the intermediate cost of making the changes. In this case, I’m really not sure if this was a clear win.

My Judgement: I’m usually pretty happy when I’m able to replace some clumsy code with something simpler and smoother — but at what cost?

More on my redux experiences in a couple weeks when it’s all done. 


That Code Served Its Purpose

Last July I came back from a family vacation all rested up and raring to go on an effort to consolidate and cut down FubuMVC and the remaining ecosystem. Most of my work for that month was removing or simplifying features that were no longer considered valuable by my shop. In particular, FubuMVC had a lot of features especially geared toward building large server side rendered web applications, among them:

  • Html conventions to build matching HTML displays, headers, and labels for .Net class properties
  • Modularity through “Bottles” to create drop in assemblies that could add any kind of client content (views, CSS, JS, whatnot) or server elements to an existing web application
  • “Content Extensions” that allowed users to create extensible views

All of the features above had already provided value in previous projects, but were no longer judged necessary for the kind of applications that we build today using much more JS and far less server side rendering. In those cases, it felt more like the code was being retired after a decent run rather than any kind of failure.

My Judgement: It’s kind of a good feeling


What was I thinking? 

Some code you have to nuke just because it was awful or a massive waste of time that will never provide much value. I had a spell when I was younger as one of those traveling consultants flying out every Monday to a client site. On one occasion I ended up having to stay in Chicago over a three day weekend instead of getting to come home. Being more ambitious back then, I spent most of that weekend building a WinForms application to explore and diagnose problems with StructureMap containers. That particular project was a complete flop and I’ve always regretted that I wasted an opportunity to go sight see in downtown Chicago instead of wasting my time.

I think there has to be a constantly running daemon process running in your mind during any major coding effort that can tell you “this isn’t working” or “this shouldn’t be this hard” that shakes you out of an approach or project that is probably headed toward failure.

My Judgement: Grumble. Fail fast next time and don’t pay the opportunity cost!


git reset –hard

Git makes it ridiculously easy to do quick, throwaway experiments in your codebase. Wanna see if you can remove a class without too much harm? No problem, just try it out, and if it flops, just reset or checkout or one of the million ways to do the same basic thing in git.

My Judgement: No harm, no foul. Surprisingly valuable for longer lived projects


I don’t want to support this anymore

When I was readying the StructureMap 3.0 release a couple years ago, I purposely removed several old, oddball features in StructureMap that I just didn’t want to support any longer. In every case, there actually was other, simpler ways to accomplish what the user was trying to do without that feature. My criteria there was “do I groan anytime a user asks me a question about this feature…” If the answer was “yes”, I killed it.

I was helping my wife through a class on learning Python, and watching over her shoulder I think I have to admire Python’s philosophy of having only one way to do any kind of task. Compare that philosophy to the seemingly infinite number of ways you can create objects in Javascript. In the case of StructureMap 3, I deleted some custom fluent interfaces for conditional object construction based on runtime conditions that could easily be accomplished much more flexibly by just letting users provide C# Func’s. In one blow, I removed a now unnecessary feature that confused users and caused me headaches on the user list without moving backward in capability.

My Judgement: Mixed. You wish it wasn’t necessary to do it, but the result should be favorable in the end.


Get every new post delivered to your Inbox.

Join 54 other followers