Storyteller, Continuous Integration, and the Art of Failing Fast

Someone today asked me if it were possible to use Storyteller 3 as part of their continuous integration builds. Fortunately, I was able to answer “yes” and point them to documentation on running Storyteller specifications with the headless “st run” command. One of my primary goals for Storyteller 3 was to make our existing continuous integration suites faster, more reliable, and for heaven’s sake, fail fast instead of trying to execute specifications against a hopelessly broken environment. You might not have any intention of every touching Storyteller itself, but the lessons we learned from earlier Storyteller and the resulting improvements in 3.0 I’m describing in this post should be useful for using any kind of test automation tooling.

How Storyteller 3 Integrates with Continuous Integration

While you generally author and even execute Storyteller specifications at development time with the interactive editor web application, you can also run batches of specifications with the “st run” command from the command line. By exposing this command line interface, you should be able to incorporate Storyteller into any kind of CI server or build automation tooling.

The results are written to a single, self-contained HTML file that can be opened and browsed directly (the equivalent report in earlier versions was a mess).

 

Acceptance vs. Regression Specs

This has been a bit of a will o’wisp always just out of reach for most of my career, but ideally you’d like the Storyteller specifications to be expressed – if not completely implemented – before developers start work on any new feature or user story. If you really can pull off “acceptance test driven development”, that means that you may very well be trying to execute Storyteller specifications in CI builds that aren’t really done yet. That’s okay though, because Storyteller let’s you mark specifications as two different “lifecycle” states:

  1. Acceptance – The default state, just tells Storyteller that it’s a work in progress
  2. Regression – The functionality expressed in a specification is supposed to be working correctly

For CI builds, you can either choose to run acceptance specifications strictly for informational value or leave them out for the sake of build times. Either way, the acceptance specs do not count toward the “st run” tool passing or failing the build. Any failures while running regression specifications will always fail the build though.

 

Being Judicious with Retries

To deal with “flaky”tests that had a lot of timing issues due to copious amounts of asynchronous behavior, the original Storyteller team took some inspiration from jQuery and added the ability to make Storyteller retry failing specifications a certain number of times and accept any later successes.

You really shouldn’t need this feature, but it’s an imperfect world and you might very well need this feature. What we found in earlier Storyteller though was that the retries were too generous and made the CI build times take far too long when things went off the rails.

In Storyteller 3, we adopted some more stringent guidelines for when and when not to retry specifications:

  1. The new default behavior is to not allow retries. You now have to opt into retries either on a specification by specification basis (recommended) or supply a default maximum retry count as a command line argument.
  2. Acceptance specifications are never retried
  3. Specifications will never be retried if an execution detects “critical” or “catastrophic” errors. This was done to try to distinguish between “timing errors” and cases where the system just flat out fails. The classic example we used when designing this behavior was getting an exception when trying to navigate to a new Url in a browser application.

 

Failing Faster This Time

Prior to Storyteller 3, our CI builds could go on forever when the system or environment was non-functional. Like many acceptance testing tools – and opposite of xUnit tools – Storyteller tries to run a specification from start to finish, even if any early step fails. This behavior is valuable when you have an expensive scenario setup for multiple assertions so that you can maximize the feedback when you’re attempting to fix the failures. Unfortunately, this behavior also killed us with runaway CI builds.

The canonical example my colleagues told me about was trying to navigate a browser to a new Url with WebDriver, the navigation failing with some kind of “YSOD”, but Storyteller still trying to wait until certain elements were visible — then add the retries into the now mess.

To alleviate this kind of pain, we invested a lot of time into making Storyteller 3 “fail fast” in its CI runs. Now, if Storyteller detects a “StorytellerCriticalExecution” or “StorytellerCatastrophicException” (the entire system is unresponsive), Storyteller 3 will immediately stop the specification execution, bypass any possible retries, and return the results so far. Underneath the covers, we made Storyteller treat any error in Fixture setup or teardown as critical exceptions.

“Catastrophic” exceptions would be caused by any error in trying to bootstrap the application or system wide setup or teardown. In this case, Storyteller 3 stops all execution and reports the results with the catastrophic exception message. Based on your own environment tests, users can also force a catastrophic exception that effectively sets the breaks on the current batch run (for things like “can’t connect to the database at all”).

This small change in logic has done a lot to stop runaway CI builds when things go off the rails.

 

Why so slow?

The major driver for launching the Storyteller 3 rewrite was to try to make the automated testing builds on a very large project much faster. On top of all the optimization work inside of Storyteller itself, we also invested in adding the collection of performance metrics about test execution to try to understand what steps and system actions were really causing the testing slowness (early adopters of Storyteller 3 have consistently described the integrated performance data as their favorite feature).

While all that performance data is embedded in the HTML results, you can also have that information dumped into either CSV files for easy import into tools like Excel or Access or exported as Storyteller’s own JSON format.

By analyzing the raw performance data with simple Access reports, I was able to spot some of the performance hot spots of our large application like particularly slow HTTP endpoints, a browser application probably being too chatty to the backend, and even spot pages that were slow to load. I can’t say that we have all the performance issues solved yet, but now we’re much more informed about the underlying problems.

 

Optimizing for Batch Execution

With Storyteller 3 I was trying to incorporate every possible trick we could think of to squeeze more throughput out of the big CI builds. While we don’t completely support parallelization of specification runs yet (but we will sooner or later), Storyteller 3 partially parallelizes the batch runs by using a cascading series of producer/consumer queues to:

  1. Read in specification data
  2. “Plan” the specification by doing all necessary data coercion and attaching the raw spec inputs to the objects that will execute each step. Basically, do everything that can possibly be done before actually executing the specification.
  3. Execute specifications one at a time

The strategy above can help quite a bit if you need to run a large number of small specifications, but doesn’t help much at all if you have a handful of very slow specification executions.

Advertisement

3 thoughts on “Storyteller, Continuous Integration, and the Art of Failing Fast

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s