Big Update on StructureMap 3.0 Progress

I can finally claim some very substantial progress on StructureMap 3.0 today. For a background on the goals and big changes for the 3.0 release, see Kicking off StructureMap 3 from last year and some additions from last month when I started again. As of today, StructureMap 3.0 development is in the master branch in GitHub. If you need to get at StructureMap 2.6 level code, use the TwoSix branch.

What’s been done?

I removed the strong naming.
All the old [Obsolete] API methods have been removed
The registration API has been greatly streamlined and there’s much more consistency internally now
The nested container implementation has been completely redone. It’s much simpler, should be much faster because it’s doing much less on setup, and the old lifecycle confusion between the parent and nested container problems have been fixed.
The “Profile” functionality has been completely redesigned and rebuilt. It’s also much more capable now than it was before.
The container spinup time *should* be much better because there’s so much less going on and a lot more decision making is done in a lazy way with memoization along the way. Lazy<T> FTW!
There’s much more runtime “figure out what I could do” type possibilities now
You can apply lifecycle scoping Instance by Instance instead of only at the PluginType level. That’s been a big gripe for years.
The Xml configuration has been heavily streamlined
The old [PluginFamily] / [Pluggable] attributes have been completely ripped out
Internally, the old PipelineGraph, InstanceFactory, ProfileManager architecture is all gone. The new PipelineGraph implementations just wrap one or more PluginGraph objects, so there’s vastly less data structure shuffling gone on internally.

What’s left to do?

I’ve transcribed my own notes about outstanding work (minus the documentation) to the GitHub issues page. There are a few items that are going to need some serious forethought, but I think the biggest architectural changes are already done and that list is starting to be more of a punchlist. I would dearly love any kind of help, design input, additions, or feedback on the outstanding work. If you’re inclined to get involved and tackle some of the issues, I tried to label the issues for the effort level.

If you think of the issues as picking a sword fight, the tags line up like this:

“Easy Fix” – Facing a sheepherder who probably stole that heron mark blade he’s carrying
“Medium Effort” – Fighting a Trolloc
“Architectural Level Change” – Fade. I will likely need to be involved with any of these

Fairly soon, I’ll be making a call for folks to try out a prerelease version of StructureMap 3 in their existing applications. As part of that effort, I’d really like to get some feedback about the observed performance and see if we can beat on it enough to find any memory leak issues.

If you or someone you know is a multi-threading guru, I’d probably be interested in talking through some things with you in the codebase.

Docs? Someday? Maybe?

Hopefully someday soon. The FubuMVC core team will be relaunching a completely new website sometime in the next couple years with our own implementation of a readthedocs style infrastructure. I’m planning on making the new StructureMap documentation part of that website. Documentation will be in git where it’ll be easy to take in pull requests for additions and corrections, and you’ll be able to use either Html or Markdown for the content. We’ve already got a working mechanism to “slurp” code samples live out of a source code tree and put into the we pages with formatting via pretty print to achieve “living” documentation this time around.

Last Thoughts

I haven’t paid attention to any of the “IoC Container Performance Shootout!” type blog posts in a long time, but StructureMap used to routinely come in well ahead of the other full-featured IoC containers (tools like Funq shouldn’t be considered apples to apples with StructureMap/Windsor/Ninject/Autofac/whatever. If you don’t support auto-wiring, rich lifecycle support, and maybe even interception, I say you don’t count as full-featured) in terms of performance. However, as I’ve torn into the StructureMap codebase with an eye towards better performance for the first time in years, I’ve found a scary amount of performance killing cruft code. My final thought is that as bad as the StructureMap code was (and trust me, it was), if it’s really faster than the other IoC containers, then what does that say about their code internals at that time? 😉

Big Proposed Changes for StructureMap 3

Just trying to round up more feedback as I go, here’s a handful of discussions I’ve started on the big proposed changes for StructureMap 3:

Please feel free to chime in here, twitter, or the list on any of these topics or any other thing you want for StructureMap 3.

Thanks,

Jeremy

Let’s try this again, StructureMap 3.0 in en route as of now

I’ll be honest, I haven’t worked much on StructureMap since I originally shelved my original 3.0/rewrite work in the summer of 2010 — and yes, the documentation is almost worthless. Now that FubuMVC reached that magic 1.0 mark I’m turning my attention back to StructureMap for a bit, but I think I want some feedback about what I’m thinking right now.

For background, read:

Kicking off StructureMap 3.0 — I just re-read this, and I’m still thinking all of the same things here and all the feedback is still valid.
Proposed StructureMap 2.7 Release

A month ago my plan was to do a small 2.7 release on the existing codebase to remove all the [Obsolete] API calls and grab some pull requests along the way. Having done that, I would then turn my attention back to the 3.0 codebase where I planned to essentially rewrite the core of StructureMap and retrofit the existing API on top of the new, cleaner core. A week or so into the work for the 2.7 release and I’ve changed my mind. First off, by the rules of semantic versioning, I should bump the major version to 3.0.0 when I make the breaking API changes. Secondly, I’m coming around to the idea of restructuring the existing code in place instead of a full rewrite.

To reiterate the major points, the 3.0 release means:

All [Obsolete] API calls are going away
Removing the strong naming — if you absolutely *have* to have this, maybe we can make separate nuget packages. I suggest we name that “structuremap.masochistic.”
Move to .Net 4.0. I don’t think it’s time to go to 4.5 yet and I don’t really want to mess with that anyway
Taking a dependency on FubuCore — if that causes pushback we’ll ilmerge it
Streamlining the Xml support
Rewrite the “Profile” feature completely
Make nested containers not be a crime against computer science
NOT adding every random brainfart “feature” that Windsor has
Make it faster
Make the diagnostics much better
Removing some obscure, clumsy features I never use and really wish you wouldn’t either

Additionally, we have a new “living documentation” infrastructure baking for the Fubu projects. I know some work already happened to transfer the StructureMap docs to Jekyll, but I’d far prefer to publish on the new fubu world website whenever that happens.

For right now, the 3.0 branch is in the original StructureMap repository at https://github.com/structuremap/structuremap/tree/three.

Proposed StructureMap 2.7 Release

So StructureMap 3 hasn’t really gotten going again, but I still have intentions of doing so this year. In the mean time, I’ve got a batch of pull requests stacked up in the StructureMap 2.6 codebase and it’s time for a new intermediate release. At this time, what I’d like to do is rev up to StructureMap 2.7 and do this:

Take in all the outstanding pull requests
Remove all [Obsolete] API members
Mark as [Obsolete] some various parts of the registration API that I know that I will not support in 3.0 (conditional construction comes to mine)
Mark all StructureMap attributes except for [DefaultConstructor] as [Obsolete] as I think we will dump all the circa 2003 attributes that you used to need to use.
Remove the strong naming because it’s death in combination with Nuget. If this is an issue for you, I will happily take a pull request to make a separate nuget package for a signed version of StructureMap
Ideally, I’d like to clean up the more coarse grained unit tests in a new namespace called “Acceptance” in order to get these ready for usage in StructureMap 3 and maybe provide a level of living documentation for later.
MAYBE — take a look at cleaning up the exception stack traces to give you more contextual information about where StructureMap caught an exception. We lost a lot of contextual information when I eliminated the Reflection.Emit usage in favor of compiling Expression’s.

Thoughts?

Clean Database per Automated Test Run? Yes, please.

TL;DR We’re able to utilize RavenDb‘s support for embedded databases, some IoC trickery, and our FubuMVC.RavenDb library to make automated testing far simpler by quickly spinning up a brand new database for each individual automated test to have complete control over the state of our system. Oh, and removing ASP.Net and relational databases out of the equation makes automated functional testing far easier too.

Known inputs and expected outcomes is the mantra of successful automated testing. This is generally pretty simple with unit tests and more granular integration tests, but sooner or later you’re going to want to exercise your application stack with a persistent database. You cannot sustain your sanity, much less be successful, while doing automated testing if you cannot easily put your system in a known state before you try to exercise the system. Stateful elements of your application architecture includes things like queues, the file system, and in memory caches, but for this post I’m only concerned with controlling the state of the application database.

On my last several projects we’ve used some sort of common test setup action to roll back our database to a near empty state before a test adds the exact data to the database that it needs as part of the test execution (the “arrange” part of arrange, act, and assert). You can read more about the ugly stuff I’ve tried in the past at the bottom of this post, but I think we’ve finally arrived at a solution for this problem that I think is succeeding.

Our Solution

First, we’re using RavenDb as a schema-less document database. We also use StructureMap to compose the services in our system, and RavenDb’s IDocumentStore is built and scoped as a singleton. In functional testing scenarios, we run our entire application (FubuMVC website hosted with an embedded web server, RavenDb, our backend service) in the same AppDomain as our testing harness, so it’s very simple for us to directly alter the state of the application. Before each test, we:

Eject and dispose any preexisting instance of IDocumentStore from our main StructureMap container
Replace the default registration of IDocumentStore with a new, completely empty instance of RavenDb’s EmbeddedDocumentStore
Write a little bit of initial state into the new database (a couple pre-canned logins and tenants).
Continue to the rest of the test that will generally start by adding test specific data using our normal repository classes helpfully composed by StructureMap to use the new embedded database

I’m very happy with this solution for a couple different reasons. First, it’s lightning fast compared with other mechanics I’ve used and describe at the bottom of this post. Secondly, using a schema-less database means that we don’t have much maintenance work to do to keep this database cleansing mechanism up to date with new additions to our persistent domain model and event store — and I think this is a significant source of friction when testing against relational databases.

Show me some code!

I won’t get into too much detail, but we use StoryTeller2 as our test harness for functional testing. The “arrange” part of any of our functional tests gets expressed like this taken from one of our tests for our multi-tenancy support:

----------------------------------------------
|If the system state is |

|The users are                                |
|Username |Password |Clients                  |
|User1    |Password1|ClientA, ClientB, ClientC|
|User2    |Password2|ClientA, ClientB         |
----------------------------------------------

In the test expressed above, the only state in the system is exactly what I put into the “arrange” section of the test itself. The “If the system state is” DSL is implemented by a Fixture class that runs this little bit of code in its setup:

Code Snippet

        public override void SetUp(ITestContext context)
        {
            // There’s a bit more than this going on here, but the service below
            // is part of our FubuPersistence library as a testing hook to
            // wipe the slate clean in a running application
            _reset = Retrieve<ICompleteReset>();
            _reset.ResetState();
        }

As long as my team is using our “If the system state is” fixture to setup the testing state, the application database will be set back to a known state before every single test run — making the automated tests far more reliable than other mechanisms I’ve used in the past.

The ICompleteReset interface originates from the FubuPersistence project that was designed in no small part to make it simpler to completely wipe out the state of your running application. The ResetState() method looks like this:

Code Snippet

        public void ResetState()
        {
            // Shutdown any type of background process in the application 
            // that is stateful or polling before resetting the database
            _serviceResets.Each(x => {
                trace(“Stopping services with {0}”, x.GetType().Name);
                x.Stop();
            });
            // The call to replace the database
            trace(“Clearing persisted state”);
            _persistence.ClearPersistedState();
            // Load any basic state that has to exist for all tests.  
            // I’m thinking that this is nothing but a couple default 
            // login credentials and maybe some static lookup list
            // data
            trace(“Loading initial data”);
            _initialState.Load();
            // Restart any and all background processes to run against the newly
            // created database
            _serviceResets.Each(x => {
                trace(“Starting services with {0}”, x.GetType().Name);
                x.Start();
            });
        }

The method _persistence.ClearPersistedState() called above to rollback all persistence is implemented by our RavenDbPersistedState class. That method does this:

Code Snippet

        public void ClearPersistedState()
        {
            // _container is the main StructureMap IoC container for the
            // running application.  The line below will
            // eject any existing IDocumentStore from the container
            // and dispose it
            _container.Model.For<IDocumentStore>().Default.EjectObject();
            // RavenDbSettings is another class from FubuPersistence
            // that just controls the very intial creation of a
            // RavenDb IDocumentStore object.  In this case, we’re
            // overriding the normal project configuration from
            // the App.config with instructions to use an
            // EmbeddedDocumentStore running completely
            // in memory. 
            _container.Inject(new RavenDbSettings
            {
                RunInMemory = true
            });
        }

The code above doesn’t necessarily create a new database, but we’ve set ourselves up to use a brand new embedded, in memory database whenever something does request a running database from the StructureMap container. I’m not going to show this code for the sake of brevity, but I think it’s important to note that the RavenDb database construction will use your normal mechanisms for bootstrapping and configuring an IDocumentStore including all the hundred RavenDb switches and pre-canned indices.

All the code shown here is from the FubuPersistence repository on GitHub.

Conclusion

I’m generally happy with this solution. So far, it’s quick in execution and we haven’t required much maintenance as we’ve progressed other than more default data. Hopefully, this solution will be applicable and reusable in future projects out of the box. I would happily recommend a similar approach to other teams.

But, but, but…

If you did read this carefully, I think you’ll find some things to take exception with:

I’m assuming that you really are able to test functionality with bare minimum data sets to keep the setup work to a minimum and the performance at an acceptable level. This technique isn’t going to be useful for anything involving performance or load testing — but are you really all that concerned about functionality testing when you do that type of testing?
We’re not running our application in its deployed configuration when we collapse everything down to the same AppDomain. Why I think this is a good idea, the benefits, and how we do it are a topic for another blog post. Promise.
RavenDb is schema-less and that turns out to make a huge difference in how long it takes to spin up a new database from scratch compared to relational databases. Yes, there may be some pre-canned indices that need to get built up when you spin up the new embedded database, but with an empty database I don’t see that as a show stopper.

Other, less successful ways of controlling state I’ve used in the past

Over the years I’ve done automated testing against persisted databases with varying degrees of frustration. The worst possible thing you can do is to have everybody testing against a shared relational database in the development and testing environments. You either expect the database to be in a certain state at the start of the test, or you ran a stored procedure to set up the tables you wanted to test against. I can’t even begin to tell you how unreliable this turns out to be when more than one person is running tests at the same time and fouling up the test runs. Unfortunately, many shops still try to do this and it’s a significant hurdle to clear when doing automated testing. Yes, you can try to play tricks with transactions to isolate the test data or try to use randomized data, but I’m not a believer in either approach.

Having an isolated relational database per developer, preferably on their own development box, was a marked improvement, but it adds a great deal of overhead to your project automation. Realistically, you need a developer to be able to build out the latest database on the fly from the latest source on their own box. That’s not a deal breaker with modern database migration tools, but it’s still a significant about of work for your team. The bigger problem to me is how you tear down the existing state in a relational database to put it into a known state before running an automated test. You’ve got a couple choices:

Destroy the schema completely and rebuild it from scratch. Don’t laugh, I’ve seen people do this and the tests were as painfully slow as you can probably imagine. I suppose you could also script the database to rollback to a checkpoint or reattach a backed up copy of the database, but again, I’m never going to recommend that if you have other options.
Execute a set of commands that wipes most if not all of the data in a database before each test. I’ve done this before, and while it definitely helped create a known state in the system, this strategy performed very poorly and it took quite a bit of work to maintain the “clean database” script as the project progressed. As a project grows, the runtime of your automated test runs becomes very important to keep the feedback loop useful. Slow tests hamper the usefulness of automated testing.
Selectively clean out and write data to only the tables affected by a test. This is probably much faster performance wise, but I think it will require more coding inside of the testing code to do the one off, set up the state code.

* As an aside, I really suggest keeping the project database data definition language scripts and/or migrations in the same source control system as the code so that it’s very easy to trace the version of the code running against which version of the database schema. The harsh reality in my experience is that the same software engineering rigor we generally use for our application code (source control, unit testing, TDD, continuous integration) is very often missing in regards to the relational database DDL and environment. If you’re a database guy talking to me at a conference, you better have your stuff together on this front before you dare tell me that “my developers can’t be trusted to access my database.”