Alba v5.0 is out! Easy recipes for integration testing ASP.Net web services

Alba is a small library to facilitate easier integration testing of web services built on the ASP.Net Core platform. It’s more or less a wrapper around the ASP.Net TestServer, but does a lot to make testing much easier than writing low level code against TestServer. You would use Alba from within xUnit.Net or NUnit testing projects.

I outlined a proposal for Alba V5 a couple weeks back, and today I was able to publish the new Nuget. The big changes are:

  • The Alba documentation website was rebuilt with Vitepress and refreshed to reflect V5 changes
  • The old SystemUnderTest type was renamed AlbaHost to make the terminology more inline with ASP.Net Core
  • The Json serialization is done through the configured input and output formatters within your ASP.Net Core application — meaning that Alba works just fine regardless of whether your application uses Newtonsoft.Json or System.Text.Json for its Json serialization.
  • There is a new extension model that was added specifically to support testing web services that are authenticated with JWT bearer tokens.

For security, you can opt into:

  1. Stub out all of the authentication and use hard-coded claims
  2. Add a valid JWT to every request with user-defined claims
  3. Let Alba deal with live OIDC integration by handling the JWT integration with your OIDC identity server

Testing web services secured by JWT tokens with Alba v5

We’re working toward standing up a new OIDC infrastructure built around Identity Server 5, with a couple gigantic legacy monolith applications and potentially dozens of newer microservices needing to use this new identity server for authentication. We’ll have to have a good story for running our big web applications that will have this new identity server dependency at development time, but for right now I just want to focus on an automated testing strategy for our newer ASP.Net Core web services using the Alba library.

First off, Alba is a helper library for integration testing HTTP API endpoints in .Net Core systems. Alba wraps the ASP.Net Core TestServer while providing quite a bit of convenient helpers for setting up and verifying HTTP calls against your ASP.Net Core services. We will be shortly introducing Alba into my organization at MedeAnalytics as a way of doing much more integration testing at the API layer (think the middle layer of any kind of testing pyramid concept).

In my previous post I laid out some plans and proposals for a quickly forthcoming Alba v5 release, with the biggest improvement being a new model for being able to stub out OIDC authentication for APIs that are secured by JWT bearer tokens (I think I win programming bingo for that sentence!).

Before I show code, I should say that all of this code is in the v5 branch of Alba on GitHub, but not yet released as it’s very heavily in flight.

To start, I’m assuming that you have a web service project, then a testing library for that web service project. In your web application, bearer token authentication is set up something like this inside your Startup.ConfigureServices() method:

services.AddAuthentication("Bearer")
    .AddJwtBearer("Bearer", options =>
    {
        // A real application would pull all this information from configuration
        // of course, but I'm hardcoding it in testing
        options.Audience = "jwtsample";
        options.ClaimsIssuer = "myapp";
        
        // don't worry about this, our JwtSecurityStub is gonna switch it off in
        // tests
        options.Authority = "https://localhost:5001";
            

        options.TokenValidationParameters = new TokenValidationParameters
        {
            ValidateAudience = false,
            IssuerSigningKey = new SymmetricSecurityKey(Encoding.UTF8.GetBytes("some really big key that should work"))
        };
    });

And of course, you also have these lines of code in your Startup.Configure() method to add in ASP.Net Core middleware for authentication and authorization:

app.UseAuthentication();
app.UseAuthorization();

With these lines of setup code, you will not be able to hit any secured HTTP endpoint in your web service unless there is a valid JWT token in the Authorization header of the incoming HTTP request. Moreover, with this configuration your service would need to make calls to the configured bearer token authority (http://localhost:5001 above). It’s going to be awkward and probably very brittle to depend on having the identity server spun up and running locally when our developers try to run API tests. It would obviously be helpful if there was a quick way to stub out the bearer token authentication in testing to automatically supply known claims so our developers can focus on developing their individual service’s functionality.

That’s where Alba v5 comes in with its new JwtSecurityStub extension that will:

  1. Disable any validation interactions with an external OIDC authority
  2. Automatically add a valid JWT token to any request being sent through Alba
  3. Give developers fine-grained control over the claims attached to any specific request if there is logic that will vary by claim values

To demonstrate this new Alba functionality, let’s assume that you have a testing project that has a direct reference to the web service project. The direct project reference is important because you’ll want to spin up the “system under test” in a test fixture like this:

    public class web_api_authentication : IDisposable
    {
        private readonly IAlbaHost theHost;

        public web_api_authentication()
        {
            // This is calling your real web service's configuration
            var hostBuilder = Program.CreateHostBuilder(new string[0]);

            // This is a new Alba v5 extension that can "stub" out
            // JWT token authentication
            var jwtSecurityStub = new JwtSecurityStub()
                .With("foo", "bar")
                .With(JwtRegisteredClaimNames.Email, "guy@company.com");

            // AlbaHost was "SystemUnderTest" in previous versions of
            // Alba
            theHost = new AlbaHost(hostBuilder, jwtSecurityStub);
        }

I was using xUnit.Net in this sample, but Alba is agnostic about the actual testing library and we’ll use both NUnit and xUnit.Net at work.

In the code above I’ve bootstrapped the web service with Alba and attached the JwtSecurityStub. I’ve also established some baseline claims that will be added to every JWT token on all Alba scenario requests. The AlbaHost extends the IHost interface you’re already used to in .Net Core, but adds the important Scenario() method that you can use to run HTTP requests all the way through your entire application stack like this test :

        [Fact]
        public async Task post_to_a_secured_endpoint_with_jwt_from_extension()
        {
            // Building the input body
            var input = new Numbers
            {
                Values = new[] {2, 3, 4}
            };

            var response = await theHost.Scenario(x =>
            {
                // Alba deals with Json serialization for us
                x.Post.Json(input).ToUrl("/math");
                
                // Enforce that the HTTP Status Code is 200 Ok
                x.StatusCodeShouldBeOk();
            });

            var output = response.ResponseBody.ReadAsJson<Result>();
            output.Sum.ShouldBe(9);
            output.Product.ShouldBe(24);
        }

You’ll notice that I did absolutely nothing in regards to JWT set up or claims or anything. That’s because the JwtSecurityStub is taking care of everything for you. It’s:

  1. Reaching into your application’s bootstrapping to pluck out the right signing key so that it builds JWT token strings that can be validated with the right signature
  2. Turns off any external token validation using an external OIDC authority
  3. Places a unique, unexpired JWT token on each request that matches the issuer and authority configuration of your application

Now, to further control the claims used on any individual scenario request, you can use this new method in Scenario tests:

        [Fact]
        public async Task can_modify_claims_per_scenario()
        {
            var input = new Numbers
            {
                Values = new[] {2, 3, 4}
            };

            var response = await theHost.Scenario(x =>
            {
                // This is a custom claim that would only be used for the 
                // JWT token in this individual test
                x.WithClaim(new Claim("color", "green"));
                x.Post.Json(input).ToUrl("/math");
                x.StatusCodeShouldBeOk();
            });

            var principal = response.Context.User;
            principal.ShouldNotBeNull();
            
            principal.Claims.Single(x => x.Type == "color")
                .Value.ShouldBe("green");
        }

I’ve got plenty of more ground to cover in how we’ll develop locally with our new identity server strategy, but I’m feeling pretty good about having a decent API testing strategy. All of this code is just barely written, so any feedback you might have would be very timely. Thanks for reading about some brand new code!

Proposal for Alba v5: JWT secured APIs, more JSON options, and other stuff

Just typed up a master GitHub issue for Alba v5, and I’ll take any possible feedback anywhere, but I’d prefer you make requests on GitHub. Here’s the GitHub issue I just banged out copied and pasted for anybody interested:

I’m dipping into Alba over the next couple work days to add some improvements that we want at MedeAnalytics to test HTTP APIs that are going to be secured with JWT Bearer Token authentication that’s ultimately backed up by IdentityServer5. I’ve done some proof of concepts off to the side on how to deal with that in Alba tests with or without the actual identity server up and running, so I know it’s possible.

However, when I started looking into how to build a more formal “JWT Token” add on to Alba and catching up with GitHub issues on Alba, it’s led me to the conclusion that it’s time for a little overhaul of Alba as part of the JWT work that’s turning into a proposed Alba v5 release.

Here’s what I’m thinking:

  • Ditch all support < .Net 5.0 just because that makes things easier on me maintaining Alba. Selfishness FTW.
  • New Alba extension model, more on this below
  • Revamp the Alba bootstrapping so that it’s just an extension method hanging off the end of IHostBuilder that will create a new ITestingHost interface for the existing SystemUnderTest class. More on this below.
  • Decouple Alba’s JSON facilities from Newtonsoft.Json. A couple folks have asked if you can use System.Text.Json with Alba so far, so this time around we’ll try to dig into the application itself and find the JSON formatter from the system and make all the JSON serialization “just” use that instead of recreating anything
  • Enable easy testing of GPRC endpoints like we do for classic JSON services

DevOps changes

  • Replace the Rake build script with Bullseye
  • Move the documentation website from stdocs to VitePress with a similar set up as the new Marten website
  • Retire the existing AppVeyor CI (that works just fine), and move to GitHub actions for CI, but add a new action for publishing Nugets

Bootstrapping Changes

Today you use code like this to bootstrap Alba against your ASP.Net Core application:

var system = new SystemUnderTest(WebApi.Program.CreateHostBuilder(new string[0]));

where Alba’s SystemUnderTest intercepts an IHostBuilder, adds some additional configuration, swaps out Kestrel for the ASP.Net Core TestServer, and starts the underlying IHost.

In Alba v5, I’d like to change this a little bit to being an extension method hanging off of IHostBuilder so you’d have something more like this:

var host = WebApi.Program.CreateHostBuilder(new string[0]).StartAlbaHost();

// or

var host = await WebApi.Program.CreateHostBuilder(new string[0]).StartAlbaHostAsync();

The host above would be this new interface (still the old SystemUnderTest under the covers):

public interface IAlbaHost : IHost
{
    Task<ScenarioResult> Scenario(Action<Scenario> configure);

    // various methods to register actions to run before or after any scenario is executed
}

Extension Model

Right now the only extension I have in mind is one for JWT tokens, but it probably goes into a separate Nuget, so here you go.

public interface ITestingHostExtension : IDisposable, IAsyncDisposable
{
    // Make any necessary registrations to the actual application
    IHostBuilder Configure(IHostBuilder builder);

    // Spin it up if necessary before any scenarios are run, also gives
    // an extension a chance to add any BeforeEach() / AfterEach() kind
    // of hooks to the Scenario or HttpContext
    ValueTask Start(ITestingHost host);
   
}

My thought is that ITestingHostExtension objects would be passed into the StartAlbaHost(params ITestingHostExtension[] extensions) method from above.

JWT Bearer Token Extension

This will probably need to be in a separate Alba.JwtTokens Nuget library.

  • Extension method on Scenario to add a bearer token to the Authorization header. Small potatoes.
  • Helper of some sort to fetch JWT tokens from an external identity server and use that token on further scenario runs. I’ve got this prototyped already
  • a new Alba extension that acts as a stubbed identity server. More below:

For the stubbed in identity server, what I’m thinking is that a registered IdentityServerStub extension would:

  • Spin up a small fake OIDC web application hosted in memory w/ Kestrel that responds to the basic endpoints to validate JWT tokens.
  • With this extension, Alba will quietly stick a JWT token into the Authorization header of each request so you can test API endpoints that are secured by bearer tokens
  • Reaches into the system under test’s IHostBuilder and sets the authority url to the local Url of the stubbed OIDC server (I’ve already successfully prototyped this one)
  • Also reaches into the system under test to find the JwtBearerOptions as configured and uses the signing keys that are already configured for the application for the JWT generation. That should make the setup of this quite a bit simpler
  • The new JWT stub extension can be configured with baseline claims that would be part of any generated JWT
  • A new extension method on Scenario to add or override claims in the JWT tokens on a per-scenario basis to support fine-grained authorization testing in your API tests
  • An extension method on Scenario to disable the JWT header altogether so you can test out authentication failures
  • Probably need a facility of some sort to override the JWT expiration time as well

Testing effectively — with or without mocks or stubs

My team at MedeAnalytics are working with our development teams on a long term initiative to improve the effectiveness of our developer testing practices and encouraging more Test Driven Development in daily development. As part of that effort, it’s time for my organization to talk about how — and when — we use test doubles like mock objects or stubs in our daily work.

I think most developers have probably heard the terms mock or stub. Mocks and stubs are examples of “test doubles” that refer to any kind of testing object or function that is substituted a production dependency while testing or in early development before the real dependency is available. At the moment, I’m concerned that we may be overusing mock objects and especially dynamic mocking tools when other types of testing doubles could be easier or we should be switching to integration testing and be testing against the real dependencies.

Let’s first talk about the different kinds of tests that developers write, both as part of a Test Driven Development workflow and tests that you may add later to act as regression tests. When I was learning about developer testing, we talked about a pretty strict taxonomy of test types using the old Michael Feathers definition of what constituted a unit test. Unit tests typically meant that we tested one class at a time with all of its dependencies replaced with some kind of test double so we could isolate the functionality of just that one test. We’d also write some integration tests that ran part or all of the application stack, but that wasn’t emphasized very much at the time.

Truth be told, many of the mock-heavy unit tests I wrote back then didn’t provide a lot of value compared to the effort I put into authoring them, and I learned the hard way in longer lived codebases like StructureMap that I was better off in many cases breaking the “one class” rule of unit tests and writing far more coarser grained tests that were focused on usage scenarios because the fine-grained unit tests actually made it much harder to evolve the internal structure of the code. In my later efforts, I switched to mostly testing through the public APIs all the way down the stack and got much better results.

Flash forward to today, and we talk a lot about the testing pyramid concept where code really needs to be covered by different types of tests for maximum effective test coverage – lots of small unit tests, a medium amount of a nebulously defined middle ground of integration tests, and a handful of full blown, end to end black box tests. The way that I personally over-simplify this concept is to say:

Test with the finest grained mechanism that tells you something important

Jeremy Miller (me!)

For any given scenario, I want developers to consciously choose the proper mix of testing techniques that really tells you that the code fulfills its requirements correctly. At the end of the day, the code passing all of its integration tests should go along way toward making us confident that the code is ready to ship to production.

I also want developers to be aware that unit tests that become tightly coupled to the implementation details can make it quite difficult to refactor that code later. In one ongoing project, one of my team members is doing a lot of work to optimize an expensive process. We’ve already talked about the need to do the majority of his testing from outside-in with integration tests so he will have more freedom to iterate on completely different internal mechanisms while he pursues performance optimization.

I think I would recommend that we all maybe stop thinking so much about unit vs integration tests and think more about tests being on a continuous spectrum between “sociable” vs “solitary” tests that Marten Fowler discusses in On the Diverse And Fantastical Shapes of Testing.

To sum up this section, I think there are two basic questions I want our developers to constantly ask themselves in regards to using any kind of test double like a mock object or a stub:

  1. Should we be using an integration or “sociable” test instead of a “solitary” unit test that uses test doubles?
  2. When writing a “solitary” unit test with a test double, which type of test double is easiest in this particular test?

In the past, we strongly preferred writing “solitary” tests with or without mock objects or other fakes because those tests were reliable and ran fast. That’s still a valid consideration, but I think these days it’s much easier to author more “socialable” tests that might even be using infrastructure like databases or the file system than it was when I originally learned TDD and developer testing. Especially if a team is able to use something like Docker containers to quickly spin up local development environments, I would very strongly recommend writing tests that work through the data layer or call HTTP endpoints in place of pure, Feathers-compliant unit tests.

Black box or UI tests through tools like Selenium are just a different ball game altogether and I’m going to just say that’s out of the scope of this post and get on with things.

As for when mock objects or stubs or any other kind of test double are or are not appropriate, I’m going to stand pat on what I’ve written in the past:

After all of that, I’d finally like to talk about all the different kinds of test doubles, but first I’d like to make a short digression into…

The Mechanics of Automated Tests

I think that most developers writing any kind of tests today are familiar with the Arrange-Act-Assert pattern and nomenclature. To review, most tests will follow a structure of:

  1. “Arrange” any initial state and inputs to the test. One of the hallmarks of a good test is being repeatable and creating a clear relationship between known inputs and expected outcomes.
  2. “Act” by executing a function or a method on a class or taking some kind of action in code
  3. “Assert” that the expected outcome has happened

In the event sourcing support in Marten V4, we have a subsystem called the “projection daemon” that constantly updates projected views based on incoming events. In one of its simpler modes, there is a class called SoloCoordinator that is simply responsible for starting up every single configured projected view agent when the projection daemon is initialized. To test the start up code of that class, we have this small test:

    public class SoloCoordinatorTests
    {
        [Fact]
        public async Task start_starts_them_all()
        {
            // "Arrange" in this case is creating a test double object
            // as an input to the method we're going to call below
            var daemon = Substitute.For<IProjectionDaemon>();
            var coordinator = new SoloCoordinator();

            // This is the "Act" part of the test
            await coordinator.Start(daemon, CancellationToken.None);

            // This is the "Assert" part of the test
            await daemon.Received().StartAllShards();
        }
    }

and the little bit of code it’s testing:

        public Task Start(IProjectionDaemon daemon, CancellationToken token)
        {
            _daemon = daemon;
            return daemon.StartAllShards();
        }

In the code above, the real production code for the IProjectionDaemon interface is very complicated and setting up a real one would require a lot more code. To short circuit that set up in the “arrange” part of the test, I create a “test double” for that interface using the NSubstitute library, my dynamic mock/stub/spy library of choice.

In the “assert” phase of the test I needed to verify that all of the known projected views were started up, and I did that by asserting through the mock object that the IProjectionDaemon.StartAllShards() method was called. I don’t necessarily care here that that specific method was called so much as that the SoloCoordinator sent a logical message to start all the projections.

In the “assert” part of a test you can either verify some expected change of state in the system or return value (state-based testing), or use what’s known as “interaction testing” to verify that the code being tested sent the expected messages or invoked the proper actions to its dependencies.

See an old post of mine from 2005 called TDD Design Starter Kit – State vs. Interaction Testing for more discussion on this subject. The old code samples and formatting are laughable, but I think the discussion about the concepts are still valid.

As an aside, you might ask why I bothered writing a test for such a simple piece of code? I honestly won’t write bother writing unit tests in every case like this, but a piece of advice I read from (I think) Kent Beck was to write a test for any piece of code that could possibly break. Another rule of thumb is to write a test for any vitally important code regardless of how simple it is in order to remove project risk by locking it down through tests that could fail in CI if the code is changed. And lastly, I’d argue that it was worthwhile to write that test as documentation about what the code should be doing for later developers.

Mocks or Spies

Now that we’ve established the basic elements of automated testing and reviewed the difference between state-based and interaction-based testing, let’s go review the different types of test doubles. I’m going to be using the nomenclature describing different types of test doubles from the xUnit Patterns book by Gerard Meszaro (I was one of the original technical reviewers of that book and used the stipend to buy a 30gb iPod that I had for years). You can find his table explaining the different kinds of test doubles here.

As I explain the differences between these concepts, I recommend that you focus more on the role of a test double within a test than get too worked up about how things happen to be implemented and if we are or are not using a dynamic mocking tool like NSubstitute.

The most commonly used test double term is a “mock.” Some people use this term to mean “dynamic stand in object created dynamically by a mocking tool.” The original definition was an object that can be used to verify the calls or interactions that the code under test makes to its dependencies.

To illustrate the usage of mock objects, let’s start with a small feature in Marten to seed baseline data in the database whenever a new Marten DocumentStore is created. That feature allows users to register objects implementing this interface to the configuration of a new DocumentStore:

    public interface IInitialData
    {
        Task Populate(IDocumentStore store);
    }

Here’s a simple unit test from the Marten codebase that uses NSubstitute to do interaction testing with what I’d still call a “mock object” to stand in for IInitialData objects in the test:

        [Fact]
        public void runs_all_the_initial_data_sets_on_startup()
        {

            // These three objects are mocks or spies
            var data1 = Substitute.For<IInitialData>();
            var data2 = Substitute.For<IInitialData>();
            var data3 = Substitute.For<IInitialData>();

            // This is part of a custom integration test harness
            // we use in the Marten codebase. It's configuring and
            // and spinning up a new DocumentStore. As part of the
            // DocumentStore getting initialized, we expect it to
            // execute all the registered IInitialData objects
            StoreOptions(_ =>
            {
                _.InitialData.Add(data1);
                _.InitialData.Add(data2);
                _.InitialData.Add(data3);
            });

            theStore.ShouldNotBeNull();

            // Verifying that the expected interactions
            // with the three mocks happened as expected
            data1.Received().Populate(theStore);
            data2.Received().Populate(theStore);
            data3.Received().Populate(theStore);
        }

In the test above, the “assertions” are just that at some point when a new Marten DocumentStore is initialized it will call the three registered IInitialData objects to seed data. This test would fail if the IInitialData.Populate(IDocumentStore) method was not called on any of the mock objects. Mock objects are used specifically to do assertions about the interactions between the code under test and its dependencies.

In the original xUnitPatterns book, the author also identified another slightly different test double called a “spy” that recorded the inputs to itself that could be interrogated in the “Assert” part of a test. That differentiation did make sense years ago when early mock tools like RhinoMocks or NMock worked very differently than today’s tools like NSubstitute or FakeItEasy.

I used NSubstitute in the sample above to build a mock dynamically, but at other times I’ll roll a mock object by hand when it’s more convenient. Consider the common case of needing to verify that an important message or exception was logged (I honestly won’t always write tests for this myself, but it makes for a good example here).

Using a dynamic mocking tool (Moq in this case) to mock the ILogger<T> interface from the core .Net abstractions just to verify that an exception was logged could result in code like this:

_loggerMock.Received().Log(
            _loggerMock.Verify(
                x => x.Log(
                    LogLevel.Error,
                    It.IsAny<EventId>(),
                    It.IsAny<It.IsAnyType>(),
                    It.IsAny<Exception>(),
                    It.IsAny<Func<It.IsAnyType, Exception, string>>()),
                Times.AtLeastOnce);

In this case you have to use an advanced feature of mocking libraries called argument matchers as a kind of wild card to match the expected call against data generated at runtime you don’t really care about. As you can see, this code is a mess to write and to read. I wouldn’t say to never use argument matchers, but it’s a “guilty until proven” kind of technique to me.

Instead, let’s write our own mock object for logging that will more easily handle the kind of assertions we need to do later (this wasn’t a contrived example, I really have used this):

    public class RecordingLogger<T> : ILogger<T>
    {
        public IDisposable BeginScope<TState>(TState state)
        {
            throw new NotImplementedException();
        }

        public bool IsEnabled(LogLevel logLevel)
        {
            return true;
        }

        public void Log<TState>(
            LogLevel logLevel, 
            EventId eventId, 
            TState state, 
            Exception exception, 
            Func<TState, Exception, string> formatter)
        {
            // Just add this object to the list of messages
            // received
            var message = new LoggedMessage
            {
                LogLevel = logLevel,
                EventId = eventId,
                State = state,
                Exception = exception,
                Message = formatter?.Invoke(state, exception)
            };

            Messages.Add(message);
        }

        public IList<LoggedMessage> Messages { get; } = new List<RecordingLogger<T>.LoggedMessage>();

        public class LoggedMessage
        {
            public LogLevel LogLevel { get; set; }

            public EventId EventId { get; set; }

            public object State { get; set; }

            public Exception Exception { get; set; }

            public string Message { get; set; }
        }

        public void AssertExceptionWasLogged()
        {
            // This uses Fluent Assertions to blow up if there
            // are no recorded errors logged
            Messages.Any(x => x.LogLevel == LogLevel.Error)
                .Should().BeTrue("No exceptions were logged");
        }
    }

With this hand-rolled mock object, the ugly code above that uses argument matchers just becomes this assertion in the tests:

// _logger is a RecordingLogger<T> that 
// was used as an input to the code under
// test
_logger.AssertExceptionWasLogged();

I’d argue that that’s much simpler to read and most certainly to write. I didn’t show it here, but with you could have just interrogated the calls made to a dynamically generated mock object without using argument matchers, but the syntax for that can be very ugly as well and I don’t recommend that in most cases.

To sum up, “mock objects” are used in the “Assert” portion of your test to verify that expected interactions were made with the dependencies of the code under tests. You also don’t have to use a mocking tool like NSubstitute, and sometimes a hand-rolled mock class might be easier to consume and lead to easier to read tests.

Pre-Canned Data with Stubs

“Stubs” are just a way to replace real services with some kind of stand in that supplies pre-canned data as test inputs. Whereas “mocks” refer to interaction testing, “stubs” are to provide inputs in state-based testing. I won’t go into too much detail because I think this concept is pretty well understood, but here’s an example of using NSubstitute to whip up a stub in place of a full blown Marten query session in a test:

        [Fact]
        public async Task using_a_stub()
        {

            var user = new User {UserName = "jmiller",};

            // I'm stubbing out Marten's IQuerySession
            var session = Substitute.For<IQuerySession>();
            session.LoadAsync<User>(user.Id).Returns(user);

            var service = new ServiceThatLooksUpUsers(session);
            
            // carry out the Arrange and Assert parts of the test
        }

Again, “stub” refers to a role within the test and not how it was built. In memory database stand ins in tools like Entity Framework Core are another common example of using stubs.

Dummy Objects

A “dummy” is a test double who’s only real purpose is to act as a stand in service that does nothing but yet allows your test to run without constant NullReferenceException problems. Going back to the ServiceThatLooksUpUsers in the previous section, let’s say that the service also depends on the .Net ILogger<T> abstraction for tracing within the service. We may not care about the log messages happening in some of our tests, but ServiceThatLooksUpUsers will blow up if it doesn’t have a logger, so we’ll use the built in NullLogger<T> that’s part of the .Net logging as a “dummy” like so:

        [Fact]
        public async Task using_a_stub()
        {

            var user = new User {UserName = "jmiller",};

            // I'm stubbing out Marten's IQuerySession
            var session = Substitute.For<IQuerySession>();
            session.LoadAsync<User>(user.Id).Returns(user);

            var service = new ServiceThatLooksUpUsers(
                session,
                
                // Using a dummy logger
                new NullLogger<ServiceThatLooksUpUsers>());

            // carry out the Arrange and Assert parts of the test
        }

Summing it all up

I tried to cover a lot of ground, and to be honest, this was meant to be the first cut at a new “developer testing best practices” guide at work so it meanders a bit.

There’s a few things I would hope folks would get out of this post:

  • Mocks or stubs can sometimes be very helpful in writing tests, but can also cause plenty of heartburn in other cases.
  • Don’t hesitate to skip mock-heavy unit tests with some sort of integration test might be easier to write or do more to ascertain that the code actually works
  • Use the quickest feedback cycle you can get away with when trying to decide what kind of testing to do for any given scenario — and sometimes a judicious usage of a stub or mock object helps write tests that run faster and are easier to set up than integration tests with the real dependencies
  • Don’t get tunnel vision on using mocking libraries and forget that hand-rolled mocks or stubs can sometimes be easier to use within some tests

Dynamic Code Generation in Marten V4

Marten V4 extensively uses runtime code generation backed by Roslyn runtime compilation for dynamic code. This is both much more powerful than source generators in what it allows us to actually do, but can have significant memory usage and “cold start” problems (seems to depend on exact configurations, so it’s not a given that you’ll have these issues). In this post I’ll show the facility we have to “generate ahead” the code to dodge the memory and cold start issues at production time.

Before V4, Marten had used the common model of building up Expression trees and compiling them into lambda functions at runtime to “bake” some of our dynamic behavior into fast running code. That does work and a good percentage of the development tools you use every day probably use that technique internally, but I felt that we’d already outgrown the dynamic Expression generation approach as it was, and the new functionality requested in V4 was going to substantially raise the complexity of what we were going to need to do.

Instead, I (Marten is a team effort, but I get all the blame for this one) opted to use the dynamic code generation and compilation approach using LamarCodeGeneration and LamarCompiler that I’d originally built for other projects. This allowed Marten to generate much more complex code than I thought was practical with other models (we could have used IL generation too of course, but that’s an exercise in masochism). If you’re interested, I gave a talk about these tools and the approach at NDC London 2019.

I do think this has worked out in terms of performance improvements at runtime and certainly helped to be able to introduce the new document and event store metadata features in V4, but there’s a catch. Actually two:

  1. The Roslyn compiler sucks down a lot of memory sometimes and doesn’t seem to ever release it. It’s gotten better with newer releases and it’s not consistent, but still.
  2. There’s a sometimes significant lag in cold start scenarios on the first time Marten needs to generate and compile code at runtime

What we could do though, is provide what I call..

Marten’s “Generate Ahead” Strategy

To side step the problems with the Roslyn compilation, I developed a model (I did this originally in Jasper) to generate the code ahead of time and have it compiled into the entry assembly for the system. The last step is to direct Marten to use the pre-compiled types instead of generating the types at runtime.

Jumping straight into a sample console project to show off this functionality, I’m configuring Marten with the AddMarten() method you can see in this code on GitHub.

The important line of code you need to focus on here is this flag:

opts.GeneratedCodeMode = TypeLoadMode.LoadFromPreBuiltAssembly;

This flag in the Marten configuration directs Marten to first look in the entry assembly of the application for any types that it would normally try to generate at runtime, and if that type exists, load it from the entry assembly and bypass any invocation of Roslyn. I think in a real application you’d wrap that call something like this so that it only applies when the application is running in production mode:

if (Environment.IsProduction())
{

    options.GeneratedCodeMode = 
        TypeLoadMode.LoadFromPreBuiltAssembly;
}

The next thing to notice is that I have to tell Marten ahead of time what the possible document types and even about any compiled query types in this code so that Marten will “know” what code to generate in the next section. The compiled query registration is new, but you already had to let Marten know about the document types to make the schema migration functionality work anyway.

Generating and exporting the code ahead of time is done from the command line through an Oakton command. First though, add the LamarCodeGeneration.Commands Nuget to your entry project, which will also add a transitive reference to Oakton if you’re not already using it. This is all described in the Oakton getting started page, but you’ll need to change your Program.Main() method slightly to activate Oakton:

        // The return value needs to be Task<int> 
        // to communicate command success or failure
        public static Task<int> Main(string[] args)
        {
            return CreateHostBuilder(args)
                
                // This makes Oakton be your CLI
                // runner
                .RunOaktonCommands(args);
        }

If you’ll open up the command terminal of your preference at the root directory of the entry project, type this command to see the available Oakton commands:

dotnet run -- help

That’ll spit out a list of commands and the assemblies where Oakton looked for command types. You should see output similar to this:

Searching 'LamarCodeGeneration.Commands, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' for commands
Searching 'Marten.CommandLine, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' for commands

  -------------------------------------------------
    Available commands:
  -------------------------------------------------
        check-env -> Execute all environment checks against the application
          codegen -> Utilities for working with LamarCodeGeneration and LamarCompiler

and assuming you got that far, now type dotnet run -- help codegen to see the list of options with the codegen command.

If you just want to preview the generated code in the console, type:

dotnet run -- codegen preview

To just verify that the dynamic code can be successfully generated and compiled, use:

dotnet run -- codegen test

To actually export the generated code, use:

dotnet run — codegen write

That command will write a single C# file at /Internal/Generated/DocumentStorage.cs for any document storage types and another at `/Internal/Generated/Events.cs` for the event storage and projections.

Just as a short cut to clear out any old code, you can use:

dotnet run -- codegen delete

If you’re curious, the generated code — and remember that it’s generated code so it’s going to be butt ugly — is going to look like this for the document storage, and this code for the event storage and projections.

The way that I see this being used is something like this:

  • The LoadFromPreBuiltAssembly option is only turned on in Production mode so that developers can iterate at will during development. That should be disabled at development time.
  • As part of the CI/CD process for a project, you’ll run the dotnet run -- codegen write command as an initial step, then proceed to the normal compilation and testing cycles. That will bake in the generated code right into the compiled assemblies and enable you to also take advantage of any kind AOT compiler optimizations

Duh. Why not source generators doofus?

Why didn’t we use source generators you may ask? The Roslyn-based approach in Marten is both much better and much worse than source generators. Source generators are part of the compilation process itself and wouldn’t have any kind of cold start problem like Marten has with the runtime Roslyn compilation approach. That part is obviously much better, plus there’s no weird two step compilation process at CI/CD time. But on the downside, source generators can only use information that’s available at compilation time, and the Marten code generation relies very heavily on type reflection, conventions applied at runtime, and configuration built up through a fluent interface API. I do not believe that we could use source generators for what we’ve done in Marten because of that dependency on runtime information.

Marten, the Generic Host Builder in .Net Core, and why this could be the golden age for OSS in .Net

Before I dive into some newly improved mechanics for quickly integrating Marten into just about any possible kind of .Net 5+ application, I want to take a trip through yesteryear and appreciate just how much better the .Net platform is today than it’s every been. If you wanna skip down to the positive part of the post, just look for sub heading “The better future for .Net is already here!.”

I’ve been a .Net developer since the very ugly beginning when we were all just glad to be off VB6/COM and onto a “modern” programming language that felt like a better Java but not yet completely horrified at how disastrously bad WebForms was going to turn out to be or disappointed at how poorly all our new .Net stuff worked out when we tried to adopt Agile development practices like Test Driven Development.

I’ve also been very deeply invested in developing open source tools and frameworks on the .Net framework, starting with StructureMap, the very first production capable Inversion of Control library on .Net that first went into a production system in the summer of 2004.

As the maintainer of StructureMap throughout, I had a front row seat to the absolute explosion of adapter libraries for seemingly every possible permutation of framework, IoC tool, and logging library. And this happened because every framework of any note in the .Net world had a completely different way to abstract away the actual IoC container or allowed you to use different logging tools through the specific framework’s special adapter mechanisms.

I’m also old enough to remember when every tool took a direct dependency on log4net around the time they lost their public strong naming key and broke compatibility. That was loads of fun.

There also wasn’t a standard way to deal with application lifecycle events across different types of .Net applications or application frameworks (ASP.Net’s Global.ASAX being an exception I guess). Every framework at that time had a different way of being bootstrapped at application start up time, so developers had to do a lot of plumbing code as they pieced all this junk together with the myriad adapter libraries.

To recap the “old”, pre-Core .Net world:

  • Application framework developers had to create their own IoC and logging abstractions and support a plethora of adapter libraries as well as possibly having to create all new application lifecycle management facilities as part of their framework
  • Library developers had to support users through their usage of problematic framework integrations where the library in question was often used in non-optimal ways (my personal pet peeve, and I edited this post before publishing to eliminate some harsh criticism of certain well known .Net application frameworks).
  • Developers of actual .Net applications were rightfully intimidated by the complexity necessary to integrate open source tools, or simply struggled to do so as there was no standardization of how all those tools went about their integration into .Net applications.

I would argue that the situation in .Net as it was before .Net Core seriously impeded the development and adoption of open source tooling in our community — which also had the effect of slowing down the rate of innovation throughout the .Net community.

You might notice that I used the past tense quite a bit in the previous paragraphs, and that’s because I think…

The better future for .Net is already here!.

It’s hard to overstate this, but the generic host builder (IHostBuilder and IHost) in .Net Core / .Net 5 and the related is hugely advantageous for the .Net ecosystem. I tried to explain why I think this a bit in a recent episode of .Net Rocks, but it’s probably easier to demonstrate this by integrating Marten into a new .Net 5 web api project.

Assuming that you’re already familiar with the usage of the Startup class for configuring ASP.Net Core applications, I’m going to add Marten to the new service with the following code:

public void ConfigureServices(IServiceCollection services)
{
    // This is the absolute, simplest way to integrate Marten into your
    // .Net Core application with Marten's default configuration
    services.AddMarten(options =>
    {
        // Establish the connection string to your Marten database
        options.Connection(Configuration.GetConnectionString("Marten"));

        // If we're running in development mode, let Marten just take care
        // of all necessary schema building and patching behind the scenes
        if (Environment.IsDevelopment())
        {
            options.AutoCreateSchemaObjects = AutoCreate.All;
        }

        if (Environment.IsProduction())
        {
            // This is a new V4 feature I'll blog about soon...
            options.GeneratedCodeMode = TypeLoadMode.LoadFromPreBuiltAssembly;
        }

        // Turn on the async projection daemon in only one node
        options.Projections.AsyncMode = DaemonMode.HotCold;

        // Configure other elements of the event store or document types...
    });
}

Even if you’re a developer who is completely unfamiliar with Marten itself, the integration through AddMarten() follows the recent .Net idiom and would be conceptually similar to many other tools you probably already use. That by itself is a very good thing as there’s less novelty for new Marten developers (there is of course matching UseMarten() signatures that hang off of IHostBuilder that is useful for other kinds of .Net projects)

I purposely used a few bells and whistles in Marten to demonstrate more capabilities of the generic host in .Net Core. To go into more detail, the code above:

  • Registers all the necessary Marten services with the proper service lifecycles in the underlying system container. And that works for any IoC container that supports the standard DI abstractions in the Microsoft.Extensions.DependencyInjection namespace. There isn’t any kind of Marten.Lamar or Marten.DryIoC or Marten.WhateverIoCToolYouHappenToLike adapter libraries.
  • I’ve also turned on Marten’s asynchronous projection process which will run as a long running task in the application by utilizing the standard .Net Core IHostedService mechanism whose lifecycle will be completely managed by the generic host in .Net.
  • We were able to do a little bit of environment specific configuration to let Marten do all necessary schema management automatically at development time, but not allow Marten to change database schemas in production. I also opted into a new Marten V4 optimization for production cold starts when the application is running in “Production” mode. That’s easy to do now using .Net’s standard IHostEnvironment mechanism at bootstrapping time.
  • It’s not obvious from that code, but starting in V4.0, Marten will happily log to the standard ILogger interface from .Net Core. Marten itself doesn’t care where or how you actually capture log information, but because .Net itself now specifies a single set of logging abstractions to rule them all, Marten can happily log to any commonly used logging mechanism (NLog, Serilog, the console, Azure logging, whatever) without any other kind of adapter.
  • And oh yeah, I had to add the connection string to a Postgresql database from .Net Core’s IConfiguration interface, which itself could be pulling information any number of ways, but our code doesn’t have to care about anything else but that one standard interface.

As part of the team behind the Marten, I feel like our jobs are easier now than it was just prior to .Net Core and the generic host builder because now there are:

  • .Net standard abstractions for logging, configuration, and IoC service registration
  • The IHostedService mechanism is an easy, standard way in .Net to run background processes or even just to hook into application start up and tear down events with custom code
  • The IServiceCollection.Add[Tool Name]() and IHostBuilder.Use[Tool Name]() idioms provided us a way to integrate Marten into basically any kind of .Net Core / .Net 5/6 application through a mechanism that is likely to feel familiar to experienced .Net developers — which hopefully reduces the barrier to adoption for Marten

As some of you reading this may know, I was very heavily involved in building an alternative web development framework for .Net called FubuMVC well before .Net Core came on to the scene. When we built FubuMVC, we had to create from scratch lot of the exact same functionality that’s in the generic host today to manage application lifecycle events, bootstrapping the application’s IoC container through its own IoC abstractions, and hooks for extension with other .Net libraries. I wouldn’t say that was the primary reason that FubuMVC failed as a project, but it certainly didn’t help.

Flash forward to today, and I’ve worked off and on on a complete replacement for FubuMVC on .Net Core called Jasper for asynchronous messaging and an alternative for HTTP services. That project isn’t going anywhere at the moment, but Jasper’s internals are much, much smaller than FubuMVC’s were for similar functionality because Jasper can rely on the functionality and standard abstractions of the generic host in .Net Core. That makes tools like Jasper much easier for .Net developers to write.

Unsurprisingly, there are plenty of design and implementation decisions that Microsoft made with the generic host builder that I don’t agree with. I can point to a few things that I believe FubuMVC did much better for composition a decade ago than ASP.Net Core does today (and vice versa too of course). I deeply resent how the ASP.Net Core team wrote the IoC compliance rules that were rammed down the throats of all us IoC library maintainers. You can also point out that there are cultural problems within the .Net community in regards to OSS and that Microsoft itself severely retards open source innovation in .Net by stomping on community projects at will.

All of that may be true, but from a strictly technical perspective, I think that it is now much easier for the .Net community to build and adopt innovative open source tools than it ever has been. I largely credit the work that’s been done with the generic host, the common idioms for tool integration, and the standard interfaces that came in the .Net Core wave for potentially opening up a new golden age for .Net OSS development.

Brief Update on Marten V4

The latest, greatest Marten V4 release candidate is up on Nuget this morning. Marten V4 has turned out to be a massive undertaking for an OSS project with a small team, but we’re not that far off what’s turned into our “Duke Nukem Forever” release. We’re effectively “code complete” other than occasional bugs coming in from early users — which is fortunately a good thing as it’s helped us make some things more robust without impacting the larger community.

At this point it’s mostly a matter of completing the documentation updates for V4, and I’m almost refusing to work on new features until that’s complete. The Marten documentation website is moving to VitePress going forward, so it’ll be re-skinned with better search capabilities. We’re also taking this opportunity to reorganize the documentation with the hopes of making that site more useful for users.

Honestly, the biggest thing holding us back right now in my opinion is that I’ve been a little burned out and slow working on the documentation. Right now, I’m hoping we can pull the trigger on V4 in the next month or two.

Joining MedeAnalytics

I joined MedeAnalytics last week in a leadership role within their architecture team. Besides helping tackle strategic technical goals like messaging, test automation, and cloud architecture, I’m trying to play the Martin Fowler-style “bumble bee” role across development teams on topics like tools, techniques, and the technologies we use.

I’m hopeful that being in a strategic role where I’m constantly neck deep in technical challenges and learning new things will reboot my blogging a bit, but we’ll see. Right off the bat, I’ve been reading up on the C4 Model for describing software architecture and the Structurizr toolset as a way of reverse engineering our existing systems for my own edification. At least expect a positive review of both soon.

I don’t know how this will change my OSS work at the moment, other than I’ll be answering Lamar issues raised by my new MedeAnalytics colleagues a lot faster than I probably did for them before 🙂

Using Alba to Test ASP.Net Services

TL;DR: Alba is a handy library for integration tests against ASP.Net web services, and the rapid feedback that enables is very useful

One of our projects at Calavista right now is helping a client modernize and optimize a large .Net application, with the end goal being everything running on .Net 5 and an order of magnitude improvement in system throughput. As part of the effort to upgrade the web services, I took on a task to replace this system’s usage of IdentityServer3 with IdentityServer4, but still use the existing Marten-backed data storage for user membership information.

Great, but there’s just one problem. I’ve never used IdentityServer4 before and it changed somewhat between the IdentityServer3 code I was trying to reverse engineer and its current model. I ended up getting through that work just fine. A key element of doing that was using the Alba library to create a test harness so I could iterate through configuration changes quickly by rerunning tests on the new IdentityServer4 project. It didn’t start out this way, but Alba is essentially a wrapper around the ASP.Net TestServer and just acts as a utility to make it easier to write automated tests around the HTTP services in your web service projects.

I ended up starting two new .Net projects:

  1. A new web service that hosts IdentityServer4 and is configured to use user membership information from our client’s existing Marten/Postgresql database
  2. A new xUnit.Net project to hold integration tests against the new IdentityServer4 web service.

Let’s dive right into how I set up Alba and xUnit.Net as an automated test harness for our new IdentityServer4 service. If you start a new ASP.Net project with one of the built in project templates, you’ll get a Program file that’s the main entry point for the application and a Startup class that has most of the system’s bootstrapping configuration. The templates will generate this method that’s used to configure the IHostBuilder for the application:

public static IHostBuilder CreateHostBuilder(string[] args)
{
    return Host.CreateDefaultBuilder(args)
        .ConfigureWebHostDefaults(webBuilder =>
        {
            webBuilder.UseStartup<Startup>();
        });
}

For more information on what role of the IHostBuilder is within your application, see .NET Generic Host in ASP.NET Core.

That’s important, because that gives us the ability to stand up the application exactly as it’s configured in an automated test harness. Switching to the new xUnit.Net test project, referenced my new web service project that will host IdentityServer4. Because spinning up your ASP.Net system can be relatively expensive, I only want to do that once and share the IHost between tests. That’s a perfect usage for xUnit.Net’s shared context support.

First I make what will be the shared test fixture context class for the integration tests shown below:

public class AppFixture : IDisposable
{
    public AppFixture()
    {
        // This is calling into the actual application's
        // configuration to stand up the application in 
        // memory
        var builder = Program.CreateHostBuilder(new string[0]);

        // This is the Alba "SystemUnderTest" wrapper
        System = new SystemUnderTest(builder);
    }
    
    public SystemUnderTest System { get; }

    public void Dispose()
    {
        System?.Dispose();
    }
}

The Alba SystemUnderTest wrapper is responsible for building the actual IHost object for your system, and does so using the in memory TestServer in place of Kestrel.

Just as a convenience, I like to create a base class for integration tests I tend to call IntegrationContext:

    public abstract class IntegrationContext 
        // This is some xUnit.Net mechanics
        : IClassFixture<AppFixture>
    {
        public IntegrationContext(AppFixture fixture)
        {
            System = fixture.System;
            
            // Just casting the IServiceProvider of the underlying
            // IHost to a Lamar IContainer as a convenience
            Container = (IContainer)fixture.System.Services;
        }
        
        public IContainer Container { get; }
        
        // This gives the tests access to run
        // Alba "Scenarios"
        public ISystemUnderTest System { get; }
    }

We’re using Lamar as the underlying IoC container in this application, and I wanted to use Lamar-specific IoC diagnostics in the tests, so I expose the main Lamar container off of the base class as just a convenience.

To finally turn to the tests, the very first thing to try with IdentityServer4 was just to hit the descriptive discovery endpoint just to see if the application was bootstrapping correctly and IdentityServer4 was functional *at all*. I started a new test class with this declaration:

    public class EndToEndTests : IntegrationContext
    {
        private readonly ITestOutputHelper _output;
        private IDocumentStore theStore;

        public EndToEndTests(AppFixture fixture, ITestOutputHelper output) : base(fixture)
        {
            _output = output;

            // I'm grabbing the Marten document store for the app
            // to set up user information later
            theStore = Container.GetInstance<IDocumentStore>();
        }

And then a new test just to exercise the discovery endpoint:

[Fact]
public async Task can_get_the_discovery_document()
{
    var result = await System.Scenario(x =>
    {
        x.Get.Url("/.well-known/openid-configuration");
        x.StatusCodeShouldBeOk();
    });
    
    // Just checking out what's returned
    _output.WriteLine(result.ResponseBody.ReadAsText());
}

The test above is pretty crude. All it does is try to hit the `/.well-known/openid-configuration` url in the application and see that it returns a 200 OK HTTP status code.

I tend to run tests while I’m coding by using keyboard shortcuts. Most IDEs support some kind of “re-run the last test” keyboard shortcut. Using that, my preferred workflow is to run the test once, then assuming that the test is failing the first time, work in a tight cycle of making changes and constantly re-running the test(s). This turned out to be invaluable as it took me a couple iterations of code changes to correctly re-create the old IdentityServer3 configuration into the new IdentityServer4 configuration.

Moving on to doing a simple authentication, I wrote a test like this one to exercise the system with known credentials:

[Fact]
public async Task get_a_token()
{
    // All I'm doing here is wiping out the existing database state
    // with Marten and using a little helper method to add a new
    // user to the database as part of the test setup
    // This would obviously be different in your system
    theStore.Advanced.Clean.DeleteAllDocuments();
    await theStore
        .CreateUser("aquaman@justiceleague.com", "dolphin", "System Administrator");
    
    
    var body =
        "client_id=something&client_secret=something&grant_type=password&scope=ourscope%20profile%20openid&username=aquaman@justiceleague.com&password=dolphin";

    // The test would fail here if the status
    // code was NOT 200
    var result = await System.Scenario(x =>
    {
        x.Post
            .Text(body)
            .ContentType("application/x-www-form-urlencoded")
            .ToUrl("/connect/token");
    });

    // As a convenience, I made a new class called ConnectData
    // just to deserialize the IdentityServer4 response into 
    // a .Net object
    var token = result.ResponseBody.ReadAsJson<ConnectData>();
    token.access_token.Should().NotBeNull();
    token.token_type.Should().Be("Bearer");
}

public class ConnectData
{
    public string access_token { get; set; }
    public int expires_in { get; set; }
    public string token_type { get; set; }
    public string scope { get; set; }
    
}

Now, this test took me several iterations to work through until I found exactly the right way to configure IdentityServer4 and adjusted our custom Marten backing identity store (IResourceOwnerPasswordValidator and IProfileService in IdentityServer4 world) until the tests pass. I found it extremely valuable to be able to debug right into the failing tests as I worked, and even needed to take advantage of JetBrains Rider’s capability to debug through external code to understand how IdentityServer4 itself worked. I’m very sure that I was able to get through this work much faster by iterating through tests as opposed to just trying to run the application and driving it through something like Postman or through the connected user interface.

Improvements to Event Sourcing in Marten V4

Marten V4 is still in flight, but everything I’m showing in this post is in the latest alpha release (4.0.0-alpha.6) release to Nuget.

TL;DR: Marten V4 has some significant improvements to its event sourcing support that will help developers deal with potential concurrency issues. The related event projection support in Marten V4 is also significantly more robust than previous versions.

A Sample Project Management Event Store

Imagine if you will, a simplistic application that uses Marten’s event sourcing functionality to do project management tracking with a conceptual CQRS architecture. The domain will include tracking each project task within the greater project. In this case, I’m choosing to model the activity of a single project and its related tasks as a stream of events like:

  1. ProjectStarted
  2. TaskRecorded
  3. TaskStarted
  4. TaskFinished
  5. ProjectCompleted

Starting a new Event Stream

Using event sourcing as our persistence strategy, the real system state is the raw events that model state changes or transitions of our project. As an example, let’s say that our system records and initializes a “stream” of events for a new project with this command handler:

    public class NewProjectCommand
    {
        public string Name { get; set; }
        public string[] Tasks { get; set; }
        public Guid ProjectId { get; set; } = Guid.NewGuid();
    }

    public class NewProjectHandler
    {
        private readonly IDocumentSession _session;

        public NewProjectHandler(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(NewProjectCommand command)
        {
            var timestamp = DateTimeOffset.Now;
            var started = new ProjectStarted  
            {
                Name = command.Name, 
                Timestamp = timestamp
            };

            var tasks = command.Tasks
                .Select(name => new TaskRecorded {
                    Timestamp = timestamp, 
                    Title = name
                });

            // This tells Marten that it should create a new project
            // stream
            _session.Events.StartStream(command.ProjectId, started);

            // This quietly appends events to the event stream
            // created in the line of code above
            _session.Events.Append(command.ProjectId, tasks);

            // The actual persistence to Postgresql happens here
            return _session.SaveChangesAsync();
        }
    }

The call to StartStream() up above tells Marten that it should create a new event stream with the supplied project id. As a long requested improvement in Marten for V4, StartStream() guarantees that the transaction cannot succeed if another project event stream with that project id already exists. This helps the system prevent users from accidentally duplicating projects — at least by its project id anyway.

Assuming that there is no existing stream with the project id, Marten will create a new record to track the new project stream and individual records in Postgresql to persist the raw event data along with metadata like the stream version, time stamps, and the event type.

But we need a “read side” projection of the Projects!

So great, we can persist the raw events, and there’s plenty of valuable things we can do later with those events. However, our application sooner or later is going to need to know what the current state of an ongoing project for user screens or validation logic, so we need some way to compile the events for a single project into some kind of “here’s the current state of the Project” data structure — and that’s where Marten’s support for projections comes into play.

To model a “projected” view of a project from its raw events, I’m creating a single Project class to model the full state of a single, ongoing project. To reduce the number of moving parts, I’m going to make Project be “self-aggregating” such that it’s responsible for mutating itself based on incoming events:

    public class Project
    {
        private readonly IList _tasks = new List();

        public Project(ProjectStarted started)
        {
            Version = 1;
            Name = started.Name;
            StartedTime = started.Timestamp;
        }

        // This gets set by Marten
        public string Id { get; set; }

        public long Version { get; set; }
        public DateTimeOffset StartedTime { get; private set; }
        public DateTimeOffset? CompletedTime { get; private set; }

        public string Name { get; private set; }

        public ProjectTask[] Tasks
        {
            get
            {
                return _tasks.ToArray();
            }
            set
            {
                _tasks.Clear();
                _tasks.AddRange(value);
            }
        }

        public void Apply(TaskRecorded recorded, IEvent e)
        {
            Version = e.Version;
            var task = new ProjectTask
            {
                Title = recorded.Title,
                Number = _tasks.Max(x => x.Number) + 1,
                Recorded = recorded.Timestamp
            };

            _tasks.Add(task);
        }

        public void Apply(TaskStarted started, IEvent e)
        {
            Version = e.Version;
            var task = _tasks.FirstOrDefault(x => x.Number == started.Number);

            // Remember this isn't production code:)
            if (task != null) task.Started = started.Timestamp;
        }

        public void Apply(TaskFinished finished, IEvent e)
        {
            Version = e.Version;
            var task = _tasks.FirstOrDefault(x => x.Number == finished.Number);

            // Remember this isn't production code:)
            if (task != null) task.Finished = finished.Timestamp;
        }

        public void Apply(ProjectCompleted completed, IEvent e)
        {
            Version = e.Version;
            CompletedTime = completed.Timestamp;
        }
    }

I didn’t choose to use that here, but I want to point out that you can use immutable aggregates with Marten V4. In that case you’d simply have the Apply() methods return a new Project object (as far as Marten is concerned anyway). Functional programming enthusiasts will cheer the built in support for immutability, some of you will point out that that leads to less efficient code by increasing the number of object allocations and more code ceremony, and I will happily say that Marten V4 let’s you make that decision to suit your own needs and preferences;-)

Also see Event Sourcing with Marten V4: Aggregated Projections for more information on other alternatives for expressing aggregated projections in Marten.

Working in our project management domain, I’d like the Project aggregate document to be updated every time events are captured for an event related to a project. I’ll add the Project document as a self-aggregating, inline projection in my Marten system like this:

    public class Startup
    {
        public void ConfigureServices(IServiceCollection services)
        {
            services.AddMarten(opts =>
            {
                opts.Connection("some connection");

                // Direct Marten to update the Project aggregate
                // inline as new events are captured
                opts.Events
                    .Projections
                    .SelfAggregate<Project>(ProjectionLifecycle.Inline);

            });
        }
    }

As an example of how inline projections come into play, let’s look at another sample command handler for adding a new task to an ongoing project with a new task title :

    public class CreateTaskCommand
    {
        public string ProjectId { get; set; }
        public string Title { get; set; }
    }

    public class CreateTaskHandler
    {
        private readonly IDocumentSession _session;

        public CreateTaskHandler(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(CreateTaskCommand command)
        {
            var recorded = new TaskRecorded
            {
                Timestamp = DateTimeOffset.UtcNow, 
                Title = command.Title
            };

            _session.Events.Append(command.ProjectId, recorded);
            return _session.SaveChangesAsync();
        }
    }

When SaveChangesAsync() is called above, Marten will issue a single database commit that captures a new TaskRecorded event. In that same commit Marten will fetch the persisted Project document for that project event stream, apply the changes from the new event, and persist the Project document that reflects the information in the events.

Let’s say that we have a user interface in our project management system that allows users to review and edit projects and tasks. Independent of the event streams, the “query” side of our application can happily retrieve the latest data for the Project documents by querying with Marten’s built in document database functionality like this simple MVC controller endpoint:

    public class ProjectController: ControllerBase
    {
        private readonly IQuerySession _session;

        public ProjectController(IQuerySession session)
        {
            _session = session;
        }

        [HttpGet("/api/project/{projectId}")]
        public Task<Project> Get(string projectId)
        {
            return _session.LoadAsync<Project>(projectId);
        }
    }

For a future blog post, I’ll show you a new Marten V4 feature for ultra efficient “read side” web services by streaming JSON data directly from Postgresql straight down the HTTP response body without ever wasting any time with JSON serialization.

Concurrency Problems Everywhere!!!

Don’t name a software project “Genesis”. Hubristic project names lead to massive project failures.

The “inline” projection strategy is easy to use in Marten, but it’s vulnerable to oddball concurrency problems if you’re not careful in your system design. Let’s say that we have two coworkers named Jack and Jill using our project management system. Jack pulls up the data for the “Genesis Project” in the UI client, then gets pulled into a hallway conversation (remember when we used to have those in the before times?). While he’s distracted, Jill makes some changes to the “Genesis Project” to record some tasks she and Jack had talked about earlier and saves the changes to the server. Jack finally gets back to his machine and makes basically the same edits to “Genesis” that Jill already did and saves the data. If we build our project management system naively, we’ve now allowed Jack and Jill to duplicate work and the “Genesis Project” task management is all messed up.

One way to prevent that concurrent change issue is to detect that the project has been changed by Jill when Jack tries to persist his duplicate changes and give him the chance to update his screen with the latest data before pushing new project task changes.

Going back to the Project aggregate document, as Marten appends events to a stream, it increments a numeric version number to each event within a stream. Let’s say in our project management system that we always want to know what the latest version of the project. Marten V4 finally allows you to use event metadata like the event version number within inline projections (this has been a long running request from some of our users). You can see that in action in the method below that updates Project based on a TaskStarted event. You might notice that I also pass in the IEvent object that would let us access the event metadata:

public void Apply(TaskStarted started, IEvent e)
{
    // Update the Project document based on the event version
    Version = e.Version;
    var task = _tasks.FirstOrDefault(x => x.Number == started.Number);

    // Remember this isn't production code:)
    if (task != null) task.Started = started.Timestamp;
}

Now, having the stream version number on the Project document turns out to be very helpful for concurrency issues. Since we’re worried about a Project event stream state getting out of sync if the system receives concurrent updates you can pass the current version that’s conveniently updated on the Project read side document down to the user interface, and have the user interface send you what it thinks is the current version of the project when it tries to make updates. If the underlying project event stream has changed since the user interface fetched the original data, we can make Marten reject the additional events. This is a form of an offline optimistic concurrency check, as we’re going to assume that everything is hunky dory, and just let the infrastructure reject the changes as an exceptional case.

To put that into motion, let’s say that our project management user interface posts this command up to the server to close a single project task:

    public class CompleteTaskCommand
    {
        public Guid ProjectId { get; set; }
        public int TaskNumber { get; set; }
        
        // This is the version of the project data
        // that was being edited in the user interface
        public long ExpectedVersion { get; set; }
    }

In the command handler, we can direct Marten to reject the new events being appended if the stream has been changed in between the user interface having fetched the project data and submitting its updates to the server:

    public class CompleteTaskHandler
    {
        private readonly IDocumentSession _session;

        public CompleteTaskHandler(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(CompleteTaskCommand command)
        {
            var @event = new TaskFinished
            {
                Number = command.TaskNumber,
                Timestamp = DateTimeOffset.UtcNow
            };

            _session.Events.Append(
                command.ProjectId,

                // Using this overload will make Marten do
                // an optimistic concurrency check against
                // the existing version of the project event
                // stream as it commits
                command.ExpectedVersion,

                @event);
            return _session.SaveChangesAsync();
        }
    }

In the CompleteTaskHandler above, the call to SaveChangesAsync() will throw a EventStreamUnexpectedMaxEventIdException exception if the project event stream has advanced past the existing version assumed by the originator of the command in command.ExpectedVersion. To make our system more resilient, we would need to catch the Marten exception and deal with the proper application workflow.

What I showed above is pretty draconian in terms of what edits it allows to go through. In other cases you may get by with a simple workflow that just tries to guarantee that a single Project aggregate is only being updated by a single process at any time. Marten V4 comes through with a couple new ways to append events with more control over concurrent updates to a single event stream.

Let’s stick to optimistic checks for now and look at the new AppendOptimistic() method for appending events to an existing event stream with a rewritten version of our CompleteTaskCommand handling:

    public class CompleteTaskHandler2
    {
        private readonly IDocumentSession _session;

        public CompleteTaskHandler2(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(CompleteTaskCommand command)
        {
            var @event = new TaskFinished
            {
                Number = command.TaskNumber,
                Timestamp = DateTimeOffset.UtcNow
            };

            // If some other process magically zips
            // in and updates this project event stream
            // between the call to AppendOptimistic()
            // and SaveChangesAsync(), Marten will detect
            // that and reject the transaction
            _session.Events.AppendOptimistic(
                command.ProjectId,
                @event);

            return _session.SaveChangesAsync();
        }
    }

We would still need to catch the Marten exceptions and handle those somehow. I myself would typically try to handle that with a messaging or command execution framework like MassTransit or my own Jasper project that comes with robust exception handling and retry capabilities.

At this point everything I’ve shown for concurrency control involves optimistic locks that result in exceptions being thrown and transactions being rejected. For another alternative, Marten V4 leverages Postgresql’s robust row locking support with this version of our CompleteTaskCommand handler that uses the new V4 AppendExclusive() method:

    public class CompleteTaskHandler3
    {
        private readonly IDocumentSession _session;

        public CompleteTaskHandler3(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(CompleteTaskCommand command)
        {
            var @event = new TaskFinished
            {
                Number = command.TaskNumber,
                Timestamp = DateTimeOffset.UtcNow
            };

            // This tries to acquire an exclusive
            // lock on the stream identified by
            // command.ProjectId in the database
            // so that only one process at a time
            // can update this event stream
            _session.Events.AppendExclusive(
                command.ProjectId,
                @event);
            return _session.SaveChangesAsync();
        }
    }

This is heavier in terms of its impact on the database, but simpler to implement as there is no flow control by exceptions to worry about.

I should also note that both the AppendOptimistic() and AppendExclusive() methods will verify that the stream already exists and throw an exception if the stream does not.