Marten V6 is Out! And the road to Wolverine 1.0

Marten 6.0 came out last week. Rather than describe that, just take a look at Oskar’s killer release notes write up on GitHub for V6. This also includes some updates to the Marten documentation website. Oskar led the charge on this release, so big thanks are due to him — in no small part by allowing me to focus on Wolverine by taking the brunt of the “Critter Stack” Discord rooms. The healthiness of the Marten community shows up with a slew of new contributors in this release.

With Marten 6.0 out, it’s on to finally getting to Wolverine 1.0:

Wolverine has lingered for way, way too long for my taste in a pre-1.0 status, but it’s getting closer. A couple weeks ago I felt like Wolverine 1.0 was very close as soon as the documentation was updated, but then I kept hearing repeated feedback about how early adopters want or need first class database multi-tenancy support as part of their Wolverine + Marten experience — and lesser number wanting some sort of EF Core + Wolverine multi-tenancy, but I’m going to put that aside just for now.

Cool, so I started jotting down what first class support for multi-tenancy through multiple databases was going to entail:

  • Some way to communicate the message tenant information through to Wolverine with message metadata. Easy money, that didn’t take much.
  • A little bit of change to the Marten transactional middleware in Wolverine to be tenant aware. Cool, that’s pretty small. Especially after a last minute change I made in Marten 6.0 specifically to support Wolverine.
  • Uh, oh, the durable inbox/outbox support in Wolverine will require specific table storage in every single tenant database, and you’d probably also want an “any tenant” master database as well for transactions that aren’t for a specific tenant. Right off the bat, this is much more complex than the other bullet points above. Wolverine could try to stretch its current “durability agent” strategy for multiple databases, but it’s a little too greedy on database connection usage and I was getting some feedback from potential users who were concerned by exactly that issue. At that point, I thought it would be helpful to reduce the connection usage, which…
  • Led me to wanting an approach where only one running node was processing the inbox/outbox recovery instead of each node hammering the database with advisory locks to figure out if anything needed to be recovered from previous nodes that shut down before finishing their work. Which now led me to wanting…
  • Some kind of leadership election in Wolverine, which now means that Wolverine needs durable storage for all the active nodes and the assignments to each node — which is functionality I wanted to build out soon regardless for Marten’s “async projection” scalability.

So to get the big leadership election, durability agent assignment across nodes, and finally back to the multi-tenancy support in Wolverine, I’ve got a bit of work to get through. It’s going well so far, but it’s time consuming because of the sheer number of details and the necessity of rigorously testing bitwise before trying to put it all together end to end.

There are a few other loose ends for Wolverine 1.0, but the work described up above is the main battle right now before Wolverine efforts shift to documentation and finally a formal 1.0 release. Famous last words of a fool, but I’m hoping to roll out Wolverine 1.0 right now during the NDC Oslo conference in a couple weeks.

Advertisement

Isolating Side Effects from Wolverine Handlers

For easier unit testing, it’s often valuable to separate responsibilities of “deciding” what to do from the actual “doing.” The side effect facility in Wolverine is an example of this strategy. You will need Wolverine 0.9.17 that just dropped for this feature.

At times, you may with to make Wolverine message handlers (or HTTP endpoints) be pure functions as a way of making the handler code itself easier to test or even just to understand. All the same, your application will almost certainly be interacting with the outside world of databases, file systems, and external infrastructure of all types. Not to worry though, Wolverine has some facility to allow you to declare the side effects as return values from your handler.

To make this concrete, let’s say that we’re building a message handler that will take in some textual content and an id, and then try to write that text to a file at a certain path. In our case, we want to be able to easily unit test the logic that “decides” what content and what file path a message should be written to without ever having any usage of the actual file system (which is notoriously irritating to use in tests).

First off, I’m going to create a new “side effect” type for writing a file like this:

// ISideEffect is a Wolverine marker interface
public class WriteFile : ISideEffect
{
    public string Path { get; }
    public string Contents { get; }

    public WriteFile(string path, string contents)
    {
        Path = path;
        Contents = contents;
    }

    // Wolverine will call this method. 
    public Task ExecuteAsync(PathSettings settings)
    {
        if (!Directory.Exists(settings.Directory))
        {
            Directory.CreateDirectory(settings.Directory);
        }
        
        return File.WriteAllTextAsync(Path, Contents);
    }
}

And the matching message type, message handler, and a settings class for configuration:

// An options class
public class PathSettings
{
    public string Directory { get; set; } 
        = Environment.CurrentDirectory.AppendPath("files");
}

public record RecordText(Guid Id, string Text);

public class RecordTextHandler
{
    public WriteFile Handle(RecordText command)
    {
        return new WriteFile(command.Id + ".txt", command.Text);
    }
}

At runtime, Wolverine is generating this code to handle the RecordText message:

    public class RecordTextHandler597515455 : Wolverine.Runtime.Handlers.MessageHandler
    {
        public override System.Threading.Tasks.Task HandleAsync(Wolverine.Runtime.MessageContext context, System.Threading.CancellationToken cancellation)
        {
            var recordTextHandler = new CoreTests.Acceptance.RecordTextHandler();
            var recordText = (CoreTests.Acceptance.RecordText)context.Envelope.Message;
            var pathSettings = new CoreTests.Acceptance.PathSettings();
            var outgoing1 = recordTextHandler.Handle(recordText);
            
            // Placed by Wolverine's ISideEffect policy
            return outgoing1.ExecuteAsync(pathSettings);
        }
    }

To explain what is happening up above, when Wolverine sees that any return value from a message handler implements the Wolverine.ISideEffect interface, Wolverine knows that that value should have a method named either Execute or ExecuteAsync() that should be executed instead of treating the return value as a cascaded message. The method discovery is completely by method name, and it’s perfectly legal to use arguments for any of the same types available to the actual message handler like:

  • Service dependencies from the application’s IoC container
  • The actual message
  • Any objects created by middleware
  • CancellationToken
  • Message metadata from Envelope

Taking this functionality farther, here’s a new example from the WolverineFx.Marten library that exploits this new side effect model to allow you to start event streams or store/insert/update documents from a side effect return value without having to directly touch Marten‘s IDocumentSession:

public static class StartStreamMessageHandler
{
    // This message handler is creating a brand new Marten event stream
    // of aggregate type NamedDocument. No services, no async junk,
    // pure function mechanics. You could unit test the method by doing
    // state based assertions on the StartStream object coming back out
    public static StartStream Handle(StartStreamMessage message)
    {
        return MartenOps.StartStream<NamedDocument>(message.Id, new AEvent(), new BEvent());
    }
    
    public static StartStream Handle(StartStreamMessage2 message)
    {
        return MartenOps.StartStream<NamedDocument>(message.Id, new CEvent(), new BEvent());
    }
}

As I get a little more time and maybe ambition, I want to start blogging more about how Wolverine is quite different from the “IHandler of T” model tools like MediatR, MassTransit, or NServiceBus. The “pure function” usage above potentially makes for a big benefit in terms of testability and longer term maintainability.

Compound Handlers in Wolverine

Last week I started a new series of blog posts about Wolverine capabilities with:

Today I’m going to continue with a contrived example from the “payment ingestion service,” this time on what I’m so far calling “compound handlers” in Wolverine. When building a system with any amount of business logic or workflow logic, there’s some philosophical choices that Wolverine is trying to make:

  • To maximize testability, business or workflow logic — as much as possible — should be in pure functions that are easily testable in isolated unit tests. In other words, you should be able to test this code without integration tests or mock objects. Just data in, and state-based assertions.
  • Of course your message handler will absolutely need to read data from our database in the course of actually handling messages. It’ll also need to write data to the underlying database. Yet we still want to push toward the pure function approach for all logic. To get there, I like Jim Shore’s A-Frame metaphor for how code should be organized to isolate business logic away from infrastructure and into nicely testable code.
  • I certainly didn’t set out this way years ago when what’s now Wolverine was first theorized, but Wolverine is trending toward using more functional decomposition with fewer abstractions rather than “traditional” class centric C# usage with lots of interfaces, constructor injection, and IoC usage. You’ll see what I mean when we hit the actual code

I don’t think that mock objects are evil per se, but they’re absolutely over-used in our industry. All I’m trying to suggest in this post is to structure code such that you don’t have to depend on stubs or any other kind of fake to set up test inputs to business or workflow logic code.

Consider the case of a message handler that needs to process a command message to apply a payment to principal within an existing loan. Depending on the amount and the account in question, the handler may need to raise domain events for early principle payment penalties (or alerts or whatever you actually do in this situation). That logic is going to need to know about both the related loan and account information in order to make that decision. The handler will also make changes to the loan to reflect the payment made as well, and commit those changes back to the database.

Just to sum things up, this message handler needs to:

  1. Look up loan and account data
  2. Use that data to carry out the business logic
  3. Potentially persist the changed state

Alright, on to the handler, which I’m going to accomplish with a single class that uses two separate methods:

public record PayPrincipal(Guid LoanId, decimal Amount, DateOnly EffectiveDate);

public static class PayPrincipalHandler
{
    // Wolverine will call this method first by naming convention.
    // If you prefer being more explicit, you can use any name you like and decorate
    // this with [Before] 
    public static async Task<(Account, LoanInformation)> LoadAsync(PayPrincipal command, IDocumentSession session,
        CancellationToken cancellation)
    {
        Account? account = null;
        var loan = await session
            .Query<LoanInformation>()
            .Include<Account>(x => x.AccountId, a => account = a)
            .Where(x => x.Id == command.LoanId)
            .FirstOrDefaultAsync(token: cancellation);

        if (loan == null) throw new UnknownLoanException(command.LoanId);
        if (account == null) throw new UnknownAccountException(loan.AccountId);
        
        return (account, loan);
    }

    // This is the main handler, but it's able to use the data built
    // up by the first method
    public static IEnumerable<object> Handle(
        // The command
        PayPrincipal command,
        
        // The information loaded from the LoadAsync() method above
        LoanInformation loan, 
        Account account,
        
        // We need this only to mark items as changed
        IDocumentSession session)
    {
        // The next post will switch this to event sourcing I think

        var status = loan.AcceptPrincipalPayment(command.Amount, command.EffectiveDate);
        switch (status)
        {
            case PrincipalStatus.BehindSchedule:
                // Maybe send an alert? Act on this in some other way?
                yield return new PrincipalBehindSchedule(loan.Id);
                break;
            
            case PrincipalStatus.EarlyPayment:
                if (!account.AllowsEarlyPayment)
                {
                    // Maybe just a notification?
                    yield return new EarlyPrincipalPaymentDetected(loan.Id);
                }

                break;
        }

        // Mark the loan as being needing to be persisted
        session.Store(loan);
    }
}

Wolverine itself is weaving in the call first to LoadAsync(), and piping the results of that method to the inputs of the inner Handle() method, which now gets to be almost a pure function with just the call to IDocumentSession.Store() being “impure” — but at least that one single method is relatively painless to mock.

The point of doing this is really just to make the main Handle() method where the actual business logic is happening be very easily testable with unit tests as you can just push in the Account and Loan information. Especially in cases where there’s likely many permutations of inputs leading to different behaviors, it’s very advantageous to be able to walk right up to just the business rules and push inputs right into that, then do assertions on the messages returned from the Handle() function and/or assert on modifications to the Loan object.

TL:DR — Repository abstractions over persistence tooling can cause more harm than good.

Also notice that I directly used a reference to the Marten IDocumentSession rather than wrapping some kind of IRepository<Loan> or IAccountRepository abstraction right around Marten. That was very purposeful. I think those abstractions — especially narrow, entity-centric abstractions around basic CRUD or load methods cause more harm than good in nontrivial enterprise systems. In the case above, I was using a touch of advanced, Marten-specific behavior to load related documents in one network round trip as a performance optimization. That’s the exact kind of powerful ability of specific persistence tools that’s thrown away by generic “IRepository of T” strategies “just in case we decide to change database technologies later” that I believe to be harmful in larger enterprise systems. Moreover, I think that kind of abstraction bloats the codebase and leads to poorly performing systems.

Wolverine 0.9.13: Contextual Logging and More

We’re dogfooding Wolverine at work and the Critter Stack Discord is pretty active right now. All of that means that issues and opportunities to improve Wolverine are coming in fast right now. I just pushed Wolverine 0.9.13 (the Nugets are all named “WolverineFx” something because someone is squatting on the “Wolverine” name in Nuget).

First, quick thanks to Robin Arrowsmith for finding and fixing an issue with Wolverine’s Azure Service Bus support. And a more general thank you to the nascent Wolverine community for being so helpful working with me in Discord to improve Wolverine.

A few folks are reporting various issues with Wolverine handler discovery. To help alleviate whatever those issues turn out to be, Wolverine has a new mechanism to troubleshoot “why is my handler not being found by Wolverine?!?” issues.

We’re converting a service at work that lives within a giant distributed system that’s using NServiceBus for messaging today, so weirdly enough, there’s some important improvements for Wolverine’s interoperability with NServiceBus.

This will be worth a full blog post soon, but there’s some ability to add contextual logging about your domain (account numbers, tenants, product numbers, etc.) to Wolverine’s open telemetry and/or logging support. My personal goal here is to have all the necessary and valuable correlation between system activity, performance, and logged problems without forcing the development team to write repetitive code throughout their message handler code.

And one massive bug fix for how Wolverine generates runtime code in conjunction with your IoC service registrations for objects created by Wolverine itself. That’s a huge amount of technical mumbo jumbo that amounts to “even though Jeremy really doesn’t approve, you can inject Marten IDocumentSession or EF Core DbContext objects into repository classes while still using Wolverine transactional middleware and outbox support.” See this issue for more context. It’s a hugely important fix for folks who choose to use Wolverine with a typical, .NET Onion/Clean architecture with lots of constructor injection, repository wrappers, and making the garbage collection work like crazy at runtime.

Critter Stack Roadmap (Marten, Wolverine, Weasel)

This post is mostly an attempt to gather feedback from anyone out there interested enough to respond. Comment here, or better yet, tell us and the community what you’re interested in in the Critter Stack Discord community.

The so called “Critter Stack” is Marten, Wolverine, and a host of smaller, shared supporting projects within the greater JasperFx umbrella. Marten has been around for while now, just hit the “1,000 closed pull request” milestone, and will reach the 4 million download mark sometime next week. Wolverine is getting some early adopter love right now, and the feedback is being very encouraging to me right now.

The goal for this year is to make the Critter Stack the best technical choice for a CQRS with Event Sourcing style architecture across every technical ecosystem — and a strong candidate for server side development on the .NET platform for other types of architectural strategies. That’s a bold goal, and there’s a lot to do to fill in missing features and increase the ability of the Critter Stack to scale up to extremely large workloads. To keep things moving, the core team banged out our immediate road map for the next couple months:

  1. Marten 6.0 within a couple weeks. This isn’t a huge release in terms of API changes, but sets us up for the future
  2. Wolverine 1.0 shortly after. I think I’m to the point of saying the main priority is finishing the documentation website and conducting some serious load and chaos testing against the Rabbit MQ and Marten integrations (weirdly enough the exact technical stack we’ll be using at my job)
  3. Marten 6.1: Formal event subscription mechanisms as part of Marten (ability to selectively publish events to a listener of some sort or a messaging broker). You can do this today as shown in Oskar’s blog post, but it’s not a first class citizen and not as efficient as it should be. Plus you’d want both “hot” and “cold” subscriptions.
  4. Wolverine 1.1: Direct support for the subscription model within Marten so that you have ready recipes to publish events from Marten with Wolverine’s messaging capabilities. Technically, you can already do this with Wolverine + Marten’s outbox integration, but that only works through Wolverine handlers. Adding the first class recipe for “Marten to Wolverine messaging” I think will make it awfully easy to get up and going with event subscriptions fast.

Right now, Marten 6 and Wolverine 1.0 have lingered for awhile, so it’s time to get them out. After that, subscriptions seem to be the biggest source of user questions and requests right now, so that’s the obvious next thing to do. After that though, here’s a rundown of some of the major initiatives we could pursue in either Marten or Wolverine this year (and some straddle the line):

  • End to end multi-tenancy support in Wolverine, Marten, and ASP.Net Core. Marten has strong support for multi-tenancy, but users have to piece things together themselves together within their applications. Wolverine’s Marten integration is currently limited to only one Marten database per application
  • Hot/cold storage for active vs archived events. This is all about massive scalability for the event sourcing storage
  • Sharding the asynchronous projections to distribute work across multiple running nodes. More about scaling the event sourcing
  • Zero down time projection rebuilds. Big user ask. Probably also includes trying to optimize the heck out of the performance of this feature too
  • More advanced message broker feature support. AWS SNS support. Azure Service Bus topics support. Message batching in Rabbit MQ
  • Improving the Linq querying in Marten. At some point soon, I’d like to try to utilize the sql/json support within Postgresql to try to improve the Linq query performance and fill in more gaps in the support. Especially for querying within child collections. And better Select() transform support. That’s a neverending battle.
  • Optimized serverless story in Wolverine. Not exactly sure what this means, but I’m thinking to do something that tries to drastically reduce the “cold start” time
  • Open Telemetry support within Marten. It’s baked in with Wolverine, but not Marten yet. I think that’s going to be an opt in feature though
  • More persistence options within Wolverine. I’ll always be more interested in the full Wolverine + Marten stack, but I’d be curious to try out DynamoDb or CosmosDb support as well

There’s tons of other things to possibly do, but that list is what I’m personally most interested in our community getting to this year. No way there’s enough bandwidth for everything, so it’s time to start asking folks what they want out of these tools in the near future.

Useful Tricks with Lamar for Integration Testing

Earlier this week I started a new blog series on Wolverine & Marten:

Earlier this week I started a new series of blog posts about Wolverine capabilities with:

Today I’m taking a left turn in Albuquerque to talk about how to deal with injecting fake services in integration test scenarios for external service gateways in Wolverine applications using some tricks in the underlying Lamar IoC container — or really just anything that turns out to be difficult to deal with in automated tests.

Since this is a headless service, I’m not too keen on introducing Alba or WebApplicationFactory and all their humongous tail of ASP.Net Core dependencies. Instead, I made a mild change to the Program file of the main application to revert back to the “old” .NET 6 style of bootstrapping instead of the newer, implied Program.Main() style strictly to facilitate integration testing:

public static class Program
{
    public static Task<int> Main(string[] args)
    {
        return CreateHostBuilder().RunOaktonCommands(args);
    }

    // This method is a really easy way to bootstrap the application
    // in testing later
    public static IHostBuilder CreateHostBuilder()
    {
        return Host.CreateDefaultBuilder()
            .UseWolverine((context, opts) =>
            {
                // And a lot of necessary configuration here....
            });
    }
}

Now, I’m going to start a new xUnit.Net project to test the main application (NUnit or MSTest would certainly be viable as well). In the testing project, I want to test the payment ingestion service from the prior blog posts with basically the exact same set up as the main application, with the exception of replacing the service gateway for the “very unreliable 3rd party service” with a stub that we can control at will during testing. That stub could look like this:

// More on this later...
public interface IStatefulStub
{
    void ClearState();
}

public class ThirdPartyServiceStub : IThirdPartyServiceGateway, IStatefulStub
{
    public Dictionary<Guid, LoanInformation> LoanInformation { get; } = new();
    
    public Task<LoanInformation> FindLoanInformationAsync(Guid loanId, CancellationToken cancellation)
    {
        if (LoanInformation.TryGetValue(loanId, out var information))
        {
            return Task.FromResult(information);
        }

        // I suppose you'd throw a more specific exception type, but I'm lazy, so....
        throw new ArgumentOutOfRangeException(nameof(loanId), "Unknown load id");
    }

    public Task PostPaymentScheduleAsync(PaymentSchedule schedule, CancellationToken cancellation)
    {
        PostedSchedules.Add(schedule);
        return Task.CompletedTask;
    }

    public List<PaymentSchedule> PostedSchedules { get; } = new();
    public void ClearState()
    {
        PostedSchedules.Clear();
        LoanInformation.Clear();
    }
}

Now that we have a usable stub for later, let’s build up a test harness for our application. Right off the bat, I’m going to say that we won’t even try to run integration tests in parallel, so I’m going for a shared context that bootstraps the applications IHost:

public class AppFixture : IAsyncLifetime
{
    public async Task InitializeAsync()
    {
        // This is bootstrapping the actual application using
        // its implied Program.Main() set up
        Host = await Program.CreateHostBuilder()
            // This is from Lamar, this will override the service registrations
            // no matter what order registrations are done. This was specifically
            // intended for automated testing scenarios
            .OverrideServices(services =>
            {
                // Override the existing application's registration with a stub
                // for the third party service gateway
                services.AddSingleton<IThirdPartyServiceGateway>(ThirdPartyService);
            }).StartAsync();

    }

    // Just a convenient way to get at this later
    public ThirdPartyServiceStub ThirdPartyService { get; } = new();

    public IHost Host { get; private set; }
 
    public Task DisposeAsync()
    {
        return Host.StopAsync();
    }
}

So a couple comments about the code up above:

  • I’m delegating to the Program.CreateHostBuilder() method from our real application to create an IHostBuilder that is exactly the application itself. I think it’s important to do integration tests as close to the real application as possible so you don’t get false positives or false negatives from some sort of different bootstrapping or configuration of the application.
  • That being said, it’s absolutely going to be a pain in the ass to use the real “unreliable 3rd party service” in integration testing, so it would be very convenient to have a nice, easily controlled stub or “spy” we can use to capture data sent to the 3rd party or to set up responses from the 3rd party service
  • And no, we don’t know if your application actually works end to end if we use the whitebox testing approach, and there is very likely going to be unforeseen issues when we integrate with the real 3rd party service. All that being said, it’s very helpful to first know that our code works exactly the way we intended it to before we tackle fully end to end tests.
  • But if this were a real project, I’d spike the actual 3rd party gateway code ASAP because that’s likely where the major project risk is. In the real life project this was based on, that gateway code was not under my purview at first and I might have gotten myself temporarily banned from the client site after finally snapping at the developer “responsible” for that after about a year of misery. Moving on!
  • Lamar is StructureMap’s descendent, but it’s nowhere near as loosey-goosey flexible about runtime service overrides as StructureMap. That was very purposeful on my part as that led to Lamar having vastly better (1-3 orders of magnitude improvement) performance, and also to reduce my stress level by simplifying the Lamar usage over StructureMap’s endlessly complicated rules for service overrides. Long story short, that requires you to think through in advance a little bit about what services are going to be overridden in tests and to frankly use that sparingly compared to what was easy in StructureMap years ago.

Next, I’ll add the necessary xUnit ICollectionFixture type that I almost always forget to do at first unless I’m copy/pasting code from somewhere else:

[CollectionDefinition("integration")]
public class ScenarioCollection : ICollectionFixture<AppFixture>
{
     
}

Now, I like to have a base class for integration tests that just adds a tiny bit of reusable helpers and lifecycle methods to clean up the system state before all tests:

public abstract class IntegrationContext : IAsyncLifetime
{
    public IntegrationContext(AppFixture fixture)
    {
        Host = fixture.Host;
        Store = Host.Services.GetRequiredService<IDocumentStore>();
        ThirdPartyService = fixture.ThirdPartyService;
    }

    public ThirdPartyServiceStub ThirdPartyService { get; set; }

    public IHost Host { get; }
    public IDocumentStore Store { get; }

    async Task IAsyncLifetime.InitializeAsync()
    {
        // Using Marten, wipe out all data and reset the state
        // back to exactly what we described in InitialAccountData
        await Store.Advanced.ResetAllData();
        
        // Clear out all the stateful stub state too!
        // First, I'm getting at the broader Lamar service
        // signature to do Lamar-specific things...
        var container = (IContainer)Host.Services;

        // Find every possible service that's registered in Lamar that implements
        // the IStatefulStub interface, resolve them, and loop though them 
        // like so
        foreach (var stub in container.Model.GetAllPossible<IStatefulStub>())
        {
            stub.ClearState();
        }
    }
 
    // This is required because of the IAsyncLifetime 
    // interface. Note that I do *not* tear down database
    // state after the test. That's purposeful
    public Task DisposeAsync()
    {
        return Task.CompletedTask;
    }

}

And now, some comments about that bit of code. You generally want a clean slate of system state going into each test, and our stub for the 3rd party system is stateful, so we’d want to clear it out between tests to keep from polluting the next test. That what the `IStatefulStub` interface and the calls to GetAllPossible() is helping us do with the Lamar container. If the system grows and we use more stubs, we can use that mechanism to have a one stop shop to clear out any stateful objects in the container between tests.

Lastly, here’s a taste of how the full test harness might be used:

public class ASampleTestHarness : IntegrationContext
{
    public ASampleTestHarness(AppFixture fixture) : base(fixture)
    {
    }

    [Fact]
    public async Task how_the_test_might_work()
    {
        // Do the Arrange and Act part of the tests....
        await Host.InvokeMessageAndWaitAsync(new PaymentValidated(new Payment()));

        // Our test *should* have posted a single payment schedule
        // within the larger workflow, and this will blow up if there's
        // none or many
        var schedule = ThirdPartyService.PostedSchedules.Single();
        
        // Write assertions against the expected data for the schedule maybe?
    }
}

The InvokeMessageAndWaitAsync() is baked into Wolverine’s test automation support.

Summary and next time…

I don’t like piecing together special application bootstrapping in the test automation projects, as that tends to drift apart from the actual application over time. Instead, I’d rather use the application’s own bootstrapping — in this case how it builds up an IHostBuilder — then apply some limited number of testing overrides.

Lamar has a couple helpers for test automation, including the OverrideServices() method and the GetAllPossible() helper that can be useful for clearing out state between tests in stubs or caches or who knows what else in a systematic way.

So far I’ve probably mostly blogged about things that Wolverine does that other tools like NServiceBus, MassTransit, or MediatR do as well. Next time out, I want to go completely off road where those tools can’t follow and into Wolverine’s “compound handler” strategy for maximum testability using Jim Shore’s A-Frame Architecture approach.

Resiliency with Wolverine

Yesterday I started a new series of blog posts about Wolverine capabilities with:

To review, I was describing a project I worked on years ago that involved some interactions with a very unreliable 3rd party web service system to handle payments originating from a flat file:

Just based on that diagram above, and admittedly some bad experiences in the shake down cruise of the historical system that diagram was based on, here’s some of the things that can go wrong:

  • The system blows up and dies while the payments from a particular file are only half way processed
  • Transient errors from database connectivity. Network hiccups
  • File IO errors from reading the flat files (I tend to treat direct file system access a lot like a poisonous snake due to very bad experiences early in my career)
  • HTTP errors from timeouts calling the web service
  • The 3rd party system is under distress and performing very poorly, such that a high percentage of requests are timing out
  • The 3rd party system can be misconfigured after code migrations on its system so that it’s technically “up” and responsive, but nothing actually works
  • The 3rd party system is completely down

Man, it’s a scary world sometimes!

Let’s say right now that our goal is as much as possible to have a system that is:

  1. Able to recover from errors without losing any ongoing work
  2. Doesn’t allow the system to permanently get into an inconsistent state — i.e. a file is marked as completely read, but somehow some of the payments from that file got lost along the way
  3. Rarely needs manual intervention from production support to recover work or restart work
  4. Heavens forbid, when something does happen that the system can’t recover from, it notifies production support

Now let’s go onto how to utilize Wolverine features to satisfy those goals in the face of all the potential problems I identified.

What if the system dies halfway through a file?

If you read through the last post, I used the local queueing mechanism in Wolverine to effectively create a producer/consumer workflow. Great! But what if the current process manages to die before all the ongoing work is completed? That’s where the durable inbox support in Wolverine comes in.

Pulling Marten in as our persistence strategy (but EF Core with either Postgresql or Sql Server is fully supported for this use case as well), I’m going to set up the application to opt into durable inbox mechanics for all locally queued messages like so (after adding the WolverineFx.Marten Nuget):

using Marten;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Oakton;
using Wolverine;
using Wolverine.Marten;
using WolverineIngestionService;

return await Host.CreateDefaultBuilder()
    .UseWolverine((context, opts) =>
    {
        // There'd obviously be a LOT more set up and service registrations
        // to be a real application

        var connectionString = context.Configuration.GetConnectionString("marten");
        opts.Services
            .AddMarten(connectionString)
            .IntegrateWithWolverine();

        // I want all local queues in the application to be durable
        opts.Policies.UseDurableLocalQueues();
        opts.LocalQueueFor<PaymentValidated>().Sequential();
        opts.LocalQueueFor<PostPaymentSchedule>().Sequential();
    }).RunOaktonCommands(args);

And with those changes, all in flight messages in the local queues are also stored durably in the backing database. If the application process happens to fail in flight, the persisted messages will fail over to either another running node or be picked up by restarting the system process.

So far, so good? Onward…

Getting Over transient hiccups

Sometimes database interactions will fail with transient errors and will very well succeed if retried later. This is especially common when the database is under stress. Wolverine’s error handling policies easily accommodate that, and in this case I’m going to add some retry capabilities for basic database exceptions like so:

        // Retry on basic database exceptions with some cooldown time in
        // between retries
        opts
            .Policies
            .OnException<NpgsqlException>()
            .Or<MartenCommandException>()
            .RetryWithCooldown(100.Milliseconds(), 250.Milliseconds(), 500.Milliseconds());

        opts
            .OnException<TimeoutException>()
            .RetryWithCooldown(250.Milliseconds(), 500.Milliseconds());

Notice how I’ve specified some “cooldown” times for subsequent failures. This is more or less an example of exponential back off error handling that’s meant to effectively throttle a distressed subsystem to allow it to catch up and recover.

Now though, not every exception implies that the message may magically succeed at a later time, so in that case…

Walk away from bad apples

Over time we can recognize exceptions that pretty well mean that the message can never succeed. In that case we should just throw out the message instead of allowing it to suck down resources by being retried multiple times. Wolverine happily supports that as well. Let’s say that payment messages can never work if it refers to an account that cannot be found, so let’s do this:

        // Just get out of there if the account referenced by a message
        // does not exist!
        opts
            .OnException<UnknownAccountException>()
            .Discard();

I should also note that Wolverine is writing to your application log when this happens.

Circuit Breakers to give the 3rd party system a timeout

As I’ve repeatedly said in this blog series so far, the “very unreliable 3rd party system” was somewhat less than reliable. What we found in practice was that the service would fail in bunches when it fell behind, but could recover over time. However, what would happen — even with the exponential back off policy — was that when the system was distressed it still couldn’t recover in time and continuing to pound it with retries just led to everything ending up in dead letter queues where it eventually required manual intervention to recover. That was exhausting and led to much teeth gnashing (and fingers pointed at me in angry meetings). In response to that, Wolverine comes with circuit breaker support as shown below:

        // These are the queues that handle calls to the 3rd party web service
        opts.LocalQueueFor<PaymentValidated>()
            .Sequential()
            
            // Add the circuit breaker
            .CircuitBreaker(cb =>
            {
                // If the conditions are met to stop processing messages,
                // Wolverine will attempt to restart in 5 minutes
                cb.PauseTime = 5.Minutes();

                // Stop listening if there are more than 20% failures 
                // over the tracking period
                cb.FailurePercentageThreshold = 20;

                // Consider the failures over the past minute
                cb.TrackingPeriod = 1.Minutes();
                
                // Get specific about what exceptions should
                // be considered as part of the circuit breaker
                // criteria
                cb.Include<TimeoutException>();

                // This is our fault, so don't shut off the listener
                // when this happens
                cb.Exclude<InvalidRequestException>();
            });
        
        opts.LocalQueueFor<PostPaymentSchedule>()
            .Sequential()
            
            // Or the defaults might be just fine
            .CircuitBreaker();

With the set up above, if Wolverine detects too high a rate of message failures in a given time, it will completely stop message processing for that particular local queue. Since we’ve isolated the message processing for the two types of calls to the 3rd party web service, we’re allowing everything else to continue when the circuit breaker stops message processing. Do note that the circuit breaker functionality will try to restart message processing later after the designated pause time. Hopefully the pause time allows for the 3rd party system to recover — or for production support to make it recover. All of this without making all the backed up messages continuously fail and end up landing in the dead letter queues where it will take manual intervention to recover the work in progress.

Hold the line, the 3rd party system is broken!

On top of every thing else, the “very unreliable 3rd party system” was easily misconfigured at the drop of a hat such that it would become completely nonfunctional even though it appeared to be responsive. When this happened, every single message to that service would fail. So again, instead of letting all our pending work end up in the dead letter queue, let’s instead completely pause all message handling on the current local queue (wherever the error happened) if we can tell from the exception that the 3rd party system is nonfunctional like so:

        // If we encounter this specific exception with this particular error code,
        // it means that the 3rd party system is 100% nonfunctional even though it appears
        // to be up, so let's pause all processing for 10 minutes
        opts.OnException<ThirdPartyOperationException>(e => e.ErrorCode == 235)
            .Requeue().AndPauseProcessing(10.Minutes());

Summary and next time!

It’s helpful to assign work within message handlers in such a way to maximize your error handling. Think hard about what actions in your system are prone to failure and may deserve to be their own individual message handler and messaging endpoint to allow for exact error handling policies like the way I used a circuit breaker on the queues that handled calls to the unreliable 3rd party service.

For my next post in this series, I think I want to make a diversion into integration testing using a stand in stub for the 3rd party service using the application setup with Lamar.

Wolverine’s New HTTP Endpoint Model

UPDATE: If you pull down the sample code, it’s not quite working with Swashbuckle yet. It *does* publish the metadata and the actual endpoints work, but it’s not showing up in the OpenAPI spec. Always something.

I just published Wolverine 0.9.10 to Nuget (after a much bigger 0.9.9 yesterday). There’s several bug fixes, some admitted breaking changes to advanced configuration items, and one significant change to the “mediator” behavior that’s described at the section at the very bottom of this post.

The big addition is a new library that enables Wolverine’s runtime model directly for HTTP endpoints in ASP.Net Core services without having to jump through the typical sequence of delegating directly from a Minimal API method directly to Wolverine’s mediator functionality like this:

app.MapPost("/items/create", (CreateItemCommand cmd, IMessageBus bus) => bus.InvokeAsync(cmd));

app.MapPost("/items/create2", (CreateItemCommand cmd, IMessageBus bus) => bus.InvokeAsync<ItemCreated>(cmd));

Instead, Wolverine now has the WolverineFx.Http library to directly use Wolverine’s runtime model — including its unique middleware approach — directly from HTTP endpoints.

Shamelessly stealing the Todo sample application from the Minimal API documentation, let’s build a similar service with WolverineFx.Http, but I’m also going to switch to Marten for persistence just out of personal preference.

To bootstrap the application, I used the dotnet new webapi model, then added the WolverineFx.Marten and WolverineFx.HTTP nugets. The application bootstrapping for basic integration of Wolverine, Marten, and the new Wolverine HTTP model becomes:

using Marten;
using Oakton;
using Wolverine;
using Wolverine.Http;
using Wolverine.Marten;

var builder = WebApplication.CreateBuilder(args);

// Adding Marten for persistence
builder.Services.AddMarten(opts =>
    {
        opts.Connection(builder.Configuration.GetConnectionString("Marten"));
        opts.DatabaseSchemaName = "todo";
    })
    .IntegrateWithWolverine()
    .ApplyAllDatabaseChangesOnStartup();

// Wolverine usage is required for WolverineFx.Http
builder.Host.UseWolverine(opts =>
{
    // This middleware will apply to the HTTP
    // endpoints as well
    opts.Policies.AutoApplyTransactions();
    
    // Setting up the outbox on all locally handled
    // background tasks
    opts.Policies.UseDurableLocalQueues();
});

// Learn more about configuring Swagger/OpenAPI at https://aka.ms/aspnetcore/swashbuckle
builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();

var app = builder.Build();

// Configure the HTTP request pipeline.
if (app.Environment.IsDevelopment())
{
    app.UseSwagger();
    app.UseSwaggerUI();
}

// Let's add in Wolverine HTTP endpoints to the routing tree
app.MapWolverineEndpoints();

return await app.RunOaktonCommands(args);

Do note that the only thing in that sample that pertains to WolverineFx.Http itself is the call to IEndpointRouteBuilder.MapWolverineEndpoints().

Let’s move on to “Hello, World” with a new Wolverine http endpoint from this class we’ll add to the sample project:

public class HelloEndpoint
{
    [WolverineGet("/")]
    public string Get() => "Hello.";
}

At application startup, WolverineFx.Http will find the HelloEndpoint.Get() method and treat it as a Wolverine http endpoint with the route pattern GET: / specified in the [WolverineGet] attribute.

As you’d expect, that route will write the return value back to the HTTP response and behave as specified by this Alba specification:

[Fact]
public async Task hello_world()
{
    var result = await _host.Scenario(x =>
    {
        x.Get.Url("/");
        x.Header("content-type").SingleValueShouldEqual("text/plain");
    });
    
    result.ReadAsText().ShouldBe("Hello.");
}

Moving on to the actual Todo problem domain, let’s assume we’ve got a class like this:

public class Todo
{
    public int Id { get; set; }
    public string? Name { get; set; }
    public bool IsComplete { get; set; }
}

In a sample class called TodoEndpoints let’s add an HTTP service endpoint for listing all the known Todo documents:

[WolverineGet("/todoitems")]
public static Task<IReadOnlyList<Todo>> Get(IQuerySession session) 
    => session.Query<Todo>().ToListAsync();

As you’d guess, this method will serialize all the known Todo documents from the database into the HTTP response and return a 200 status code. In this particular case the code is a little bit noisier than the Minimal API equivalent, but that’s okay, because you can happily use Minimal API and WolverineFx.Http together in the same project. WolverineFx.Http, however, will shine in more complicated endpoints.

Consider this endpoint just to return the data for a single Todo document:

// Wolverine can infer the 200/404 status codes for you here
// so there's no code noise just to satisfy OpenAPI tooling
[WolverineGet("/todoitems/{id}")]
public static Task<Todo?> GetTodo(int id, IQuerySession session, CancellationToken cancellation) 
    => session.LoadAsync<Todo>(id, cancellation);

At this point it’s effectively de rigueur for any web service to support OpenAPI documentation directly in the service. Fortunately, WolverineFx.Http is able to glean most of the necessary metadata to support OpenAPI documentation with Swashbuckle from the method signature up above. The method up above will also cleanly set a status code of 404 if the requested Todo document does not exist.

Now, the bread and butter for WolverineFx.Http is using it in conjunction with Wolverine itself. In this sample, let’s create a new Todo based on submitted data, but also publish a new event message with Wolverine to do some background processing after the HTTP call succeeds. And, oh, yeah, let’s make sure this endpoint is actively using Wolverine’s transactional outbox support for consistency:

[WolverinePost("/todoitems")]
public static async Task<IResult> Create(CreateTodo command, IDocumentSession session, IMessageBus bus)
{
    var todo = new Todo { Name = command.Name };
    session.Store(todo);

    // Going to raise an event within out system to be processed later
    await bus.PublishAsync(new TodoCreated(todo.Id));
    
    return Results.Created($"/todoitems/{todo.Id}", todo);
}

The endpoint code above is automatically enrolled in the Marten transactional middleware by simple virtue of having a dependency on Marten’s IDocumentSession. By also taking in the IMessageBus dependency, WolverineFx.Http is wrapping the transactional outbox behavior around the method so that the TodoCreated message is only sent after the database transaction succeeds.

Lastly for this page, consider the need to update a Todo from a PUT call. Your HTTP endpoint may vary its handling and response by whether or not the document actually exists. Just to show off Wolverine’s “composite handler” functionality and also how WolverineFx.Http supports middleware, consider this more complex endpoint:

public static class UpdateTodoEndpoint
{
    public static async Task<(Todo? todo, IResult result)> LoadAsync(UpdateTodo command, IDocumentSession session)
    {
        var todo = await session.LoadAsync<Todo>(command.Id);
        return todo != null 
            ? (todo, new WolverineContinue()) 
            : (todo, Results.NotFound());
    }

    [WolverinePut("/todoitems")]
    public static void Put(UpdateTodo command, Todo todo, IDocumentSession session)
    {
        todo.Name = todo.Name;
        todo.IsComplete = todo.IsComplete;
        session.Store(todo);
    }
}

In the WolverineFx.Http model, any bit of middleware that returns an IResult object is tested by the generated code to execute any IResult object returned from middleware that is not the built in WolverineContinue type and stop all further processing. This is intended to enable validation or authorization type middleware where you may need to filter calls to the inner HTTP handler.

With the sample application out of the way, here’s a rundown of the significant things about this library:

  • It’s actually a pretty small library in the greater scheme of things and all it really does is connect ASP.Net Core’s endpoint routing to the Wolverine runtime model — and Wolverine’s runtime model is likely going to be somewhat more efficient than Minimal API and much more efficient that MVC Core
  • It can be happily combined with Minimal API, MVC Core, or any other ASP.Net Core model that exploits endpoint routing, even within the same application
  • Wolverine is allowing you to use the Minimal API IResult model
  • The JSON serialization is strictly System.Text.Json and uses the same options as Minimal API within an ASP.Net Core application
  • It’s possible to use Wolverine middleware strategy with the HTTP endpoints
  • Wolverine is trying to glean necessary metadata from the method signatures to feed OpenAPI usage within ASP.Net Core without developers having to jump through hoops adding attributes or goofy TypedResult noise code just for Swashbuckle
  • This model plays nicely with Wolverine’s transactional outbox model for common cases where you need to both make database changes and publish additional messages for background processing in the same HTTP call. That’s a bit of important functionality that I feel is missing or is clumsy at best in many leading .NET server side technologies.

For the handful of you reading this that still remember FubuMVC, Wolverine’s HTTP model retains some of FubuMVC’s old strengths in terms of still not ramming framework concerns into your application code, but learned some hard lessons from FubuMVC’s ultimate failure:

  • FubuMVC was an ambitious, sprawling framework that was trying to be its own ecosystem with its own bootstrapping model, logging abstractions, and even IoC abstractions. WolverineFx.Http is just a citizen within the greater ASP.Net Core ecosystem and uses common .NET abstractions, concepts, and idiomatic naming conventions at every possible turn
  • FubuMVC relied too much on conventions, which was great when the convention was exactly what you needed, and kinda hurtful when you needed something besides the exact conventions. Not to worry, WolverineFx.Http let’s you drop right down to the HttpContext level at will or use any of the IResult objects in existing ASP.Net Core whenever the Wolverine conventions don’t fit.
  • FubuMVC could technically be used with old ASP.Net MVC, but it was a Frankenstein’s monster to pull off. Wolverine can be mixed and matched at will with either Minimal API, MVC Core, or even other OSS projects that exploit ASP.Net Core endpoint routing.
  • Wolverine is trying to play nicely in terms of OpenAPI metadata and security related metadata for usage of standard ASP.Net Core middleware like the authorization or authentication middleware
  • FubuMVC’s “Behavior” model gave you a very powerful “Russian Doll” middleware ability that was maximally flexible — and also maximally inefficient in runtime. Wolverine’s runtime model takes a very different approach to still allow for the “Russian Doll” flexibility, but to do so in a way that is more efficient at runtime than basically every other commonly used framework today in the .NET community.
  • When things went boom in FubuMVC, you got monumentally huge stack traces that could overwhelm developers who hadn’t had a week’s worth of good night sleeps. It sounds minor, but Wolverine is valuable in the sense that the stack traces from HTTP (or message handler) failures will have very minimal Wolverine related framework noise in the stack trace for easier readability by developers.

Big Change to In Memory Mediator Model

I’ve been caught off guard a bit by how folks have mostly been interested in Wolverine as an alternative to MediatR with typical usage like this where users just delegate to Wolverine in memory within a Minimal API route:

app.MapPost("/items/create2", (CreateItemCommand cmd, IMessageBus bus) => bus.InvokeAsync<ItemCreated>(cmd));

With the corresponding message handler being this:

public class ItemHandler
{
    // This attribute applies Wolverine's EF Core transactional
    // middleware
    [Transactional]
    public static ItemCreated Handle(
        // This would be the message
        CreateItemCommand command,

        // Any other arguments are assumed
        // to be service dependencies
        ItemsDbContext db)
    {
        // Create a new Item entity
        var item = new Item
        {
            Name = command.Name
        };

        // Add the item to the current
        // DbContext unit of work
        db.Items.Add(item);

        // This event being returned
        // by the handler will be automatically sent
        // out as a "cascading" message
        return new ItemCreated
        {
            Id = item.Id
        };
    }
}

Prior to the latest release, the ItemCreated event in the handler above when used from IMessageBus.InvokeAsync<ItemCreated>() was not published as a message because my original assumption was that in that case you were using the return value explicitly as a return value. Early users have been surprised that the ItemCreated was not published as a message, so I just changed the behavior to do so to make the cascading message behavior be more consistent and what folks seem to actually want.

New Wolverine Release & Future Plans

After plenty of keystone cops shenanigans with CI automation today that made me question my own basic technical competency, there’s a new Wolverine 0.9.8 release on Nuget today with a variety of fixes and some new features. The documentation website was also re-published.

First, some thanks:

  • Wojtek Suwala made several fixes and improvements to the EF Core integration
  • Ivan Milosavljevic helped fix several hanging tests on CI, built the MemoryPack integration, and improved the FluentValidation integration
  • Anthony made his first OSS contribution (?) to help fix quite a few issues with the documentation
  • My boss and colleague Denys Grozenok for all his support with reviewing docs and reporting issues
  • Kebin for improving the dead letter queue mechanics

The highlights:

Dogfooding baby!

Conveniently enough, I’m part of a little officially sanctioned skunkworks team at work experimenting with converting a massive distributed monolithic application to the full Marten + Wolverine “critter stack.” I’m very encouraged by the effort so far, and it’s driven some recent features in Wolverine’s execution model to handle complexity in enterprise systems. More on that soon.

It’s also pushing the story for interoperability with NServiceBus on the other end of Rabbit MQ queues. Strangely enough, no one is interested in trying to convert a humongous distributed system to Wolverine in one round of work. Go figure.

When will Wolverine hit 1.0?

There’s a little bit of awkwardness in that Marten V6.0 (don’t worry, that’s a much smaller release than 4/5) needs to be released first and I haven’t been helping Oskar & Babu with that recently, but I think we’ll be able to clear that soon.

My “official” plan is to finish the documentation website by the end of February and make the 1.0 release by March 1st. Right now, Wolverine is having its tires kicked by plenty of early users and there’s plenty of feedback (read: bugs or usability issues) coming in that I’m trying to address quickly. Feature wise, the only things I’m hoping to have done by 1.0 are:

  • Using more native capabilities of Azure Service Bus, Rabbit MQ, and AWS SQS for dead letter queues and delayed messaging. That’s mostly to solidify some internal abstractions.
  • It’s a stretch goal, but have Wolverine support Marten’s multi-tenancy through a database per tenant strategy. We’ll want that for internal MedeAnalytics usage, so it might end up being a priority
  • Some better integration with ASP.Net Core Minimal API

Automating Integration Tests using the “Critter Stack”

This builds on the previous blog posts in this list:

Integration Testing, but How?

Some time over the holidays Jim Shore released an updated version of his excellent paper Testing Without Mocks: A Pattern Language. He also posted this truly massive thread with some provocative opinions about test automation strategies:

I think it’s a great thread over all, and the paper is chock full of provocative thoughts about designing for testability. Moreover, some of the older content in that paper is influencing the direction of my own work with Wolverine. I’ve also made it recommended reading for the developers in my own company.

All that being said, I strongly disagree with approach the approach he describes for integration testing with “nullable infrastructure” and eschewing DI/IoC for composition in favor of just willy nilly hard coding things because “DI us scary” or whatever. My strong preference and also where I’ve had the most success is to purposely choose to rely on development technologies that lend themselves to low friction, reliable, and productive integration testing.

And as it just so happens, the “critter stack” tools (Marten and Wolverine) that I work on are purposely designed for testability and include several features specifically to make integration testing more effective for applications using these tools.

Integration Testing with the Critter Stack

From my previous blog posts linked up above, I’ve been showing a very simplistic banking system to demonstrate the usage of Wolverine with Marten. For a testing scenario, let’s go back to part of this message handler for a WithdrawFromAccount message that will effect changes on an Account document entity and potentially send out other messages to perform other actions:

    [Transactional] 
    public static async Task Handle(
        WithdrawFromAccount command, 
        Account account, 
        IDocumentSession session, 
        IMessageContext messaging)
    {
        account.Balance -= command.Amount;
     
        // This just marks the account as changed, but
        // doesn't actually commit changes to the database
        // yet. That actually matters as I hopefully explain
        session.Store(account);
 
        // Conditionally trigger other, cascading messages
        if (account.Balance > 0 && account.Balance < account.MinimumThreshold)
        {
            await messaging.SendAsync(new LowBalanceDetected(account.Id));
        }
        else if (account.Balance < 0)
        {
            await messaging.SendAsync(new AccountOverdrawn(account.Id), new DeliveryOptions{DeliverWithin = 1.Hours()});
         
            // Give the customer 10 days to deal with the overdrawn account
            await messaging.ScheduleAsync(new EnforceAccountOverdrawnDeadline(account.Id), 10.Days());
        }
        
        // "messaging" is a Wolverine IMessageContext or IMessageBus service 
        // Do the deliver within rule on individual messages
        await messaging.SendAsync(new AccountUpdated(account.Id, account.Balance),
            new DeliveryOptions { DeliverWithin = 5.Seconds() });
    }

For a little more context, I’ve set up a Minimal API endpoint to delegate to this command like so:

// One Minimal API endpoint that just delegates directly to Wolverine
app.MapPost("/accounts/withdraw", (WithdrawFromAccount command, IMessageBus bus) => bus.InvokeAsync(command));

In the end here, I want a set of integration tests that works through the /accounts/withdraw endpoint, through all ASP.NET Core middleware, all configured Wolverine middleware or policies that wrap around that handler above, and verifies the expected state changes in the underlying Marten Postgresql database as well as any messages that I would expect to go out. And oh, yeah, I’d like those tests to be completely deterministic.

First, a Shared Test Harness

I’m starting to be interested in moving back to NUnit for the first time in years strictly for integration testing because I’m starting to suspect it would give you more control over the test fixture lifecycle in ways that are frequently valuable in integration testing.

Now, before writing the actual tests, I’m going to build an integration test harness for this system. I prefer to use xUnit.Net these days as my test runner, so we’re going to start with building what will be a shared fixture to run our application within integration tests. To be able to test through HTTP endpoints, I’m also going to add another JasperFx project named Alba to the testing project (See Alba for Effective ASP.Net Core Integration Testing for more information):

public class AppFixture : IAsyncLifetime
{
    public async Task InitializeAsync()
    {
        // Workaround for Oakton with WebApplicationBuilder
        // lifecycle issues. Doesn't matter to you w/o Oakton
        OaktonEnvironment.AutoStartHost = true;
        
        // This is bootstrapping the actual application using
        // its implied Program.Main() set up
        Host = await AlbaHost.For<Program>(x =>
        {
            // I'm overriding 
            x.ConfigureServices(services =>
            {
                // Let's just take any pesky message brokers out of
                // our integration tests for now so we can work in
                // isolation
                services.DisableAllExternalWolverineTransports();
                
                // Just putting in some baseline data for our database
                // There's usually *some* sort of reference data in 
                // enterprise-y systems
                services.InitializeMartenWith<InitialAccountData>();
            });
        });
    }

    public IAlbaHost Host { get; private set; }

    public Task DisposeAsync()
    {
        return Host.DisposeAsync().AsTask();
    }
}

There’s a bit to unpack in that class above, so let’s start:

  • A .NET IHost can be expensive to set up in memory, so in any kind of sizable system I will try to share one single instance of that between integration tests.
  • The AlbaHost mechanism is using WebApplicationFactory to bootstrap our application. This mechanism allows us to make some modifications to the application’s normal bootstrapping for test specific setup, and I’m exploiting that here.
  • The `DisableAllExternalWolverineTransports()` method is a built in extension method in Wolverine that will disable all external sending or listening to external transport options like Rabbit MQ. That’s not to say that Rabbit MQ itself is necessarily impossible to use within automated tests — and Wolverine even comes with some help for that in testing as well — but it’s certainly easier to create our tests without having to worry about messages coming and going from outside. Don’t worry though, because we’ll still be able to verify the messages that should be sent out later.
  • I’m using Marten’s “initial data” functionality that’s a way of establishing baseline data (reference data usually, but for testing you may include a baseline set of test user data maybe). For more context, `InitialAccountData` is shown below:
public class InitialAccountData : IInitialData
{
    public static Guid Account1 = Guid.NewGuid();
    public static Guid Account2 = Guid.NewGuid();
    public static Guid Account3 = Guid.NewGuid();
    
    public Task Populate(IDocumentStore store, CancellationToken cancellation)
    {
        return store.BulkInsertAsync(accounts().ToArray());
    }

    private IEnumerable<Account> accounts()
    {
        yield return new Account
        {
            Id = Account1,
            Balance = 1000,
            MinimumThreshold = 500
        };
        
        yield return new Account
        {
            Id = Account2,
            Balance = 1200
        };

        yield return new Account
        {
            Id = Account3,
            Balance = 2500,
            MinimumThreshold = 100
        };
    }
}

Next, just a little more xUnit.Net overhead. To make a shared fixture across multiple test classes with xUnit.Net, I add this little marker class:

[CollectionDefinition("integration")]
public class ScenarioCollection : ICollectionFixture<AppFixture>
{
    
}

I have to look this up every single time I use this functionality.

For integration testing, I like to a have a slim base class that I tend to quite originally call “IntegrationContext” like this one:

public abstract class IntegrationContext : IAsyncLifetime
{
    public IntegrationContext(AppFixture fixture)
    {
        Host = fixture.Host;
        Store = Host.Services.GetRequiredService<IDocumentStore>();
    }
    
    public IAlbaHost Host { get; }
    public IDocumentStore Store { get; }
    
    public async Task InitializeAsync()
    {
        // Using Marten, wipe out all data and reset the state
        // back to exactly what we described in InitialAccountData
        await Store.Advanced.ResetAllData();
    }

    // This is required because of the IAsyncLifetime 
    // interface. Note that I do *not* tear down database
    // state after the test. That's purposeful
    public Task DisposeAsync()
    {
        return Task.CompletedTask;
    }
}

Other than simply connecting real test fixtures to the ASP.Net Core system under test (the IAlbaHost), this IntegrationContext utilizes another bit of Marten functionality to completely reset the database state back to only the data defined by the InitialAccountData so that we always have known data in the database before tests execute.

By and large, I find NoSQL databases to be more easily usable in automated testing than purely relational databases because it’s generally easier to tear down and rebuild databases with NoSQL. When I’m having to use a relational database in tests, I opt for Jimmy Bogard’s Respawn library to do the same kind of reset, but it’s substantially more work to use than Marten’s built in functionality.

In the case of Marten, we very purposely designed in the ability to reset the database state for integration testing scenarios from the very beginning. Add this functionality to the easy ability to run the underlying Postgresql database in a local Docker container for isolated testing, and I’ll claim that Marten is very usable within test automation scenarios with no real need to try to stub out the database or use some kind of low fidelity fake in memory database in testing.

See My Opinions on Data Setup for Functional Tests for more explanation of why I’m doing the database state reset before all tests, but never immediately afterward. And also why I think it’s important to place test data setup directly into tests rather than trying to rely on any kind of external, expected data set (when possible).

From my first pass at writing the sample test that’s coming in the next section, I discovered the need for one more helper method on IntegrationContext to make HTTP calls to the system while also tracking background Wolverine activity as shown below:

    // This method allows us to make HTTP calls into our system
    // in memory with Alba, but do so within Wolverine's test support
    // for message tracking to both record outgoing messages and to ensure
    // that any cascaded work spawned by the initial command is completed
    // before passing control back to the calling test
    protected async Task<(ITrackedSession, IScenarioResult)> TrackedHttpCall(Action<Scenario> configuration)
    {
        IScenarioResult result = null;
        
        // The outer part is tying into Wolverine's test support
        // to "wait" for all detected message activity to complete
        var tracked = await Host.ExecuteAndWaitAsync(async () =>
        {
            // The inner part here is actually making an HTTP request
            // to the system under test with Alba
            result = await Host.Scenario(configuration);
        });

        return (tracked, result);
    }

The method above gives me access to the complete history of Wolverine messages during the activity including all outgoing messages spawned by the HTTP call. It also delegates to Alba to run HTTP requests in memory and gives me access to the Alba wrapped response for easy interrogation of the response later (which I don’t need in the following test, but would frequently in other tests).

See Test Automation Support from the Wolverine documentation for more information on the integration testing support baked into Wolverine.

Writing the first integration test

The first “happy path” test that verifies that calling the web service through to the Wolverine message handler for withdrawing from an account without going into any kind of low balance conditions might look like this:

public class when_debiting_an_account : IntegrationContext
{
    public when_debiting_an_account(AppFixture fixture) : base(fixture)
    {
    }

    [Fact]
    public async Task should_increase_the_account_balance_happy_path()
    {
        // Drive in a known data, so the "Arrange"
        var account = new Account
        {
            Balance = 2500,
            MinimumThreshold = 200
        };

        await using (var session = Store.LightweightSession())
        {
            session.Store(account);
            await session.SaveChangesAsync();
        }

        // The "Act" part of the test.
        var (tracked, _) = await TrackedHttpCall(x =>
        {
            // Send a JSON post with the DebitAccount command through the HTTP endpoint
            // BUT, it's all running in process
            x.Post.Json(new WithdrawFromAccount(account.Id, 1300)).ToUrl("/accounts/debit");

            // This is the default behavior anyway, but still good to show it here
            x.StatusCodeShouldBeOk();
        });
        
        // Finally, let's do the "assert"
        await using (var session = Store.LightweightSession())
        {
            // Load the newly persisted copy of the data from Marten
            var persisted = await session.LoadAsync<Account>(account.Id);
            persisted.Balance.ShouldBe(1300); // Started with 2500, debited 1200
        }

        // And also assert that an AccountUpdated message was published as well
        var updated = tracked.Sent.SingleMessage<AccountUpdated>();
        updated.AccountId.ShouldBe(account.Id);
        updated.Balance.ShouldBe(1300);

    }
}

The test above follows the basic “arrange, act, assert” model. In order, the test:

  1. Writes a brand new Account document to the Marten database
  2. Makes an HTTP call to the system to POST a WithdrawFromAccount command to our system using our TrackedHttpCall method that also tracks Wolverine activity during the HTTP call
  3. Verify that the Account data was changed in the database the way we expected
  4. Verify that an expected outgoing message was published as part of the activity

It was a lot of initial set up to get to the point where we could write tests, but I’m going to argue in the next section that we’ve done a lot to reduce the friction in writing additional integration tests for our system in a reliable way.

Avoiding the Selenium as Golden Hammer Anti-Pattern

Playwright or Cypress.io may prove to be better options than Selenium over time (I’m bullish on Playwright myself), but the main point is really that only depending on end to end tests through the browser can easily be problematic and inefficient.

Before I go back to defending why I think the testing approach and tooling shown in this post is very effective, let’s build up an all too real strawman of inefficient and maybe even ineffective test automation:

  • All your integration tests are blackbox, end to end tests that use Selenium to drive a web browser
  • These tests can only be executed externally to the application when the application is deployed to a development or testing environment. In the worst case scenario — which is also unfortunately common — the Selenium tests cannot be easily executed locally on demand
  • The tests are prone to failures due to UI changes
  • The tests are prone to intermittent “blinking” failures due to asynchronous behavior in the UI where test assertions happen before actions are completed in the application. This is a source of major friction and poor results in large scale Selenium testing that has been endemic in every single shop or project where I’ve used or seen Selenium used over the past decade — including in my current role.
  • The end to end tests are slow compared to finer grained unit tests or smaller whitebox integration tests that do not have to use the browser
  • Test failures are often difficult to diagnose since the tests are running out of process without direct access to the actual application. Some folks try to alleviate this issue with screenshots of the browser or in more advanced usages, trying to correlate the application logs to the test runs
  • Test failures often happen because related test databases are not in the expected state

I’m laying it on pretty thick here, but I think that I’m getting my point across that only relying on Selenium based browser testing is potentially very inefficient and sometimes ineffective. Now, let’s consider how the “critter stack” tools and the testing approach I used up above solve some of the issues I raised just above:

  • Postgresql itself is very easy to run in Docker containers or if you have to, to deploy locally. That makes it friendly for automated testing where you really, really want to have isolated testing infrastructure and avoid sharing any kind of stateful resource between testing processes
  • Marten in particular has built in support for setting up known database states going into automated tests. This is invaluable for integration testing
  • Executing directly against HTTP API endpoints is much faster than browser testing with something like Selenium. Faster executing tests == faster feedback cycles == better development throughput and delivery period
  • Running the tests completely in process with the application such as we did with Alba makes debugging test failures much easier for developers than trying to solve Selenium failures in a CI environment
  • Using the Alba + xUnit.Net (or NUnit etc) approach means that the integration tests can live with the application code and can be executed on demand whenever. That shifts the testing “left” in the development cycle compared to the slower Selenium running on CI only cycle. It also helps developers quickly spot check potential issues.
  • By embedding the integration tests directly in the codebase, you’re much less likely to get the drift between the application itself and automated tests that frequently arises from Selenium centric approaches.
  • This approach makes developers be involved with the test automation efforts. I strongly believe that it’s impossible for large scale test automation to work whatsoever without developer involvement
  • Whitebox tests are simply much more efficient than the blackbox model. This statement is likely to get me yelled at by real testing professionals, but it’s still true

This post took way, way too long to write compared to how I thought it would go. I’m going to make a little bonus followup on using Lamar of all things for other random test state resets.