Marten Event Sourcing Gets Some New Tools

JasperFx Software has gotten the chance this work to build out several strategic improvements to both Marten and Wolverine through collaborations with our clients who have had some specific needs. This has been highly advantageous because it’s helped push some significant, long planned technical improvements while getting all important feedback as clients integrate the new features. Today I’d like to throw out a couple valuable features and capabilities that Marten has gained as part of recent client work.

“Side Effects” in Projections

In a recent post called Multi Step Workflows with the Critter Stack I talked about using Wolverine sagas (really process managers if you have to be precise about the pattern name because I’m slopping about interchanging “saga” and “process manager”) for long running workflows. In that post I talked about how an incoming file would be:

Broken up into batches of rows
Each batch would be validated as a separately handled message for some parallelization and more granular retries
When there were validation results recorded for each record batch, the file processing itself would either stop with a call back message summarizing the failures to the upstream sender or continue to the next stage.

As it turns out, event sourcing with a projected aggregate document for the state of the file import turns out to be another good way to implement this workflow, especially with the new “side effects” model recently introduced in Marten at the behest of a JasperFx client.

In this usage. let’s say that we have this aggregated state for a file being imported:

public class FileImportState
{

    // Identity for this saga within our system
    public Guid Id { get; set; }
    public string FileName { get; set; }
    public string PartnerTrackingNumber { get; set; }
    public DateTimeOffset Created { get; set; } = DateTimeOffset.UtcNow;
    public List<RecordBatchTracker> RecordBatches { get; set; } = new();

    public FileImportStage Stage { get; set; } = FileImportStage.Validating;
}

The FileImportState would be updated by appending events like BatchValidated, with Marten “projecting” those events in the rolled up state of the entire file. In Marten’s async daemon process that runs projections in a background process, Marten is processing a certain range (think events 10000 to 11000) at a time. As the daemon processes events into a projection for the FileImportState, it’s grouping the events in that range into event “slices” that are grouped by file id.

For managing the workflow, we can now append all new events as a “side effect” of processing an event slice in the daemon as the aggregation data is updated in the background. Let’s say that we have a single stream projection for our FileImportState aggregation like this below:

public class FileImportProjection : SingleStreamProjection<FileImportState>
{
    // Other Apply / Create methods to update the state of the 
    // FileImportState aggregate document

    public override ValueTask RaiseSideEffects(IDocumentOperations operations, IEventSlice<FileImportState> slice)
    {
        var state = slice.Aggregate;
        if (state.Stage == FileImportStage.Validating &&
            state.RecordBatches.All(x => x.ValidationStatus != RecordStatus.Pending))
        {
            // At this point, the file is completely validated, and we can decide what should happen next with the
            // file
            
            // Are there any validation message failures?
            var rejected = state.RecordBatches.SelectMany(x => x.ValidationMessages).ToArray();
            if (rejected.Any())
            {
                // Append a validation failed event to the stream
                slice.AppendEvent(new ValidationFailed());
                
                // Also, send an outgoing command message that summarizes
                // the validation failures
                var message = new FileRejectionSummary()
                {
                    FileName = state.FileName,
                    Messages = rejected,
                    TrackingNumber = state.PartnerTrackingNumber
                };
                
                // This will "publish" a message once the daemon
                // has successfully committed all changes for the 
                // current batch of events
                // Unsurprisingly, there's a Wolverine integration 
                // for this
                slice.PublishMessage(message);
            }
            else
            {
                slice.AppendEvent(new ValidationSucceeded());
            }
        }

        return new ValueTask();
    }
}

And unsurprisingly, there is also the ability to “publish” outgoing messages as part of processing through asynchronous projections with an integration to Wolverine available.

This feature has long, long been planned and I was glad to get the chance to build it out this fall for a client. I’m happy to say that this is in production for them — after the obligatory shakedown cruise and some bug fixes.

Optimized Projection Rebuilds

Another JasperFx client has a system where they retrofitted Marten into an in flight system using event sourcing for a very large data set, but didn’t take advantage of many Marten capabilities including the ability to effectively pre-build or “snapshot” projected data to optimize system state reads.

With a little bit of work in their system, we knew we would be able to introduce the new projection snapshotting into their system with Marten’s blue/green deployment model for projections where Marten would immediately start trying to pre-build a new projection (or new version of an existing projection) from scratch. Great! Except we knew that was potentially going to be a major performance problem until the projection caught up to the current “high water mark” of the event store.

To ease the cost of introducing a new, persisted projection on top of ~100 million events, we built out Marten’s new optimized projection rebuild feature. To demonstrate what I mean, let’s first opt into using this feature (it had to be opt in because it forces users to made additive changes to existing database tables):

builder.Services.AddMarten(opts =>
{
    opts.Connection("some connection string");

    // Opts into a mode where Marten is able to rebuild single
    // stream projections faster by building one stream at a time
    // Does require new table migrations for Marten 7 users though
    opts.Events.UseOptimizedProjectionRebuilds = true; 
});

Now, when our users redeploy their system with the new snapshotted projection running with Marten’s Async workflow for the first time Marten will see that the projection has not been processed before, so will try to use an “optimized rebuild mode.” Since we’ve turned on optimized projection rebuilds, for a single stream projection, Marten runs the projection in “rebuild” mode by:

First building a new table to track each event stream that relates to the aggregate type in question, but builds this table in reverse order of when each stream has been changed. The whole point of that is to make sure our optimized rebuild process is dealing with the most recently changed event streams so that the system can perform well even while the rebuild process in running
The rebuild process rebuilds the aggregates event stream by event stream as a way of minimizing the number of database reads and writes it takes to rebuild the single stream projection. Compare that to the previous, naive “left fold” approach where it just works from event sequence = 1 to the high water mark and constantly writes and reads back the same projection document as its encountered throughout the event store
When the optimized rebuild is complete, it switches the projection to running in its normal, continuous mode from the point at which the rebuild started

That’s a lot of words and maybe some complicated explanation, but the point is that Marten makes it possible to introduce new projections to a large, in flight system without incurring system downtime or even inconsistent data showing up to users.

Other Recent Improvements for Clients

Some other recent work that JasperFx has done for our clients includes:

Masking private data in event data
The “quick append” model for the event store that has done a lot to reduce runtime problems with Marten’s asynchronous projection execution
Caching in aggregation projections as a way to improve throughput in the async daemon
A lot of various support for strong typed identifiers in all kinds of nook and crannies of the Marten feature set
The utilization of PostgreSQL table partitioning for more performant systems using Marten in some circumstances

Summary

I was part of a discussion slash argument a couple weeks ago about whether or not it was necessary to use an off the shelf event sourcing library or framework like Marten, or if you were just fine rolling your own. While I’d gladly admit that you can easily build purely a storage subsystem for events, it’s not even remotely feasible to quickly roll your own tooling that matches advanced features in Marten such as the work I presented here.

Specification Usage with Marten for Repository-Free Development

I’ll jump into real discussions about architecture later in this post, but let’s say that we’re starting the development of a new software system. And for a variety of reasons I’ll try to discuss later, we want to eschew the usage of repository abstractions and be able to use all the power of our persistence tooling, which in our case is Marten of course. We’re also going to leverage a Vertical Slice Architecture approach for our codebase (more on this later).

In some cases, we might very well hit complicated database queries or convoluted LINQ expressions that are duplicated across different command or query handler “slices” within our system. Or maybe we just want some workflow code to be cleaner and easier to understand that it would be if we embedded a couple dozen lines of ugly LINQ expression code directly into the workflow code.

Enter the Specification pattern, which you’ve maybe seen from Steve Smith’s work, but I’ve run across a few times over the years. The Specification pattern is just the encapsulation of reusable query of some sort into a custom type. Marten has direct support baked in for the specification pattern through the older compiled query mechanism and the newer, more flexible query plan feature.

First, here’s an example of a compiled query:

public class FindUserByAllTheThings: ICompiledQuery<User>
{
    public string Username { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }

    public Expression<Func<IMartenQueryable<User>, User>> QueryIs()
    {
        return query =>
            query.Where(x => x.FirstName == FirstName && Username == x.UserName)
                .Where(x => x.LastName == LastName)
                .Single();
    }
}

To execute the query above, it’s this syntax on any Marten IQuerySession or IDocumentSession:

        // theSession is an IQuerySession 
        var user = await theSession.QueryAsync(new FindUserByAllTheThings
        {
            Username = "jdm", FirstName = "Jeremy", LastName = "Miller"
        });

Compiled queries are obviously a weird API, but they come with a bit of a performance boost by being able to “remember” the LINQ parsing and SQL construction inside of Marten. Think of Marten compiled queries as the equivalent to a stored procedure — but maybe with more performance advantages.

Marten compiled queries do come with some significant limitations in usefulness as they really don’t allow for any runtime flexibility. To that end, Marten introduced the query plan idea as a more generic specification implementation that can support anything that Marten itself can do.

A “query plan” is just an implementation of this interface:

public interface IQueryPlan<T>
{
    Task<T> Fetch(IQuerySession session, CancellationToken token);
}

// and optionally, this too:
public interface IBatchQueryPlan<T>
{
    Task<T> Fetch(IBatchedQuery query);
}

And executed against Marten with this method on the IQuerySession API:

Task<T> QueryByPlanAsync<T>(IQueryPlan<T> plan, CancellationToken token = default);

As you’d probably guess, it’s just a little bit of double dispatch in terms of its implementation, but in concept this gives you the ability to create reusable query plans against Marten that enables the usage of anything that Marten itself can do — including in some cases, the ability to enroll inside of Marten batched querying for better performance.

Not that I want to run around encouraging the copious usage of dynamic mock objects in your unit tests, but it is very feasible to mock the usage of query plans or compiled query objects against Marten’s IQuerySession in a way that is not even remotely feasible for trying to directly mock Marten’s LINQ provider. And even though it is highly not recommended by me and probably completely moronic to do so, folks really do try to use mock objects around LINQ.

I originally built the query plan implementation in Marten after working with a JasperFx client who had some significant opportunities to improve their codebase by ditching the typical Clean/Onion Architecture usage of repository abstractions over Marten. Their current repository usage is mostly the kind of silly passthrough queries that irritate me about Clean Architecture codebases, but a handful of very complicated queries that are reused across multiple use cases. The query plan idea was a way of allowing them to encapsulate the big, crazy queries in a single place that could be shared across different handlers, but didn’t force them into using a repository.

An Aside on the Don’t Repeat Yourself Principle

The old DRY principle is a bit of a double edged sword. It’s absolutely true that creating duplication of functionality in your system can frequently hurt as rules change over time or you encounter bugs that have to be addressed in multiple places — while inevitably missing some of those places sometimes. It’s still valuable to remove duplication of logic or behavior that crops up in your system. It’s also very true that some attempts to “DRY” up code can lead to extra complexity that makes your system harder to understand and does more bad to good. Or the work to DRY up code just doesn’t pay off enough. Unfortunately, my only advice is to take things on a case by case basis. I certainly don’t buy off into any kind of black and white “share nothing” philosophy for modular monoliths, micro services, or vertical slices.

An Aside on Clean/Onion Architecture

Let’s just dive right in by me stating that I loathe the Clean/Onion Architecture approach as it is typically used by real teams in the real world as a prescriptive layered architecture that scatters related code through umpteen million separate projects. I especially dislike the copious usage of the “Repository” pattern in these templates for a handful of reasons around the useless passthroughs or accidentally causing chatty interaction between the application and database that can kill performance.

Mostly though, my strong preference is to adopt the “Vertical Slice Architecture” mantra of keeping closely related code together. For persistence code, I’d ideally like to even drop the query code in the same files — or at least the same namespace folder — as the business logic for the command or query handler that uses the data from the queries. My thinking here is that I want the system to be as easy to reason about as possible, and that includes being able to easily understand the database calls that result from handling a query or command. And honestly, I’d also like developers to just be able to write code for a feature at a time in one place without jumping all over the codebase to follow some architect’s idea of proper code organization.

When I’d use the Repository Pattern

I would maybe choose to use the “Repository” pattern to wrap my system’s underlying persistence tooling in certain conditions. Offhand, I thought of these scenarios so far:

Maybe some particular query logic is very involved and I deem it to be helpful to move that code into its own “single responsibility” method/function/class
Maybe the underlying persistence tooling is tedious of difficult to use, and by abstracting that low level access behind a repository abstraction I’m making the rest of the code simpler and probably even enhancing testability — but I think I’d strongly recommend against adopting persistence tooling that’s like that in the first place if you can possibly help it!
If there’s some sort of caching layer maybe in between your code and the persistence tooling
To eliminate some code duplication of query logic between use cases — but the point of this blog post is going to be about using the “Specification” pattern as an alternative to eliminate duplication without having to resort to a repository abstraction

Summarizing My Preferred Approach

My default approach for my own development and my strong advice for Marten users is to largely eschew repository patterns and any other kind of abstraction wrapper around Marten’s main IQuerySession or IDocumentSession APIs. My thinking goes along the lines of:

The Marten API just isn’t that complicated to begin with
You should never even dream that LINQ providers are even remotely equivalent between tools, so the idea that you’re going to be able to swap out persistence tooling and the LINQ queries will “just work” with the next tool is a pipe dream
I think it’s very rare to swap out databases underneath an existing application anyway, and you’re pretty well in for at least a partial rewrite if you try to no matter what kind of Clean/Onion/Ports and Adapters style abstractions you’ve written anyway. Sure, maybe you can swap between two different, but very similar relational databases, but why would you bother? Except possibly for the “let’s save hosting costs by moving from Sql Server to PostgreSQL” move that lots of people discuss but never really do.
As I tried to explain in my post Network Round Trips are Evil, it’s frequently important or at least valuable to get at the more advanced features of your persistence tooling to improve performance, with Marten’s feature set for batched querying or including related documents being some of the first examples that spring to mind. And that’s not an imaginary use case, because I’m currently working with a JasperFx client whose codebase could probably be more performant if they utilized those features, but first we’re going to have to unwind some repository abstractions just to get at those Marten capabilities

Part of my prescriptive advice for being more successful in systems development is to eschew the usage of the old, classic “Repository” pattern and just use the actual persistence tooling API in your code with some exceptions of course for complicated querying, to eliminate duplication, or maybe to add in some caching or validation outside of the persistence tooling. More on those exceptions soon.

The newer query plan feature in Marten gives us specification pattern support that allows us to reuse or just encapsulate complicated query logic in a way that makes it easy to reuse across vertical slices.

Message Broker per Tenant with Wolverine

The new feature shown in this post was built by JasperFx Software as part of a client engagement. This is exactly the kind of novel or challenging issue we frequently help our clients solve. If there’s something in your shop’s ongoing efforts where you could use some extra technical help, reach out to sales@jasperfx.net and we’ll be happy to talk with you.

Wolverine 3.4 was released today with a large new feature for multi-tenancy through asynchronous messaging. This feature set was envisioned for usage in an IoT system using the full “Critter Stack” (Marten and Wolverine) where “our system” is centralized in the cloud, but has to communicate asynchronously with physical devices deployed at different client sites:

The system in question already uses Marten’s support for separating per tenant information into separate PostgreSQL databases. Wolverine itself works with Marten’s multi-tenancy to make that a seamless process within Wolverine messaging workflows. All of that arguably quite robust already support was envisioned to be running within either HTTP web services or asynchronous messaging workflows completely controlled by the deployed application and its peer services. What’s new with Wolverine 3.4 is the ability to isolate the communication with remote client (tenant) devices and the centralized, cloud deployed “our system.”

We can isolate the traffic between each client site and our system first by using a separate Rabbit MQ broker or at least a separate virtual host per tenant as implied in the code sample from the docs below:

var builder = Host.CreateApplicationBuilder();

builder.UseWolverine(opts =>
{
    // At this point, you still have to have a *default* broker connection to be used for 
    // messaging. 
    opts.UseRabbitMq(new Uri(builder.Configuration.GetConnectionString("main")))
        
        // This will be respected across *all* the tenant specific
        // virtual hosts and separate broker connections
        .AutoProvision()

        // This is the default, if there is no tenant id on an outgoing message,
        // use the default broker
        .TenantIdBehavior(TenantedIdBehavior.FallbackToDefault)

        // Or tell Wolverine instead to just quietly ignore messages sent
        // to unrecognized tenant ids
        .TenantIdBehavior(TenantedIdBehavior.IgnoreUnknownTenants)

        // Or be draconian and make Wolverine assert and throw an exception
        // if an outgoing message does not have a tenant id
        .TenantIdBehavior(TenantedIdBehavior.TenantIdRequired)

        // Add specific tenants for separate virtual host names
        // on the same broker as the default connection
        .AddTenant("one", "vh1")
        .AddTenant("two", "vh2")
        .AddTenant("three", "vh3")

        // Or, you can add a broker connection to something completel
        // different for a tenant
        .AddTenant("four", new Uri(builder.Configuration.GetConnectionString("rabbit_four")));

    // This Wolverine application would be listening to a queue
    // named "incoming" on all virtual hosts and/or tenant specific message
    // brokers
    opts.ListenToRabbitQueue("incoming");

    opts.ListenToRabbitQueue("incoming_global")
        
        // This opts this queue out from being per-tenant, such that
        // there will only be the single "incoming_global" queue for the default
        // broker connection
        .GlobalListener();

    // More on this in the docs....
    opts.PublishMessage<Message1>()
        .ToRabbitQueue("outgoing").GlobalSender();
});

With this solution, we now have a “global” Rabbit MQ broker we can use for all internal communication or queueing within “our system”, and a separate Rabbit MQ virtual host for each tenant. At runtime, when a message tagged with a tenant id is published out of “our system” to a “per tenant” queue or exchange, Wolverine is able to route it to the correct virtual host for that tenant id. Likewise, Wolverine is listening to the queue named “incoming” on each virtual host (plus the global one), and automatically tags messages coming from the per tenant virtual host queues with the correct tenant id to facilitate the full Marten/Wolverine workflow downstream as the incoming messages are handled.

Now, let’s switch it up and use Azure Service Bus instead to basically do the same thing. This time though, we can register additional tenants to use a separate Azure Service Bus fully qualified namespace or connection string:

var builder = Host.CreateApplicationBuilder();

builder.UseWolverine(opts =>
{
    // One way or another, you're probably pulling the Azure Service Bus
    // connection string out of configuration
    var azureServiceBusConnectionString = builder
        .Configuration
        .GetConnectionString("azure-service-bus");

    // Connect to the broker in the simplest possible way
    opts.UseAzureServiceBus(azureServiceBusConnectionString)

        // This is the default, if there is no tenant id on an outgoing message,
        // use the default broker
        .TenantIdBehavior(TenantedIdBehavior.FallbackToDefault)

        // Or tell Wolverine instead to just quietly ignore messages sent
        // to unrecognized tenant ids
        .TenantIdBehavior(TenantedIdBehavior.IgnoreUnknownTenants)

        // Or be draconian and make Wolverine assert and throw an exception
        // if an outgoing message does not have a tenant id
        .TenantIdBehavior(TenantedIdBehavior.TenantIdRequired)

        // Add new tenants by registering the tenant id and a separate fully qualified namespace
        // to a different Azure Service Bus connection
        .AddTenantByNamespace("one", builder.Configuration.GetValue<string>("asb_ns_one"))
        .AddTenantByNamespace("two", builder.Configuration.GetValue<string>("asb_ns_two"))
        .AddTenantByNamespace("three", builder.Configuration.GetValue<string>("asb_ns_three"))

        // OR, instead, add tenants by registering the tenant id and a separate connection string
        // to a different Azure Service Bus connection
        .AddTenantByConnectionString("four", builder.Configuration.GetConnectionString("asb_four"))
        .AddTenantByConnectionString("five", builder.Configuration.GetConnectionString("asb_five"))
        .AddTenantByConnectionString("six", builder.Configuration.GetConnectionString("asb_six"));
    
    // This Wolverine application would be listening to a queue
    // named "incoming" on all Azure Service Bus connections, including the default
    opts.ListenToAzureServiceBusQueue("incoming");

    // This Wolverine application would listen to a single queue
    // at the default connection regardless of tenant
    opts.ListenToAzureServiceBusQueue("incoming_global")
        .GlobalListener();
    
    // Likewise, you can override the queue, subscription, and topic behavior
    // to be "global" for all tenants with this syntax:
    opts.PublishMessage<Message1>()
        .ToAzureServiceBusQueue("message1")
        .GlobalSender();

    opts.PublishMessage<Message2>()
        .ToAzureServiceBusTopic("message2")
        .GlobalSender();
});

This is a lot to take in, but the major point is to keep client messages completely separate from each other while also enabling the seamless usage of multi-tenanted workflows all the way through the Wolverine & Marten pipeline. As we deal with the inevitable teething pains, the hope is that the behavioral code within the Wolverine message handlers never has to be concerned with any kind of per-tenant bookkeeping. For more information, see:

Multi-Tenancy with Database per Tenant from the Marten documentation
Multi-Tenancy with Wolverine
Multi-Tenancy and Marten from the Wolverine documentation
Multi-Tenancy with Rabbit MQ from the Wolverine documentation
Multi-Tenancy with Azure Service Bus from the Wolverine documentation

And as I typed all of that out, I do fully realize that there would be some value in having a comprehensive “Multi-Tenancy with the Critter Stack” guide in one place.

Summary

I honestly don’t know if this feature set gets a lot of usage, but it came out of what’s been a very productive collaboration with JasperFx’s original customer as we’ve worked together on their IoT system. Quite a bit of improvements to Wolverine have come about as a direct reaction to friction or opportunities that we’ve spotted with our collaboration.

As far as multi-tenancy goes, I think the challenges for the Critter Stack toolset has been to give our users all the power they need to keep data and now messaging completely separate across tenants while relentlessly removing repetitive code ceremony or usability issues. My personal philosophy is that lower ceremony code is an important enabler of successful software development efforts over time.

Network Round Trips are Evil, So Batch Your Queries When You Can

JasperFx Software frequently helps our customers wring better performance or scalability out of our customer’s systems. A somewhat frequent opportunity for improving the responsiveness and throughput of systems is merely identifying ways to batch up requests from middle tier, server side code to the backing database or databases. There’s a certain amount of overhead in making any network round trips between processes, and it often pays off in terms of performance to batch up queries or commands to reduce the number of network round trips.

Today I’m merely going to focus on Marten as a persistence tool and a bit on Wolverine as “Mediator” and show some ways that Marten reduces network round trips. Just know though that this general idea of reducing network round trips by batching up database queries or commands is certainly going to apply to improving performance with any other persistence tooling.

Batching Writes

First off, let’s just look at doing a mixed bag of “writes” with a Marten session to add, delete, or modify user data:

    public static async Task modify_some_users(IDocumentSession session)
    {
        // Mixed bag of document operations
        session.Insert(new User{FirstName = "Hans", LastName = "Gruber"});
        session.Store(new User{FirstName = "John", LastName = "McClane"});
        session.DeleteWhere<User>(x => x.LastName == "Miller");

        session.Patch<User>(x => x.LastName == "May").Set(x => x.Nickname, "Mayday");

        // Let's append some events too just for fun!
        session.Events.StartStream<User>(new UserCreated("Harry", "Ellis"));

        // Commit all the changes
        await session.SaveChangesAsync();
    }

What’s important to note in the code up above is that all the logical operations to insert, “upsert”, delete, patch, or start event streams is batched up into a single database round trip when session.SaveChangesAsync() is called. In the early days of Marten we tried a lot of different things to improve throughput in Marten, including alternative serializers, reducing string concatenation, code generation techniques, and alternative data structures internally. Our consistent finding was that the single biggest improvements always came from reducing network round trips, with alternative JSON serializers being a distant second, and every other factor far behind that.

If you’re curious about the technical underpinnings, Marten 7+ is creating a single NpgsqlBatch for all the commands and even using positional parameters because that’s a touch more efficient for the interaction with PostgreSQL.

Moving to another example, let’s say that you have workflow where you need to apply logical changes to a batch of Item entities using a mix of Marten and Wolverine. Here’s a first, naive cut at this handler:

public static class ApproveItemsHandler
{
    // I'm passing in CancellationToken because:
    // a. It's probably a good idea anyway
    // b. That's how Wolverine "enforces" message timeouts
    public static async Task HandleAsync(
        ApproveItems message,
        IDocumentSession session,
        CancellationToken token)
    {
        foreach (var id in message.Ids)
        {
            var existing = await session.LoadAsync<Item>(id, token);
            if (existing != null)
            {
                existing.Approved = true;
                session.Store(existing);
            }
        }

        await session.SaveChangesAsync(token);
    }
}

Now, let’s assume that we could easily be getting 100-1000 different ids of Item entities to approve at any one time, which would make this operation chatty and potentially slow. Let’s make it a little worse though and add in Wolverine as a “mediator” to handle each individual Item inline:

public static class ApproveItemHandler
{
    public static async Task HandleAsync(
        ApproveItem message, 
        IDocumentSession session, 
        CancellationToken token)
    {
        var existing = await session.LoadAsync<Item>(message.Id, token);
        if (existing == null) return;

        existing.Approved = true;

        await session.SaveChangesAsync(token);
    }
}

public static class ApproveItemsHandler
{
    // I'm passing in CancellationToken because:
    // a. It's probably a good idea anyway
    // b. That's how Wolverine "enforces" message timeouts
    public static async Task HandleAsync(
        ApproveItems message,
        IMessageBus bus,
        CancellationToken token)
    {
        foreach (var id in message.Ids)
        {
            await bus.InvokeAsync(new ApproveItem(id), token);
        }
    }
}

In terms of performance, the second version is even worse. We compounded the existing chattiness problem with looking up each Item individually by separating out the database “writes” to separate database calls and separate transactions within “Wolverine as Mediator” usage through that InvokeAsync()call. You should be aware that when you use any kind of in process “Mediator” tool like Wolverine, MediatR, Brighter, or MassTransit’s in process mediator functionality that each call to InvokeAsync() involves a certain amount of overhead and very likely means a nested transaction that gets committed independently from the parent message handling or HTTP request that triggered the InvokeAsync() call. I think I might go so far as to say that calling IMessageBus.InvokeAsync() from another message handler is a “guilty until proven innocent” type of approach.

I’d of course argue here that the performance may or may not end up being a big deal, but not having a transactional boundary around the original message processing can easily lead to inconsistent state in our system if any of the individual Item updates fail.

Let’s make one last version of this batch approve item handler with an eye toward reducing network round trips and keeping a strongly consistent transaction boundary around all the approvals (meaning they all succeed or all fail, no in between “who knows what really happened” state):

public static class ApproveItemsHandler
{
    // I'm passing in CancellationToken because:
    // a. It's probably a good idea anyway
    // b. That's how Wolverine "enforces" message timeouts
    public static async Task HandleAsync(
        ApproveItems message,
        IDocumentSession session,
        CancellationToken token)
    {
        // Find all the related items in *one* network round trip
        var items = await session.LoadManyAsync<Item>(token, message.Ids);
        foreach (var item in items)
        {
            item.Approved = true;
            session.Store(item);
        }

        await session.SaveChangesAsync().ConfigureAwait(false);
    }
}

In the usage above, we’re making one database call to fetch the matching Item entities, and updating all of the impacted Item entities in a single batched database command within the IDocumentSession.SaveChangesAsync(). This version should almost always be much faster than the earlier versions where we issued individual queries for each Item, plus we have better transactional consistency in the case of system errors.

Lastly of course for the sake of completeness, we could just do this with one network round trip:

public static class ApproveItemsHandler
{
    // Assuming here that Wolverine "auto-transaction"
    // middleware is in place
    public static void Handle(
        ApproveItems message,
        IDocumentSession session)
    {
        session
            .Patch<Item>(x => x.Id.IsOneOf(message.Ids))
            .Set(x => x.Approved, true);
    }
}

That last version eliminates the usage of current state to validate the operation first or give us any indication of what exactly was changed, but hey, that’s the fastest possible way to code this with Marten and it might be suitable sometimes in your own system.

Batch Querying

Marten has strong support for batch querying where you can combine any number of disparate queries in a batch to the database, and read the results one at a time afterward. Here’s an example from the Marten documentation, but just know that session in this case is a Marten IQuerySession:

// Start a new IBatchQuery from an active session
var batch = session.CreateBatchQuery();

// Fetch a single document by its Id
var user1 = batch.Load<User>("username");

// Fetch multiple documents by their id's
var admins = batch.LoadMany<User>().ById("user2", "user3");

// User-supplied sql
var toms = batch.Query<User>("where first_name == ?", "Tom");

// Where with Linq
var jills = batch.Query<User>().Where(x => x.FirstName == "Jill").ToList();

// Any() queries
var anyBills = batch.Query<User>().Any(x => x.FirstName == "Bill");

// Count() queries
var countJims = batch.Query<User>().Count(x => x.FirstName == "Jim");

// The Batch querying supports First/FirstOrDefault/Single/SingleOrDefault() selectors:
var firstInternal = batch.Query<User>().OrderBy(x => x.LastName).First(x => x.Internal);

// Kick off the batch query
await batch.Execute();

// All of the query mechanisms of the BatchQuery return
// Task's that are completed by the Execute() method above
var internalUser = await firstInternal;
Debug.WriteLine($"The first internal user is {internalUser.FirstName} {internalUser.LastName}");

That’s a little more code and complexity than you might have otherwise if you just make the queries independently, but there’s some significant performance gains to be made from batching queries.

This is a much, much longer discussion than I have ambition for today, but the rampant usage of repository abstractions around raw persistence tooling like Marten has a tendency to knock out more powerful functionality like query batching. That’s especially compounded with “noun-centric” code organization where you may have IOrderRepository and IInvoiceRepository wrapping your raw persistence tooling, but yet frequently have logical operations that deal with both Order and Invoice data at the same time. With Wolverine especially, I’m pushing JasperFx clients and our users to try to get away with eschewing these kinds of abstractions and leaning hard into Wolverine’s “A-Frame Architecture” approach so you can utilize the full power of Marten (or EF Core or RavenDb or whatever else you actually use).

What I can tell you is that for a current JasperFx client, we’re looking in the long run to collapse and simplify and inline their current usage of Railway Programming and MediatR-calling-other-MediatR handlers as a way to enable us to utilize query batching to optimize some of their very complicated operations that today end up being very chatty between the server and database.

Including Related Entities when Querying

There are plenty of times you’ll have an operation in your system that needs information from multiple, related entity types. Marten provides its version of Include() in its LINQ provider as a way to batch query related documents in fewer network round trips, and hence better performance like this example from the tests:

[Fact]
public async Task simple_include_for_a_single_document()
{
    var user = new User();
    var issue = new Issue { AssigneeId = user.Id, Title = "Garage Door is busted" };

    using var session = theStore.IdentitySession();
    session.Store<object>(user, issue);
    await session.SaveChangesAsync();

    using var query = theStore.QuerySession();

    // The following query will fetch both the Issue document
    // and the related User document for the Issue in one
    // network round trip
    User included = null;
    var issue2 = query
        .Query<Issue>()
        .Include<User>(x => included = x).On(x => x.AssigneeId)
        .Single(x => x.Title == issue.Title);

    included.ShouldNotBeNull();
    included.Id.ShouldBe(user.Id);

    issue2.ShouldNotBeNull();
}

I’ll refer you to the documentation for more alternative usages, but just know that Marten has this capability and it’s a valuable way to improve performance in your system by reducing the number of network roundtrips between your code and the backend.

Marten’s Include() functionality was originally inspired/copied from RavenDb. We’ve unfortunately had some confusion in the past from folks coming over from EF Core where its Include() means something very different. Oh, and just to pull aside the curtain, it’s not doing any kind of JOIN behind the scenes, but a temporary table + multiple SELECT() statements.

Summary

I just wanted to get a handful of things across in this post:

Network round trips can easily be expensive and a contributing factor in poor system performance. Reducing the number of network round trips by batching queries can sometimes pay off overall even if that sometimes means more complex code
Marten has several features specifically meant to improve system performance by batching database queries that you can utilize. Both Marten and Wolverine are absolutely built with this philosophy of reducing network round trips as much as possible
Any coding or architectural strategy that results in excessive layering, long call stacks (A calls B that calls C that calls D that finally calls to a database), or really just obfuscates your understanding of how system operations lead to increased numbers of network round trips can easily be harmful to your system’s performance because you can’t easily “see” what your system is really doing

Personal Identifiable Information Masking in Marten

JasperFx Software helps our customers be more successful with their usage of the “Critter Stack” tools (or any other server side .NET tooling you might be using). The work in this post was delivered for a JasperFx customer to help protect their customer’s private information. If you need or want any help with event sourcing, Event Driven Architecture, or automated testing, drop us a note and we’d be happy to talk with you about what JasperFx can do for you.

I defy you to say the title of this post out loud in rapid succession without stumbling over it.

According to the U.S. Department of Labor, “Personal Identifiable Information” (PII) is defined as:

Any representation of information that permits the identity of an individual to whom the information applies to be reasonably inferred by either direct or indirect means.

Increasingly, Marten users are running into requirements to be able to “forget” PII that is persisted within a Marten database. For the document storage, I think this is relatively easy to do with a host of existing functionality including the partial update functionality that Marten got (back) in V7. For the event store though, there wasn’t anything built in that would have made it easy to erase or “mask” protected information within the persisted event data — until now!

The Marten 7.31 adds a new capability to erase or mask PII data within the event store.

For a variety of reasons, you may wish to remove or mask sensitive data elements in a Marten database without necessarily deleting the information as a whole. Documents can be amended with Marten’s Patching API. With event data, you now have options to reach into the event data and rewrite selected members as well as to add custom headers. First, start by defining data masking rules by event type like so:

var builder = Host.CreateApplicationBuilder();
builder.Services.AddMarten(opts =>
{
    opts.Connection(builder.Configuration.GetConnectionString("marten"));

    // By a single, concrete type
    opts.Events.AddMaskingRuleForProtectedInformation<AccountChanged>(x =>
    {
        // I'm only masking a single property here, but you could do as much as you want
        x.Name = "****";
    });

    // Maybe you have an interface that multiple event types implement that would help
    // make these rules easier by applying to any event type that implements this interface
    opts.Events.AddMaskingRuleForProtectedInformation<IAccountEvent>(x => x.Name = "****");

    // Little fancier
    opts.Events.AddMaskingRuleForProtectedInformation<MembersJoined>(x =>
    {
        for (int i = 0; i < x.Members.Length; i++)
        {
            x.Members[i] = "*****";
        }
    });
});

That’s strictly a configuration time effort. Next, you can apply the masking on demand to any subset of events with the IDocumentStore.Advanced.ApplyEventDataMasking() API. First, you can apply the masking for a single stream:

public static Task apply_masking_to_streams(IDocumentStore store, Guid streamId, CancellationToken token)
{
    return store
        .Advanced
        .ApplyEventDataMasking(x =>
        {
            x.IncludeStream(streamId);

            // You can add or modify event metadata headers as well
            // BUT, you'll of course need event header tracking to be enabled
            x.AddHeader("masked", DateTimeOffset.UtcNow);
        }, token);
}

As a finer grained operation, you can specify an event filter (Func<IEvent, bool>) within an event stream to be masked with this overload:

public static Task apply_masking_to_streams_and_filter(IDocumentStore store, Guid streamId, CancellationToken token)
{
    return store
        .Advanced
        .ApplyEventDataMasking(x =>
        {
            // Mask selected events within a single stream by a user defined criteria
            x.IncludeStream(streamId, e => e.EventTypesAre(typeof(MembersJoined), typeof(MembersDeparted)));

            // You can add or modify event metadata headers as well
            // BUT, you'll of course need event header tracking to be enabled
            x.AddHeader("masked", DateTimeOffset.UtcNow);
        }, token);
}

Note that regardless of what events you specify, only events that match a pre-registered masking rule will have the header changes applied.

To apply the event data masking across streams on an arbitrary grouping, you can use a LINQ expression as well:

public static Task apply_masking_by_filter(IDocumentStore store, Guid[] streamIds)
{
    return store.Advanced.ApplyEventDataMasking(x =>
        {
            x.IncludeEvents(e => e.EventTypesAre(typeof(QuestStarted)) && e.StreamId.IsOneOf(streamIds));
        });
}

Finally, if you are using multi-tenancy, you can specify the tenant id as part of the same fluent interface:

public static Task apply_masking_by_tenant(IDocumentStore store, string tenantId, Guid streamId)
{
    return store
        .Advanced
        .ApplyEventDataMasking(x =>
        {
            x.IncludeStream(streamId);

            // Specify the tenant id, and it doesn't matter
            // in what order this appears in
            x.ForTenant(tenantId);
        });
}

Here’s a couple more facts you might need to know:

The masking rules can only be done at configuration time (as of right now)
You can apply multiple masking rules for certain event types, and all will be applied when you use the masking API
The masking has absolutely no impact on event archiving or projected data — unless you rebuild the projection data after applying the data masking of course

Summary

The Marten team is at least considering support for crypto-shredding in Marten 8.0, but no definite plans have been made yet. It might fit into the “Critter Stack 2025” release cycle that we’re just barely starting.

Multi Step Workflows with the Critter Stack

I’m working with a JasperFx Software client who is in the beginning stages of building a pretty complex, multi-step file import process that is going to involve several different services. For the sake of example code in this post, let’s say that we have the (much simplified from my client’s actual logical workflow) workflow from the diagram above:

External partners (or customers) are sending us an Excel sheet with records that our system will need to process and utilize within our downstream systems (invoices? payments? people? transactions?)
For the sake of improved throughput, the incoming file is broken into batches of records so the smaller batches can be processed in parallel
Each batch needs to be validated by the “Validation Service”
When each batch has been completely validated:
- If there are any errors, send a rejection summary about the entire file to the original external partner
- If there are no errors, try to send each record batch to “Downstream System #1”
When each batch has been completely accepted or rejected by “Downstream System #1”
- If there are any rejections, send a rejection summary about the entire file to the original external partner
- If all batches are accepted by “Downstream System #1”, try to send each record batch to “Downstream System #2”
When each batch has been completely accepted or rejected by “Downstream System #2”
- If there are any rejections, send a rejection summary about the entire file to the original external partner and a message to “Downstream System #1” to reverse each previously accepted records in the file
- If all batches are accepted by “Downstream System #2”, send a successful receipt message to the original external partner and archive the intermediate state

Right off the bat, I think we can identify a couple needs and challenges:

We need some way to track the current, in process state of an individual file and where all the various batches are in that process
At every point, make decisions about what to do next in the workflow based on the current state of the file based on incremental process. And to make this as clear as possible, I think it’s extremely valuable to be able to clearly write, read, unit test, and reason about this workflow code without any significant coupling to the surrounding infrastructure.
The whole system should be resilient in the face of the expected transient hiccups like a database getting overwhelmed or a downstream system being temporarily down and “work” should never get lost or hopefully even require human intervention at runtime
Especially for large files, we absolutely better be prepared for some challenging concurrency issues when lots of incoming messages attempt to update that central file import processing state
Make it all performance too of course!

Alright, so we’re definitely using both Marten for persistence and Wolverine for the workflow and messaging between services for all of this. The first basic approach for the state management is to use Wolverine’s stateful saga support with Marten. In that case we might have a saga type in Marten something like this:

// Again, express the stages in terms of your
// business domain instead of technical terms,
// but you'll do better than me on this front!
public enum FileImportStage
{
    Validating,
    Downstream1,
    Downstream2,
    Completed
}

// As long as it's JSON serialization friendly, you can happily
// tighten up the access here all you want, but I went for quick and simple
public class FileImportSaga : 
    // Only necessary marker type for Wolverine here
    Saga, 
    
    // Opts into tracked version concurrency for Marten
    // We probably want in this case
    IRevisioned
{
    // Identity for this saga within our system
    public Guid Id { get; set; }
    public string FileName { get; set; }
    public string PartnerTrackingNumber { get; set; }
    public DateTimeOffset Created { get; set; } = DateTimeOffset.UtcNow;
    public List<RecordBatchTracker> RecordBatches { get; set; } = new();

    public FileImportStage Stage { get; set; } = FileImportStage.Validating;
    
    // Much more in just a bit...
}

Inside our system, we can start a new FileImportSaga and launch the first set of messages to validate each batch of records with this handler that reacts to a request to import a new file:

public record ImportFile(string fileName);

// This could have been done inside the FileImportSaga as well,
// but I think I'd rather keep that focused on the state machine
// and workflow logic
public static class FileImportHandler
{
    public static async Task<(FileImportSaga, OutgoingMessages)> Handle(
        ImportFile command, 
        IFileImporter importer,
        CancellationToken token)
    {
        var saga = await importer.ReadAsync(command.fileName, token);
        var messages = new OutgoingMessages();
        messages.AddRange(saga.CreateValidationMessages());

        return (saga, messages);
    }
}

public interface IFileImporter
{
    Task<FileImportSaga> ReadAsync(string fileName, CancellationToken token);
}

Let’s say that we’re receiving messages back from the Validation Message like this:

public record ValidationResult(Guid Id, Guid BatchId, ValidationMessage[] Messages);

public record ValidationMessage(int RecordNumber, string Message);

Quick note, if Wolverine is handling the messaging in the downstream systems, it’s helping make this easier by tracking the saga id in message metadata from upstream to downstream and back to the upstream through response messages. Otherwise you’d have to track the saga id on the incoming messages.

We could process the validation results in our saga one at a time like so:

    // Use Wolverine's cascading message feature here for the next steps
    public IEnumerable<object> Handle(ValidationResult validationResult)
    {
        var currentBatch = RecordBatches
            .FirstOrDefault(x => x.Id == validationResult.BatchId);
        
        // We'd probably rig up Wolverine error handling so that it either discards
        // a message in this case or immediately moves it to the dead letter queue
        // because there's no sense in trying to retry a message that can never be
        // processed successfully
        if (currentBatch == null) throw new UnknownBatchException(Id, validationResult.BatchId);
        currentBatch.ReadValidationResult(validationResult);
        
        var currentValidationStatus = determineValidationStatus();
        switch (currentValidationStatus)
        {
            case RecordStatus.Pending:
                yield break;
            
            case RecordStatus.Accepted:
                Stage = FileImportStage.Downstream1;
                foreach (var batch in RecordBatches)
                {
                    yield return new RequestDownstream1Processing(Id, batch.Id, batch.Records);
                }

                break;
            
            case RecordStatus.Rejected:
                // This saga is complete
                MarkCompleted();
                
                // Tell the original sender that this file is rejected
                // I'm assuming that Wolverine will get the right information
                // back to the original sender somehhow
                yield return BuildRejectionMessage();
                break;
            
        }
    }
    
    private RecordStatus determineValidationStatus()
    {
        if (RecordBatches.Any(x => x.ValidationStatus == RecordStatus.Pending))
        {
            return RecordStatus.Pending;
        }

        if (RecordBatches.Any(x => x.ValidationStatus == RecordStatus.Rejected))
        {
            return RecordStatus.Rejected;
        }

        return RecordStatus.Accepted;
    }

First off, I’m going to argue that the way that Wolverine supports its stateful sagas and its cascading message feature make the workflow logic pretty easy to unit test in isolation from all the infrastructure. That part is good, right? But what’s maybe not great is that we could easily be getting a bunch of those ValidationResult messages back for the same file at the same time because they’re handled in parallel, so we really need to be prepared for that.

We could rely on the Wolverine/Marten combination’s support for optimistic concurrency and just retry ValidationResult messages that fail because of caught ConcurrencyException, but that’s potentially thrashing the database and the application pretty hard. We could also solve this problem in a “sledgehammer to crack a nut” kind of way by using Wolverine’s strictly ordered listener approach that would force the file import status messages to be processed in order on a single running node:

builder.Host.UseWolverine(opts =>
{
    opts.UseRabbitMq(builder.Configuration.GetConnectionString("rabbitmq"));

    opts.ListenToRabbitQueue("file-import-updates")
        
        // Single file, serialized access across the
        // entire running application cluster!
        .ListenWithStrictOrdering();
});

That solves the concurrency issue in a pretty hard core way, but it’s not going to terribly performant because you’ve eliminated all concurrency between different files and you’re making the system constantly load, then save the FileImportSaga data for intermediate steps. Let’s adjust this and incorporate Wolverine’s new message batching feature.

First off, let’s add a new validation batch message like so:

public record ValidationResultBatch(Guid Id, ValidationResult[] Results);

And a new message handler on our saga type for that new message type:

    public IEnumerable<object> Handle(ValidationResultBatch batch)
    {
        var groups = batch.Results.GroupBy(x => x.BatchId);
        foreach (var group in groups)
        {
            var currentBatch = RecordBatches
                .FirstOrDefault(x => x.Id == group.Key);

            foreach (var result in group)
            {
                currentBatch.ReadValidationResult(result);
            }
        }

        return DetermineNextStepsAfterValidation();
    }

    // I pulled this out as a helper, but also, it's something
    // you probably want to unit test in isolation on just the FileImportSaga
    // class to nail down the workflow logic w/o having to do an integration
    // test
    public IEnumerable<object> DetermineNextStepsAfterValidation()
    {
        var currentValidationStatus = determineValidationStatus();
        switch (currentValidationStatus)
        {
            case RecordStatus.Pending:
                yield break;
            
            case RecordStatus.Accepted:
                Stage = FileImportStage.Downstream1;
                foreach (var batch in RecordBatches)
                {
                    yield return new RequestDownstream1Processing(Id, batch.Id, batch.Records);
                }

                break;
            
            case RecordStatus.Rejected:
                // This saga is complete
                MarkCompleted();
                
                // Tell the original sender that this file is rejected
                // I'm assuming that Wolverine will get the right information
                // back to the original sender somehhow
                yield return BuildRejectionMessage();
                break;
            
        }
    }

And lastly, we need to tell Wolverine how to do the message batching, which I’ll do first with this code:

public class ValidationResultBatcher : IMessageBatcher
{
    public IEnumerable<Envelope> Group(IReadOnlyList<Envelope> envelopes)
    {
        var groups = envelopes
            .GroupBy(x => x.Message.As<ValidationResult>().Id)
            .ToArray();
        
        foreach (var group in groups)
        {
            var message = new ValidationResultBatch(group.Key, group.OfType<ValidationResult>().ToArray());

            // It's important here to pass along the group of envelopes that make up 
            // this batched message for Wolverine's transactional inbox/outbox
            // tracking
            yield return new Envelope(message, group);
        }
    }

    public Type BatchMessageType => typeof(ValidationResultBatch);
}

Then lastly, in your Wolverine configuration in your Program file (or a helper method that’s called from Program), you’d tell Wolverine about the batching strategy like so:

builder.Host.UseWolverine(opts =>
{
    // Other Wolverine configuration...
    
    opts.BatchMessagesOf<ValidationResult>(x =>
    {
        x.Batcher = new ValidationResultBatcher();
        x.BatchSize = 100;
    });
});

With the message batching, you’re potentially putting less load on the database and improving performance by simply making fewer reads and writes over all. You might still have some concurrency concerns, so you have more options to control the parallelization of the ValidationResultBatch messages running locally like this in your UseWolverine() configuration:

    opts.LocalQueueFor<ValidationResultBatch>()
        
        // You *could* do this to completely prevent
        // concurrency issues
        .Sequential()

        // Or depend on some level of retries on concurrency
        // exceptions and let it parallelize work by file
        .MaximumParallelMessages(5);

We could choose to accept some risk of concurrent access to an individual FileImportSaga (unlikely after the batching, but still), so let’s add some better optimistic concurrency checking with our friend Marten. For any given Saga type that’s persisted with Marten, just implement the IRevisioned interface to let Wolverine know to opt into Marten’s concurrency protection like so:

public class FileImportSaga : 
    // Only necessary marker type for Wolverine here
    Saga, 
    
    // Opts into tracked version concurrency for Marten
    // We probably want in this case
    IRevisioned

That’s it, that’s all you need to do. What this does for you is create a check by Wolverine & Marten together that during the processing of any message on a FileImportSaga that no other message was successfully processed against that FileImportSaga between loading the initial copy of the saga at the time the transaction is committed. If Marten detects a concurrency violation upon the commit, it rejects the transaction and throws a ConcurrencyException. We can handle that with a series of retries to just have Wolverine retry the message from the new state with this error handling policy that I’m going to make specific to our FileImportSaga like so:

public class FileImportSaga : 
    // Only necessary marker type for Wolverine here
    Saga, 
    
    // Opts into tracked version concurrency for Marten
    // We probably want in this case
    IRevisioned
{
    public static void Configure(HandlerChain chain)
    {
        // Retry the message over again at least 3 times
        // with the specified wait times
        chain.OnException<ConcurrencyException>()
            .RetryWithCooldown(100.Milliseconds(), 250.Milliseconds(), 250.Milliseconds());
    }

    // ... the rest of FileImportSaga

See Wolverine’s error handling facilities for more information.

So now we’ve got the beginnings of a multi-step process using Wolverine’s stateful saga support. We’ve also taken some care to protect our file import process against concurrency concerns. And we’ve done all of this in a way where we can quite handily test the workflow logic by just doing state-based tests against the FileImportSaga with no database or message broker infrastructure in sight before we waste any time trying to debug the whole shebang.

Summary

The key takeaway I hope you get from this is that the full Critter Stack has some significant tooling to help you build complex, multi-step workflows. Pair that with the easy getting started stories that both tools have, and I think you have a toolset that allows you to quickly start while also scaling up to more complex needs when you need that.

As so very often happens, this blog post was bigger than I thought it would be, and I’m breaking it up into a series of a follow ups. In the next version of this post, we’ll take the same logical FileImportSaga and do the logical workflow tracking with Marten event sourcing to track the state and use some cool new Marten functionality for the workflow logic inside of Marten projections.

This might take a bit to get to, but I’ll also revisit this original implementation and talk about some extra Marten functionality to further optimize performance by baking in archiving through Marten soft-deletes and its support for PostgreSQL table partitioning.

So historically I’m actually pretty persnickety about being precise about technical terms and design pattern names, but I’m admittedly sloppy about calling something a “Saga” when maybe it’s technically a “Process Manager” and I got jumped online about that by a celebrity programmer. Sorry, not sorry?

Scaling Event Projections and Subscriptions with the Critter Stack

The feature set shown in this post was built earlier this year at the behest of a JasperFx Software client who has some unusually high data throughput and wanted to have some significant ability to scale up Marten and Wolverine‘s ability to handle a huge number of incoming events. We originally put this into what was meant to be a paid add on product, but after consultation with the rest of the Critter Stack core team and other big users, we’ve decided that it would be best for this functionality to be in the OSS core of Wolverine.

JasperFx Software is currently working with a client who has a system with around 75 million events in their database and the expectation that that database could double soon. At the same time, they need to be running around 15-20 different event projections continuously running asynchronously to build read side views. To put it mildly, they’re going to want some serious ability for Marten (with a possible helping hand from Wolverine) to handle that data in a performant manner.

Before Marten 7.0, Marten could only run projections with a “hot/cold” ownership mode that resulted in every possible projection running on a single application node within the cluster. So, not that awesome for scalability to say the least. With 7.0, Marten can do some load distribution of different projections, but it’s not terribly predictable and has no guarantee of spreading the load out.

Enter Wolverine 3.0 (RC-2 in this case) and its new ability to distribute event projections and subscriptions throughout an application cluster. With this option, as shown below:

opts.Services.AddMarten(m =>
    {
        m.DisableNpgsqlLogging = true;
        m.Connection(Servers.PostgresConnectionString);
        m.DatabaseSchemaName = "csp";

        // This was taken from Wolverine test code
        // Imagine there being far more projections and
        // subscriptions
        m.Projections.Add<TripProjection>(ProjectionLifecycle.Async);
        m.Projections.Add<DayProjection>(ProjectionLifecycle.Async);
        m.Projections.Add<DistanceProjection>(ProjectionLifecycle.Async);
    })
    .IntegrateWithWolverine(m =>
    {
        // This makes Wolverine distribute the registered projections
        // and event subscriptions evenly across a running application
        // cluster
        m.UseWolverineManagedEventSubscriptionDistribution = true;
    });

Using the UseWolverineManagedEventSubscriptionDistribution() option in place of Marten’s own async daemon management will give you a load distribution more like this:

Using this model, Wolverine can spread the asynchronous load to more running nodes so you can hopefully get a lot more throughput in your asynchronous projections without overloading any one node.

With this option, Wolverine is going to ensure that every single known asynchronous event projection and every event subscription is running on exactly one running node within your application cluster. Moreover, Wolverine will purposely stop and restart projections or subscriptions to purposely spread the running load across your entire cluster of running nodes.

In the case of using multi-tenancy through separate databases per tenant with Marten, this Wolverine “agent distribution” will assign the work by tenant databases, meaning that all the running projections and subscriptions for a single tenant database will always be running on a single application node. This was done with the theory that this affinity would hopefully reduce the number of used database connections over all.

If a node is taken offline, Wolverine will detect that the node is no longer accessible and try to move start the missing projection/subscription agents on another active node.

If you run your application on only a single server, Wolverine will of course run all projections and subscriptions on just that one server.

Some other facts about this integration:

Wolverine’s agent distribution does indeed work with per-tenant database multi-tenancy
Wolverine does automatic health checking at the running node level so that it can fail over assigned agents
Wolverine can detect when new nodes come online and redistribute work
Wolverine is able to support blue/green deployment and only run projections or subscriptions on active nodes where a capability is present. This just means that you can add all new projections or subscriptions, or even just new versions of a projection or subscription on some application nodes in order to do try “blue/green deployment.”
This capability does depend on Wolverine’s built-in leadership election — which fortunately got a lot better in Wolverine 3.0

Future Plans

While this functionality will be in the OSS core of Wolverine 3.0, we plan to add quite a bit of support to further monitor and control this feature with the planned “Critter Watch” management console tool we (JasperFx) are building. We’re planning to allow users to:

Visualize and monitor which projections and/or subscriptions are running on which application node
See a correlation to performance metrics being emitted to the Open Telemetry tool of your choice — with Prometheus PromQL compatible tools being supported first
Be able to create affinity groups between projections or subscriptions that might be using the same event data as a possible optimization
Allow individual projections or subscriptions to be paused or restarted
Trigger manual projection rebuilds at runtime
Trigger “rewinds” of subscriptions at runtime

We’re also early in planning to port the Marten event sourcing support to additional database engines. The above functionality will be available for those other database engines when we get there.

This functionality was originally conceived of something like 5-6 years ago, and it’s personally very exciting to me to finally see it out in the wild!

Critter Stack 2025

I realize the title sounds a little too similar to somebody else’s 2025 platform proposals, but let’s please just overlook that

This is a “vision board” document I wrote up and shared with our core team (Anne, JT, Babu, and Jeffry) as well as some friendly users and JasperFx Software customers. I dearly want to step foot into January 2025 with the “Critter Stack” as a very compelling choice for any shop about to embark on any kind of Event Driven Architecture — especially with the usage of Event Sourcing as part of a system’s persistence strategy. Moreover, I want to arrive at a point where the “Critter Stack” actually convinces organizations to choose .NET just to take advantage of our tooling. I’d be grateful for any feedback.

As of now, the forthcoming Wolverine 3.0 release is almost to the finish line, Marten 7 is probably just about done growing, and work on “Critter Watch” (JasperFx Software’s envisioned management console tooling for the “Critter Stack”) is ramping up. Now is a good time to detail a technical vision for the “Critter Stack” moving into 2025.

The big goals are:

Simplify the “getting started” story for using the “Critter Stack”. Not just in getting a new codebase up, but going all the way to how a Critter Stack app could be deployed and opting into all the best practices. My concern is that there are getting to be way too many knobs and switches scattered around that have to be addressed to really make performance and deployment robust.
Deliver a usable “Critter Watch” MVP
Expand the “Critter Stack” to more database options, with Sql Server and maybe CosmosDb being the leading contenders and DynamoDb or CockroachDb being later possibilities
Streamline the dependency tree. Find a way to reduce the number of GitHub repositories and Nugets if possible. Both for our maintenance overhead and also to try to simplify user setup

The major initiatives are:

Marten 8.0
Wolverine 4.0
“Critter Watch” and CritterStackPro.Projections (actually scratch the second part, that’s going to roll into the Wolverine OSS core, coming soon)
Ermine 1.0 – the Sql Server port of the Marten event store functionality
Out of the box project templates for Wolverine/Marten/Ermine usages – following the work done already by Jeffry Gonzalez
Future CosmosDb backed event store and Wolverine integration — but I’m getting a lot of mixed feedback about whether Sql Server or CosmosDb should be a higher priority

Opportunities to grow the Critter Stack user base:

Folks who are concerned about DevOps issues. “Critter Watch” and maybe more templates that show how to apply monitoring, deployment steps, and Open Telemetry to existing Critter Stack systems. The key point here is a whole lot of focus on maintainability and sustainability of the event sourcing and messaging infrastructure
Get more interest from mainstream .NET developers. Improve the integration of Wolverine and maybe Marten/Ermine as well with EF Core. This could include reaching parity with Marten for middleware support, side effects, and multi-tenancy models using EF Core. Also, maybe, hear me out, take a heavy drink, there could be an official Marten/Ermine projection integration to write projection data to EF Core? I know of at least one Critter Stack user who would use that. At this point, I’m leaning heavily toward getting Wolverine 3.0 out and mostly tackle this in the Wolverine 4.0 timeframe this fall
Expand to Sql Server for more “pure” Microsoft shops. Adding databases to the general Wolverine / Event Sourcing support (the assumption here is that the document database support in Marten would be too much work to move)
Introduce Marten and Wolverine to more people, period. Moar “DevRel” type activity! More learning videos. I’ll keep trying to do more conferences and podcasts. More sample applications. Some ideas for new samples might be a sample application with variations using each transport, using Wolverine inside of a modular monolith with multiple Marten stores and/or EF DbContexts, HTTP services, background processing. Maybe actually invest in some SEO for the websites.

Ecosystem Realignment

With major releases coming up with both Marten 8.0 and Wolverine 4.0 and the forthcoming Ermine, there’s an “opportunity” to change the organization of the code to streamline the number of GitHub repositories and Nugets floating around while also centralizing more code. There’s also an opportunity to centralize a lot of infrastructure code that could help the Ermine effort go much faster. Lastly, there are some options like code generation settings and application assembly determination that are today independently configured for Marten and Wolverine which repeatedly trips up our users (and flat out annoys me when I build sample apps).

We’re actively working to streamline the configuration code, but in the meantime, the current thinking about some of this is in the GitHub issue for JasperFx Ecosystem Dependency Reorganization. The other half of that is the content in the next section.

Projection Model Reboot

This refers to the “Reboot Projection Model API” in the Marten GitHub issue list. The short tag line is to move toward enabling easier usage of folks just writing explicit code. I also want us to tackle the absurdly confusing API for “multi-stream projections” as well. This projection model will be shared across Marten, Ermine (Sql Server-backed event store), and any future CosmosDb/DynamoDb/CockroachDb event stores.

Wrapping up Marten 7.0

Marten 7 introduced a crazy amount of new functionality on top of the LINQ rewrite, the connection management rewrite, and introduction of Polly into the core. Besides some (important) ongoing work for JasperFx clients, the remainder of Marten 7 is hopefully just:

Mark all synchronous APIs that invoke database access as [Obsolete]
Make a pass over the projection model and see how close to the projection reboot you could get. Make anything that doesn’t conform to the new ideal be [Obsolete] with nudges
Introduce the new standard code generation / application assembly configuration in JasperFx.CodeGeneration today. Mark Marten’s version of that as [Obsolete] with a pointer to using the new standard – which is hopefully very close minus namespaces to where it will be in the end

Wrapping up Wolverine 3.0

Introduce the new standard code generation / application assembly configuration in JasperFx.CodeGeneration today. Mark Marten’s version of that as [Obsolete] with a pointer to using the new standard – which is hopefully very close minus namespaces to where it will be in the end
Put a little more error handling in for code generation problems just to make it easier to fix issues later
Maybe, reexamine what work could be done to make modular monoliths easier with Wolverine and/or Marten
Maybe, consider adding back into scope improvements for EF Core with Wolverine – but I’m personally tempted to let that slide to the Wolverine 4 work

Summary

The Critter Stack core & I plus the JasperFx Software folks have a pretty audaciously ambitious plan for next year. I’m excited for it, and I’ll be talking about it in public as much as y’all will let me get away with it!

Improved Command Line Tooling with Oakton

I know, command line parsing libraries are about the least exciting tooling in the entire software universe, and there are dozens of perfectly competent ones out there. Oakton though, is heavily used throughout the entire “Critter Stack” (Marten, Weasel, and Wolverine plus other tools) to provide command line utilities directly to any old .NET Core application that happens to be bootstrapped with one of the many ways to arrive at an IHost. Oakton’s key advantage over other command line parsing tools is its ability to easily add extension commands to a .NET application in external assemblies. And of course, as part of the entire JasperFx / Critter Stack philosophy of developer tooling, Oakton’s very concept was originally created to enhance the testability of custom command line tooling. Unlike some other tools *cough* System.CommandLine *cough*.

Oakton also has some direct framework-ish elements for environment checks and the stateful resource model used very heavily all the way through Marten and Wolverine to provide the very best development time experience possible when using our tools.

Today the extended JasperFx / Critter Stack community released Oakton 6.2 with some new, hopefully important use cases. First off, the stateful resource model that we use to setup, teardown, or just check “configured stateful resources” in our system like database schemas or message broker queues just got the concept of dependencies between resources such that you can control which resources are setup first.

Next, Oakton finally got a couple easy to use recipes for utilizing IoC services in Oakton commands (it was possible, just maybe a little higher ceremony that some folks prefer). The first way, assuming that you’re running Oakton from one of the many flavors of IHostBuilder or IHost like so:

// This would be the last line in your Program.Main() method
// "app" in this case is a WebApplication object, but there
// are other extension methods for headless services
return await app.RunOaktonCommands(args);

You can build an Oakton command class that uses “setter injection” to get IoC services like so:

public class MyDbCommand : OaktonAsyncCommand<MyInput>
{
    // Just assume maybe that this is an EF Core DbContext
    [InjectService]
    public MyDbContext DbContext { get; set; }
    
    public override Task<bool> Execute(MyInput input)
    {
        // do stuff with DbContext from up above
        return Task.FromResult(true);
    }
}

Just know that when you do this and execute a command that has decorated properties for services, Oakton is:

Building your system’s IHost
Creating a new IServiceScope from your application’s DI container, or in other words, a scoped container
Building your command object and setting all the dependencies on your command object by resolving each dependency from the scoped container created in the previous step
Executing the command as normal
Disposing the scoped container and the IHost, effectively in a try/finally so that Oakton is always cleaning up after the application

In other words, Oakton is largely taking care of annoying issues like object disposal cleanup, scoping, and actually building the IHost if necessary.

Oakton’s Future

The Critter Stack Core team & I are charting a course for our entire ecosystem I’m calling “Critter Stack 2025” that’s hoping to greatly reduce the technical challenges in adopting our tool set. As part of that, what’s now Oakton is likely to move into a new shared library (I think it’s just going to be called “JasperFx”) between the various critters (and hopefully new critters for 2025!). Oakton itself will probably get a temporary life as a shim to the new location as a way to ease the transition for existing users. There’s a balance between actively improving your toolset for potential new users and not disturbing existing users too much. We’re still working on whatever that balance ends up being.

Multi-Tenancy in Wolverine Messaging

Building and maintaining a large, hosted system that requires multi-tenancy comes with a fair number of technical challenges. JasperFx Software has helped several of our clients achieve better results with their particular multi-tenancy challenges with Marten and Wolverine, and we’re available to do the same for your shop! Drop us a message on our Discord server or email us at sales@jasperfx.net to start a conversation.

This is continuing a series about multi-tenancy with Marten, Wolverine, and ASP.Net Core:

What is it and why do you care?
Marten’s “Conjoined” Model
Database per Tenant with Marten
Multi-Tenancy in Wolverine Messaging (this post)
Multi-Tenancy in Wolverine Web Services (future)
Using Partitioning for Better Performance with Multi-Tenancy and Marten (future)
Multi-Tenancy in Wolverine with EF Core & Sql Server (future, and honestly, future functionality as part of Wolverine 4.0)
Dynamic Tenant Creation and Retirement in Marten and Wolverine (definitely in the future)

Let’s say that you’re using the Marten + PostgreSQL combination for your system’s persistence needs in a web service application. Let’s also say that you want to keep the customer data within your system in completely different databases per customer company (or whatever makes sense in your system). Lastly, let’s say that you’re using Wolverine for asynchronous messaging and as a local “mediator” tool. Fortunately, Wolverine by itself has some important built in support for multi-tenancy with Marten that’s going to make your system a lot easier to build.

Let’s get started by just showing a way to opt into multi-tenancy with separate databases using Marten and its integration with Wolverine for middleware, saga support, and the all important transactional outbox support:

// Adding Marten for persistence
builder.Services.AddMarten(m =>
    {
        // With multi-tenancy through a database per tenant
        m.MultiTenantedDatabases(tenancy =>
        {
            // You would probably be pulling the connection strings out of configuration,
            // but it's late in the afternoon and I'm being lazy building out this sample!
            tenancy.AddSingleTenantDatabase("Host=localhost;Port=5433;Database=tenant1;Username=postgres;password=postgres", "tenant1");
            tenancy.AddSingleTenantDatabase("Host=localhost;Port=5433;Database=tenant2;Username=postgres;password=postgres", "tenant2");
            tenancy.AddSingleTenantDatabase("Host=localhost;Port=5433;Database=tenant3;Username=postgres;password=postgres", "tenant3");
        });

        m.DatabaseSchemaName = "mttodo";
    })
    .IntegrateWithWolverine(masterDatabaseConnectionString:connectionString);

Just for the sake of completion, here’s some sample Wolverine configuration that pairs up with the above:

// Wolverine usage is required for WolverineFx.Http
builder.Host.UseWolverine(opts =>
{
    // This middleware will apply to the HTTP
    // endpoints as well
    opts.Policies.AutoApplyTransactions();

    // Setting up the outbox on all locally handled
    // background tasks
    opts.Policies.UseDurableLocalQueues();
});

Now that we’ve got that basic setup for Marten and Wolverine, let’s move on to the first issue, how the heck does Wolverine “know” which tenant should be used? In a later post I’ll show how Wolverine.HTTP has built in tenant id detection, but for now, let’s pretend that you’re already taking care of tenant id detection from incoming HTTP requests some how within your ASP.Net Core pipeline and you just need to pass that into a Wolverine message handler that is being executed from within an MVC Core controller (“Wolverine as Mediator”):

[HttpDelete("/todoitems/{tenant}/longhand")]
public async Task Delete(
    string tenant,
    DeleteTodo command,
    IMessageBus bus)
{
    // Invoke inline for the specified tenant
    await bus.InvokeForTenantAsync(tenant, command);
}

By using the IMessageBus.InvokeForTenantAsync() method, we’re invoking a command inline, but telling Wolverine what the tenant id is. The command handler might look something like this:

// Keep in mind that we set up the automatic
// transactional middleware usage with Marten & Wolverine
// up above, so there's just not much to do here
public static class DeleteTodoHandler
{
    public static void Handle(DeleteTodo command, IDocumentSession session)
    {
        session.Delete<Todo>(command.Id);
    }
}

Not much going on there in our code, but Wolverine is helping us out here by:

Seeing the tenant id value that we passed in before that Wolverine is tracking in its own Envelope structure (Wolverine’s version of Envelope Wrapper from the venerable EIP book)
Creates the Marten IDocumentSession for that tenant id value, which will be reading and writing to the correct tenant database underneath Marten

Now, let’s make this a little more complex by also publishing an event message in that message handler for the DeleteTodo message:

public static class TodoCreatedHandler
{
    public static TodoDeleted Handle(DeleteTodo command, IDocumentSession session)
    {
        session.Delete<Todo>(command.Id);
        
        // This 
        return new TodoDeleted(command.Id);
    }
}

public record TodoDeleted(int TodoId);

Assuming that the TodoDeleted message is being published to a “durable” endpoint, Wolverine is using its transactional outbox integration with Marten to persist the outgoing message in the same tenant database and same transaction as the deletion we’re doing in that command handler. In other words, Wolverine is able to use the tenant databases for its outbox support with no other configuration necessary than what we did up above in the calls to AddMarten() and UseWolverine().

Moreover, Wolverine is even able to use its “durability agent” against all the tenant databases to ensure that any work that is somehow stranded by crashed processes.

Lastly, the TodoDeleted event message cascaded above from our message handler would be tracked throughout Wolverine with the tenant id of the original DeleteToDo command message so that you can do multi-part workflows through Wolverine while tracks the tenant id and utilizes the correct tenant database through Marten all along the way.

Summary

Building solutions with multi-tenancy can be complicated, but the Wolverine + Marten combination can make it a lot easier.