Marten Improvements in 7.34

Through a combination of Marten community members and in collaboration with some JasperFx Software clients, we’re able to push some new fixes and functionality in Marten 7.34 just today.

For the F# Person in your Life

You can now use F# Option types in LINQ Where() clauses in Marten. Check out the pull request for that to see samples. The LINQ provider code is just a difficult problem domain, and I can’t tell you how grateful I am to have gotten the community pull request for this.

Fetch the Latest Aggregate

Marten has had the FetchForWriting() API for awhile now as our recommended way to build CQRS command handlers with Marten event sourcing as I wrote about recently in CQRS Command Handlers with Marten. Great, but…

  1. What if you just want a read only view of the current data for an aggregate projection over a single event stream and wouldn’t mind a lighter weight API than FetchForWriting()?
  2. What if in your command handler you used FetchForWriting(), but now you want to return the now updated version of your aggregate projection for the caller of the command? And by the way, you want this to be as performant as possible no matter how the projection is configured.

Now you’re in luck, because Marten 7.34 adds the new FetchLatest() API for both of the bullets above.

Let’s pretend we’re building an invoicing system with Marten event sourcing and have this “self-aggregating” version of an Invoice:

public record InvoiceCreated(string Description, decimal Amount);

public record InvoiceApproved;
public record InvoiceCancelled;
public record InvoicePaid;
public record InvoiceRejected;

public class Invoice
{
    public Invoice()
    {
    }

    public static Invoice Create(IEvent<InvoiceCreated> created)
    {
        return new Invoice
        {
            Amount = created.Data.Amount,
            Description = created.Data.Description,

            // Capture the timestamp from the event
            // metadata captured by Marten
            Created = created.Timestamp,
            Status = InvoiceStatus.Created
        };
    }

    public int Version { get; set; }

    public decimal Amount { get; set; }
    public string Description { get; set; }
    public Guid Id { get; set; }
    public DateTimeOffset Created { get; set; }
    public InvoiceStatus Status { get; set; }

    public void Apply(InvoiceCancelled _) => Status = InvoiceStatus.Cancelled;
    public void Apply(InvoiceRejected _) => Status = InvoiceStatus.Rejected;
    public void Apply(InvoicePaid _) => Status = InvoiceStatus.Paid;
    public void Apply(InvoiceApproved _) => Status = InvoiceStatus.Approved;
}

And for now, we’re going to let our command handlers just use a Live aggregation of the Invoice from the raw events on demand:

var builder = Host.CreateApplicationBuilder();
builder.Services.AddMarten(opts =>
{
    opts.Connection(builder.Configuration.GetConnectionString("marten"));

    // Just telling Marten upfront that we will use
    // live aggregation for the Invoice aggregate
    // This would be the default anyway if you didn't explicitly
    // register Invoice this way, but doing so let's
    // Marten "know" about Invoice for code generation
    opts.Projections.LiveStreamAggregation<Projections.Invoice>();
});

Now we can get at the latest, greatest, current view of the Invoice that is consistent with the captured events for that invoice stream at this very moment with this usage:

public static async Task read_latest(
    // Watch this, only available on the full IDocumentSession
    IDocumentSession session,
    Guid invoiceId)
{
    var invoice = await session
        .Events.FetchLatest<Projections.Invoice>(invoiceId);
}

The usage of the API above would be completely unchanged if you were to switch the projection lifecycle of the Invoice to be either Inline (where the view is updated in the database at the same time new events are captured) or Async. That usage gives you a little bit of what we called “reversibility” in the XP days, which just means that you’re easily able to change your mind later about exactly what projection lifecycle you want to use for Invoice views.

The main reason that FetchLatest() was envisioned, however, was to pair up with FetchForWriting() in command handlers. It’s turned out to be a common use case that folks want their command handlers to:

  1. Use the current state of the projected aggregate for the event stream to…
  2. “Decide” what new events should be appended to this stream based on the incoming command and existing state of the aggregate
  3. Save the changes
  4. Return a now updated version of the projected aggregate for the event stream with the newly captured events reflected in the projected aggregate.

There is going to be a slicker integration of this with Wolverine’s aggregate handler workflow with Marten by early next week, but for now, let’s pretend we’re working with Marten from within maybe ASP.Net Minimal API and want to just work that way. Let’s say that we have a little helper method for a mini-“Decider” pattern implementation for our Invoice event streams like this one:

public static class MutationExtensions
{
    public static async Task<Projections.Invoice> MutateInvoice(this IDocumentSession session, Guid id, Func<Projections.Invoice, IEnumerable<object>> decider,
        CancellationToken token = default)
    {
        var stream = await session.Events.FetchForWriting<Projections.Invoice>(id, token);

        // Decide what new events should be appended based on the current
        // state of the aggregate and application logic
        var events = decider(stream.Aggregate);
        stream.AppendMany(events);

        // Persist any new events
        await session.SaveChangesAsync(token);

        return await session.Events.FetchLatest<Projections.Invoice>(id, token);
    }
}

Which could be used something like:

public static Task Approve(IDocumentSession session, Guid invoiceId)
{
    // I'd maybe suggest taking the lambda being passed in
    // here out somewhere where it's easy to test
    // Wolverine does that for you, so maybe just use that!
    return session.MutateInvoice(invoiceId, invoice =>
    {
        if (invoice.Status != InvoiceStatus.Approved)
        {
            return [new InvoiceApproved()];
        }

        return [];
    });
}

New Marten System Level “Archived” Event

Much more on this soon with an example more end to end with Wolverine explaining how this would add value for performance and testability.

Marten now has a built in event named Archived that can be appended to any event stream:

namespace Marten.Events;

/// <summary>
/// The presence of this event marks a stream as "archived" when it is processed
/// by a single stream projection of any sort
/// </summary>
public record Archived(string Reason);

When this event is appended to an event stream and that event is processed through any type of single stream projection for that event stream (snapshot or what we used to call a “self-aggregate”, SingleStreamProjection, or CustomProjection with the AggregateByStream option), Marten will automatically mark that entire event stream as archived as part of processing the projection. This applies for both Inline and Async execution of projections.

Let’s try to make this concrete by building a simple order processing system that might include this aggregate:

public class Item
{
    public string Name { get; set; }
    public bool Ready { get; set; }
}

public class Order
{
    // This would be the stream id
    public Guid Id { get; set; }

    // This is important, by Marten convention this would
    // be the
    public int Version { get; set; }

    public Order(OrderCreated created)
    {
        foreach (var item in created.Items)
        {
            Items[item.Name] = item;
        }
    }

    public void Apply(IEvent<OrderShipped> shipped) => Shipped = shipped.Timestamp;
    public void Apply(ItemReady ready) => Items[ready.Name].Ready = true;

    public DateTimeOffset? Shipped { get; private set; }

    public Dictionary<string, Item> Items { get; set; } = new();

    public bool IsReadyToShip()
    {
        return Shipped == null && Items.Values.All(x => x.Ready);
    }
}

Next, let’s say we’re having the Order aggregate snapshotted so that it’s updated every time new events are captured like so:

var builder = Host.CreateApplicationBuilder();
builder.Services.AddMarten(opts =>
{
    opts.Connection("some connection string");

    // The Order aggregate is updated Inline inside the
    // same transaction as the events being appended
    opts.Projections.Snapshot<Order>(SnapshotLifecycle.Inline);

    // Opt into an optimization for the inline aggregates
    // used with FetchForWriting()
    opts.Projections.UseIdentityMapForAggregates = true;
})

// This is also a performance optimization in Marten to disable the
// identity map tracking overall in Marten sessions if you don't
// need that tracking at runtime
.UseLightweightSessions();

Now, let’s say as a way to keep our application performing as well as possible, we’d like to be aggressive about archiving shipped orders to keep the “hot” event storage table small. One way we can do that is to append the Archived event as part of processing a command to ship an order like so:

public static async Task HandleAsync(ShipOrder command, IDocumentSession session)
{
    var stream = await session.Events.FetchForWriting<Order>(command.OrderId);
    var order = stream.Aggregate;

    if (!order.Shipped.HasValue)
    {
        // Mark it as shipped
        stream.AppendOne(new OrderShipped());

        // But also, the order is done, so let's mark it as archived too!
        stream.AppendOne(new Archived("Shipped"));

        await session.SaveChangesAsync();
    }
}

If an Order hasn’t already shipped, one of the outcomes of that command handler executing is that the entire event stream for the Order will be marked as archived.

Marten Event Sourcing Gets Some New Tools

JasperFx Software has gotten the chance this work to build out several strategic improvements to both Marten and Wolverine through collaborations with our clients who have had some specific needs. This has been highly advantageous because it’s helped push some significant, long planned technical improvements while getting all important feedback as clients integrate the new features. Today I’d like to throw out a couple valuable features and capabilities that Marten has gained as part of recent client work.

“Side Effects” in Projections

In a recent post called Multi Step Workflows with the Critter Stack I talked about using Wolverine sagas (really process managers if you have to be precise about the pattern name because I’m slopping about interchanging “saga” and “process manager”) for long running workflows. In that post I talked about how an incoming file would be:

  1. Broken up into batches of rows
  2. Each batch would be validated as a separately handled message for some parallelization and more granular retries
  3. When there were validation results recorded for each record batch, the file processing itself would either stop with a call back message summarizing the failures to the upstream sender or continue to the next stage.

As it turns out, event sourcing with a projected aggregate document for the state of the file import turns out to be another good way to implement this workflow, especially with the new “side effects” model recently introduced in Marten at the behest of a JasperFx client.

In this usage. let’s say that we have this aggregated state for a file being imported:

public class FileImportState
{

    // Identity for this saga within our system
    public Guid Id { get; set; }
    public string FileName { get; set; }
    public string PartnerTrackingNumber { get; set; }
    public DateTimeOffset Created { get; set; } = DateTimeOffset.UtcNow;
    public List<RecordBatchTracker> RecordBatches { get; set; } = new();

    public FileImportStage Stage { get; set; } = FileImportStage.Validating;
}

The FileImportState would be updated by appending events like BatchValidated, with Marten “projecting” those events in the rolled up state of the entire file. In Marten’s async daemon process that runs projections in a background process, Marten is processing a certain range (think events 10000 to 11000) at a time. As the daemon processes events into a projection for the FileImportState, it’s grouping the events in that range into event “slices” that are grouped by file id.

For managing the workflow, we can now append all new events as a “side effect” of processing an event slice in the daemon as the aggregation data is updated in the background. Let’s say that we have a single stream projection for our FileImportState aggregation like this below:

public class FileImportProjection : SingleStreamProjection<FileImportState>
{
    // Other Apply / Create methods to update the state of the 
    // FileImportState aggregate document

    public override ValueTask RaiseSideEffects(IDocumentOperations operations, IEventSlice<FileImportState> slice)
    {
        var state = slice.Aggregate;
        if (state.Stage == FileImportStage.Validating &&
            state.RecordBatches.All(x => x.ValidationStatus != RecordStatus.Pending))
        {
            // At this point, the file is completely validated, and we can decide what should happen next with the
            // file
            
            // Are there any validation message failures?
            var rejected = state.RecordBatches.SelectMany(x => x.ValidationMessages).ToArray();
            if (rejected.Any())
            {
                // Append a validation failed event to the stream
                slice.AppendEvent(new ValidationFailed());
                
                // Also, send an outgoing command message that summarizes
                // the validation failures
                var message = new FileRejectionSummary()
                {
                    FileName = state.FileName,
                    Messages = rejected,
                    TrackingNumber = state.PartnerTrackingNumber
                };
                
                // This will "publish" a message once the daemon
                // has successfully committed all changes for the 
                // current batch of events
                // Unsurprisingly, there's a Wolverine integration 
                // for this
                slice.PublishMessage(message);
            }
            else
            {
                slice.AppendEvent(new ValidationSucceeded());
            }
        }

        return new ValueTask();
    }
}

And unsurprisingly, there is also the ability to “publish” outgoing messages as part of processing through asynchronous projections with an integration to Wolverine available.

This feature has long, long been planned and I was glad to get the chance to build it out this fall for a client. I’m happy to say that this is in production for them — after the obligatory shakedown cruise and some bug fixes.

Optimized Projection Rebuilds

Another JasperFx client has a system where they retrofitted Marten into an in flight system using event sourcing for a very large data set, but didn’t take advantage of many Marten capabilities including the ability to effectively pre-build or “snapshot” projected data to optimize system state reads.

With a little bit of work in their system, we knew we would be able to introduce the new projection snapshotting into their system with Marten’s blue/green deployment model for projections where Marten would immediately start trying to pre-build a new projection (or new version of an existing projection) from scratch. Great! Except we knew that was potentially going to be a major performance problem until the projection caught up to the current “high water mark” of the event store.

To ease the cost of introducing a new, persisted projection on top of ~100 million events, we built out Marten’s new optimized projection rebuild feature. To demonstrate what I mean, let’s first opt into using this feature (it had to be opt in because it forces users to made additive changes to existing database tables):

builder.Services.AddMarten(opts =>
{
    opts.Connection("some connection string");

    // Opts into a mode where Marten is able to rebuild single
    // stream projections faster by building one stream at a time
    // Does require new table migrations for Marten 7 users though
    opts.Events.UseOptimizedProjectionRebuilds = true; 
});

Now, when our users redeploy their system with the new snapshotted projection running with Marten’s Async workflow for the first time Marten will see that the projection has not been processed before, so will try to use an “optimized rebuild mode.” Since we’ve turned on optimized projection rebuilds, for a single stream projection, Marten runs the projection in “rebuild” mode by:

  1. First building a new table to track each event stream that relates to the aggregate type in question, but builds this table in reverse order of when each stream has been changed. The whole point of that is to make sure our optimized rebuild process is dealing with the most recently changed event streams so that the system can perform well even while the rebuild process in running
  2. The rebuild process rebuilds the aggregates event stream by event stream as a way of minimizing the number of database reads and writes it takes to rebuild the single stream projection. Compare that to the previous, naive “left fold” approach where it just works from event sequence = 1 to the high water mark and constantly writes and reads back the same projection document as its encountered throughout the event store
  3. When the optimized rebuild is complete, it switches the projection to running in its normal, continuous mode from the point at which the rebuild started

That’s a lot of words and maybe some complicated explanation, but the point is that Marten makes it possible to introduce new projections to a large, in flight system without incurring system downtime or even inconsistent data showing up to users.

Other Recent Improvements for Clients

Some other recent work that JasperFx has done for our clients includes:

Summary

I was part of a discussion slash argument a couple weeks ago about whether or not it was necessary to use an off the shelf event sourcing library or framework like Marten, or if you were just fine rolling your own. While I’d gladly admit that you can easily build purely a storage subsystem for events, it’s not even remotely feasible to quickly roll your own tooling that matches advanced features in Marten such as the work I presented here.

Build Resilient Systems with Wolverine’s Transactional Outbox

JasperFx Software is completely open for business to help you get the best possible results with the “Critter Stack” tools or really any type of server side .NET development efforts. A lot of what I’m writing about is inspired by work we’ve done with our ongoing clients.

I think I’m at the point where I believe and say that leaning on asynchronous messaging is the best way to create truly resilient back end systems. And when I mean “resilient” here, I mean the system is best able to recover from errors it encounters at runtime or performance degradation or even from subsystems being down and still function without human intervention. A system incorporating asynchronous messaging and at least some communication through queues can apply retry policies for errors and utilize patterns like circuit breakers or dead letter queues to avoid losing in flight work.

There’s more to this of course, like:

  • Being able to make finer grained error handling policies around individual steps
  • Dead letter queues and replay of messages
  • Not having “temporal coupling” between systems or subsystems
  • Back pressure mechanics
  • Even maybe being able to better reason about the logical processing steps in an asynchronous model with formal messaging as opposed to just really deep call stacks in purely synchronous code

Wolverine certainly comes with a full range of messaging options and error handling options for resiliency, but a key feature that does lead to Wolverine adoption is its support for the transactional outbox (and inbox) pattern.

What’s the Transactional Outbox all about?

The transactional outbox pattern is an important part of your design pattern toolkit for almost any type of backend system that involves both database persistence and asynchronous work or asynchronous messaging. If you’re not already familiar with the pattern, just consider this message handler (using Wolverine) from a banking system that uses both Wolverine’s transactional middleware and transactional outbox integration (with Marten and PostgreSQL):

public Task<Account> LoadAsync(IDocumentSession session, DebitAccount command)
        => session.LoadAsync<Acount>(command.AccountId);

[Transactional]
public static async Task Handle(
    DebitAccount command,
    Account account,
    IDocumentSession session,
    IMessageContext messaging)
{
    account.Balance -= command.Amount;

    // This just marks the account as changed, but
    // doesn't actually commit changes to the database
    // yet. That actually matters as I hopefully explain
    session.Store(account);

    // Conditionally trigger other, cascading messages
    if (account.Balance > 0 && account.Balance < account.MinimumThreshold)
    {
        await messaging.SendAsync(new LowBalanceDetected(account.Id));
    }
    else if (account.Balance < 0)
    {
        await messaging.SendAsync(new AccountOverdrawn(account.Id), new DeliveryOptions{DeliverWithin = 1.Hours()});

        // Give the customer 10 days to deal with the overdrawn account
        await messaging.ScheduleAsync(new EnforceAccountOverdrawnDeadline(account.Id), 10.Days());
    }

    // "messaging" is a Wolverine IMessageContext or IMessageBus service
    // Do the deliver within rule on individual messages
    await messaging.SendAsync(new AccountUpdated(account.Id, account.Balance),
        new DeliveryOptions { DeliverWithin = 5.Seconds() });
}

You’ll notice up above that the handler both:

  1. Modifies a banking account based on the command and persists those changes to the database
  2. Potentially sends out messages in regard to that account

What the “outbox” is doing for us around this message handler is guaranteeing that:

  • The outgoing messages I registered with the IMessageBus service above are only actually sent to messaging brokers or local queues after the database transaction is successful. Think of the messaging outbox as kind of queueing the outgoing messages as part of your unit of work (which is really implemented by the Marten IDocumentSession up above.
  • The outgoing messages are actually persisted to the same database as the account data as part of a native database transactions
  • As part of a background process, the Wolverine outbox subsystem will make sure the message gets recovered and sent event if — and hate to tell you, but this absolutely does happen in the real world — the running process somehow shuts down unexpectedly between the database transaction succeeding and the messages actually getting successfully sent through local Wolverine queues or remotely sent through messaging brokers like Rabbit MQ or Azure Service Bus.
  • Also as part of the background processing, Wolverine’s outbox is also making sure that persisted, outgoing messages really do get sent out eventually in the case of the messaging broker being temporarily unavailable or network issues — and this is 100% something that actually happens in production, so the ability to recover messages is an awfully important feature for building robust systems.

To sum things up, a good implementation of the transactional outbox pattern in your system can be a great way to make your system be more resilient and “self heal” in the face of inevitable problems in production. As important, the usage of a transactional outbox can do a lot to prevent subtle race condition bugs at runtime from messages getting processed against inconsistent database state before database transactions have completed — and folks, this also absolutely happens in real systems. Ask me how I know:-)

Alright, now that we’ve established what it is, let’s look at some ways in which Wolverine makes its transactional outbox easy to adopt and use — and we’ll show a simpler version of the message handler above, but we just have to introduce more Wolverine concepts.

Setting up the Outbox in Wolverine

If you are using the full “Critter Stack” combination of Marten + Wolverine, you just add both Marten & Wolverine to your application and tie them together with the IntegrateWithWolverine() call from the WolverineFx.Marten Nuget as shown below:

var builder = WebApplication.CreateBuilder(args);

// Adds in some command line diagnostics
builder.Host.ApplyOaktonExtensions();

builder.Services.AddAuthentication("Test");
builder.Services.AddAuthorization();

builder.Services.AddMarten(opts =>
    {
        // You always have to tell Marten what the connection string to the underlying
        // PostgreSQL database is, but this is the only mandatory piece of 
        // configuration
        var connectionString = builder.Configuration.GetConnectionString("postgres");
        opts.Connection(connectionString);
    })
    // This adds middleware support for Marten as well as the 
    // transactional middleware support we'll introduce in a little bit...
    .IntegrateWithWolverine();

builder.Host.UseWolverine();

That does of course require some PostgreSQL tables for the Wolverine outbox storage to function, but Wolverine in this case is able to pull the connection and schema information (the schema can be overridden if you choose) from its Marten integration. In normal development mode, Wolverine — like Marten — is able to apply database migrations itself on the fly so you can just work.

Switching the SQL Server and EF Core combination with Wolverine, you have this setup:

var builder = WebApplication.CreateBuilder(args);

// Just the normal work to get the connection string out of
// application configuration
var connectionString = builder.Configuration.GetConnectionString("sqlserver");

// If you're okay with this, this will register the DbContext as normally,
// but make some Wolverine specific optimizations at the same time
builder.Services.AddDbContextWithWolverineIntegration<ItemsDbContext>(
    x => x.UseSqlServer(connectionString), "wolverine");

// Add DbContext that is not integrated with outbox
builder.Services.AddDbContext<ItemsDbContextWithoutOutbox>(
    x => x.UseSqlServer(connectionString));

builder.Host.UseWolverine(opts =>
{
    // Setting up Sql Server-backed message storage
    // This requires a reference to Wolverine.SqlServer
    opts.PersistMessagesWithSqlServer(connectionString, "wolverine");

    // Set up Entity Framework Core as the support
    // for Wolverine's transactional middleware
    opts.UseEntityFrameworkCoreTransactions();

    // Enrolling all local queues into the
    // durable inbox/outbox processing
    opts.Policies.UseDurableLocalQueues();
});

Likewise, Wolverine is able to build the necessary schema objects for SQL Server on application startup so that the outbox integration “just works” in local development or testing environments. I should note that in all cases, Wolverine provides command line tools to export SQL scripts for these schema objects that you could use within database migration tools like Grate.

Outbox Usage within Message Handlers

Honestly, just to show a lower ceremony version of a Wolverine handler, let’s take the message handler from up above and use Wolverine’s “cascading message” capability to express the same logic for choosing which messages to send out as well as expression the database operation.

Before I show the handler, let me call out a couple things first:

  • Wolverine has an “auto transaction” middleware policy you can opt into to apply transaction handling for Marten, EF Core, or RavenDb around your handler code. This is helpful to keep your handler code simpler and often to allow you to write synchronous code
  • The “outbox” sending kicks in with any messages sent to an endpoint (local queue, Rabbit MQ exchange, AWS SQS queue, Kafka topic) that is configured as “durable” in Wolverine. You can read more about the Wolverine routing here. Do know though that within any application or even within a single handler, you can mix and match durable routes with “fire and forget” endpoints as desired.
  • There’s another concept in Wolverine called “side effects” that I’m going to use just to say “I want this document stored as part of this logical transaction.” It’s yet another thing in Wolverine’s bag of tricks to help you write pure functions for message handlers as a way to maximize the testability of your application code.

This time, we’re going to write a pure function for the handler:

public static class DebitAccountHandler
{
    public static Task<Account> LoadAsync(IDocumentSession session, DebitAccount command)
        => session.LoadAsync<Account>(command.AccountId);
    
    public static async (IMartenOp, OutgoingMessages) Handle(
        DebitAccount command,
        Account account)
    {
        account.Balance -= command.Amount;

        // This just tracks outgoing, or "cascading" messages
        var messages = new OutgoingMessages();

        // Conditionally trigger other, cascading messages
        if (account.Balance > 0 && account.Balance < account.MinimumThreshold)
        {
            messages.Add(new LowBalanceDetected(account.Id));
        }
        else if (account.Balance < 0)
        {
            messages.Add(new AccountOverdrawn(account.Id), new DeliveryOptions{DeliverWithin = 1.Hours()});

            // Give the customer 10 days to deal with the overdrawn account
            messages.Delay(new EnforceAccountOverdrawnDeadline(account.Id), 10.Days());
        }

        // Do the deliver within rule on individual messages
        messages.Add(new AccountUpdated(account.Id, account.Balance),
            new DeliveryOptions { DeliverWithin = 5.Seconds() });

        return (MartenOps.Store(account), messages);
    }
}

When Wolverine executes the DebitCommand, it’s trying to commit a single database transaction with the contents of the Account entity being persisted and any outgoing messages in that OutgoingMessages collection that are routed to a durable Wolverine endpoint. When the transaction succeeds, Wolverine “releases” the outgoing messages to the sending agents within the application, where the persisted message data gets deleted from the database when Wolverine is able to successfully send the message.

Outbox Usage within MVC Core Controllers

Like all messaging frameworks in the .NET space that I’m aware of, the transactional outbox mechanics are pretty well transparent from message handler code. More recently though, the .NET ecosystem has caught up (finally) with the need to expose transactional outbox mechanics outside of a message handler.

A very common use cases is needing to both make database writes and trigger asynchronous work through messages from HTTP web services. For this example, let’s assume the usage of MVC Core Controller classes, but the mechanics I’m showing are similar for Minimal API or other alternative endpoint models in the ASP.Net Core ecosystem.

Assuming the usage of Marten + Wolverine, you can send messages with an outbox through the IMartenOutbox service that somewhat wraps the two tools together like this:

    [HttpPost("/orders/itemready")]
    public async Task Post(
        [FromBody] MarkItemReady command,
        [FromServices] IDocumentSession session,
        [FromServices] IMartenOutbox outbox
    )
    {
        // This is important!
        outbox.Enroll(session);

        // Fetch the current value of the Order aggregate
        var stream = await session
            .Events

            // We're also opting into Marten optimistic concurrency checks here
            .FetchForWriting<Order>(command.OrderId, command.Version);

        var order = stream.Aggregate;

        if (order.Items.TryGetValue(command.ItemName, out var item))
        {
            item.Ready = true;

            // Mark that the this item is ready
            stream.AppendOne(new ItemReady(command.ItemName));
        }
        else
        {
            // Some crude validation
            throw new InvalidOperationException($"Item {command.ItemName} does not exist in this order");
        }

        // If the order is ready to ship, also emit an OrderReady event
        if (order.IsReadyToShip())
        {
            // Publish a cascading command to do whatever it takes
            // to actually ship the order
            // Note that because the context here is enrolled in a Wolverine
            // outbox, the message is registered, but not "released" to
            // be sent out until SaveChangesAsync() is called down below
            await outbox.PublishAsync(new ShipOrder(command.OrderId));
            stream.AppendOne(new OrderReady());
        }

        // This will also persist and flush out any outgoing messages
        // registered into the context outbox
        await session.SaveChangesAsync();
    }

With EF Core + Wolverine, it’s similar, but just a touch more ceremony using IDbContextOutbox<T> as a convenience wrapper around an EF Core DbContext:

    [HttpPost("/items/create2")]
    public async Task Post(
        [FromBody] CreateItemCommand command,
        [FromServices] IDbContextOutbox<ItemsDbContext> outbox)
    {
        // Create a new Item entity
        var item = new Item
        {
            Name = command.Name
        };

        // Add the item to the current
        // DbContext unit of work
        outbox.DbContext.Items.Add(item);

        // Publish a message to take action on the new item
        // in a background thread
        await outbox.PublishAsync(new ItemCreated
        {
            Id = item.Id
        });

        // Commit all changes and flush persisted messages
        // to the persistent outbox
        // in the correct order
        await outbox.SaveChangesAndFlushMessagesAsync();
    }

I personally think the usage of the outbox outside of Wolverine message handlers is a little bit more awkward than I’d ideally prefer (I also feel this way about the NServiceBus or MassTransit equivalents of this usage, but it’s nice that both of those tools do have this important functionality too), so let’s introduce Wolverine’s HTTP endpoint model to write lower ceremony code while still opting into outbox mechanics from web services.

Outbox Usage within Wolverine HTTP

This is beyond annoying, but the libraries and namespaces in Wolverine are all named “Wolverine.*”, but the Nuget packages are named “WolverineFx.*” because some clown is squatting on the “Wolverine” name in Nuget and we didn’t realize that until it was too late and we’d committed to the projection name. Grr.

Wolverine also has an add on model in the WolverineFx.Http Nuget that allows you to use the basics of the Wolverine runtime execution model for HTTP services. One of the advantages of Wolverine.HTTP endpoints is the same kind of pure function model as the message handlers that I believe to be a much lower ceremony programming model than MVC Core or even Minimal API.

Maybe more valuable though, Wolverine.HTTP endpoints support the exact same transactional middleware and outbox integration as the message handlers. That also allows us to use “cascading messages” to publish messages out of our HTTP endpoint handlers without having to deal with asynchronous code or injecting IoC services. Just plain old pure functions in many cases like so:

public static class TodoCreationEndpoint
{
    [WolverinePost("/todoitems")]
    public static (TodoCreationResponse, TodoCreated) Post(CreateTodo command, IDocumentSession session)
    {
        var todo = new Todo { Name = command.Name };

        // Just telling Marten that there's a new entity to persist,
        // but I'm assuming that the transactional middleware in Wolverine is
        // handling the asynchronous persistence outside of this handler
        session.Store(todo);

        // By Wolverine.Http conventions, the first "return value" is always
        // assumed to be the Http response, and any subsequent values are
        // handled independently
        return (
            new TodoCreationResponse(todo.Id),
            new TodoCreated(todo.Id),
        );
    }
}

The Wolverine.HTTP model gives us a way to build HTTP endpoints with Wolverine’s typical, low ceremony coding model (most of the OpenAPI metadata can be gleaned from the method signatures of the endpoints, further obviating the need for repetitive ceremony code that so frequently litters MVC Core code) with easy usage of Wolverine’s transactional outbox.

I should also point out that even if you aren’t using any kind of message storage or durable endpoints, Wolverine will not actually send messages until any database transaction has completed successfully. Think of this as a non-durable, in memory outbox built into your HTTP endpoints.

Summary

The transactional outbox pattern is a valuable tool for helping create resilient systems, and Wolverine makes it easy to use within your system code. I’m frequently working with clients who aren’t utilizing a transactional outbox even when they’re using asynchronous work or trying to cascade work as “domain events” published from other transactions. It’s something I always call out when I see it, but it’s frequently hard to introduce all new infrastructure in existing projects or within tight timelines — and let’s be honest, timelines are always tight.

I think my advice is to be aware of this need upfront when you are picking out the technologies you’re going to use as the foundation for your architecture. To be blunt, a lot of shops I think are naively opting into MediatR as a core tool without realizing the important functionality it is completely missing in order to build a resilient system — like a transactional outbox. You can, and many people do, complement MediatR with a real messaging tool like MassTransit.

Instead, you could just use Wolverine that basically does both “mediator” and asynchronous messaging with one programming model of handlers and does so with a potentially lower ceremony and higher productivity coding model than any of those other tools in .NET.

Specification Usage with Marten for Repository-Free Development

I’ll jump into real discussions about architecture later in this post, but let’s say that we’re starting the development of a new software system. And for a variety of reasons I’ll try to discuss later, we want to eschew the usage of repository abstractions and be able to use all the power of our persistence tooling, which in our case is Marten of course. We’re also going to leverage a Vertical Slice Architecture approach for our codebase (more on this later).

In some cases, we might very well hit complicated database queries or convoluted LINQ expressions that are duplicated across different command or query handler “slices” within our system. Or maybe we just want some workflow code to be cleaner and easier to understand that it would be if we embedded a couple dozen lines of ugly LINQ expression code directly into the workflow code.

Enter the Specification pattern, which you’ve maybe seen from Steve Smith’s work, but I’ve run across a few times over the years. The Specification pattern is just the encapsulation of reusable query of some sort into a custom type. Marten has direct support baked in for the specification pattern through the older compiled query mechanism and the newer, more flexible query plan feature.

First, here’s an example of a compiled query:

public class FindUserByAllTheThings: ICompiledQuery<User>
{
    public string Username { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }

    public Expression<Func<IMartenQueryable<User>, User>> QueryIs()
    {
        return query =>
            query.Where(x => x.FirstName == FirstName && Username == x.UserName)
                .Where(x => x.LastName == LastName)
                .Single();
    }
}

To execute the query above, it’s this syntax on any Marten IQuerySession or IDocumentSession:

        // theSession is an IQuerySession 
        var user = await theSession.QueryAsync(new FindUserByAllTheThings
        {
            Username = "jdm", FirstName = "Jeremy", LastName = "Miller"
        });

Compiled queries are obviously a weird API, but they come with a bit of a performance boost by being able to “remember” the LINQ parsing and SQL construction inside of Marten. Think of Marten compiled queries as the equivalent to a stored procedure — but maybe with more performance advantages.

Marten compiled queries do come with some significant limitations in usefulness as they really don’t allow for any runtime flexibility. To that end, Marten introduced the query plan idea as a more generic specification implementation that can support anything that Marten itself can do.

A “query plan” is just an implementation of this interface:

public interface IQueryPlan<T>
{
    Task<T> Fetch(IQuerySession session, CancellationToken token);
}

// and optionally, this too:
public interface IBatchQueryPlan<T>
{
    Task<T> Fetch(IBatchedQuery query);
}

And executed against Marten with this method on the IQuerySession API:

Task<T> QueryByPlanAsync<T>(IQueryPlan<T> plan, CancellationToken token = default);

As you’d probably guess, it’s just a little bit of double dispatch in terms of its implementation, but in concept this gives you the ability to create reusable query plans against Marten that enables the usage of anything that Marten itself can do — including in some cases, the ability to enroll inside of Marten batched querying for better performance.

Not that I want to run around encouraging the copious usage of dynamic mock objects in your unit tests, but it is very feasible to mock the usage of query plans or compiled query objects against Marten’s IQuerySession in a way that is not even remotely feasible for trying to directly mock Marten’s LINQ provider. And even though it is highly not recommended by me and probably completely moronic to do so, folks really do try to use mock objects around LINQ.

I originally built the query plan implementation in Marten after working with a JasperFx client who had some significant opportunities to improve their codebase by ditching the typical Clean/Onion Architecture usage of repository abstractions over Marten. Their current repository usage is mostly the kind of silly passthrough queries that irritate me about Clean Architecture codebases, but a handful of very complicated queries that are reused across multiple use cases. The query plan idea was a way of allowing them to encapsulate the big, crazy queries in a single place that could be shared across different handlers, but didn’t force them into using a repository.

An Aside on the Don’t Repeat Yourself Principle

The old DRY principle is a bit of a double edged sword. It’s absolutely true that creating duplication of functionality in your system can frequently hurt as rules change over time or you encounter bugs that have to be addressed in multiple places — while inevitably missing some of those places sometimes. It’s still valuable to remove duplication of logic or behavior that crops up in your system. It’s also very true that some attempts to “DRY” up code can lead to extra complexity that makes your system harder to understand and does more bad to good. Or the work to DRY up code just doesn’t pay off enough. Unfortunately, my only advice is to take things on a case by case basis. I certainly don’t buy off into any kind of black and white “share nothing” philosophy for modular monoliths, micro services, or vertical slices.

An Aside on Clean/Onion Architecture

Let’s just dive right in by me stating that I loathe the Clean/Onion Architecture approach as it is typically used by real teams in the real world as a prescriptive layered architecture that scatters related code through umpteen million separate projects. I especially dislike the copious usage of the “Repository” pattern in these templates for a handful of reasons around the useless passthroughs or accidentally causing chatty interaction between the application and database that can kill performance.

Mostly though, my strong preference is to adopt the “Vertical Slice Architecture” mantra of keeping closely related code together. For persistence code, I’d ideally like to even drop the query code in the same files — or at least the same namespace folder — as the business logic for the command or query handler that uses the data from the queries. My thinking here is that I want the system to be as easy to reason about as possible, and that includes being able to easily understand the database calls that result from handling a query or command. And honestly, I’d also like developers to just be able to write code for a feature at a time in one place without jumping all over the codebase to follow some architect’s idea of proper code organization.

When I’d use the Repository Pattern

I would maybe choose to use the “Repository” pattern to wrap my system’s underlying persistence tooling in certain conditions. Offhand, I thought of these scenarios so far:

  • Maybe some particular query logic is very involved and I deem it to be helpful to move that code into its own “single responsibility” method/function/class
  • Maybe the underlying persistence tooling is tedious of difficult to use, and by abstracting that low level access behind a repository abstraction I’m making the rest of the code simpler and probably even enhancing testability — but I think I’d strongly recommend against adopting persistence tooling that’s like that in the first place if you can possibly help it!
  • If there’s some sort of caching layer maybe in between your code and the persistence tooling
  • To eliminate some code duplication of query logic between use cases — but the point of this blog post is going to be about using the “Specification” pattern as an alternative to eliminate duplication without having to resort to a repository abstraction

Summarizing My Preferred Approach

My default approach for my own development and my strong advice for Marten users is to largely eschew repository patterns and any other kind of abstraction wrapper around Marten’s main IQuerySession or IDocumentSession APIs. My thinking goes along the lines of:

  1. The Marten API just isn’t that complicated to begin with
  2. You should never even dream that LINQ providers are even remotely equivalent between tools, so the idea that you’re going to be able to swap out persistence tooling and the LINQ queries will “just work” with the next tool is a pipe dream
  3. I think it’s very rare to swap out databases underneath an existing application anyway, and you’re pretty well in for at least a partial rewrite if you try to no matter what kind of Clean/Onion/Ports and Adapters style abstractions you’ve written anyway. Sure, maybe you can swap between two different, but very similar relational databases, but why would you bother? Except possibly for the “let’s save hosting costs by moving from Sql Server to PostgreSQL” move that lots of people discuss but never really do.
  4. As I tried to explain in my post Network Round Trips are Evil, it’s frequently important or at least valuable to get at the more advanced features of your persistence tooling to improve performance, with Marten’s feature set for batched querying or including related documents being some of the first examples that spring to mind. And that’s not an imaginary use case, because I’m currently working with a JasperFx client whose codebase could probably be more performant if they utilized those features, but first we’re going to have to unwind some repository abstractions just to get at those Marten capabilities

Part of my prescriptive advice for being more successful in systems development is to eschew the usage of the old, classic “Repository” pattern and just use the actual persistence tooling API in your code with some exceptions of course for complicated querying, to eliminate duplication, or maybe to add in some caching or validation outside of the persistence tooling. More on those exceptions soon.

The newer query plan feature in Marten gives us specification pattern support that allows us to reuse or just encapsulate complicated query logic in a way that makes it easy to reuse across vertical slices.

Message Broker per Tenant with Wolverine

The new feature shown in this post was built by JasperFx Software as part of a client engagement. This is exactly the kind of novel or challenging issue we frequently help our clients solve. If there’s something in your shop’s ongoing efforts where you could use some extra technical help, reach out to sales@jasperfx.net and we’ll be happy to talk with you.

Wolverine 3.4 was released today with a large new feature for multi-tenancy through asynchronous messaging. This feature set was envisioned for usage in an IoT system using the full “Critter Stack” (Marten and Wolverine) where “our system” is centralized in the cloud, but has to communicate asynchronously with physical devices deployed at different client sites:

The system in question already uses Marten’s support for separating per tenant information into separate PostgreSQL databases. Wolverine itself works with Marten’s multi-tenancy to make that a seamless process within Wolverine messaging workflows. All of that arguably quite robust already support was envisioned to be running within either HTTP web services or asynchronous messaging workflows completely controlled by the deployed application and its peer services. What’s new with Wolverine 3.4 is the ability to isolate the communication with remote client (tenant) devices and the centralized, cloud deployed “our system.”

We can isolate the traffic between each client site and our system first by using a separate Rabbit MQ broker or at least a separate virtual host per tenant as implied in the code sample from the docs below:

var builder = Host.CreateApplicationBuilder();

builder.UseWolverine(opts =>
{
    // At this point, you still have to have a *default* broker connection to be used for 
    // messaging. 
    opts.UseRabbitMq(new Uri(builder.Configuration.GetConnectionString("main")))
        
        // This will be respected across *all* the tenant specific
        // virtual hosts and separate broker connections
        .AutoProvision()

        // This is the default, if there is no tenant id on an outgoing message,
        // use the default broker
        .TenantIdBehavior(TenantedIdBehavior.FallbackToDefault)

        // Or tell Wolverine instead to just quietly ignore messages sent
        // to unrecognized tenant ids
        .TenantIdBehavior(TenantedIdBehavior.IgnoreUnknownTenants)

        // Or be draconian and make Wolverine assert and throw an exception
        // if an outgoing message does not have a tenant id
        .TenantIdBehavior(TenantedIdBehavior.TenantIdRequired)

        // Add specific tenants for separate virtual host names
        // on the same broker as the default connection
        .AddTenant("one", "vh1")
        .AddTenant("two", "vh2")
        .AddTenant("three", "vh3")

        // Or, you can add a broker connection to something completel
        // different for a tenant
        .AddTenant("four", new Uri(builder.Configuration.GetConnectionString("rabbit_four")));

    // This Wolverine application would be listening to a queue
    // named "incoming" on all virtual hosts and/or tenant specific message
    // brokers
    opts.ListenToRabbitQueue("incoming");

    opts.ListenToRabbitQueue("incoming_global")
        
        // This opts this queue out from being per-tenant, such that
        // there will only be the single "incoming_global" queue for the default
        // broker connection
        .GlobalListener();

    // More on this in the docs....
    opts.PublishMessage<Message1>()
        .ToRabbitQueue("outgoing").GlobalSender();
});

With this solution, we now have a “global” Rabbit MQ broker we can use for all internal communication or queueing within “our system”, and a separate Rabbit MQ virtual host for each tenant. At runtime, when a message tagged with a tenant id is published out of “our system” to a “per tenant” queue or exchange, Wolverine is able to route it to the correct virtual host for that tenant id. Likewise, Wolverine is listening to the queue named “incoming” on each virtual host (plus the global one), and automatically tags messages coming from the per tenant virtual host queues with the correct tenant id to facilitate the full Marten/Wolverine workflow downstream as the incoming messages are handled.

Now, let’s switch it up and use Azure Service Bus instead to basically do the same thing. This time though, we can register additional tenants to use a separate Azure Service Bus fully qualified namespace or connection string:

var builder = Host.CreateApplicationBuilder();

builder.UseWolverine(opts =>
{
    // One way or another, you're probably pulling the Azure Service Bus
    // connection string out of configuration
    var azureServiceBusConnectionString = builder
        .Configuration
        .GetConnectionString("azure-service-bus");

    // Connect to the broker in the simplest possible way
    opts.UseAzureServiceBus(azureServiceBusConnectionString)

        // This is the default, if there is no tenant id on an outgoing message,
        // use the default broker
        .TenantIdBehavior(TenantedIdBehavior.FallbackToDefault)

        // Or tell Wolverine instead to just quietly ignore messages sent
        // to unrecognized tenant ids
        .TenantIdBehavior(TenantedIdBehavior.IgnoreUnknownTenants)

        // Or be draconian and make Wolverine assert and throw an exception
        // if an outgoing message does not have a tenant id
        .TenantIdBehavior(TenantedIdBehavior.TenantIdRequired)

        // Add new tenants by registering the tenant id and a separate fully qualified namespace
        // to a different Azure Service Bus connection
        .AddTenantByNamespace("one", builder.Configuration.GetValue<string>("asb_ns_one"))
        .AddTenantByNamespace("two", builder.Configuration.GetValue<string>("asb_ns_two"))
        .AddTenantByNamespace("three", builder.Configuration.GetValue<string>("asb_ns_three"))

        // OR, instead, add tenants by registering the tenant id and a separate connection string
        // to a different Azure Service Bus connection
        .AddTenantByConnectionString("four", builder.Configuration.GetConnectionString("asb_four"))
        .AddTenantByConnectionString("five", builder.Configuration.GetConnectionString("asb_five"))
        .AddTenantByConnectionString("six", builder.Configuration.GetConnectionString("asb_six"));
    
    // This Wolverine application would be listening to a queue
    // named "incoming" on all Azure Service Bus connections, including the default
    opts.ListenToAzureServiceBusQueue("incoming");

    // This Wolverine application would listen to a single queue
    // at the default connection regardless of tenant
    opts.ListenToAzureServiceBusQueue("incoming_global")
        .GlobalListener();
    
    // Likewise, you can override the queue, subscription, and topic behavior
    // to be "global" for all tenants with this syntax:
    opts.PublishMessage<Message1>()
        .ToAzureServiceBusQueue("message1")
        .GlobalSender();

    opts.PublishMessage<Message2>()
        .ToAzureServiceBusTopic("message2")
        .GlobalSender();
});

This is a lot to take in, but the major point is to keep client messages completely separate from each other while also enabling the seamless usage of multi-tenanted workflows all the way through the Wolverine & Marten pipeline. As we deal with the inevitable teething pains, the hope is that the behavioral code within the Wolverine message handlers never has to be concerned with any kind of per-tenant bookkeeping. For more information, see:

And as I typed all of that out, I do fully realize that there would be some value in having a comprehensive “Multi-Tenancy with the Critter Stack” guide in one place.

Summary

I honestly don’t know if this feature set gets a lot of usage, but it came out of what’s been a very productive collaboration with JasperFx’s original customer as we’ve worked together on their IoT system. Quite a bit of improvements to Wolverine have come about as a direct reaction to friction or opportunities that we’ve spotted with our collaboration.

As far as multi-tenancy goes, I think the challenges for the Critter Stack toolset has been to give our users all the power they need to keep data and now messaging completely separate across tenants while relentlessly removing repetitive code ceremony or usability issues. My personal philosophy is that lower ceremony code is an important enabler of successful software development efforts over time.

Messaging with Wolverine using Apache Pulsar

As part of the Wolverine 3.0 release a couple weeks back, Wolverine gained a lightweight messaging transport option with Apache Pulsar.

“Lightweight” just meaning “it doesn’t have a lot of features yet”

To get started, first add this Nuget to your system:

dotnet add WolverineFx.Pulsar

And just like that, you’re ready to start adding publishing rules and subscriptions to Pulsar topics in a very idiomatic Wolverine way:

var builder = Host.CreateApplicationBuilder();
builder.UseWolverine(opts =>
{
    opts.UsePulsar(c =>
    {
        var pulsarUri = builder.Configuration.GetValue<Uri>("pulsar");
        c.ServiceUrl(pulsarUri);
        
        // Any other configuration you want to apply to your
        // Pulsar client
    });

    // Publish messages to a particular Pulsar topic
    opts.PublishMessage<Message1>()
        .ToPulsarTopic("persistent://public/default/one")
        
        // And all the normal Wolverine options...
        .SendInline();

    // Listen for incoming messages from a Pulsar topic
    opts.ListenToPulsarTopic("persistent://public/default/two")
        
        // And all the normal Wolverine options...
        .Sequential();
});

It’s a minimal implementation for right now (no conventional routing topology for example), but we’ll happily enhance this transport option if there’s interest. To be honest, the Pulsar transport has been hanging out inside the Wolverine codebase for years, but never got released for whatever reason. Someone asked about this awhile back, so here we go!

Assuming that the US still exists tomorrow and I’m not trying to move my family to Canada, I’ll follow up with Wolverine’s new, fully robust transport option for Google Pubsub.

Network Round Trips are Evil, So Batch Your Queries When You Can

JasperFx Software frequently helps our customers wring better performance or scalability out of our customer’s systems. A somewhat frequent opportunity for improving the responsiveness and throughput of systems is merely identifying ways to batch up requests from middle tier, server side code to the backing database or databases. There’s a certain amount of overhead in making any network round trips between processes, and it often pays off in terms of performance to batch up queries or commands to reduce the number of network round trips.

Today I’m merely going to focus on Marten as a persistence tool and a bit on Wolverine as “Mediator” and show some ways that Marten reduces network round trips. Just know though that this general idea of reducing network round trips by batching up database queries or commands is certainly going to apply to improving performance with any other persistence tooling.

Batching Writes

First off, let’s just look at doing a mixed bag of “writes” with a Marten session to add, delete, or modify user data:

    public static async Task modify_some_users(IDocumentSession session)
    {
        // Mixed bag of document operations
        session.Insert(new User{FirstName = "Hans", LastName = "Gruber"});
        session.Store(new User{FirstName = "John", LastName = "McClane"});
        session.DeleteWhere<User>(x => x.LastName == "Miller");

        session.Patch<User>(x => x.LastName == "May").Set(x => x.Nickname, "Mayday");

        // Let's append some events too just for fun!
        session.Events.StartStream<User>(new UserCreated("Harry", "Ellis"));

        // Commit all the changes
        await session.SaveChangesAsync();
    }

What’s important to note in the code up above is that all the logical operations to insert, “upsert”, delete, patch, or start event streams is batched up into a single database round trip when session.SaveChangesAsync() is called. In the early days of Marten we tried a lot of different things to improve throughput in Marten, including alternative serializers, reducing string concatenation, code generation techniques, and alternative data structures internally. Our consistent finding was that the single biggest improvements always came from reducing network round trips, with alternative JSON serializers being a distant second, and every other factor far behind that.

If you’re curious about the technical underpinnings, Marten 7+ is creating a single NpgsqlBatch for all the commands and even using positional parameters because that’s a touch more efficient for the interaction with PostgreSQL.

Moving to another example, let’s say that you have workflow where you need to apply logical changes to a batch of Item entities using a mix of Marten and Wolverine. Here’s a first, naive cut at this handler:

public static class ApproveItemsHandler
{
    // I'm passing in CancellationToken because:
    // a. It's probably a good idea anyway
    // b. That's how Wolverine "enforces" message timeouts
    public static async Task HandleAsync(
        ApproveItems message,
        IDocumentSession session,
        CancellationToken token)
    {
        foreach (var id in message.Ids)
        {
            var existing = await session.LoadAsync<Item>(id, token);
            if (existing != null)
            {
                existing.Approved = true;
                session.Store(existing);
            }
        }

        await session.SaveChangesAsync(token);
    }
}

Now, let’s assume that we could easily be getting 100-1000 different ids of Item entities to approve at any one time, which would make this operation chatty and potentially slow. Let’s make it a little worse though and add in Wolverine as a “mediator” to handle each individual Item inline:

public static class ApproveItemHandler
{
    public static async Task HandleAsync(
        ApproveItem message, 
        IDocumentSession session, 
        CancellationToken token)
    {
        var existing = await session.LoadAsync<Item>(message.Id, token);
        if (existing == null) return;

        existing.Approved = true;

        await session.SaveChangesAsync(token);
    }
}

public static class ApproveItemsHandler
{
    // I'm passing in CancellationToken because:
    // a. It's probably a good idea anyway
    // b. That's how Wolverine "enforces" message timeouts
    public static async Task HandleAsync(
        ApproveItems message,
        IMessageBus bus,
        CancellationToken token)
    {
        foreach (var id in message.Ids)
        {
            await bus.InvokeAsync(new ApproveItem(id), token);
        }
    }
}

In terms of performance, the second version is even worse. We compounded the existing chattiness problem with looking up each Item individually by separating out the database “writes” to separate database calls and separate transactions within “Wolverine as Mediator” usage through that InvokeAsync()call. You should be aware that when you use any kind of in process “Mediator” tool like Wolverine, MediatR, Brighter, or MassTransit’s in process mediator functionality that each call to InvokeAsync() involves a certain amount of overhead and very likely means a nested transaction that gets committed independently from the parent message handling or HTTP request that triggered the InvokeAsync() call. I think I might go so far as to say that calling IMessageBus.InvokeAsync() from another message handler is a “guilty until proven innocent” type of approach.

I’d of course argue here that the performance may or may not end up being a big deal, but not having a transactional boundary around the original message processing can easily lead to inconsistent state in our system if any of the individual Item updates fail.

Let’s make one last version of this batch approve item handler with an eye toward reducing network round trips and keeping a strongly consistent transaction boundary around all the approvals (meaning they all succeed or all fail, no in between “who knows what really happened” state):

public static class ApproveItemsHandler
{
    // I'm passing in CancellationToken because:
    // a. It's probably a good idea anyway
    // b. That's how Wolverine "enforces" message timeouts
    public static async Task HandleAsync(
        ApproveItems message,
        IDocumentSession session,
        CancellationToken token)
    {
        // Find all the related items in *one* network round trip
        var items = await session.LoadManyAsync<Item>(token, message.Ids);
        foreach (var item in items)
        {
            item.Approved = true;
            session.Store(item);
        }

        await session.SaveChangesAsync().ConfigureAwait(false);
    }
}

In the usage above, we’re making one database call to fetch the matching Item entities, and updating all of the impacted Item entities in a single batched database command within the IDocumentSession.SaveChangesAsync(). This version should almost always be much faster than the earlier versions where we issued individual queries for each Item, plus we have better transactional consistency in the case of system errors.

Lastly of course for the sake of completeness, we could just do this with one network round trip:

public static class ApproveItemsHandler
{
    // Assuming here that Wolverine "auto-transaction"
    // middleware is in place
    public static void Handle(
        ApproveItems message,
        IDocumentSession session)
    {
        session
            .Patch<Item>(x => x.Id.IsOneOf(message.Ids))
            .Set(x => x.Approved, true);
    }
}

That last version eliminates the usage of current state to validate the operation first or give us any indication of what exactly was changed, but hey, that’s the fastest possible way to code this with Marten and it might be suitable sometimes in your own system.

Batch Querying

Marten has strong support for batch querying where you can combine any number of disparate queries in a batch to the database, and read the results one at a time afterward. Here’s an example from the Marten documentation, but just know that session in this case is a Marten IQuerySession:

// Start a new IBatchQuery from an active session
var batch = session.CreateBatchQuery();

// Fetch a single document by its Id
var user1 = batch.Load<User>("username");

// Fetch multiple documents by their id's
var admins = batch.LoadMany<User>().ById("user2", "user3");

// User-supplied sql
var toms = batch.Query<User>("where first_name == ?", "Tom");

// Where with Linq
var jills = batch.Query<User>().Where(x => x.FirstName == "Jill").ToList();

// Any() queries
var anyBills = batch.Query<User>().Any(x => x.FirstName == "Bill");

// Count() queries
var countJims = batch.Query<User>().Count(x => x.FirstName == "Jim");

// The Batch querying supports First/FirstOrDefault/Single/SingleOrDefault() selectors:
var firstInternal = batch.Query<User>().OrderBy(x => x.LastName).First(x => x.Internal);

// Kick off the batch query
await batch.Execute();

// All of the query mechanisms of the BatchQuery return
// Task's that are completed by the Execute() method above
var internalUser = await firstInternal;
Debug.WriteLine($"The first internal user is {internalUser.FirstName} {internalUser.LastName}");

That’s a little more code and complexity than you might have otherwise if you just make the queries independently, but there’s some significant performance gains to be made from batching queries.

This is a much, much longer discussion than I have ambition for today, but the rampant usage of repository abstractions around raw persistence tooling like Marten has a tendency to knock out more powerful functionality like query batching. That’s especially compounded with “noun-centric” code organization where you may have IOrderRepository and IInvoiceRepository wrapping your raw persistence tooling, but yet frequently have logical operations that deal with both Order and Invoice data at the same time. With Wolverine especially, I’m pushing JasperFx clients and our users to try to get away with eschewing these kinds of abstractions and leaning hard into Wolverine’s “A-Frame Architecture” approach so you can utilize the full power of Marten (or EF Core or RavenDb or whatever else you actually use).

What I can tell you is that for a current JasperFx client, we’re looking in the long run to collapse and simplify and inline their current usage of Railway Programming and MediatR-calling-other-MediatR handlers as a way to enable us to utilize query batching to optimize some of their very complicated operations that today end up being very chatty between the server and database.

Including Related Entities when Querying

There are plenty of times you’ll have an operation in your system that needs information from multiple, related entity types. Marten provides its version of Include() in its LINQ provider as a way to batch query related documents in fewer network round trips, and hence better performance like this example from the tests:

[Fact]
public async Task simple_include_for_a_single_document()
{
    var user = new User();
    var issue = new Issue { AssigneeId = user.Id, Title = "Garage Door is busted" };

    using var session = theStore.IdentitySession();
    session.Store<object>(user, issue);
    await session.SaveChangesAsync();

    using var query = theStore.QuerySession();

    // The following query will fetch both the Issue document
    // and the related User document for the Issue in one
    // network round trip
    User included = null;
    var issue2 = query
        .Query<Issue>()
        .Include<User>(x => included = x).On(x => x.AssigneeId)
        .Single(x => x.Title == issue.Title);

    included.ShouldNotBeNull();
    included.Id.ShouldBe(user.Id);

    issue2.ShouldNotBeNull();
}

I’ll refer you to the documentation for more alternative usages, but just know that Marten has this capability and it’s a valuable way to improve performance in your system by reducing the number of network roundtrips between your code and the backend.

Marten’s Include() functionality was originally inspired/copied from RavenDb. We’ve unfortunately had some confusion in the past from folks coming over from EF Core where its Include() means something very different. Oh, and just to pull aside the curtain, it’s not doing any kind of JOIN behind the scenes, but a temporary table + multiple SELECT() statements.

Summary

I just wanted to get a handful of things across in this post:

  1. Network round trips can easily be expensive and a contributing factor in poor system performance. Reducing the number of network round trips by batching queries can sometimes pay off overall even if that sometimes means more complex code
  2. Marten has several features specifically meant to improve system performance by batching database queries that you can utilize. Both Marten and Wolverine are absolutely built with this philosophy of reducing network round trips as much as possible
  3. Any coding or architectural strategy that results in excessive layering, long call stacks (A calls B that calls C that calls D that finally calls to a database), or really just obfuscates your understanding of how system operations lead to increased numbers of network round trips can easily be harmful to your system’s performance because you can’t easily “see” what your system is really doing

Never mind, Lamar is going to continue

A couple months ago I wrote Retiring Lamar and the Ghost of IoC Containers Past as we were closing in on decoupling Wolverine 3.0 from Lamar (since completed) and I was already getting sick of edge case bugs introduced by Microsoft from their inexplicably wacky approach for keyed services. Since releasing Wolverine 3.0 without its previous coupling to Lamar, I’ve recommended to several users and clients to just go put back Lamar because of various annoyances with .NET’s built in ServiceProvider. There’s just too many places where Lamar is significantly less finicky than ServiceProvider and I’m personally missing Lamar’s “it should just work” attitude when being forced to use ServiceProvider or helping other folks who just upgraded to Wolverine 3.0.

Long story short, I change my mind about ending Lamar support and I’m actually starting Lamar 14 today as part of the Critter Stack 2025 initiative. Sorry for the churn folks.

Personal Identifiable Information Masking in Marten

JasperFx Software helps our customers be more successful with their usage of the “Critter Stack” tools (or any other server side .NET tooling you might be using). The work in this post was delivered for a JasperFx customer to help protect their customer’s private information. If you need or want any help with event sourcing, Event Driven Architecture, or automated testing, drop us a note and we’d be happy to talk with you about what JasperFx can do for you.

I defy you to say the title of this post out loud in rapid succession without stumbling over it.

According to the U.S. Department of Labor, “Personal Identifiable Information” (PII) is defined as:

Any representation of information that permits the identity of an individual to whom the information applies to be reasonably inferred by either direct or indirect means.

Increasingly, Marten users are running into requirements to be able to “forget” PII that is persisted within a Marten database. For the document storage, I think this is relatively easy to do with a host of existing functionality including the partial update functionality that Marten got (back) in V7. For the event store though, there wasn’t anything built in that would have made it easy to erase or “mask” protected information within the persisted event data — until now!

The Marten 7.31 adds a new capability to erase or mask PII data within the event store.

For a variety of reasons, you may wish to remove or mask sensitive data elements in a Marten database without necessarily deleting the information as a whole. Documents can be amended with Marten’s Patching API. With event data, you now have options to reach into the event data and rewrite selected members as well as to add custom headers. First, start by defining data masking rules by event type like so:

var builder = Host.CreateApplicationBuilder();
builder.Services.AddMarten(opts =>
{
    opts.Connection(builder.Configuration.GetConnectionString("marten"));

    // By a single, concrete type
    opts.Events.AddMaskingRuleForProtectedInformation<AccountChanged>(x =>
    {
        // I'm only masking a single property here, but you could do as much as you want
        x.Name = "****";
    });

    // Maybe you have an interface that multiple event types implement that would help
    // make these rules easier by applying to any event type that implements this interface
    opts.Events.AddMaskingRuleForProtectedInformation<IAccountEvent>(x => x.Name = "****");

    // Little fancier
    opts.Events.AddMaskingRuleForProtectedInformation<MembersJoined>(x =>
    {
        for (int i = 0; i < x.Members.Length; i++)
        {
            x.Members[i] = "*****";
        }
    });
});

That’s strictly a configuration time effort. Next, you can apply the masking on demand to any subset of events with the IDocumentStore.Advanced.ApplyEventDataMasking() API. First, you can apply the masking for a single stream:

public static Task apply_masking_to_streams(IDocumentStore store, Guid streamId, CancellationToken token)
{
    return store
        .Advanced
        .ApplyEventDataMasking(x =>
        {
            x.IncludeStream(streamId);

            // You can add or modify event metadata headers as well
            // BUT, you'll of course need event header tracking to be enabled
            x.AddHeader("masked", DateTimeOffset.UtcNow);
        }, token);
}

As a finer grained operation, you can specify an event filter (Func<IEvent, bool>) within an event stream to be masked with this overload:

public static Task apply_masking_to_streams_and_filter(IDocumentStore store, Guid streamId, CancellationToken token)
{
    return store
        .Advanced
        .ApplyEventDataMasking(x =>
        {
            // Mask selected events within a single stream by a user defined criteria
            x.IncludeStream(streamId, e => e.EventTypesAre(typeof(MembersJoined), typeof(MembersDeparted)));

            // You can add or modify event metadata headers as well
            // BUT, you'll of course need event header tracking to be enabled
            x.AddHeader("masked", DateTimeOffset.UtcNow);
        }, token);
}

Note that regardless of what events you specify, only events that match a pre-registered masking rule will have the header changes applied.

To apply the event data masking across streams on an arbitrary grouping, you can use a LINQ expression as well:

public static Task apply_masking_by_filter(IDocumentStore store, Guid[] streamIds)
{
    return store.Advanced.ApplyEventDataMasking(x =>
        {
            x.IncludeEvents(e => e.EventTypesAre(typeof(QuestStarted)) && e.StreamId.IsOneOf(streamIds));
        });
}

Finally, if you are using multi-tenancy, you can specify the tenant id as part of the same fluent interface:

public static Task apply_masking_by_tenant(IDocumentStore store, string tenantId, Guid streamId)
{
    return store
        .Advanced
        .ApplyEventDataMasking(x =>
        {
            x.IncludeStream(streamId);

            // Specify the tenant id, and it doesn't matter
            // in what order this appears in
            x.ForTenant(tenantId);
        });
}

Here’s a couple more facts you might need to know:

  • The masking rules can only be done at configuration time (as of right now)
  • You can apply multiple masking rules for certain event types, and all will be applied when you use the masking API
  • The masking has absolutely no impact on event archiving or projected data — unless you rebuild the projection data after applying the data masking of course

Summary

The Marten team is at least considering support for crypto-shredding in Marten 8.0, but no definite plans have been made yet. It might fit into the “Critter Stack 2025” release cycle that we’re just barely starting.

Sending Messages to the Original Sender with Wolverine

Yesterday I blogged about a small, convenience feature we snuck into he release of Wolverine 3.0 last week for a JasperFx Software customer I wrote about in Combo HTTP Endpoint and Message Handler with Wolverine 3.0. Today I’d like to show some additions to Wolverine 3.0 just to improve its ability to send responses back to the original sending application or raise other messages in response to problems.

One of Wolverine’s main functions is to be an asynchronous messaging framework where we expect messages to come into our Wolverine systems through messaging brokers like Azure Service Bus or Rabbit MQ or AWS SQS from another system (or you can message to yourself too of course). A frequent question from users is what if there’s a message that can’t be processed for some reason and there’s a need to send a message back to the originating system or to create some kind of alert message to a support person to intervene?

Let’s start with the assumption that at least some problems can be found with validation rules early in message processing such that you can determine early that a message is not able to be processed — and if this happens, send a message back to the original sender telling it (or a person) so. In the Wolverine documentation, we have this middleware for looking up account information for any message that implements an IAccountCommand interface:

// This is *a* way to build middleware in Wolverine by basically just
// writing functions/methods. There's a naming convention that
// looks for Before/BeforeAsync or After/AfterAsync
public static class AccountLookupMiddleware
{
    // The message *has* to be first in the parameter list
    // Before or BeforeAsync tells Wolverine this method should be called before the actual action
    public static async Task<(HandlerContinuation, Account?)> LoadAsync(
        IAccountCommand command,
        ILogger logger,

        // This app is using Marten for persistence
        IDocumentSession session,

        CancellationToken cancellation)
    {
        var account = await session.LoadAsync<Account>(command.AccountId, cancellation);
        if (account == null)
        {
            logger.LogInformation("Unable to find an account for {AccountId}, aborting the requested operation", command.AccountId);
        }

        return (account == null ? HandlerContinuation.Stop : HandlerContinuation.Continue, account);
    }
}

Now, let’s change the middleware up above to send a notification message back to whatever the original sender is if the referenced account cannot be found. For the first attempt, let’s do it by directly injecting IMessageContext (IMessageBus, but with some specific API additions we need in this case) from Wolverine like so:

public static class AccountLookupMiddleware
{
    // The message *has* to be first in the parameter list
    // Before or BeforeAsync tells Wolverine this method should be called before the actual action
    public static async Task<(HandlerContinuation, Account?)> LoadAsync(
        IAccountCommand command,
        ILogger logger,

        // This app is using Marten for persistence
        IDocumentSession session,
        
        IMessageContext bus,

        CancellationToken cancellation)
    {
        var account = await session.LoadAsync<Account>(command.AccountId, cancellation);
        if (account == null)
        {
            logger.LogInformation("Unable to find an account for {AccountId}, aborting the requested operation", command.AccountId);

            // Send a message back to the original sender, whatever that happens to be
            await bus.RespondToSenderAsync(new InvalidAccount(command.AccountId));

            return (HandlerContinuation.Stop, null);
        }

        return (HandlerContinuation.Continue, account);
    }
}

Okay, hopefully not that bad. Now though, let’s utilize Wolverine’s OutgoingMessages type to relay that message with this functionally equivalent code:

public static class AccountLookupMiddleware
{
    // The message *has* to be first in the parameter list
    // Before or BeforeAsync tells Wolverine this method should be called before the actual action
    public static async Task<(HandlerContinuation, Account?, OutgoingMessages)> LoadAsync(
        IAccountCommand command,
        ILogger logger,

        // This app is using Marten for persistence
        IDocumentSession session,

        CancellationToken cancellation)
    {
        var messages = new OutgoingMessages();
        var account = await session.LoadAsync<Account>(command.AccountId, cancellation);
        if (account == null)
        {
            logger.LogInformation("Unable to find an account for {AccountId}, aborting the requested operation", command.AccountId);

            messages.RespondToSender(new InvalidAccount(command.AccountId));
            return (HandlerContinuation.Stop, null, messages);
        }

        // messages would be empty here
        return (HandlerContinuation.Continue, account, messages);
    }
}

As of Wolverine 3.0, you’re now able to send messages from “before / validate” middleware by either using IMessageBus/IMessageContext or OutgoingMessages. This is in addition to the older functionality to possibly send messages on certain message failures, as shown below in a sample from the Wolverine documentation on custom error handling policies:

theReceiver = await Host.CreateDefaultBuilder()
    .UseWolverine(opts =>
    {
        opts.ListenAtPort(receiverPort);
        opts.ServiceName = "Receiver";

        opts.Policies.OnException<ShippingFailedException>()
            .Discard().And(async (_, context, _) =>
            {
                if (context.Envelope?.Message is ShipOrder cmd)
                {
                    await context.RespondToSenderAsync(new ShippingFailed(cmd.OrderId));
                }
            });
    }).StartAsync();

Summary

You’ve got options! Wolverine does have a concept of “respond to sender” if you’re sending messages between Wolverine applications that will let you easily send a new message inside a message handler or message handler exception handling policy back to the original sender. This functionality also works, admittedly in a limited capacity, with interoperability between MassTransit and Wolverine through Rabbit MQ.