Multi-Tenancy: Database per Tenant with Marten

This is continuing a series about multi-tenancy with Marten, Wolverine, and ASP.Net Core:

What is it and why do you care?
Marten’s “Conjoined” Model
Database per Tenant with Marten (this post)

In the previous post we learned how to keep all the document or event data for each tenant in the same database, but using Marten’s “conjoined multi-tenancy” model to keep the data separated. This time out, let’s go for a much higher degree of separation by using a completely different database for each tenant with Marten.

Marten has a couple different recipes for “database per tenant multi-tenancy”, but let’s start with the simplest possible model where we’ll explicitly tell Marten about every single tenant by its id (the tenant_id values) and a connection string to that tenant’s specific database:

var builder = Host.CreateApplicationBuilder();
var configuration = builder.Configuration;
builder.Services.AddMarten(opts =>
{
    // Setting up Marten to "know" about five different tenants
    // and the database connection string for each
    opts.MultiTenantedDatabases(tenancy =>
    {
        tenancy.AddSingleTenantDatabase(configuration.GetConnectionString("tenant1"), "tenant1");
        tenancy.AddSingleTenantDatabase(configuration.GetConnectionString("tenant2"), "tenant2");
        tenancy.AddSingleTenantDatabase(configuration.GetConnectionString("tenant3"), "tenant3");
        tenancy.AddSingleTenantDatabase(configuration.GetConnectionString("tenant4"), "tenant4");
        tenancy.AddSingleTenantDatabase(configuration.GetConnectionString("tenant5"), "tenant5");
    });
});

using var host = builder.Build();
await host.StartAsync();

Just like in the post on conjoined tenancy, you can open a Marten document session (Marten’s unit of work abstraction for most typical operations) by supplying the tenant id like so:

// This was a recent convenience method added to 
// Marten to fetch the IDocumentStore singleton
var store = host.DocumentStore();

// Open up a Marten session to the database for "tenant1"
await using var session = store.LightweightSession("tenant1");

With that session object above, you can query all the data in that one specific tenant, or write Marten documents or events to that tenant database — and only that tenant database.

Now, to answer some questions you might have:

Marten’s DocumentStore is still a singleton registered service in your application’s IoC container that “knows” about multiple databases that are assumed to be identical. DocumentStore is an expensive object to create, and an important part of Marten’s multi-tenancy strategy was to ensure that you only every needed one — even with multiple tenant databases
Marten is able to track the schema object creation completely separate for each tenant database, so the “it just works” default mode where Marten is completely able to do database migrations for you on the fly also “just works” with the multi-tenancy by separate database approach
Marten’s (really Weasel‘s) command line tooling is absolutely able to handle multiple tenant databases. You can either migrate or patch all databases, or one database at a time through the command line tools
Marten’s Async Daemon background processing of event projections is perfectly capable of managing the execution against multiple databases as well
We’ll get into this in a later post, but it’s also possible to do two layers of multi-tenancy by combining both separate databases and conjoined multi-tenancy

Moving to a bit more complex case, let’s use Marten’s relatively recent “master table tenancy” model that will locate a table of tenant identifiers to tenant database connection strings in a table in a “master” database:

var builder = Host.CreateApplicationBuilder();
var configuration = builder.Configuration;
builder.Services.AddMarten(opts =>
{
    var tenantDatabaseConnectionString = configuration.GetConnectionString("tenants");
    opts.MultiTenantedDatabasesWithMasterDatabaseTable(tenantDatabaseConnectionString);
});

using var host = builder.Build();
await host.StartAsync();

The usage at runtime is identical to any other kind of multi-tenancy in Marten, but this model gives you the ability to add new tenants and tenant database at runtime without any down time. Marten will still be able to recognize a new tenant id and apply any necessary database changes at runtime.

Summary and What’s Next

Using separate databases for each tenant is a great way to create an even more rigid separation of data. You might opt for this model as a way to:

Scale your system better by effectively sharding your customer databases into smaller databases
Potentially reduce hosting costs by placing high volume tenants on different hardware than lower volume tenants
Meet more rigid security requirements for less risk of tenant data being exposed incorrectly

To the last point, I’ve heard of several cases where regulatory concerns have trumped technical concerns and led teams to choose the tenant per database approach.

Of course, the obvious potential downsides are more complex deployments, more things to go wrong, and maybe higher hosting costs if you’re not careful. Yeah, I know I said that’s a potential cost savings, that sword can cut both ways, so just be aware of potential hosting cost changes.

As for what’s next, actually quite a bit! In subsequent posts we’ll dig into Wolverine’s multi-tenancy support, detecting the tenant id from HTTP requests, two level tenancy in Marten because’s that’s possible, and even Wolverine’s ability to spawn virtual actors by tenant id.

For my fellow Gen X’ers out there who keep hearing the words “keep the data separated” and naturally have this song stuck in your head:

One Year of JasperFx Software

After about 25 years or so in the software industry, I finally founded my own business named JasperFx Software last year at about this time with the general strategy of building a services and product company around the “Critter Stack” tools (Marten, Wolverine, Weasel, and soon to be some others) as well as any or all consulting opportunities around server side .NET development or automated testing. After a year, it’s time for a little retrospective.

How’s it going?

Everything has taken longer than I’d hoped, and technical or company milestones have taken longer than I wanted. To quote the old adage, “it’s darkest right before the dawn.” I was so discouraged about the company right before Christmas that I was strongly considering giving up. Then JasperFx signed a couple big deals that gave me enough space and revenue to keep going, and even allowed JasperFx to bring on Babu Annamalai as a consultant and collaborator.

I’ve long been telling folks who ask me how it’s going that the technology side looks good, but the business is going just well enough to be encouraging but not well enough to feel confident yet. Right now I think I can finally say the business is strong enough to be thinking much more about how to grow and become sustainable than working on exit strategies. To put it more succinctly, I think that potential clients can trust that JasperFx Software is going to be around as a technical partner.

The big goals for this next year are to grow enough to be able to bring on full time colleagues for round the clock support and to continue to grow the “Critter Stack” platform. A big part of that is the planned “Critter Stack Pro” and “CritterWatch” commercial add on products I’m hoping will be demonstrable by the end of the 3rd quarter this year.

Client Work Highlights

JasperFx Software doesn’t exist without its clients — and we could always use more! Here’s some of the highlights of our client work in the past year:

By my count just now, we have worked with clients headquartered in six different countries so far, but I’ve already lost track of which countries our client contacts are located in
Event sourcing is relatively new, so there’s been a lot of engagement with various clients about the best ways to model their domains with events or how to best use projections for read or write models
I’ve gotten to work very closely with a client building an IoT system using event sourcing with Marten, messaging and HTTP services with Wolverine, and Rabbit MQ. That work has led to several improvements especially with Wolverine that have helped reduce repetitive code
JasperFx has helped several clients set up automated testing strategies and mentored teams on unit testing practices. Honestly, I think this might be the one way in which we’ve delivered the most “bang for the buck” for our clients. Automated testing strategies and designing for testability are primary areas of expertise within JasperFx Software — and that might be a little lacking in the larger software community these days
Multiple clients have complicated requirements for multi-tenancy in their systems, and that’s mightily pushed forward capabilities for both Marten and Wolverine in reaction. The specific highlights have been dynamic multi-tenancy in both tools, multi-level multi-tenancy, HTTP integration, and an effective virtual actor capability per tenant for Wolverine
Helped several clients with support contracts with JasperFx Software to use the tools more effectively. This has typically involved issues around transactional integrity, using a transactional outbox, error handling, and generally how to make systems be more robust or more performant
Concurrency issues are quite common, and JasperFx has worked with several of our customers to either make their systems more resilient to concurrency or to reduce potential issues around concurrency through design changes. This has been a common enough issue that I’m going to build out a new conference talk gathering the issues and possible solutions called “Surviving Concurrency in Event Driven Architectures”
We greatly improved Marten’s integration behind GraphQL endpoints using Hot Chocolate for a client, and we’re continuing on with strategies to use Wolverine for GraphQL mutations with Hot Chocolate
JasperFx delivered an early release of the forthcoming “Critter Stack Pro” tools for a client to greatly improve their ability to scale up to very large numbers of expected event data by better distributing load throughout an application cluster
Wolverine got an MQTT integration as the request of a JasperFx client
I helped a client kick off a big new initiative (that will ultimately be built with Ruby, so that’s been interesting) by facilitating multi-day Event Storming workshops
Helping multiple clients evaluate their current application code looking for areas of concern and potential remediation or modernization efforts

If you have a problem, and trust me, you can easily find us, maybe you can engage with JasperFx Software for your own development efforts with an email to sales@jasperfx.net. Or by contacting me directly through Discord, what’s left of Twitter, or wherever.

The State of the Critter Stack

Most of the time I’m focused on what is not already good, soft spots, flaws, outright bugs, and holes in the capabilities of Marten or Wolverine. Occasionally folks pop into our Discord channels just to say positive things, but most of the interactions there or on GitHub are with folks who are having trouble with something at that very moment. Last week I spent some serious time comparing the Critter Stack’s capabilities to some other toolsets for Event Driven Architecture in the .NET ecosystem and a bigger offering in the JVM ecosystem. And you know what?

Marten is already the most robust and feature rich toolset for Event Sourcing in the .NET ecosystem. Sure, you can roll the barebones basics for Event Sourcing yourself, but there are a lot of significantly challenging technical issues around projected state data, subscriptions to event data, concurrency protections, and instrumentation that Marten already supports today in a robust way that you would otherwise have to build out yourself. Likewise, you could cobble a lot of what Marten already does today with various libraries (none of which are documented as well or as widely used as Marten) or other event stores, but you’d still end up writing a lot more non-trivial code to glue it all together and fill in holes than you would by just using Marten.
Wolverine is a unique server side application framework that can be used to dramatically reduce the amount of boilerplate code that is so common in server side .NET development. Moreover, it’s the perfect toolset for really using the “vertical slice architecture” approach in a simpler way. And if used idiomatically, Wolverine can help you structure your code for easy unit testing by separating your application logic from infrastructure concerns without ridiculously high ceremony Hexagonal/Clean/Onion Architectures.
Used together, the Critter Stack is arguably the most feature complete solution for Event Driven Architecture in .NET today — but I’ll weasel this a bit by saying that Actor-based platforms are so conceptually different in approach that I wouldn’t do a straight up comparison
The Critter Stack tools can lead to much better testability, both for unit testing and for integration testing with infrastructural elements, that help our users deliver software faster and with more confidence. I believe that the Critter Stack’s approach and support for automated testing has no peers in the .NET space.

The Critter Stack suite is going to grow this year with commercial add on libraries for increased scalability and with a custom monitoring and management tool codenamed “CritterWatch.” If all goes according to plan (and it won’t), the Critter Stack is going to also grow this year with event sourcing alternatives based on at least Sql Server as the underlying persistence with CosmosDb and/or DynamoDb following afterward.

How’d I get here?

Hey, this section is going to get into some pretty serious TMI, so feel free to skip any of this.

For various reasons, maybe because I’m terrible at organizational politics, disinterest in what I was working on, bad luck, or all of the above, I never managed to find a role or a company where I felt like it was my shop and I was happy right where I was at. A noticeable trend that always frustrated me was that I was frequently much better known and more professionally respected in the outside world that where I happened to be employed at the moment:

Writing a little retrospective on my first year of my own company. Looking back I counted *four* different companies I worked at in the past 15 years where I was theoretically the chief software architect and had absolutely zero relationship with the CTO of the company.
— Jeremy D. Miller (@jeremydmiller) May 28, 2024

I’ve known for a long time that I enjoyed my OSS work building shared libraries and frameworks for other developers much more than what I happened to be doing for my real job. What I’ve long wished I could do was to was get to build development tools, preferably where I was one of the primary drivers of the company and I had a major hand in driving the vision. I thought I’d absolutely blown my chance when a very ambitious, prior project failed miserably years ago, and I had been admittedly pretty adrift in my career in the following years.

I had a couple possible opportunities to join Microsoft in their Dev Div division when I was younger, but there was always some personal reason not to take those positions. I’ve occasionally wished that would have worked out, but that was ages ago.

A couple years ago I told a therapist that I thought I was going through a mid-life crisis, and he laughingly told me that I just had some as yet unfulfilled ambitions. About the same time I got a message from another developer then working at a software tools company asking why we weren’t already monetizing Marten because he thought it was already better than certain commercial offerings. Those two incidents, other friends founding their own company at the same time, plus a lot of encouragement from my wife, and a few other breaks led me to finally go off and take a chance on myself and start JasperFx Software. I won’t lie and say it’s been all rainbows and unicorns and that I haven’t struggled mightily with stress this past year, but it’s trending upward right now and I love being able to roll out of bed knowing that I’m working on my technical vision (for at least some of the day!) or working to help clients who respect my input and contributions to their work.

Multi-Tenancy: Marten’s “Conjoined” Model

This is continuing a series about multi-tenancy with Marten, Wolverine, and ASP.Net Core:

What is it and why do you care?
Marten’s “Conjoined” Model (this post)

Let’s say that you definitely have the need for multi-tenanted storage in your system, but don’t expect enough data to justify splitting the tenant data over multiple databases, or maybe you just really don’t want to mess with all the extra overhead of multiple databases.

“Conjoined” is a term I personally coined for Marten years ago and isn’t anything that’s an “official” term in the industry. I’m not aware of any widely used pattern name for this strategy, but there surely is somewhere since this is so common.

This is where Marten’s “Conjoined” multi-tenancy model comes into play. Let’s say that we have a little document in our system named User just to store information about our users:

public class User
{
    public User()
    {
        Id = Guid.NewGuid();
    }

    public List<Friend> Friends { get; set; }

    public string[] Roles { get; set; }
    public Guid Id { get; set; }
    public string UserName { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }

    public string? Nickname { get; set; }
    public bool Internal { get; set; }
    public string Department { get; set; } = "";
    public string FullName => $"{FirstName} {LastName}";
    public int Age { get; set; }

    public DateTimeOffset ModifiedAt { get; set; }

    public void From(User user)
    {
        Id = user.Id;
    }

    public override string ToString()
    {
        return $"{nameof(FirstName)}: {FirstName}, {nameof(LastName)}: {LastName}";
    }
}

Now, the User document certainly needs to be tracked within a single logical tenant, so I’m going to tell Marten to do exactly that:

        // This is the same syntax to configuring Marten
        // by IServiceCollection.AddMarten()
        using var store = DocumentStore.For(opts =>
        {
            // other configuration

            // Make *only* the User document be stored by tenant
            opts.Schema.For<User>().MultiTenanted();
        });

In the case above, I am only telling Marten to make the User document be multi-tenanted as it’s frequently valuable — and certainly possible — for some reference documents to be common for all tenants. If instead we just wanted to say “all documents and the event store should be multi-tenanted,” we can do this:

        using var store = DocumentStore.For(opts =>
        {
            // other configuration

            opts.Policies.AllDocumentsAreMultiTenanted();
            opts.Events.TenancyStyle = TenancyStyle.Conjoined;
        });

Either way, if we’ve established that User should be multi-tenanted, Marten will add a tenant_id column to the storage table for the User document like this:

DROP TABLE IF EXISTS public.mt_doc_user CASCADE;
CREATE TABLE public.mt_doc_user (
    tenant_id           varchar                     NOT NULL DEFAULT '*DEFAULT*',
    id                  uuid                        NOT NULL,
    data                jsonb                       NOT NULL,
    mt_last_modified    timestamp with time zone    NULL DEFAULT (transaction_timestamp()),
    mt_version          uuid                        NOT NULL DEFAULT (md5(random()::text || clock_timestamp()::text)::uuid),
    mt_dotnet_type      varchar                     NULL,
CONSTRAINT pkey_mt_doc_user_tenant_id_id PRIMARY KEY (tenant_id, id)
);

As of Marten 7, Marten also places the tenant_id first in the primary key for more efficient index usage when querying large data tables.

You might also notice that Marten adds tenant_id to the primary key for the table. Marten will happily allow you to use the same identity for documents in different tenants. And even though that’s unlikely with a Guid as the identity, it’s very certainly possible with other identity strategies and early Marten users hit that occasionally.

Let’s see the conjoined tenancy in action:

        // I'm creating a session specifically for a tenant id of
        // "tenant1"
        using var session1 = store.LightweightSession("tenant1");

        // My youngest & I just saw the Phantom Menace in the theater
        var user = new User { FirstName = "Padme", LastName = "Amidala" };

        // Marten itself assigns the identity at this point
        // if the document doesn't already have one
        session1.Store(user);
        await session1.SaveChangesAsync();

        // Let's open a session to a completely different tenant
        using var session2 = store.LightweightSession("tenant2");

        // Try to find the same user we just persisted in the other tenant...
        var user2 = await session2.LoadAsync<User>(user.Id);

        // And it shouldn't exist!
        user2.ShouldBeNull();

In the very last call to Marten to try to load the same User, but from the “tenant2” tenant used this SQL:

select d.id, d.data from public.mt_doc_user as d where id = $1 and d.tenant_id = $2
  : f746f237-ed4f-4aaa-b805-ad05f7ae2cd3
  : tenant2

If you squint really hard, you can see that Marten automatically stuck in a second WHERE filter for the current tenant id. Moreover, if we switch to LINQ and try to query that way like so:

        var user3 = await session2.Query<User>().SingleOrDefaultAsync(x => x.Id == user.Id);
        user3.ShouldBeNull();

Marten is still quietly sticking in that tenant_id == [tenant id] filter for us with this SQL:

select d.id, d.data from public.mt_doc_user as d where (d.tenant_id = $1 and d.id = $2) LIMIT $3;
  $1: tenant2
  $2: bfc53828-d56b-4fea-8d93-e8a22fe2db40
  $3: 2

If you really, really need to do this, you can query across tenants with some special Marten LINQ helpers:

        var all = await session2
            .Query<User>()
            
            // Notice AnyTenant()
            .Where(x => x.AnyTenant())
            .ToListAsync();
        
        all.ShouldContain(x => x.Id == user.Id);

Or for specific tenants:

        var all = await session2
            .Query<User>()

            // Notice the Where()
            .Where(x => x.TenantIsOneOf("tenant1", "tenant2", "tenant3"))
            .ToListAsync();

        all.ShouldContain(x => x.Id == user.Id);

Summary

While I don’t think folks should willy nilly build out the “Conjoined” model from scratch without some caution, Marten’s model is pretty robust after 8-9 years of constant use from a large, unfortunately for me the maintainer, creative user base.

I didn’t discuss the Event Sourcing functionality in this post, but do note that Marten’s conjoined tenancy model also applies to Marten’s event store and the projected documents built by Marten as well.

In the next post, we’ll branch out to using different databases for different tenants.

Multi-Tenancy: What is it and why do you care?

I’m always on the lookout for ideas about how to endlessly promote both Marten & Wolverine. Since I’ve been fielding a lot of questions, issues, and client requests around multi-tenancy support in both tools the past couple weeks, now seems to be a good time for a new series exploring the existing foundation in both critter stack tools for handling quite a few basic to advanced multi-tenancy scenarios. But first, let’s start by just talking about what the phrase “multi-tenancy” even means for architecting software systems.

In the course of building systems, you’re frequently going to have a single system that needs to serve different sets of users or clients. Some examples I’ve run across have been systems that need to segregate data for different partner companies, different regions within the same company, or just flat out different users like online email services do today.

I don’t know the origin of the terminology, but we refer to those logical separations within the system data as “tenants.”

My youngest is very quickly outgrowing Dr. Seuss books, but we still read “Because a Bug went Kachoo!” above

It’s certainly going to be important many times to keep the data accessed through the system segregated so that nobody is able to access data that they should not. For example, I probably shouldn’t be able to read you email inbox when I lot into my gmail account. For another example from my early career, I worked with an early web application system that was used to gather pricing quotes from my very large manufacturing company’s suppliers for a range of parts. Due to a series of unfortunate design decisions (because a bug went kachoo!), that application did a very poor job being able to segregate data, and I figured out that some of our suppliers were able to see the quoted prices from their competitors and get unfair advantages.

So we can all agree that mixing up the data between users who shouldn’t see each other’s data is a bad thing, so what can we do about that? The most extreme solution is to just flat out deploy a completely different set of servers for each segregated group of users as shown below:

While there are some valid reasons once in awhile to do the completely separate deployments, that’s potentially a lot of overhead and extra hosting costs. At best, this is probably only viable for a finite number of deployments (Gmail is certainly not building out a separate web server for every one of us with a Gmail account for example).

When a single deployed system is able to serve different tenants, we call that “multi-tenancy.”

According to Wikipedia:

Software multitenancy is a software architecture in which a single instance of software runs on a server and serves multiple tenants.

With multi-tenancy, we’re ensuring that one single deployment of the logical service can handle requests for multiple tenants without allowing users from one tenant to inappropriately see data from other tenants.

Roughly speaking, I’m familiar with three different ways to achieve multi-tenancy.

The first approach is to use one database for all tenant data, but to use some sort of tenant id field that just denotes which tenant the data belongs to. This is what I termed “Conjoined Tenancy” in Marten. This approach is simpler in terms of the database deployment and database change management because after all, there’s only one of them! It is potentially more complex within your codebase because your persistence layer will always need to apply filters on the data being modified and accessed by the user and whichever tenant they are part of.

There’s some inherent risk with this approach as developers aren’t perfectly omniscient, and there’s always a chance that we miss some scenarios and let data leak out inappropriately to the wrong users. I think this approach is much more viable when using persistence tooling that has strong support (like Marten!) for this type of “conjoined multi-tenancy” and mostly takes care of the tenancy filters for you.

The second approach is to use a separate schema for each tenant within the same database. I’ve never used this approach myself, and I’m not aware of any tooling in my own .NET ecosystem that supports this approach out of the box. I think this would be a good approach if you were building something on top of a relational database from scratch with a custom data layer — but I think it would be a lot of extra overhead managing the database schema migrations.

The third way to do multi-tenancy is to use a separate database for each tenant, but the single deployed system is smart enough to connect to the correct database throughout its persistence layer based on the current user (or through metadata on messages as we’ll see in a later entry on Wolverine multi-tenancy). This approach is shown below:

There’s of course some challenges to this approach as well. First off, there’s more databases to worry about, and subsequently more overhead for database migrations and management. This approach does give you rock solid data segregation between tenants, and I’ve heard of strong business or regulatory requirements to take this approach even when the data volume wouldn’t require this. As my last statement hints at, we all know that the system database is very commonly the bottleneck for performance and scalability, so segregating different tenant data into separate databases may be a good way to improve the scalability of your system.

It’s obviously going to be more difficult to do any kind of per-tenant data rollup or summary with the separate database approach, but some cloud providers have specialized infrastructure for per tenant database multi-tenancy.

A Note about Scalability

I was taught very early on that an effective way to scale systems was to design for any given server to be able to handle all the possible types of operations, then add more servers to the horizontal cluster. I think at the time this was in reaction to several systems we had where teams had tried to scale bigger systems by segregating all operations for one region to one set of servers, and a different set of servers for other regions. The end result was an explosion of deployed servers and frequently having servers absolutely pegged on CPU or memory while North America factories were in full swing while the servers tasked with handling factories on the Pacific Rim were completely dormant when their factories were closed. An architecture that can spread all the work across the cluster of running nodes might often be a much cheaper solution in the end than standing up many more nodes that can only service a subset of tenants.

Then again, you might also want to prioritize some tenants over others, so take everything I just said with a grain of “it depends” salt.

Thar be Dragons!

In the next set of posts, I’ll get into first Marten, then Wolverine capabilities for multi-tenancy, but just know first that there’s a significant set of challenges ahead:

Managing multiple database schemas if using separate databases per tenant
Needing to use per-tenant filters if using the conjoined storage model for query segregation — and trust me as the author of a persistence tool, there’s plenty of edge case dragons here
Detecting the current tenant based on HTTP requests or messaging metadata
Communicating the tenant information when using asynchronous messaging
Querying across tenants
Dynamically spinning up new tenant databases at runtime — or tearing them down! — or even moving them at runtime?!?
Isolated data processing by tenant database
Multi-level tenancy!?! JasperFx helped a customer build this out with Marten
Transactional outbox support in a multi-tenanted work — which Wolverine can do today!

The two “Critter Stack” tools help with most of these challenges today, and I’ll get around to some discussion about future work to help fill in the more advanced usages that some real users are busy running into right now.

Wolverine’s Test Support Diagnostics

I’m working on fixing a reported bug with Wolverine today and its event forwarding from Marten feature. I can’t say that I yet know why this should-be-very-straightforward-and-looks-exactly-like-the-currently-passing-tests bug is happening, but it’s a good time to demonstrate Wolverine’s automated testing support and even how it can help you to understand test failures.

First off, and I’ll admit that there’s some missing context here, I’m setting up a system such that when this message handler is executed:

public record CreateShoppingList();

public static class CreateShoppingListHandler
{
    public static string Handle(CreateShoppingList _, IDocumentSession session)
    {
        var shoppingListId = CombGuidIdGeneration.NewGuid().ToString();
        session.Events.StartStream<ShoppingList>(shoppingListId, new ShoppingListCreated(shoppingListId));
        return shoppingListId;
    }
}

The configured Wolverine + Marten integration should be kicking in to publish the event appended in the handler above to a completely different handler shown below with the Marten IEvent wrapper so that you can use Marten event store metadata within the secondary, cascaded message:

public static class IntegrationHandler
{
    public static void Handle(IEvent<ShoppingListCreated> _)
    {
        // Don't need a body here, and I'll show why not
        // next
    }
}

Knowing those two things, here’s the test I wrote to reproduce the problem:

    [Fact]
    public async Task publish_ievent_of_t()
    {
        // The "Arrange"
        using var host = await Host.CreateDefaultBuilder()
            .UseWolverine(opts =>
            {
                opts.Policies.AutoApplyTransactions();

                opts.Services.AddMarten(m =>
                {
                    m.Connection(Servers.PostgresConnectionString);
                    m.DatabaseSchemaName = "forwarding";

                    m.Events.StreamIdentity = StreamIdentity.AsString;
                    m.Projections.LiveStreamAggregation<ShoppingList>();
                }).UseLightweightSessions()
                .IntegrateWithWolverine()
                .EventForwardingToWolverine();;
            }).StartAsync();
        
        // The "Act". This method is an extension method in Wolverine
        // specifically for facilitating integration testing that should
        // invoke the given message with Wolverine, then wait until all
        // additional "work" is complete
        var session = await host.InvokeMessageAndWaitAsync(new CreateShoppingList());

        // And finally, just assert that a single message of
        // type IEvent<ShoppingListCreated> was executed
        session.Executed.SingleMessage<IEvent<ShoppingListCreated>>()
            .ShouldNotBeNull();
    }

And now, when I run the test — which “helpfully” reproduces reported bug from earlier today — I get this output:

System.Exception: No messages of type Marten.Events.IEvent<MartenTests.Bugs.ShoppingListCreated> were received

System.Exception
No messages of type Marten.Events.IEvent<MartenTests.Bugs.ShoppingListCreated> were received
Activity detected:

----------------------------------------------------------------------------------------------------------------------
| Message Id                             | Message Type                          | Time (ms)   | Event               |
----------------------------------------------------------------------------------------------------------------------
| 018f82a9-166d-4c71-919e-3bcb04a93067   | MartenTests.Bugs.CreateShoppingList   |          873| ExecutionStarted    |
| 018f82a9-1726-47a6-b657-2a59d0a097cc   | System.String                         |         1057| NoRoutes            |
| 018f82a9-17b1-4078-9997-f6117fd25e5c   | EventShoppingListCreated              |         1242| Sent                |
| 018f82a9-166d-4c71-919e-3bcb04a93067   | MartenTests.Bugs.CreateShoppingList   |         1243| ExecutionFinished   |
| 018f82a9-17b1-4078-9997-f6117fd25e5c   | EventShoppingListCreated              |         1243| Received            |
| 018f82a9-17b1-4078-9997-f6117fd25e5c   | EventShoppingListCreated              |         1244| NoHandlers          |
----------------------------------------------------------------------------------------------------------------------

EDIT: If I’d read this more closely before, I would have noticed that the problem was somewhere different than the routing that I first suspected from a too casual read.

The textual table above is Wolverine telling me what it did do during the failed test. In this case, the output does tip me off that there’s some kind of issue with the internal message routing in Wolverine that should be applying some special rules for IEvent<T> wrappers, but was not in this case. While that work fixing the real bug continues for me, what I hope you get out of this is how Wolverine is trying to help you diagnose test failures by providing diagnostic information about what was actually happening internally during all the asynchronous processing. As a long veteran of test automation efforts, I will vociferously say that it’s important for test automation harnesses to be able to adequately explain the inevitable test failures. Like Wolverine helpfully does.

Now, back to work trying to actually fix the problem…

Scheduled Message Delivery with Wolverine

Wolverine has the ability to schedule the delivery of messages for a later time. While Wolverine certainly isn’t trying to be Hangfire or Quartz.Net, the message scheduling in Wolverine today is valuable for “timeout” messages in sagas, or “retry this evening” type scenarios, or reminders of all sorts.

If using the Azure Service Bus transport, scheduled messages sent to Azure Service Bus queues or topics will use native Azure Service Bus scheduled delivery. For everything else today, Wolverine is doing the scheduled delivery for you. To make those scheduled messages be durable (i.e. not completely lost when the application is shut down), you’re going to want to add message persistence to your Wolverine application as shown in the sample below using SQL Server:

// This is good enough for what we're trying to do
// at the moment
builder.Host.UseWolverine(opts =>
{
    // Just normal .NET stuff to get the connection string to our Sql Server database
    // for this service
    var connectionString = builder.Configuration.GetConnectionString("SqlServer");
    
    // Telling Wolverine to build out message storage with Sql Server at 
    // this database and using the "wolverine" schema to somewhat segregate the 
    // wolverine tables away from the rest of the real application
    opts.PersistMessagesWithSqlServer(connectionString, "wolverine");
    
    // In one fell swoop, let's tell Wolverine to make *all* local
    // queues be durable and backed up by Sql Server 
    opts.Policies.UseDurableLocalQueues();
});

Finally, with all that said, here’s one of the ways to schedule message deliveries:

    public static async Task use_message_bus(IMessageBus bus)
    {
        // Send a message to be sent or executed at a specific time
        await bus.ScheduleAsync(new DebitAccount(1111, 100), DateTimeOffset.UtcNow.AddDays(1));

        // Or do the same, but this time express the time as a delay
        await bus.ScheduleAsync(new DebitAccount(1111, 225), 1.Days());
        
        // ScheduleAsync is really just syntactic sugar for this:
        await bus.PublishAsync(new DebitAccount(1111, 225), new DeliveryOptions { ScheduleDelay = 1.Days() });
    }

Or, if you want to utilize Wolverine’s cascading message functionality to keep most if not all of your handler method signatures “pure”, you can use this syntax within message handlers or HTTP endpoints:

    public static IEnumerable<object> Consume(Incoming incoming)
    {
        // Delay the message delivery by 10 minutes
        yield return new Message1().DelayedFor(10.Minutes());

        // Schedule the message delivery for a certain time
        yield return new Message2().ScheduledAt(new DateTimeOffset(DateTime.Today.AddDays(2)));
    }

Finally, one last alternative that was primarily meant for saga usage, subclassing TimeoutMessage like so:

public record EnforceAccountOverdrawnDeadline(Guid AccountId) : TimeoutMessage(10.Days()), IAccountCommand;

By subclassing TimeoutMessage, the message type above is “scheduled” for a later time when it’s returned as a cascading message.

Wolverine’s HTTP Model Does More For You

One of the things I’m wrestling with right now is frankly how to sell Wolverine as a server side toolset. Yes, it’s technically a message library like MassTransit or NServiceBus. It can also be used as “just” a mediator tool like MediatR. With Wolverine.HTTP, it’s even an alternative HTTP endpoint framework that’s technically an alternative to FastEndpoints, MVC Core, or Minimal API. You’ve got to categorize Wolverine somehow, and we humans naturally understand something new by comparing it to some older thing we’re already familiar with. In the case of Wolverine, it’s drastically selling the toolset short by comparing it to any of the older application frameworks I rattled off above because Wolverine fundamentally does much more to remove code ceremony, improve testability throughout your codebase, and generally just let you focus more on core application functionality than older application frameworks.

This post was triggered by a conversation I had with a friend last week who told me he was happy with his current toolset for HTTP API creation and couldn’t imagine how Wolverine’s HTTP endpoint model could possibly reduce his efforts. Challenge accepted!

For just this moment, consider a simplistic HTTP service that works on this little entity:

public record Counter(Guid Id, int Count);

Now, let’s build an HTTP endpoint that will:

Receive route arguments for the Counter.Id and the current tenant id because of course let’s say that we’re using multi-tenancy with a separate database per tenant
Try to look up the existing Counter entity by its id from the right tenant database
If the entity doesn’t exist, return a status code 404 and get out of there
If the entity does exist, increment the Count property and save the entity to the database and return a status code 204 for a successful request with an empty body

Just to make it easier on me because I already had this example code, we’re going to use Marten for persistence which happens to have much stronger multi-tenancy built in than EF Core. Knowing all that, here’s a sample MVC Core controller to implement the functionality I described above:

public class CounterController : ControllerBase
{
    [HttpPost("/api/tenants/{tenant}/counters/{id}")]
    [ProducesResponseType(204)] // empty response
    [ProducesResponseType(404)]
    public async Task<IResult> Increment(
        Guid id, 
        string tenant, 
        [FromServices] IDocumentStore store)
    {
        // Open a Marten session for the right tenant database
        await using var session = store.LightweightSession(tenant);
        var counter = await session.LoadAsync<Counter>(id, HttpContext.RequestAborted);
        if (counter == null)
        {
            return Results.NotFound();
        }
        else
        {
            counter = counter with { Count = counter.Count + 1 };
            await session.SaveChangesAsync(HttpContext.RequestAborted);
            return Results.Empty;
        }
    }
}

I’m completely open to recreating the multi-tenancy support from the Marten + Wolverine combo for EF Core and SQL Server through Wolverine, but I’m shamelessly waiting until another company is willing to engage with JasperFx Software to deliver that.

Alright, now let’s switch over to using Wolverine.HTTP with its WolverineFx.Http.Marten add on Nuget setup. Let’s drink some Wolverine koolaid and write a functionally identical endpoint the Wolverine way:

You need Wolverine 2.7.0 for this by the way!

    [WolverinePost("/api/tenants/{tenant}/counters/{id}")]
    public static IMartenOp Increment([Document(Required = true)] Counter counter)
    {
        counter = counter with { Count = counter.Count + 1 };
        return MartenOps.Store(counter);
    }

Seriously, this is the same functionality and even the same generated OpenAPI documentation. Some things to note:

Wolverine is able to derive much more of the OpenAPI documentation from the type signatures and from policies applied to the endpoint method, like…
The usage of the Document(Required = true) tells Wolverine that it will be trying to load a document of type Counter from Marten, and by default it’s going to do that through a route argument named “id”. The Required property tells Wolverine to return a 404 NotFound status code automatically if the Counter document doesn’t exist. This attribute usage also applies some OpenAPI smarts to tag the route as potentially returning a 404
The return value of the method is an IMartenOp “side effect” just saying “go save this document”, which Wolverine will do as part of this endpoint execution. Using the side effect makes this method a nice, simple pure function that’s completely synchronous. No wrestling with async Task, await, or schlepping around CancellationToken every which way
Because Wolverine can see there will not be any kind of response body, it’s going to use a 204 status code to denote the empty body and tag the OpenAPI with that as well.
There is absolutely zero Reflection happening at runtime because Wolverine is generating and compiling code at runtime (or ahead of time for faster cold starts) that “bakes” in all of this knowledge for fast execution
Wolverine + Marten has a far more robust support for multi-tenancy all the way through the technology stack than any other application framework I know of in .NET (web frameworks, mediators, or messaging libraries), and you can see that evident in the code above where Marten & Wolverine would already know how to detect tenant usage in an HTTP request and do all the wiring for you all the way through the stack so you can focus on just writing business functionality.

To make this all more concrete, here’s the generated code:

// <auto-generated/>
#pragma warning disable
using Microsoft.AspNetCore.Routing;
using System;
using System.Linq;
using Wolverine.Http;
using Wolverine.Marten.Publishing;
using Wolverine.Runtime;

namespace Internal.Generated.WolverineHandlers
{
    // START: POST_api_tenants_tenant_counters_id_inc2
    public class POST_api_tenants_tenant_counters_id_inc2 : Wolverine.Http.HttpHandler
    {
        private readonly Wolverine.Http.WolverineHttpOptions _wolverineHttpOptions;
        private readonly Wolverine.Runtime.IWolverineRuntime _wolverineRuntime;
        private readonly Wolverine.Marten.Publishing.OutboxedSessionFactory _outboxedSessionFactory;

        public POST_api_tenants_tenant_counters_id_inc2(Wolverine.Http.WolverineHttpOptions wolverineHttpOptions, Wolverine.Runtime.IWolverineRuntime wolverineRuntime, Wolverine.Marten.Publishing.OutboxedSessionFactory outboxedSessionFactory) : base(wolverineHttpOptions)
        {
            _wolverineHttpOptions = wolverineHttpOptions;
            _wolverineRuntime = wolverineRuntime;
            _outboxedSessionFactory = outboxedSessionFactory;
        }



        public override async System.Threading.Tasks.Task Handle(Microsoft.AspNetCore.Http.HttpContext httpContext)
        {
            var messageContext = new Wolverine.Runtime.MessageContext(_wolverineRuntime);
            // Building the Marten session
            await using var documentSession = _outboxedSessionFactory.OpenSession(messageContext);
            if (!System.Guid.TryParse((string)httpContext.GetRouteValue("id"), out var id))
            {
                httpContext.Response.StatusCode = 404;
                return;
            }


            var counter = await documentSession.LoadAsync<Wolverine.Http.Tests.Bugs.Counter>(id, httpContext.RequestAborted).ConfigureAwait(false);
            // 404 if this required object is null
            if (counter == null)
            {
                httpContext.Response.StatusCode = 404;
                return;
            }

            
            // The actual HTTP request handler execution
            var martenOp = Wolverine.Http.Tests.Bugs.CounterEndpoint.Increment(counter);

            if (martenOp != null)
            {
                
                // Placed by Wolverine's ISideEffect policy
                martenOp.Execute(documentSession);

            }

            
            // Commit any outstanding Marten changes
            await documentSession.SaveChangesAsync(httpContext.RequestAborted).ConfigureAwait(false);

            
            // Have to flush outgoing messages just in case Marten did nothing because of https://github.com/JasperFx/wolverine/issues/536
            await messageContext.FlushOutgoingMessagesAsync().ConfigureAwait(false);

            // Wolverine automatically sets the status code to 204 for empty responses
            if (!httpContext.Response.HasStarted) httpContext.Response.StatusCode = 204;
        }

    }

    // END: POST_api_tenants_tenant_counters_id_inc2
    
    
}

Summary

Wolverine isn’t “just another messaging library / mediator / HTTP endpoint alternative.” Rather, Wolverine is a completely different animal that while fulfilling those application framework roles for server side .NET, potentially does a helluva lot more than older frameworks to help you write systems that are maintainable, testable, and resilient. And do all of that with a lot less of the typical “Clean/Onion/Hexagonal Architecture” cruft that shines in software conference talks and YouTube videos but helps lead teams into a morass of unmaintainable code in larger systems in the real world.

But yes, the Wolverine community needs to find a better way to communicate how Wolverine adds value above and beyond the more traditional server side application frameworks in .NET. I’m completely open to suggestions — and fully aware that some folks won’t like the “magic” in the “drank all the Wolverine Koolaid” approach I used.

You can of course use Wolverine with 100% explicit code and none of the magic.

Controlling Parallelism with Wolverine Background Processing

A couple weeks back I started a new blog series meant to explore Wolverine’s capabilities for background processing. Working in very small steps and only one new concept at a time, the first time out just showed how to set up Wolverine inside a new ASP.Net Core web api service and immediately use it for offloading some processing from HTTP endpoints to background processing by using Wolverine’s local queues and message handlers for background processing. In the follow up post, I added durability to the background processing so that our work being executed in the background would be durable even in the face of application restarts.

In this post, let’s look at how Wolverine allows you to either control the parallelism of your background processing, or restrict the processing to be strictly sequential.

To review, in previous posts we were “publishing” a SignupRequest message from a Minimal API endpoint to Wolverine like so:

app.MapPost("/signup", (SignUpRequest request, IMessageBus bus) 
    => bus.PublishAsync(request));

In this particular case, our application has a message handler for SignupRequest, so Wolverine has a sensible default behavior of publishing the message to a local, in memory queue where each message will be processed in a separate thread from the original HTTP request, and do so asynchronously in the background.

So far, so good? By default, each message type gets its own local, in memory queue, with a default “maximum degree of parallelism” equal to the number of detected processors (Environment.ProcessorCount). In addition, the local queues do not enforce strict ordering by default.

But now, what if you do need to strict sequential ordering? Or if you want to restrict or expand the number of parallel messages that can be processed? Or the get really wild, constrain some messages to running sequentially while other messages run in parallel?

First, let’s see how we could alter the parallelism of our SignUpRequest to an absurd degree and say that up to 20 messages could theoretically be processed at one time by the system. We’ll do that by breaking into the UseWolverine() configuration and adding this:

builder.Host.UseWolverine(opts =>
{
    // The other stuff...

    // Make the SignUpRequest messages be published with even 
    // more parallelization!
    opts.LocalQueueFor<SignUpRequest>()
        
        // A maximum of 20 at a time because why not!
        .MaximumParallelMessages(20);
});

Easy enough, but now let’s say that we want all logical event messages in our system to be handled in the sequential order that our process publishes these messages. An easy way to do that with Wolverine is to have each event message type implement Wolverine’s IEvent marker interface like so:

public record Event1 : IEvent;
public record Event2 : IEvent;
public record Event3 : IEvent;

To be honest, the IEvent and corresponding IMessage and ICommand interfaces were added to Wolverine originally just to make it easier to transition a codebase from using NServiceBus to Wolverine, but those types have little actual meaning to Wolverine. The only way that Wolverine even uses them is for the purpose of “knowing” that a type is an outbound message so that Wolverine can preview the message routing for a type implementing one of these interfaces automatically in diagnostics.

Revisiting our UseWolverine() code block again, we’ll add that publishing rule like this:

builder.Host.UseWolverine(opts =>
{
    // Other stuff...

    opts.Publish(x =>
    {
        x.MessagesImplementing<IEvent>();
        x.ToLocalQueue("events")
            // Force every event message to be processed in the 
            // strict order they are enqueued, and one at a 
            // time
            .Sequential();
        });
});

With the code above, our application would be publishing every single message where the message type implements IEvent to that one local queue named “events” that has been configured to process messages in strict sequential order.

Summary and What’s Next

Wolverine makes it very easy to do work in background processing within your application, and even to easily control the desired parallelism in your application, or to make a subset of messages be processed in strict sequential order when that’s valuable instead.

To be honest, this series is what I go to when I feel like I need to write more Critter Stack content for the week, so it might be a minute or two before there’s a follow up. There’ll be at least two more posts, one on scheduling message execution and an example of using the local processing capabilities in Wolverine to implement the producer/consumer pattern.

Scaling Marten with PostgreSQL Read Replicas

JasperFx Software is open for business and offering consulting services (like helping you craft scalability strategies!) and support contracts for both Marten and Wolverine so you know you can feel secure taking a big technical bet on these tools and reap all the advantages they give for productive and maintainable server side .NET development.

First off, big thanks to Jaedyn Tonee for actually doing all the work I’m writing about here. JT recently accepted a long standing “offer” to be part of the official “Critter Stack Core Team.”

Marten 7 embraced several new-ish features in Npgsql, including the NpgsqlDataSource concept for connection management. This opened Marten up for a couple other capabilities like integrating Marten with .NET Aspire. It also enabled Marten to utilize PostgreSQL Read Replicas for read only query usage. Read Replicas are valuable both for high availability of database systems, and to offload heavy read intensive operations off of the main database server and onto read replicas.

To opt into read replicas with Marten, you need to utilize the new MultiHostNpgsqlDataSource in Npgsql and Marten as shown in this sample code:

// services is an IServiceCollection collection
services.AddMultiHostNpgsqlDataSource(ConnectionSource.ConnectionString);

services.AddMarten(x =>
    {
        // Will prefer standby nodes for querying.
        x.Advanced.MultiHostSettings.ReadSessionPreference = TargetSessionAttributes.PreferStandby;
    })
    .UseLightweightSessions()
    .UseNpgsqlDataSource();

Behind the scenes, if you are opting into this model, when you make a query with Marten like this:

       // theSession is an IDocumentSession 
       var users = await theSession
            .Query<User>()
            .Where(x => x.FirstName == "Sam")
            .ToListAsync();

Marten will be trying to connect to a PostgreSQL read replica to service that LINQ query.

Summary

I hope this is an important new tool for Marten users to achieve both high availability and scalability within systems with bigger data loads.

Linked Lists in Real Life

I’ve been occasionally writing posts about old design patterns or techniques that are still occasionally useful despite the decades long backlash to the old “Gang of Four” book:

The Lowly Strategy Pattern is Still Useful
The Occasionally Useful State Pattern
The Decorator Pattern is sometimes helpful
Linked Lists in Real Life (this post)

A linked list structure is a simple data structure where each element has a reference to the next element. At its simplest, it’s no more than this single linked list implementation:

public abstract class Item
{
    public Item Next { get; set; }
}

Now, I got into software development through Shadow IT rather than a formal Computer Science degree, so I can’t really get into what Big O notation stuff a linked list gives you for sorting or finding or whatnot (and don’t really care either). What I can tell you that I have occasionally used linked list structures to great effect, with at least two samples within the greater “Critter Stack” / JasperFx codebases.

For the first example, consider the complex SQL generation from this LINQ query in Marten using Marten’s custom “Include Related Documents” feature:

        // theSession is an IDocumentSession from Marten
        var holders = await theSession.Query<TargetHolder>()
            .Include<Target>(x => x.TargetId, list, t => t.Color == Colors.Green)
            .ToListAsync();

Behind the scenes, Marten is generating this pile of lovely SQL:

drop table if exists mt_temp_id_list1; 
create temp table mt_temp_id_list1 as (select d.id, d.data from public.mt_doc_targetholder as d);
 select d.id, d.data from public.mt_doc_target as d where (d.id in (select CAST(d.data ->> 'TargetId' as uuid) from mt_temp_id_list1 as d) and CAST(d.data ->> 'Color' as integer) = $1);
  $1: 2

If you squint really hard, you can notice that Marten is actually executing four different SQL statements in one logical query. Internally, Marten is using a linked list structure to “plan” and then generate the SQL from the raw LINQ Expression tree using the Statement type, partially shown below:

public abstract partial class Statement: ISqlFragment
{
    public Statement Next { get; set; }
    public Statement Previous { get; set; }

    public StatementMode Mode { get; set; } = StatementMode.Select;

    /// <summary>
    ///     For common table expressions
    /// </summary>
    public string ExportName { get; protected internal set; }

    public bool SingleValue { get; set; }
    public bool ReturnDefaultWhenEmpty { get; set; }
    public bool CanBeMultiples { get; set; }

    public void Apply(ICommandBuilder builder)
    {
        configure(builder);
        if (Next != null)
        {
            if (Mode == StatementMode.Select)
            {
                builder.StartNewCommand();
            }

            builder.Append(" ");
            Next.Apply(builder);
        }
    }

    public void InsertAfter(Statement descendent)
    {
        if (Next != null)
        {
            Next.Previous = descendent;
            descendent.Next = Next;
        }

        if (object.ReferenceEquals(this, descendent))
        {
            throw new InvalidOperationException(
                "Whoa pardner, you cannot set Next to yourself, that's a stack overflow!");
        }

        Next = descendent;
        descendent.Previous = this;
    }

    /// <summary>
    ///     Place the descendent at the very end
    /// </summary>
    /// <param name="descendent"></param>
    public void AddToEnd(Statement descendent)
    {
        if (Next != null)
        {
            Next.AddToEnd(descendent);
        }
        else
        {
            if (object.ReferenceEquals(this, descendent))
            {
                return;
            }

            Next = descendent;
            descendent.Previous = this;
        }
    }

    public void InsertBefore(Statement antecedent)
    {
        if (Previous != null)
        {
            Previous.Next = antecedent;
            antecedent.Previous = Previous;
        }

        antecedent.Next = this;
        Previous = antecedent;
    }

    public Statement Top()
    {
        return Previous == null ? this : Previous.Top();
    }

    // Find the selector statement at the very end of 
    // the linked list
    public SelectorStatement SelectorStatement()
    {
        return (Next == null ? this : Next.SelectorStatement()).As<SelectorStatement>();
    }

    // And a some other stuff...
}

The Statement model is a “double linked list,” meaning that each element is aware of both its direct ancestor (Previous) and the next descendent (Descendent). With the Statement model, I’d like to call out a couple wrinkles that made the linked list strategy a great fit for complex SQL generation.

First off, the SQL generation itself frequently requires multiple statements from the top level statement down to the very last statement, and you can see that happening in the Apply() method above that writes out the SQL for the current Statement, then calls Next.Apply() all the way down to the end of the chain — all while helping Marten’s (really Weasel’s) batch SQL generation “know” when it should start a new command (see NpgsqlBatch for a little more context on what I mean there).

Also note all the methods for inserting a new Statement directly before or after the current Statement. Linked lists are perfect for when you frequently need to insert new elements before or after or even at the very end of the chain rather than at a known index. Here’s an example from Marten that kicks in sometimes when a user uses the LINQ Distinct() operator:

For more context if you’ve lived a more charmed life and have never run across them, here’s an explanation of Common Table Expressions from PostgreSQL (this is commonly supported in other databases).

internal class DistinctSelectionStatement: Statement
{
    public DistinctSelectionStatement(SelectorStatement parent, ICountClause selectClause, IMartenSession session)
    {
        parent.ConvertToCommonTableExpression(session);

        ConvertToCommonTableExpression(session);

        parent.InsertAfter(this);

        selectClause.OverrideFromObject(this);
        var selector = new SelectorStatement { SelectClause = selectClause };
        InsertAfter(selector);
    }

    protected override void configure(ICommandBuilder sql)
    {
        startCommonTableExpression(sql);
        sql.Append("select distinct(data) from ");
        sql.Append(Previous.ExportName);
        sql.Append(" as d");
        endCommonTableExpression(sql);
    }
}

When Marten has to apply this DistinctSelectionStatement, it modifies its immediate parent statement that probably started out life as a simple select some_field from some_table query into a common table expression query, and appends the DistinctSelectionStatement behind its parent to do the actual SQL distinct mechanics.

In this particular case of the SQL generation, it’s been frequently necessary for a Statement to “know” about its immediate ancestor, as you can see in the sample code above where the DistinctSelectStatement picks off the ExportName (the CTE name of the preceding statement) to use in generating the right SQL in DistinctSelectStatement.Apply().

For another example, both Marten and Wolverine use a runtime code generation model to build a lot of their “glue” code between the framework and user’s application code (you can see an example of that runtime code generation in this post). One of the core conceptual abstractions in the shared code generation model is the abstract Frame class which roughly equates to logical step in the generated code — usually just a single line of code. During the code generation process, the Frame objects are assembled in a single linked list structure so each Frame.

When actually writing out the generated source code, a typical Frame will write its code, then tell the next Frame to write out its code and so on — as shown by this sample class:

// This is from Wolverine, and weaves in code
// to add selected tags from incoming messages to
// message handlers into the current Open Telemetry Activity
public class AuditToActivityFrame : SyncFrame
{
    private readonly Type _inputType;
    private readonly List<AuditedMember> _members;
    private Variable? _input;

    public AuditToActivityFrame(IChain chain)
    {
        _inputType = chain.InputType()!;
        _members = chain.AuditedMembers;
    }

    public override IEnumerable<Variable> FindVariables(IMethodVariables chain)
    {
        _input = chain.FindVariable(_inputType);
        yield return _input;
    }

    public override void GenerateCode(GeneratedMethod method, ISourceWriter writer)
    {
        writer.WriteComment("Application-specific Open Telemetry auditing");
        foreach (var member in _members)
            writer.WriteLine(
                $"{typeof(Activity).FullNameInCode()}.{nameof(Activity.Current)}?.{nameof(Activity.SetTag)}(\"{member.OpenTelemetryName}\", {_input!.Usage}.{member.Member.Name});");

        // Tell the next frame to write its code too!
        Next?.GenerateCode(method, writer);
    }
}

Where the linked list structure really comes into play with the source generation is when you need to wrap the inner Next Frame with some kind of coding construct like a using block or a try/finally block maybe. Here’s an example of doing just that where the following CatchStreamCollisionFrame places a try/catch block around the code generated by the Next frame (and all of the other frames after the Next frame as well):

// This is the actual middleware that's injecting some code
// into the runtime code generation
internal class CatchStreamCollisionFrame : AsyncFrame
{
    public override void GenerateCode(GeneratedMethod method, ISourceWriter writer)
    {
        writer.WriteComment("Catches any existing stream id collision exceptions");
        writer.Write("BLOCK:try");
        
        // Write the inner code here
        Next?.GenerateCode(method, writer);
        
        writer.FinishBlock();
        writer.Write($@"
BLOCK:catch({typeof(ExistingStreamIdCollisionException).FullNameInCode()} e)
await {typeof(StreamCollisionExceptionPolicy).FullNameInCode()}.{nameof(StreamCollisionExceptionPolicy.RespondWithProblemDetails)}(e, httpContext);
return;
END

");
    }

The ugly generated code above will catch a Marten exception named ExistingStreamIdCollisionException in the generated code for a Wolverine.HTTP endpoint and return a ProblemDetails result explaining the problem and an HTTP status code of 400 (Invalid) instead of letting the exception bubble up.

By having the linked list structure where each Frame is aware of the next Frame, it makes it relatively easy to generate code when you need to wrap the inner code in some kind of C# block structure.

Summary

I spent a lot of time as a kid helping my Dad on his construction crew and helping my grandfather on his farm (you can’t possibly imagine how often farming equipment breaks). Both of them obviously had pretty large toolboxes — but there’s some tools that don’t come out very often, but man you were glad you had them when you did need them. The average developer probably isn’t going to use linked lists by hand very often, but I’ve found them to be very helpful when you need to either model a problem as outer/inner handlers or ancestor/descendents. Linked lists are also great when you need to be able to easily insert items into the greater collection relative to another item.

Anyway, dunno if these examples are too involved or too specious, but there the only two times I’ve used a linked list in the past decade.

I hope it’s obvious, but the JasperFx Software logo is meant to be a tractor tire around a cattle brand. The company name and logo is a little tribute to my family heritage, such as it is:)