What Is Good Code?

This is the second part of a 3 or 4 part series where I’m formulating my thoughts about an ongoing initiative at MedeAnalytics. I started yesterday with a related post called On Giving Technical Guidance to Others that’s a synopsis of an impromptu lecture I game our architecture team about all the things I wish I’d known before becoming any kind of technical leader. I’ll follow this post up hopefully as soon as tomorrow with my reasoning about why prescriptive architectures are harmful and my own spin on the SOLID Principles.

I’m part of an architectural team that’s been charged with modernizing and improving our very large, existing systems. We have an initiative just getting off the ground to break off part of one of our monoliths into a separate system to begin a strangler application strategy to modernize the system over time. This gives us a chance to work out how we want our systems to be structured and built going forward in a smaller subset of work instead of trying to boil the ocean to update the entire monolith codebase at one time.

As part of that effort, I’m trying to put some stakes in the ground to:

  • Ban all usage of formal, prescriptive architectural styles like the Onion Architecture or Clean Architecture because I find that they do more harm than good. Rather, I’m pushing hard for vertical slice or feature folder code organization while still satisfying the need for separation of concerns and decent structuring of the code
  • Generally choose lower code ceremony approaches whenever possible because that promotes easier evolution of the code, and in the end, the only truly guaranteed path to good code is adaptation and evolution in the face of feedback about the code.
  • Be very cautious about how we abstract database access to avoid causing unnecessary complexity or poor performance, which means I probably want to ban any usage of naive IRepository<T> type abstractions
  • Put the SOLID Principles into a little bit of perspective as we do this work and make sure our developers and architects have a wider range of mental tools in their design toolbox than just an easy to remember but hard to interpret or apply acronym developed by C++ developers before many of our developers were even born

The rest of this post is just trying to support those opinions.

First, What is Good?

More on this in a later post as I give my take on SOLID, but Dan North made an attempt at describing “good code” that’s worth your read.

Let’s talk a little bit about the qualities you want in your code. Quite a few folks are going to say that the most important quality is that the code satisfies the business needs and delivers value to the business! If you’ll please get that bit of self righteousness out of your system, let’s move on to the kind of technical quality that’s necessary to continue to efficiently deliver business value over time.

  • You can understand what the code is doing, navigate within the codebase, and generally find code where you would expect it to be based on the evident and documented rules of the system architecture.
  • The code exhibits separation of concerns, meaning that you’re generally able to reason about and change one responsibility of the code at a time (data access, business logic, validation logic, data presentation, etc.). Cohesion and coupling are the alpha and omega of software design. I’m a very strong believer in evolutionary approaches to designing software as the only truly reliable method to arrive at good code, but that’s largely dependent upon the qualities of cohesion and coupling within your code.
  • Rapid feedback is vital to effective coding, so testability of the code is a major factor for me. This can mean that code is structured in a way that it’s easy to unit test in isolation (i.e., you can effectively test business rules without having to run the full stack application or in one hurtful extreme, be forced to use a tool like Selenium). This version of testability is very largely a restatement of cohesion and coupling. Alternatively, if the code depends on some kind of infrastructure that’s easy to deal with in integration testing (like Marten!) and the integration tests run “fast enough,” I say you can relax separation of concerns and jumble things together as long as the code is still easy to reason about.
  • I don’t know a pithy way to describe this, but the code needs to carefully expose the places where system state is changed or “mutated” to make the code’s behavior predictable and prevent bugs. Whether that’s adopting command query segregation, using elements of functional programming, or the uni-directional data flow in place of two way data binding in user interface development, system state changes are an awfully easy way to introduce bugs in code and should be dealt with consciously and with some care.

I think most of us would say that code should be “simple,” and I’d add that I personally want code to be written in a low ceremony way that reduces noise in the code. The problem with that whole statement is that it’s very subjective:

Which is just to say that saying the words “just write simple code!” isn’t necessarily all that helpful or descriptive. What’s helpful is to have some mental tools to help developers judge whether or not their code is “good” and move in the direction of more successful code. Bet yet, do that without introducing unnecessary complexity or code ceremony through well-intentioned prescriptive architectures like “Onion” or “Clean” that purposely try to force developers to write code “the right way.”

And next time on Jeremy tries to explain his twitter ranting…

This has inevitably taken longer than I wished to write, so I’m breaking things up. I will follow up tomorrow and Thursday with my analysis of SOLID, an explanation of why I think the Onion/Clean Architecture style of code organization is best avoided, and eventually some thoughts on database abstractions.

On Giving Technical Guidance to Others

I’m still working on my promised SOLID/Anti-Onion/Anti-Clean/Database Abstraction post, but it’s as usual taking longer than I’d like and I’m publishing this section separately.

Just as a quirk of circumstances, I pretty well went straight from being a self-taught “Shadow IT” developer to being a lead developer and de facto architect on a mission critical supply chain application for a then Fortune 500 company. The system was an undeniable success in the short term, but it came at a cost to me because as a first time lead I had zero ability to enable the other developers working with me to be productive. As such, I ended up writing the mass majority of the code and inevitably became the bottleneck on all subsequent production issues. That doesn’t scale.

The following year I had another chance to lead a follow up project and vowed to do a better job with the other developers (plus I was getting a lot of heat from various management types to do so). In a particular case that I remember to this day, I wrote up a detailed Word document for a coding assignment for another developer. I got all the way down to class and method names and even had some loose sample code I think. I handed that off, patted myself on the back for being a better lead, and went off on my merry way.

As you might have surmised, when I got his code back later it was unusable because he did exactly what I said to do — which turned out to be wrong based on factors I hadn’t anticipated. Worse, he only did exactly what I said to do and missed some concerns that I didn’t think needed to be explicitly called out. I’ve thought a lot about this over the years and come to some conclusions about how I should have tried to work differently with that developer. Before diving into that, let’s first talk about you for awhile!

Congratulations! You’ve made it to some kind of senior technical role in your company. You’ve attained enough skill and knowledge to be recognized for your individual contributions, and now your company is putting you in a position to positively influence other developers, determine technical strategies, and serve as a steward for your company’s systems.

Hopefully you’ll still be hands on in the coding and testing, but increasingly, your role is going to involve trying to create and evolve technical guidance for other developers within your systems. More and more, your success is going to be dependent on your ability to explain ideas, concepts, and approaches to other developers. Not that I’m the fount of all wisdom about this, but here’s some of the things I wish I’d understood before being put into technical leadership roles:

  • It’s crucial to provide the context, reasoning, and applicability behind any technical guidance. Explaining why or when are we doing this is just as important as the “what” or “how.”
  • Being too specific in the guidance or instructions to another developer can easily come with the unintended consequence of turning off their brains and will frequently lead to poor results. Expanding on my first point, it’s better to explain the goals, how their work fits into the larger system, and the qualities of the code you’re hoping to achieve rather than try to make them automatons just following directions. It’s quite possible that JIRA-driven development exacerbates this potential problem.
  • You need to provide some kind of off-ramp to developers to understand the limitations of the guidance. The last thing you want is for developers to blindly follow guidance that is inappropriate for a circumstance that wasn’t anticipated during the formulation of said guidance
  • Recommendations about technology usage probably needs to come as some kind of decision tree with multiple options to its applicability because there’s just about never a one size fits all tool
  • By all means, allow and encourage the actual developers to actively look for better approaches because they’re the ones closest to their code. Especially with talented younger developers, you never want to take away their sense of initiative or close them off from providing feedback, adjustments, or flat out innovation to the “official” guidance. At the very least, you as a senior technical person need to pay attention when a developer tells you that the current approach is confusing or laborious or feels too complicated.
  • Treat every possible recommendation or technical guidance as a theory that hasn’t yet been perfectly proven.

I’ve talked a lot about giving technical guidance, but you should never think that you or any other individual are responsible for doing all the thinking within a software ecosystem. What you might be responsible for is facilitating the sharing of learning and knowledge through the company. I was lucky enough early in my career to spend just a little bit of time working with Martin Fowler who effectively acts as a sort of industry wide, super bumble bee gathering useful knowledge from lots of different projects and cross-pollinating what he’s learned to other teams and other projects. Maybe you don’t impact the entire software development industry like he has, but you can at least facilitate that within your own team or maybe within your larger organization.

As an aside, a very helpful technique to use when trying to explain something in code to another developer is to ask them to explain it back to you in their own words — or conversely, I try to do this when I’m the one getting the explanation to make sure I’ve really understood what I’m being told. My wife is an educator and tells me this is a common technique for teachers as well.

Next time…

In my next post I’m going to cover a lot of ground about why I think prescriptive architectural styles like the “Onion” or “Clean” are harmful, alternatives, a discussion about what use is SOLID these days (more than none, but much less than the focus many people put on it is really worth), and a discussion about database abstractions I find to be harmful that tend to be side effects of prescriptive architectures.

Low Code Ceremony Sagas with Jasper & Marten

You’ll need at least Jasper v2.0.0-alpha-4 if you want to recreate the saga support in this post. All the sample code for this post is in an executable sample on GitHub. Jasper does support sagas with EF Core and Sql Server or Postgresql, but Marten is where most of the effort is going just at the moment.

The Saga pattern is a way to solve the issue of logical, long-running transactions that necessarily need to span over multiple operations. In the approaches I’ve encountered throughout my career, this has generally meant persisting a “saga state” of some sort in a database that is used within a message handling framework to “know” what steps have been completed, and what’s outstanding.

Jumping right into an example, consider a very simple order management service that will have steps to:

  1. Create a new order
  2. Complete the order
  3. Or alternatively, delete new orders if they have not been completed within 1 minute

For the moment, I’m going to ignore the underlying persistence and just focus on the Jasper message handlers to implement the order saga workflow with this simplistic saga code:

using Baseline.Dates;
using Jasper;

namespace OrderSagaSample;

public record StartOrder(string Id);

public record CompleteOrder(string Id);

public record OrderTimeout(string Id) : TimeoutMessage(1.Minutes());

public class Order : Saga
{
    public string? Id { get; set; }

    // By returning the OrderTimeout, we're triggering a "timeout"
    // condition that will process the OrderTimeout message at least
    // one minute after an order is started
    public OrderTimeout Start(StartOrder order, ILogger<Order> logger)
    {
        Id = order.Id; // defining the Saga Id.

        logger.LogInformation("Got a new order with id {Id}", order.Id);
        // creating a timeout message for the saga
        return new OrderTimeout(order.Id);
    }

    public void Handle(CompleteOrder complete, ILogger<Order> logger)
    {
        logger.LogInformation("Completing order {Id}", complete.Id);

        // That's it, we're done. This directs Jasper to delete the
        // persisted saga state after the message is done.
        MarkCompleted();
    }

    public void Handle(OrderTimeout timeout, ILogger<Order> logger)
    {
        logger.LogInformation("Applying timeout to order {Id}", timeout.Id);

        // That's it, we're done. Delete the saga state after the message is done.
        MarkCompleted();
    }
}

I’m just aiming for a quick sample rather than exhaustive documentation here, but a few notes:

  • Jasper leans a bit on type and naming conventions to discover message handlers and to “know” how to call these message handlers. Some folks will definitely not like the magic, but this approach leads to substantially less code and arguably complexity compared to existing .Net tools
  • Jasper supports the idea of scheduled messages, and the new TimeoutMessage base class up there is just a way to utilize that support for “saga timeout” conditions
  • Jasper generally tries to adapt to your application code rather than force a lot of mandatory framework artifacts into your message handler code

Now let’s move over to the service bootstrapping and add Marten in as our persistence mechanism in the Program file:

using Jasper;
using Jasper.Persistence.Marten;
using Marten;
using Oakton;
using Oakton.Resources;
using OrderSagaSample;

var builder = WebApplication.CreateBuilder(args);

// Not 100% necessary, but enables some extra command line diagnostics
builder.Host.ApplyOaktonExtensions();

// Adding Marten
builder.Services.AddMarten(opts =>
    {
        var connectionString = builder.Configuration.GetConnectionString("Marten");
        opts.Connection(connectionString);
        opts.DatabaseSchemaName = "orders";
    })

    // Adding the Jasper integration for Marten.
    .IntegrateWithJasper();


builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();

// Do all necessary database setup on startup
builder.Services.AddResourceSetupOnStartup();

// The defaults are good enough here
builder.Host.UseJasper();

var app = builder.Build();

// Just delegating to Jasper's local command bus for all
app.MapPost("/start", (StartOrder start, ICommandBus bus) => bus.InvokeAsync(start));
app.MapPost("/complete", (CompleteOrder start, ICommandBus bus) => bus.InvokeAsync(start));
app.MapGet("/all", (IQuerySession session) => session.Query<Order>().ToListAsync());
app.MapGet("/", (HttpResponse response) =>
{
    response.Headers.Add("Location", "/swagger");
    response.StatusCode = 301;
});

app.UseSwagger();
app.UseSwaggerUI();

return await app.RunOaktonCommands(args);

Off screen, I’ve started up a docker container for Postgresql to get a blank database. With that running, I’ll start the application up with the usual dotnet run command and open up the Swagger page:

You’ll get a lot of SQL in your terminal on the first run as Marten sets up the database for you, that’s perfectly normal.

I’m going to first create a new order for “Shoes” and execute the /create endpoint:

And verify that it’s persisted by checking the /all endpoint:

If I’m quick enough, I’ll post {"Id": "Shoes"} to /complete, and then verify through the /all endpoint that the “Shoes” order has been completed.

Otherwise, if I’m too slow to complete the order, the timeout message will be applied to our order and you’ll see evidence of that in the logging output like so:

And that’s it, one working saga implementation with database backed persistence through Marten. The goal of Jasper is to make this kind of server side development as low ceremony and easy to use as possible, so any feedback about what you do or don’t like in this sample would be very helpful.

Related Posts

I’ve spit out quite a bit of blogging content the past several weeks on both Marten and Jasper:

(Re) Introducing Jasper as a Command Bus

EDIT 6/15/2022: The correct Nuget is “Jasper.Persistence.Marten”

I just released a second alpha of Jasper 2.0 to Nuget. You can find the project goals for Jasper 2.0 here, and an update from a couple weeks ago here. Be aware that the published documentation for Jasper is very, very far behind. I’m working on it:)

Jasper is a long running open source project with the goal of creating a low ceremony and highly productive framework for building systems in .Net that would benefit from either an in memory command bus or utilize asynchronous messaging. The big driver for Jasper right now is using it in combination with the event sourcing capabilities of Marten as a full stack CQRS architectural framework. Later this week I’ll preview the ongoing Marten + Jasper integration, but for today I’d like to just introduce Jasper itself a little bit.

For a simplistic sample application, let’s say that we’re building an online system for potential customers to make reservations at any number of participating restaurants. I’m going to start by laying down a brand new .Net 6 Web API project. I’m obviously going to choose Marten as my persistence tooling, so the next steps are to add a Nuget reference to the Jasper.Persistence.Marten Nuget which will bring transitive dependencies over for both Jasper and Marten.

Jasper also has some working integration with EF Core using either Sql Server or Postgresql as the backing store so far.

Let’s build an online reservation system for your favorite restaurants!

Let’s say that as a result of an event storming requirements session, we’ve determined that we want both a command message to confirm a reservation, and a corresponding event message out to the internal systems of the various restaurants. I’m going to eschew event sourcing to keep this simpler and just opt for a persistent Reservation document in Marten. All that being said, here’s our code to model everything I just described:

public record ConfirmReservation(Guid ReservationId);
public record ReservationConfirmed(Guid ReservationId);

public class Reservation
{
    public Guid Id { get; set; }
    public DateTimeOffset Time { get; set; }
    public string RestaurantName { get; set; }
    public bool IsConfirmed { get; set; }
}

Building Our First Message Handlers

In this contrived example, the ReservationConfirmed event message will be published separately because it spawns a call to an external system where I’d strongly prefer to have a separate “retry loop” around just that call. That being said, this is the first cut for a command handler for the ConfirmReservation message:

public class ConfirmReservationHandler
{
    public async Task Handle(ConfirmReservation command, IDocumentSession session, IExecutionContext publisher)
    {
        var reservation = await session.LoadAsync<Reservation>(command.ReservationId);

        reservation!.IsConfirmed = true;

        // Watch out, this could be a race condition!!!!
        await publisher.PublishAsync(new ReservationConfirmed(reservation.Id));

        // We're coming back to this in a bit......
        await session.SaveChangesAsync();
    }
}

To be technical, Jasper uses an in memory outbox for all message processing even if there’s no durable message storage to at least guarantee that outgoing messages are only published when the original message is successfully handled. I just wanted to show the potential danger here.

So a couple things to note that are different from existing tools like NServiceBus or MassTransit:

  • Jasper locates message handlers strictly through naming conventions. Public methods named either Handle() or Consume() on public types that are suffixed by Handler or Consumer. There are no mandatory attributes or interfaces. Hell, there’s not even a mandatory method signature except that the first argument is always assumed to be the message type.
  • Jasper Handle() methods can happily support method injection, meaning that the IDocumentSession parameter above is pushed into the method from Jasper itself. In my experience, using method injection frequently simplifies the message handler code as opposed to the idiomatic C# approach of using constructor injection and relaying things through private fields.
  • Message types in Jasper are just concrete types, and there’s no necessary Event or Message base classes of any type — but that may be introduced later strictly for optional diagnostics.

Lastly, notice my comment about the race condition between publishing the outgoing ReservationConfirmed event message and committing the database work through IDocumentSession.SaveChangesAsync(). That’s obviously a problem waiting to bite us, so we’ll come back to that.

Next, let’s move on to the separate handler for ReservationConfirmed:

[LocalQueue("Notifications")]
[RetryNow(typeof(HttpRequestException), 50, 100, 250)]
public class ReservationConfirmedHandler
{
    public async Task Handle(ReservationConfirmed confirmed, IQuerySession session, IRestaurantProxy restaurant)
    {
        var reservation = await session.LoadAsync<Reservation>(confirmed.ReservationId);

        // Make a call to an external web service through a proxy
        await restaurant.NotifyRestaurant(reservation);
    }
}

All this handler does is look up the current state of the reservation and post that to an external system through a proxy interface (IRestaurantProxy).

As I said before, I strongly prefer that calls out to external systems be isolated to their own retry loops. In this case, the [RetryNow] attribute is setting up Jasper to retry a command through this handler on transient HttpException errors with a 50, 100, and then 250 millisecond cooldown period between attempts. Jasper’s error handling policies go much deeper than this, but hopefully you can already see what’s possible.

The usage of the [LocalQueue("Notifications")] attribute is directing Jasper to execute the ReservationConfirmed messages in a separate, local queue named “Notifications”. In effect, we’ve got a producer/consumer solution between the incoming ConfirmReservation command and ReservationConfirmed event messages. The local queueing is done with the TPL Dataflow library. Maybe Jasper will eventually move to using System.Threading.Channels, but for right now there’s just bigger issues to worry about.

Don’t fret if you don’t care for sprinkling attributes all over your code, all of the configuration I’ve done above with attributes can also be done with a fluent interface at bootstrapping time, or even within the message handler classes themselves.

Bootstrapping Jasper

Stepping into the Program.cs file for our new system, I’m going to add bootstrapping for both Marten and Jasper in the simplest possible way like so:

using CommandBusSamples;
using Jasper;
using Marten;
using Oakton;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddMarten(opts =>
{
    opts.Connection("Host=localhost;Port=5433;Database=postgres;Username=postgres;password=postgres");
});

// Adding Jasper as a straight up Command Bus with all
// its default configuration
builder.Host.UseJasper();

var app = builder.Build();

// This isn't *quite* the most efficient way to do this,
// but it's simple to understand, so please just let it go...
// I'm just delegating the HTTP request body as a command 
// directly to Jasper's in memory ICommandBus
app.MapPost("/reservations/confirm", (ConfirmReservation command, ICommandBus bus) => bus.InvokeAsync(command));


// This opts into using Oakton for extended command line options for this app
// Oakton is also a transitive dependency of Jasper itself
return await app.RunOaktonCommands(args);

Okay, so let’s talk about some of the things in that code up above:

  • Jasper tries to embrace the generic host building and core abstractions that came with .Net Core (IHostBuilder, ILogger, IHostedService etc.) wherever possible, so hence the integration happens with the UseJasper() call seen above.
  • The call to UseJasper() also quietly sets up Lamar as the underlying IoC container for your application. I won’t get into that much here, but there are optimizations in Jasper’s runtime model that require Lamar.
  • I used Oakton as the command line parser. That’s not 100% necessary, but there are a lot of development time utilities with Oakton for Jasper development. I’ll show some of that in later posts building on this one.

The one single HTTP route calls directly into the Jasper ICommandBus.InvokeAsync() method to immediately execute the message handler inline for the ConfirmReservation message. As someone who’s a skeptic of “mediator” tools in AspNetCore, I’m not sure this really adds much value as the handler for ConfirmReservation is currently written. However, we can add some transient error handling to our application’s bootstrapping that would apply to the ICommandBus.InvokeAsync() calls like so:

builder.Host.UseJasper(opts =>
{
    // Just setting up some retries on transient database connectivity errors
    opts.Handlers.OnException<NpgsqlException>().OrInner<NpgsqlException>()
        .RetryWithCooldown(50.Milliseconds(), 100.Milliseconds(), 250.Milliseconds());
});

We can also opt to make our ConfirmReservation commands be processed in background threads rather than inline through our web request by changing the Minimal API route to:

// This isn't *quite* the most efficient way to do this,
// but it's simple to understand, so please just let it go...
app.MapPost("/reservations/confirm", (ConfirmReservation command, ICommandBus bus) => bus.EnqueueAsync(command));

The EnqueueAsync() method above places the incoming command message into an in-memory queue.

What if the process dies mid-flight?!?

An experienced user of asynchronous messaging tools will have happily spotted several potential problems in the solution so far. One, there’s a potential race condition in the ConfirmReservationHandler code between database changes being committed and the outgoing message being processed. Two, what if the process dies? If we’re using all this in-memory queueing stuff, that all dies when the process dies, right?

Fortunately, Jasper with some significant help from Postgresql and Marten here, already has a robust inbox and outbox implementation we’ll add next for durable messaging.

For clarity, here’s the original handler code again for the ConfirmReservation message:

public class ConfirmReservationHandler
{
    public async Task Handle(ConfirmReservation command, IDocumentSession session, IExecutionContext publisher)
    {
        var reservation = await session.LoadAsync<Reservation>(command.ReservationId);

        reservation!.IsConfirmed = true;

        // Watch out, this could be a race condition!!!!
        await publisher.PublishAsync(new ReservationConfirmed(reservation.Id));

        // We're coming back to this in a bit......
        await session.SaveChangesAsync();
    }
}

Please note the comment about the race condition in the code. What we need to do is to introduce Jasper’s outbox feature, then revisit this handler.

First though, I need to go back to the bootstrapping code in Program and add a little more code:

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddMarten(opts =>
{
    opts.Connection("Host=localhost;Port=5433;Database=postgres;Username=postgres;password=postgres");
})
    // NEW! Adding Jasper outbox integration to Marten in the "messages"
    // database schema
    .IntegrateWithJasper("messages");

// Adding Jasper as a straight up Command Bus
builder.Host.UseJasper(opts =>
{
    // Just setting up some retries on transient database connectivity errors
    opts.Handlers.OnException<NpgsqlException>().OrInner<NpgsqlException>()
        .RetryWithCooldown(50.Milliseconds(), 100.Milliseconds(), 250.Milliseconds());

    // NEW! Apply the durable inbox/outbox functionality to the two in-memory queues
    opts.DefaultLocalQueue.DurablyPersistedLocally();
    opts.LocalQueue("Notifications").DurablyPersistedLocally();

    // And I just opened a GitHub issue to make this config easier...
});

var app = builder.Build();

// This isn't *quite* the most efficient way to do this,
// but it's simple to understand, so please just let it go...
app.MapPost("/reservations/confirm", (ConfirmReservation command, ICommandBus bus) => bus.EnqueueAsync(command));

// This opts into using Oakton for extended command line options for this app
// Oakton is also a transitive dependency of Jasper itself
return await app.RunOaktonCommands(args);

This time I chained a call on the Marten configuration through the UseWithJasper() extension method. I also added a couple lines of code within the UseJasper() block to mark the in memory queues as being enrolled in the inbox/outbox mechanics through the DurablyPersistedLocally() method.

Now, back to our handler code. Keeping things explicit for now, I’m going to add some necessary mechanics for opting into outbox sending:

public class ConfirmReservationHandler
{
    public async Task Handle(ConfirmReservation command, IDocumentSession session, IExecutionContext publisher)
    {
        // Enroll the execution context and Marten session in
        // outbox sending
        // This is an extension method in Jasper.Persistence.Marten
        await publisher.EnlistInOutboxAsync(session);
        
        var reservation = await session.LoadAsync<Reservation>(command.ReservationId);

        reservation!.IsConfirmed = true;

        // No longer a race condition, I'll explain more below:)
        await publisher.PublishAsync(new ReservationConfirmed(reservation.Id));

        // Persist, and kick out the outgoing messages
        await session.SaveChangesAsync();
    }
}

I’m going to utilize Jasper/Marten middleware in the last section to greatly simplify the code above, so please read to the end:)

With this version of the handler code, thing are working a little differently:

  • The call to PublishAsync() does not immediately release the message to the in memory queue. Instead, it’s routed and held in memory by the IExecutionContext for later
  • When IDocumentSession.SaveChangesAsync() is called, the outgoing messages are persisted into the underlying database in the same database transaction as the change to the Reservation document. Using the Marten integration, the Jasper outbox can even do this within the exact same batched database command as a minor performance optimization
  • At the end of IDocumentSession.SaveChangesAsync() upon a successful transaction, the outgoing messages are kicked out into the outgoing message sending queues in Jasper.

The outbox usage here solves a couple issues. First, it eliminates the race condition between the outgoing messages and the database changes. Secondly, it prevents situations of system inconsistency where either the message succeeds and the database changes fail, or vice versa.

I’ll be writing up a follow up post later this week or next diving deeper into Jasper’s outbox implementation. For a quick preview, by taking a very different approach than existing messaging tools in .Net, Jasper’s outbox is already usable in more scenarios than other alternatives and I’ll try to back that assertion up next time. To answer the obvious question in the meantime, Jasper’s outbox gives you an at least once delivery guarantee even if the current process fails.

Streamline the Handler Mechanics

So the “register outbox, public messages, commit session transaction” dance could easily get repetitive in your code. Jasper’s philosophy is that repetitive code is wasteful, so let’s eliminate the cruft-y code we were writing strictly for Marten or Jasper in the ConfirmReservationHandler. The code below is the exact functional equivalent to the earlier handler — even down to enrolling in the outbox:

public static class ConfirmReservationHandler
{
    [Transactional]
    public static async Task<ReservationConfirmed> Handle(ConfirmReservation command, IDocumentSession session)
    {
        var reservation = await session.LoadAsync<Reservation>(command.ReservationId);

        reservation!.IsConfirmed = true;

        session.Store(reservation);

        // Kicking out a "cascaded" message
        return new ReservationConfirmed(reservation.Id);
    }
}

The usage of the [Transactional] attribute opts only this handler into using Marten transactional middleware that handles all the outbox enrollment mechanics and calls IDocumentSession.SaveChangesAsync() for us after this method is called.

By returning Task<ReservationConfirmed>, I’m directing Jasper to publish the returned value as a message upon the successful completion of the incoming message. I personally like the cascading message pattern in Jasper as a way to make unit testing handler code easier. This was based on a much older custom service bus I helped build and run in production in the mid-10’s.

On the next episode of “please, please pay attention to my OSS projects!”

My very next post is a sequel to Marten just got better for CQRS architectures, but this time using some new Jasper functionality with Marten to further streamline out repetitive code.

Following behind that, I want to write follow ups doing a deeper dive on Jasper’s outbox implementation, using Rabbit MQ with Jasper, then a demonstration of all the copious command line utilities built into both Jasper and Marten.

As my mentor from Dell used to say, “thanks for listening, I feel better now”

Marten just got better for CQRS architectures

I’m assuming some prior knowledge of Event Sourcing as an architectural pattern here. I highly recommend Oskar Dudycz’s Introduction to Event Sourcing training kit or this video from Derek Comartin. While both Event Sourcing and the closely associated CQRS architectural style are both useful without the other, I’m still assuming here that you’re interested in using Marten for event sourcing within a larger CQRS architecture.

So you’re adopting an event sourcing style with Marten for your persistence within a larger CQRS architectural style. Crudely speaking, all “writes” to the system state involve sending a command message to your CQRS service with a workflow something like this:

In the course of handling the command message, our command handler (or HTTP endpoint) needs to:

  1. Fetch a “write model” that represents the state for the current workflow. This projected “write model” will be used by the command handler to validate the incoming command and also to…
  2. Decide what subsequent events should be published to update the state of the system based on the existing state and the incoming command
  3. Persist the new events to the ongoing Marten event store
  4. Possibly publish some or all of the new events to an outgoing transport to be acted upon asynchronously
  5. Deal with concurrency concerns, especially if there’s any significant chance that other related commands maybe coming in for the same logical workflow at the same time

Do note that as I shift to implementations that I’m going to mostly bypass any discussion of design patterns or what I personally consider to be useless cruft from common CQRS approaches in the .Net or JVM worlds. I.e., no repositories will be used in any of this code.

As an example system, let’s say that we’re building a new, online telehealth system that among other things will track how a medical provider spends their time during a shift helping patients during their workday. Using Marten’s “self-aggregate” support, a simplified version of the provider shift state is represented by this model:

public class ProviderShift
{
    public Guid Id { get; set; }
    
    // Pay attention to this, this will come into play
    // later
    public int Version { get; set; }
    
    public Guid BoardId { get; private set; }
    public Guid ProviderId { get; init; }
    public ProviderStatus Status { get; private set; }
    public string Name { get; init; }
    
    public Guid? AppointmentId { get; set; }
    
    public static async Task<ProviderShift> Create(ProviderJoined joined, IQuerySession session)
    {
        var provider = await session.LoadAsync<Provider>(joined.ProviderId);
        return new ProviderShift
        {
            Name = $"{provider.FirstName} {provider.LastName}",
            Status = ProviderStatus.Ready,
            ProviderId = joined.ProviderId,
            BoardId = joined.BoardId
        };
    }

    public void Apply(ProviderReady ready)
    {
        AppointmentId = null;
        Status = ProviderStatus.Ready;
    }

    public void Apply(ProviderAssigned assigned)
    {
        Status = ProviderStatus.Assigned;
        AppointmentId = assigned.AppointmentId;
    }
    
    public void Apply(ProviderPaused paused)
    {
        Status = ProviderStatus.Paused;
        AppointmentId = null;
    }

    // This is kind of a catch all for any paperwork the
    // provider has to do after an appointment has ended
    // for the just concluded appointment
    public void Apply(ChartingStarted charting) => Status = ProviderStatus.Charting;
}

Next up, let’s play the user story for a provider to make their “charting” activity complete after a patient appointment concludes. Looking at the sequence diagram and the bullet list of concerns for each command handler, we’ve got a few things to worry about. Never fear though, because Marten has you (mostly) covered today with a couple new features introduced in Marten v5.4 last week.

Starting with this simple command:

public record CompleteCharting(
    Guid ShiftId, 
    Guid AppointmentId, 
    int Version);

We’ll use Marten’s brand new IEventStore.FetchForWriting<T>() API to whip up the basic command handler (just a small ASP.Net Core Controller endpoint):

    public async Task CompleteCharting(
        [FromBody] CompleteCharting charting, 
        [FromServices] IDocumentSession session)
    {
        /* We've got options for concurrency here! */
        var stream = await session
            .Events.FetchForWriting<ProviderShift>(charting.ShiftId);

        // Validation on the ProviderShift aggregate
        if (stream.Aggregate.Status != ProviderStatus.Charting)
        {
            throw new Exception("The shift is not currently charting");
        }
        
        // We "decided" to emit one new event
        stream.AppendOne(new ChartingFinished(stream.Aggregate.AppointmentId.Value, stream.Aggregate.BoardId));

        await session.SaveChangesAsync();
    }

The FetchForWriting() method used above is doing a couple different things:

  1. Finding the current, persisted version of the event stream for the provider shift and loading that into the current document session to help with optimistic concurrency checks
  2. Fetching the current state of the ProviderShift aggregate for the shift id coming up on the command. Note that this API papers over whether or not the aggregate in question is a “live aggregate” that needs to be calculated on the fly from the raw events or previously persisted as just a Marten document by either an inline or asynchronous projection. I think I would strongly recommend that “write model” aggregates be either inline or live to avoid eventual consistency issues.

Concurrency?!?

Hey, the hard truth is that it’s easy for the command to be accidentally or incidentally dispatched to your service multiple times from messaging infrastructure, multiple users doing the same action in different sessions, or somebody clumsy like me just accidentally clicking a button too many times. One way or another, we may need to harden our command handler against concurrency concerns.

The usage of FetchForWriting<T>() will actually set you up for optimistic concurrency checks. If someone else manages to successfully process a command against the same provider shift between the call to FetchForWriting<T>() and IDocumentSession.SaveChangesAsync(), you’ll get a Marten ConcurrencyException thrown by SaveChangesAsync() that will abort and rollback the transaction.

Moving on though, let’s tighten up the optimistic version check by first telling Marten what the version of the provider shift was that our command thinks that the provider shift is at on the server. First though, we need to get the current version back to the client that’s collecting changes to our provider shift. If you scan back to the ProviderShift aggregate above, you’ll see this property:

    public int Version { get; set; }

With another new little feature in Marten v5.4, the Marten projection support will automatically set the value of a Version to the latest stream version for a single stream aggregate like the ProviderShift. Knowing that, and assuming that ProviderShift is updated inline, we could just deliver the whole ProviderShift to the client with this little web service endpoint (using Marten.AspNetCore extensions):

    [HttpGet("/shift/{shiftId}")]
    public Task GetProviderShift(Guid shiftId, [FromServices] IQuerySession session)
    {
        return session.Json.WriteById<ProviderShift>(shiftId, HttpContext);
    }

The Version property can be a field, scoped as internal, or read-only. Marten is using a dynamically generated Lambda that can happily bypass whatever scoping rules you have to set the version to the latest event for the stream represented by this aggregate. The Version naming convention can also be explicitly ignored, or redirected to a totally differently named member. Lastly, it can even be a .Net Int64 type too — but if you’re doing that, you probably have some severe modeling issues that should be addressed first!

Back to our command handler. If the client has what’s effectively the “expected starting version” of the ProviderShift and sends the CompleteCharting command with that version, we can change the first line of our handler method code to this:

        var stream = await session
                
            // Note: I'm passing in the expected, starting provider shift
            // version from the command
            .Events.FetchForWriting<ProviderShift>(charting.ShiftId, charting.Version);

This new version will throw a ConcurrencyException right off the bat if the expected, starting version is not the same as the last, persisted version in the database. After that, it’s the same optimistic concurrency check at the point of calling SaveChangesAsync() to commit the changes.

Lastly, since Marten is built upon a real database instead of trying to be its own specialized storage engine like many other event sourcing tools, we’ve got one last trick. Instead of putzing around with optimistic concurrency checks let’s go to a pessimistic, exclusive lock on the specific provider shift so that only one session at a time can ever be writing to that provider shift with this variation:

        var stream = await session
                
            // Note: This will try to "wait" to claim an exclusive lock for writing
            // on the provider shift event stream
            .Events.FetchForExclusiveWriting<ProviderShift>(charting.ShiftId);
        

As you can see, Marten has some new functionality to make it even easier to use Marten within CQRS architectures by eliminating some previously repetitive code in both queries on projected state and in command handlers where you need to use Marten’s concurrency control.

Wait, not so fast, you missed some things!

I missed a couple very big things in the sample code above. For one, we’d probably want to broadcast the new events through some kind of service bus to allow other systems or just our own system to asynchronously do other work (like trying to assign our provider to another ready patient appointment). To do that reliably so that the event capture and the outgoing events being published succeed or fail together in one atomic action, I really need an “outbox” of some sort integrated into Marten.

I also left out any kind of potential error handling or message retry capabilities around the concurrency exceptions. And lastly (that I can think of offhand), I completely left out any discussion of the instrumentation you’d want in any kind of grown up system.

Since we’re in the middle of the NBA playoffs, I’m reminded of a Shaquille O’Neal quote from when his backup was Alonzo Mourning, and Mourning had a great game off the bench: “sometimes Superman needs some help from the Incredible Hulk.” In this case, part of the future of Marten is to be combined with another project called Jasper that is going to add external messaging with a robust outbox implementation for Marten to create a full stack for CQRS architectures. Maybe as soon as late next week or at least in June, I’ll write a follow up showing the Marten + Jasper combination that deals with the big missing pieces of this post.

Update on Jasper v2 with an actual alpha

First off, my super power (stop laughing at me!) is having a much longer attention span than the average software developer. In positive ways, this has enabled me to tackle very complex problems. In negative ways, I’ve probably wasted a tremendous amount of time in my career working on systems or projects long after they had probably already failed and I just wouldn’t admit it.

So late last year I started working on a reboot of Jasper, my attempt at creating a “next generation” messaging framework for .Net. The goal of Jasper has changed quite a bit since I started jotting down notes for it in 2014, but the current vision is to be a highly productive command execution engine and asynchronous messaging tool for .Net with less code ceremony than the currently popular tools in this space.

I kicked out a Jasper v 2.0.0-alpha-1 release this week just barely in time for my talk at That Conference yesterday (but didn’t end up showing it at all). Right now the intermediate goals to get to a full Jasper 2.0 rebooted project is to:

  • Finish the baked in Open Telemetry support. It’s there, but there’s holes in what’s being captured
  • Get the interop with MassTransit via Rabbit MQ working for more scenarios. I’ve got a successful proof of concept of bi-directional interaction between Jasper and MassTransit services
  • Finish documentation for the new 2.0 version. I moved the docs to VitePress and started re-writing the docs from scratch, and that takes time

The first two bullet points are all about getting Jasper ready to be something I could dogfood at work.

While I absolutely intend both Jasper and Marten to be perfectly usable without the other, there’s also going to be some specific integration between Jasper and Marten to create a full blown, opinionated CQRS stack for .Net development (think Axon for .Net, but hopefully with much less code ceremony). For this combination, the Marten team is talking about adding messaging subscriptions for the Marten event store functionality, Jasper middleware to reduce repetitive CQRS handler code, and using the outbox functionality in Jasper to also integrate Marten with external messaging infrastructure.

I’ll kick out actual content about all this in the next couple weeks, but a couple folks have noticed the big uptick in Jasper work and asked what was going on, so here’s a little blog post on it:)

Improving the Development and Production Time Experience with Marten V5

Marten V5 dropped last week, with significant new features for multi-tenancy scenarios and enabling users to use multiple Marten document stores in one .Net application. A big chunk of the V5 work was mostly behind the scenes trying to address user feedback from the much larger V4 release late last year. As always, the Marten documentation is here.

First, why didn’t you just…

I’d advise developers and architects to largely eliminate the word “just” and any other lullabye language from their vocabulary when talking about technical problems and solutions.

That being said:

  • Why didn’t you just use source generators instead? Most of this was done before source generators were released, and source generators are limited to information that’s available at compile time. The dynamic code generation in Marten is potentially using information that is only available at run time
  • Why didn’t you just use IL generation instead? Because I despise working directly with IL and I think that would have dramatically curtailed what was easily possible. It’s also possible that we end up having to go there eventually.

Setting the Stage

Consider this simplistic code to start a new Marten DocumentStore against a blank database and persist a single User document:

var store = DocumentStore.For("connection string");

await using var session = store.LightweightSession();
var user = new User
{
    UserName = "pmahomes", 
    FirstName = "Patrick", 
    LastName = "Mahomes"
};

session.Store(user);
await session.SaveChangesAsync();

Hopefully that code is simple enough for new users to follow and immediately start being productive with Marten. The major advantage of document databases over the more traditional RDBMS with or without an ORM is the ability to just get stuff done without having to spend a lot of time configuring databases or object to database mappings or anywhere as much underlying code to just read and write data. To that end, there’s a lot of stuff going on behind the scenes of that code up above.

First off, there’s some automatic database schema management. In the default configuration used up above, Marten is quietly checking the underlying database on the first usage of the User document type to see if the database matches Marten’s configuration for the User document, and applies database migrations at runtime to change the database as necessary.

Secondly, there’s some runtime code generation happening to “bake in” the internal handling of how User documents are read from and written to the database. It’s not apparent here, but there’s a lot of knobs you can twist in Marten to change the behavior of how a document type is stored and retrieved from the database (soft deletes, turning on more metadata tracking, turning off default metadata tracking to be leaner, etc.). That behavior even varies between the lightweight session I used up above and the behavior of IDocumentStore.OpenSession() that adds identity map behavior to the session. To be more efficient over all, Marten generates the tightest possible C# code to handle each document type, then in the default mode, actually compiles that code in memory with Roslyn and uses the dynamically built assembly.

Cool, right? I’d argue that Marten can make teams be far more productive than they would be with the more typical EF Core or Dapper backed approach. Now let’s move on to the unfortunately very real downsides of Marten’s approach and what we’ve done to improve matters:

  • The dynamic Roslyn code generation can sometimes incur a major “cold start” issue on the very first usage. It’s definitely not consistent, as some people do not see any noticeable impact and other folks tell me they get a 9 second delay on the first usage. This cold start issue is especially problematic for folks using Marten in a Serverless architecture
  • The dynamically generated code can’t be used for any kind of potentially valuable AOT optimization
  • Roslyn usage sometimes causes a big ol’ memory leak no matter what we try. This isn’t consistent, so I don’t know why
  • The database change tracking does do some in memory locking, and that’s been prone to dead lock issues in some flavors of .Net (Blazor, WPF)
  • Some of you won’t want to give your application rights to modify a database at runtime
  • In Marten V4 there were a few too many places where Marten was executing the database change detection asynchronously, but from within synchronous calls using the dreaded .GetAwaiter().GetResult() approach. Occasional deadlock issues occurred, mostly in Marten usage within Blazor.

Database Migration Improvements

Alright, let’s tackle the database migration issues first. Marten has long had some command line support so that you could detect and apply any outstanding database changes from your application itself with this call:

dotnet run -- marten-apply

If you use the command line tooling for migrations, you can now optimize Marten to just turn off all runtime database migrations like so:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");
                opts.AutoCreateSchemaObjects = AutoCreate.None;
            });
    }).StartAsync();

Other folks won’t want to use the command line tooling, so there’s another option to just do all database migrations on database startup once, but otherwise completely eliminate all other potential locking in Marten V5, but this time I have to use the IHost integration:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");
                
                // Mild compromise, now I've got to tell
                // Marten about the User document
                opts.RegisterDocumentType<User>();
            })

            // This tells the app to do all database migrations
            // at application startup time
            .ApplyAllDatabaseChangesOnStartup();
    }).StartAsync();

In case you’re wondering, this option is safe to use even if you have multiple application nodes starting up simultaneously. The V5 version here relies on global locks in Postgresql itself to prevent simultaneous database changes that previously resulted in interestingly chaotic failure:(

Pre-building the Generated Types

Now, onto dealing with the dynamic codegen aspect of things. V4 created a “build types ahead” model where you can generate all the dynamic code with this command line call:

dotnet run -- codegen write

You can now completely dodge the runtime code generation issue by this sequence of events:

  1. In your deployment scripts, run dotnet run -- codegen write first
  2. Compile your application, which will embed the newly generated code right into your application’s entry assembly
  3. Use the below setting to completely disable all dynamic codegen:
using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");

                // Turn off all dynamic code generation, but this
                // will blow up if the necessary type isn't compiled
                // into 
                opts.GeneratedCodeMode = TypeLoadMode.Static;
            });
    }).StartAsync();

Again though, this depends on you having all document types registered with Marten instead of depending on runtime discovery as we did in the very first sample in this post — and that’s a bit of friction. What we’ve found is that folks have found the origin pre-built generation model to be clumsy, so we went back to the drawing board for Marten V5 and came up with the…

“Auto” Generated Code Mode

For V5, we have the option shown below:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");

                // use pre-built code if it exists, or
                // generate code if it doesn't and "just work"
                opts.GeneratedCodeMode = TypeLoadMode.Auto;
            });
    }).StartAsync();

My thinking here is that you’d just keep this on all the time, and as long as you’re running the application locally or through your integration test suite (you have one of those, right?), you’d have the dynamic types written to your main project’s code automatically (in an /Internal/Generated folder). Unless you purposely add those to your source control’s ignore list, that code will also be checked in. Woohoo, right?

Now, finally let’s put this all together and bundle all of what I would recommend as Marten best practices into the new…

Optimized Artifact Workflow

New in Marten V5 is what I named the “optimized artifact workflow” (I say “I” because I don’t think other folks like the name:)) as shown below:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");
            })
            // This is the call you want!
            .OptimizeArtifactWorkflow(TypeLoadMode.Static)
            .ApplyAllDatabaseChangesOnStartup();
    })
    
    // In testing harnesses, or with AWS Lambda / Azure Functions,
    // you may have to help out .Net by explicitly setting
    // the main application assembly
    .UseApplicationProject(typeof(User).Assembly)
    
    .StartAsync();

With the OptimizeArtifaceWorkflow(TypeLoadMode.Static) usage above, Marten is running with automatic database management and “Auto” code generation if the host’s environment name is “Development” as it would typically be on a local developer box. In “Production” mode, Marten is running with all automatic database management disabled at runtime beside the initial database change application at startup. In “Production” mode, Marten is also turning off all dynamic code generation with the assumption that all necessary types can be found in the entry assembly.

The goal here was to have a quick setting that optimized Marten usage in both development and production time without having to add in a bunch of nested conditional logic for IHostEnvironment.IsDevelopment() throughout the IHost configuration code.

Exterminating Sync over Async Calls

Back to the very original sample code:

var store = DocumentStore.For("connection string");

await using var session = store.LightweightSession();
var user = new User
{
    UserName = "pmahomes", 
    FirstName = "Patrick", 
    LastName = "Mahomes"
};

session.Store(user);
await session.SaveChangesAsync();

In Marten V4, the first call to session.Store(user) would trigger the database schema detection, which behind the scenes would end up doing a .GetAwaiter().GetResult() trick to call asynchronous code within the synchronous Store() command (not gonna get into that here, but we eliminated all synchronous database schema detection functionality for unrelated reasons in V4).

In V5, we rewired a lot of the internal guts such that the database schema detection is happening instead in the call to IDocumentSession.SaveChangesAsync(), which is of course, asynchronous. That allowed us to eliminate usages of “sync over async” calls. Likewise, we made similar changes throughout other areas of Marten.

Summary

The hope here is that we can make our users be more successful with Marten, and side step the problems our users have had specifically with using Marten with AWS Lambda, Azure Functions, Blazor, and inside of WPF applications. I’m also hoping that the OptimizedArtifactWorkflow() usage greatly simplifies the usage of Marten “best practices.”

Working with Multiple Marten Databases in One Application

Marten V5 dropped last week. I covered the new “database per tenant” strategy for multi-tenancy in my previous blog post. Closely related to that feature is the ability to register and work with multiple Marten databases from a single .Net system, and that’s what I want to talk about today.

Let’s say that for whatever reason (but you know there’s some legacy in there somehow), our application is mostly persisted in its own Marten database, but also needs to interact with a completely separate “Invoicing” database on a different database server and having a completely different configuration. With Marten V5 we can register an additional Marten database by first writing a marker interface for that other database:

    // These marker interfaces *must* be public
    public interface IInvoicingStore : IDocumentStore
    {

    }

And now we can register and configure a completely separate Marten database in our .Net system with the AddMartenStore<T>() usage shown below:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        // You can still use AddMarten() for the main document store
        // of this application
        services.AddMarten("some connection string");

        services.AddMartenStore<IInvoicingStore>(opts =>
            {
                // All the normal options are available here
                opts.Connection("different connection string");

                // more configuration
            })
            // Optionally apply all database schema
            // changes on startup
            .ApplyAllDatabaseChangesOnStartup()

            // Run the async daemon for this database
            .AddAsyncDaemon(DaemonMode.HotCold)

            // Use IInitialData
            .InitializeWith(new DefaultDataSet())

            // Use the V5 optimized artifact workflow
            // with the separate store as well
            .OptimizeArtifactWorkflow();
    }).StartAsync();

So here’s a few things to talk about from that admittedly busy code sample above:

  1. The IInvoicingStore will be registered in your underlying IoC container with singleton scoping. Marten is quietly making a concrete implementation of your interface for you, similar to how Refit works if you’re familiar with that library.
  2. We don’t yet have a way to register a matching IDocumentSession or IQuerySession type to go with the separate document store. I think my though on that is to wait until folks ask for that.
  3. The separate store could happily connect to the same database with a different database schema or connect to a completely different database server altogether
  4. You are able to separately apply all detected database changes on startup
  5. The async daemon can be enabled completely independently for the separate document store
  6. The IInitialData model can be used in isolation with the separate document store for baseline data
  7. The new V5 “optimized artifact workflow” model can be enabled explicitly on each separate document store. This will be the subject of my next Marten related blog post.
  8. It’s not shown up above, but if you really wanted to, you could make the separate document stores use a multi-tenancy strategy with multiple databases
  9. The Marten command line tooling is “multiple database aware,” meaning that it is able to apply changes or assert the configuration on all the known databases at one time or by selecting specific databases by name. This was the main reason the Marten core team did the separate document store story at the same time as the database per tenant strategy.

As I said earlier, we have a service registration for a fully functional DocumentStore implementing our IInvoicingStore that can be injected as a constructor dependency as shown in an internal service of our application:

public class InvoicingService
{
    private readonly IInvoicingStore _store;

    // IInvoicingStore can be injected like any other
    // service in your IoC container
    public InvoicingService(IInvoicingStore store)
    {
        _store = store;
    }

    public async Task DoSomethingWithInvoices()
    {
        // Important to dispose the session when you're done
        // with it
        await using var session = _store.LightweightSession();

        // do stuff with the session you just opened
    }
}

This feature and the multi-tenancy with a database per tenant have been frequent feature requests by Marten users, and it made a lot of sense to tackle them together in V5 because there was quite a bit of overlap in the database change management code to support both. I would very strongly state that a single database should be completely owned by one system, but I don’t know how I really feel about a single system working with multiple databases. Regardless, it comes up often enough that I’m glad we have something in Marten.

I worked with a client system some years back that was a big distributed monolith where the 7-8 separate windows services all talked to the same 4-5 Marten databases, and we hacked together something similar to the new formal support in Marten V5 to accommodate that. I do not recommend getting yourself into that situation though:-)

Multi-Tenancy with Marten

Multitenancy is a reference to the mode of operation of software where multiple independent instances of one or multiple applications operate in a shared environment. The instances (tenants) are logically isolated, but physically integrated.

Gartner Glossary

In this case, I’m referring to “multi-tenancy” in regards to Marten‘s ability to deploy one logical system where the data for each client, organization, or “tenant” is segregated such that users are only ever reading or writing to their own tenant’s data — even if that data is all stored in the same database.

In my research and experience, I’ve really only seen three main ways that folks handle multi-tenancy at the database layer (and this is going to be admittedly RDBMS-centric here):

  1. Use some kind of “tenant id” column in every single database table, then do something behind the scenes in the application layer to always be filtering on that column based on the current user. Marten has supported what I named the “Conjoined” model* since very early versions.
  2. Separate database schema per tenant within the same database. This model is very unlikely to ever be supported by Marten because Marten compiles database schema names into generated code in many, many places.
  3. Using a completely separate database per tenant with identical structures. This approach gives you the most complete separation of data between tenants, and could easily give your system much more scalability when the database is your throughput bottleneck. While you could — and many folks did — roll your own version of “tenant per database” with Marten, it wasn’t supported out of the box.

But, drum roll please, Marten V5 that dropped just last week adds out of the box support for doing multi-tenancy with Marten using a separate database for each tenant. Let’s just right into the simplest possible usage. Let’s say that we have a small system where all we want:

  1. Tenants “tenant1” and “tenant2” to be stored in a database named “database1”
  2. Tenant “tenant3” should be stored in a database named “tenant3”
  3. Tenant “tenant4” should be stored in a database named “tenant4”

And that’s that. Just three databases that are known at bootstrapping time. Jumping into the configuration code in a small .Net 6 web api projection gives us this code:

var builder = WebApplication.CreateBuilder(args);

var db1ConnectionString = builder.Configuration
    .GetConnectionString("database1");

var tenant3ConnectionString = builder.Configuration
    .GetConnectionString("tenant3");

var tenant4ConnectionString = builder.Configuration
    .GetConnectionString("tenant4");

builder.Services.AddMarten(opts =>
{
    opts.MultiTenantedDatabases(x =>
    {
        // Map multiple tenant ids to a single named database
        x.AddMultipleTenantDatabase(db1ConnectionString,"database1")
            .ForTenants("tenant1", "tenant2");

        // Map a single tenant id to a database, which uses the tenant id as well for the database identifier
        x.AddSingleTenantDatabase(tenant3ConnectionString, "tenant3");
        x.AddSingleTenantDatabase(tenant4ConnectionString,"tenant4");
    });

    // Register all the known document types just
    // to enable database schema management
    opts.Schema.For<User>()
        // This is *only* necessary if you want to put more
        // than one tenant in one database. Which we did.
        .MultiTenanted();
});

Now let’s see this in usage a bit. Knowing that the variable theStore in the test below is the IDocumentStore registered in our system with the configuration code above, this test shows off a bit of the multi-tenancy usage:

[Fact]
public async Task can_use_bulk_inserts()
{
    var targets3 = Target.GenerateRandomData(100).ToArray();
    var targets4 = Target.GenerateRandomData(50).ToArray();

    await theStore.Advanced.Clean.DeleteAllDocumentsAsync();

    // This will load new Target documents into the "tenant3" database
    await theStore.BulkInsertDocumentsAsync("tenant3", targets3);

    // This will load new Target documents into the "tenant4" database
    await theStore.BulkInsertDocumentsAsync("tenant4", targets4);

    // Open a query session for "tenant3". This QuerySession will
    // be connected to the "tenant3" database
    using (var query3 = theStore.QuerySession("tenant3"))
    {
        var ids = await query3.Query<Target>().Select(x => x.Id).ToListAsync();

        ids.OrderBy(x => x).ShouldHaveTheSameElementsAs(targets3.OrderBy(x => x.Id).Select(x => x.Id).ToList());
    }

    using (var query4 = theStore.QuerySession("tenant4"))
    {
        var ids = await query4.Query<Target>().Select(x => x.Id).ToListAsync();

        ids.OrderBy(x => x).ShouldHaveTheSameElementsAs(targets4.OrderBy(x => x.Id).Select(x => x.Id).ToList());
    }
}

So far, so good. There’s a little extra configuration in this case to express the mapping of tenants to database, but after that, the mechanics are identical to the previous “Conjoined” multi-tenancy model in Marten. However, as the next set of questions will show, there was a lot of thinking and new infrastructure code under the visible surface because Marten can no longer assume that there’s only one database in the system.

https://commons.wikimedia.org/wiki/File:Iceberg.jpg

To dive a little deeper, I’m going to try to anticipate the questions a user might have about this new functionality:

Is there a DocumentStore per database, or just one?

DocumentStore is a very expensive object to create because of the dynamic code compilation that happens within it. Fortunately, with this new feature set, there is only one DocumentStore. The one DocumentStore does store the database schema difference detection by database though.

How much can I customize the database configuration?

The out of the box options for “database per tenant” configuration are pretty limited, and we know that they won’t cover every possible need of our users. No worries though, because this is pluggable by writing your own implementation of our ITenancy interface, then setting that on StoreOptions.Tenancy as part of your Marten bootstrapping.

For more examples, here’s the StaticMultiTenancy model that underpins the example usage up above. There’s also the SingleServerMultiTenancy model that will dynamically create a named database on the same database server for each tenant id.

To apply your custom ITenancy model, set that on StoreOptions like so:

var store = DocumentStore.For(opts =>
{
    // NO NEED TO SET CONNECTION STRING WITH THE
    // Tenancy option below
    //opts.Connection("connection string");

    // Apply custom tenancy model
    opts.Tenancy = new MySpecialTenancy();
});

Is it possible to mix “Conjoined” multi-tenancy with multiple databases?

Yes, it is, and the example code above tried to show that. You’ll still have to mark document types as MultiTenanted() to opt into the conjoined multi-tenancy in that case. We supported that model thinking that this would be helpful for cases where the logical tenant of an application may have suborganizations. Whether or not this ends up being useful is yet to be proven.

What about the “Clean” functionality?

Marten has some built in functionality to reset or teardown database state on demand that is frequently used for test automation (think Respawn, but built into Marten itself). With the introduction of database per tenant multi-tenancy, the old IDocumentStore.Advanced.Clean functionality had to become multi-database aware. So when you run this code:

    // theStore is an IDocumentStore
    await theStore.Advanced.Clean.DeleteAllDocumentsAsync();

Marten is deleting all the document data in every known tenant database. To be more targeted, we can also “clean” a single database like so:

            // Find a specific database
            var database = await store.Storage.FindOrCreateDatabase("tenant1");

            // Delete all document data in just this database
            await database.DeleteAllDocumentsAsync();

What about database management?

Marten tries really hard to manage database schema changes for you behind the scenes so that your persistence code “just works.” Arguably the biggest task for per database multi-tenancy was enhancing the database migration code to support multiple databases.

If you’re using the Marten command line support for the system above, this will apply any outstanding database changes to each and every known tenant database:

dotnet run -- marten-apply

But to be more fine-grained, we can choose to apply changes to only the tenant database named “database1” like so:

dotnet run -- marten-apply --database database1

And lastly, you can interactively choose which databases to migrate like so:

dotnet run -- marten-apply -i

In code, you can direct Marten to detect and apply any outstanding database migrations (between how Marten is configured in code and what actually exists in the underlying database) across all tenant database upon application startup like so:

services.AddMarten(opts =>
{
    // Marten configuration...
}).ApplyAllDatabaseChangesOnStartup();

The migration code above runs in an IHostedService upon application startup. To avoid collisions between multiple nodes in your application starting up at the same time, Marten uses a Postgresql advisory lock so that only one node at a time can be trying to apply database migrations. Lesson learned:)

Or in your own code, assuming that you have a reference to an IDocumentStore object named theStore, you can use this syntax:

// Apply changes to all tenant databases
await theStore.Storage.ApplyAllConfiguredChangesToDatabaseAsync();

// Apply changes to only one database
var database = await theStore.Storage.FindOrCreateDatabase("database1");
await database.ApplyAllConfiguredChangesToDatabaseAsync();

Can I execute a transaction across databases in Marten?

Not natively with Marten, but I think you could pull that off with TransactionScope, and multiple Marten IDocumentSession objects for each database.

Does the async daemon work across the databases?

Yes! Using the IHost integration to set up the async daemon like so:

services.AddMarten(opts =>
{
    // Marten configuration...
})
    // Starts up the async daemon across all known
    // databases on one single node
    .AddAsyncDaemon(DaemonMode.HotCold);

Behind the scenes, Marten is just iterating over all the known tenant databases and actually starting up a separate object instance of the async daemon for each database.

We don’t yet have any way of distributing projection work across application nodes, but that is absolutely planned.

Can I rebuild a projection by database? Or by all databases at one time?

Oopsie. In the course of writing this blog post I realized that we don’t yet support “database per tenant” with the command line projections option. You can create a daemon instance programmatically for a single database like so:

// Rebuild the TripProjection on just the database named "database1"
using var daemon = await theStore.BuildProjectionDaemonAsync("database1");
await daemon.RebuildProjection<TripProjection>(CancellationToken.None);

*I chose the name “Conjoined,” but don’t exactly remember why. I’m going to claim that that was taken from the “Conjoiner” sect from Alistair Reynolds “Revelation Space” series.

Marten V5 is out!

The Marten team published Marten V5.0 today! It’s not as massive a leap as the Marten V4 release late last year was (and a much, much easier transition from 4 to 5 than 3 to 4 was:)), but I think this addresses a lot of the user issues folks have had with the V4 and makes Marten a much better tool in production and in development.

Some highlights:

  • The release notes and the 5.0 GitHub milestone issues.
  • The closed GitHub milestone just to prove we were busy
  • Fully supports .Net 6 and the latest version of Npgsql for folks who use Marten in combination with Dapper or *gasp* EF Core
  • Marten finally supports doing multi-tenancy through a “database per tenant” strategy with Marten fully able to handle schema migrations across all the known databases!
  • There were a lot of improvements to the database change management and the “pre-built code generation” model has a much easier to use alternative now. See the Development versus Production Usage. Also see the new AddMarten().ApplyAllDatabaseChangesOnStartup() option here.
  • I went through the Marten internals with a fine toothed comb to try and eliminate async within sync calls using .GetAwaiter().GetResult() to try to prevent deadlock issues that some users had reported with, shall we say, “alternative” Marten usages.
  • You can now add and resolve additional document stores in one .Net application.
  • There’s a new option for “custom aggregations” in the event sourcing support for advanced aggregations that fall outside of what was currently possible. This still allows for the performance optimizations we did for Marten V4 aggregates without having to roll your own infrastructure.

As always, thank you to Oskar Dudycz and Babu Annamalai for all their contributions as Marten is fortunately a team effort.

I’ll blog some later this and next week on the big additions. 5.0.1 will inevitably follow soon with who knows what bug fixes. And after that, I’m taking a break on Marten development for a bit:)