Marten and Friend’s (Hopefully) Big Future!

Marten was conceived and launched way back in 2016 as an attempt to quickly improve the performance and stability of a mission critical web application by utilizing Postgresql and its new JSON capabilities as a replacement for a 3rd party document database – and do that in a hurry before the next busy season. My former colleagues and I did succeed in that endeavor, but more importantly for the longer run, Marten was also launched as an open source project on GitHub and quickly attracted attention from other developers. The addition of an originally small feature set for event sourcing dramatically increased interest and participation in Marten. 

Fast forward to today, and we have a vibrant community of engaged users and a core team of contributors that are constantly improving the tool and discussing ideas about how to make it even better. The giant V4 release last year brought an overhaul of almost all the library internals and plenty of new capabilities. V5 followed early in 2022 with more multi-tenancy options and better tooling for development lifecycles and database management based on early issues with V4. 

At this point, I’d list the strong points of Marten that we’ve already achieved as:

  • A very useful document database option that provides the powerful developer productivity you expect from NoSQL solutions while also supporting a strong consistency model that’s usually missing from NoSQL databases. 
  • A wide range of viable hosting options by virtue of being on top of Postgresql. No cloud vendor lock-in with Marten!
  • Quite possibly the easiest way to build an application using Event Sourcing in .NET with both event storage and user defined view projections in the box
  • A great local development story through the simple ability to run Postgresql in a Docker container and Marten’s focus on an “it just works” style database schema management subsystem
  • The aforementioned core team and active user base makes Marten a viable OSS tool for teams wanting some reassurance that Marten is going to be well supported in the future

Great! But now it’s time to talk about the next steps we’re planning to take Marten to even greater heights in the forthcoming Marten V6 that’s being planned now. The overarching theme is to remove the most common hurdles for not choosing Marten. By and large, I think the biggest themes for Marten are:

  1. Scalability, so Marten can be used for much larger data sets. From user feedback, Marten is able to handle data sets of 10 million events today, but there’s opportunities to go far, far larger than that.
  2. Improvements to operational support. Database migrations when documents change, rebuilding projections without downtime, usage metrics, and better support for using multiple databases for multi-tenancy
  3. Marten is in good shape as a purely storage option for Event Sourcing, but users are very often asking for an array of subscription options to propagate events captured by Marten
  4. More powerful options for aggregating event data into more complex projected views
  5. Improving the Linq and other querying support is a seemingly never-ending battle
  6. The lack of professional support for Marten. Obviously a lot of shops and teams are perfectly comfortable with using FOSS tools knowing that they may have to roll up their sleeves and pitch in with support, but other shops are not comfortable with this at all and will not allow FOSS usage for critical functions. More on this later.

First though, Marten is getting a new “critter” friend in the larger JasperFx project family:

Wolverine is a new/old OSS command bus and messaging tool for .NET. It’s what was formerly being developed as Jasper, but the Marten team decided to rebrand the tool as a natural partner with Marten (both animals plus Weasel are members of the Mustelidae family). While both Marten and Wolverine are happily usable without each other, we think that the integration of these tools gives us the opportunity to build a full fledged platform for building applications in .NET using a CQRS architecture with Event Sourcing. Moreover, we think there’s a significant gap in .NET for this kind of tooling and we hope to fill that. 

So, onto future plans…

There’s a couple immediate ways to improve the scalability of Marten we’re planning to build in Marten V6. The first idea is to utilize Postgresql table sharding in a couple different ways. 

First, we can enable sharding on document tables based on user defined criteria through Marten configuration. The big challenge there is to provide a good migration strategy for doing this as it requires at least a 3 step process of copying the existing table data off to the side before creating the new tables. 

The next idea is to shard the event storage tables as well, with the immediate idea being to shard off of archived status to effectively create a “hot” storage of recent events and a “cold” storage of older events that are much less frequently accessed. This would allow Marten users to keep the active “hot” event storage to a much smaller size and therefore greatly improve potential performance even as the database continues to grow.

We’re not done “sharding” yet, but this time we need to shift to the asynchronous projection support in Marten. The core team has some ideas to improve the throughput of the asynchronous projection code as it is, but today it’s limited to only running on one single application node with “hot/cold” rollover support. With some help from Wolverine, we’re hoping to build a “sharded” asynchronous projection that can shard the processing of single projections and distribute the projection work across potentially many nodes as shown in the following diagram:

The asynchronous projection sharding is going to be a big deal for Marten all by itself, but there’s some other potentially big wins for Marten V6 with better tooling for projection rebuilds and asynchronous projections in general:

  1. Some kind of user interface to monitor and manage the asynchronous projections
  2. Faster projection rebuilds
  3. Zero downtime projection rebuilds

Marten + Wolverine == “Critter Stack” 

Again, both Marten and Wolverine will be completely usable independently, but we think there’s some potential synergy through the combination. One of the potential advantages of combining the tools is to use Wolverine’s messaging to give Marten a full fledged subscription model for Marten events. All told we’re planning three different mechanisms for propagating Marten events to the rest of your system:

  1. Through Wolverine’s transactional outbox right at the point of event capture when you care more about immediate delivery than strict ordering (this is already working)
  2. Through Martens asynchronous daemon when you do need strict ordering
  3. If this works out, through CDC event streaming straight from the database to Kafka/Pulsar/Kinesis

That brings me to the last topic I wanted to talk about in this post. Marten and Wolverine in their current form will remain FOSS under the MIT license, but it’s past time to make a real business out of these tools.

I don’t know how this is exactly going to work out yet, but the core Marten team is actively planning on building a business around Marten and now Wolverine. I’m not sure if this will be the front company, but I personally have formed a new company named “Jasper Fx Software” for my own activity – but that’s going to be limited to just being side work for at least awhile. 

The general idea – so far – is to offer:

  • Support contracts for Marten 
  • Consulting services, especially for help modeling and maximizing the usage of the event sourcing support
  • Training workshops
  • Add on products that add the advanced features I described earlier in this post

Maybe success leads us to offering a SaaS model for Marten, but I see that as a long way down the road.

What think you gentle reader? Does any of this sound attractive? Should we be focusing on something else altogether?

Advertisement

Effective Test Driven Development

I wrote a lot about Test Driven Development back in the days of the now defunct CodeBetter site. You can read a little of the old precursor content from this old MSDN Magazine article I wrote in 2008. As time permits or my ambition level waxes and wanes, I’ll be resurrecting and rewriting some of my old “Shade Tree Developer” content on team dynamics, design fundamentals, and Agile software practices from those days. This is just a preface to a new blog series on my thinking about how to effectively do TDD in your daily coding work.

The series so far:

I’m giving an internal talk at work this week about applying Test Driven Development (TDD) within one of our largest systems. Our developers today certainly build tests for new code today with a mix of unit tests and integration tests, but there’s room for improvement to help our developers do more effective unit testing with less effort and end up with more useful tests.

That being said, it’s not all that helpful to just yell at your developers and tell them they should “just” write more or better tests or say that they should “just do TDD.” So instead of yelling, let’s talk through some possible strategies and mental tools for applying TDD in real world code. But first, here’s a quick rundown of…

What we don’t want:

  • Tests that require a lot of setup code just to establish inputs. Not only does that keep developers from being productive when writing tests, it’s a clear sign that you may have harmful coupling problems within your code structure.
  • Tests that only duplicate the implementation of the code under test. This frequently happens from overusing mock objects. Tests written this way are often brittle when the actual code needs to be refactored, and can even serve to prevent developers from trying to make code improvements through refactoring. These tests are also commonly caused by attempts to “shut up the code coverage check” in CI with tests retrofitted onto existing code.
  • Tests that “blink,” meaning that they do not consistently pass or fail even if the actual functionality is correct. This is all too painfully common with integration tests that deal with asynchronous code. Selenium tests are notoriously bad for this.
  • Slow feedback cycles between writing code and knowing whether or not that code actually works
  • Developers needing to spend a lot of time in the debugger trying to trace down problems in the code.

Instead, let’s talk about…

What we do want:

  • Fast feedback cycles for development. It’s hard to overstate how important that is for developers to be productive.
  • Developers to be able to efficiently use their time while constantly switching between writing tests and the code to make those tests pass
  • The tests are fine-grained enough to allow our developers to find and remove problems in the code
  • The existing tests are useful for refactoring. Or at least not a significant cause of friction when trying to refactor code.
  • Test tests clearly express the intent of the code and act as a form of documentation.
  • The code should generally exhibit useful qualities of cohesion and coupling between various pieces of code

And more than anything, I would like developers to be able to use TDD to help them think through their code as they build it. TDD is a couple things, but the most important two things to me are as a heuristic to think through code structure and as a rapid feedback cycle. Having the tests around later to facilitate safe refactoring in the codebase is important too, especially if you’re going to be working on a codebase for years that’s likely going to outgrow its original purpose.

So what’s next?

I’ve already started working on the actual content of how to do TDD with examples mostly pulled from my open source projects. Right now, I’m thinking about writing over the next couple months about:

  • Using responsibility driven design as a way to structure code in a way that’s conducive to easier unit testing
  • Some real world examples of building open source features with TDD
  • My old “Jeremy’s Rules of TDD” which really just amount to some heuristics for improving the properties of cohesion or coupling in your code based on testability. I’m going to supplement that by stealing from Jim Shore’s excellent book on Testing without Mocks
  • A discussion of state-based vs interaction based testing and when you would choose either
  • Switching between top down code construction or bottom up coding using TDD
  • What code deserves a test, and what could you let slide without?
  • Choosing between solitary unit tests, sociable unit tests, or pulling in infrastructure to write integration tests on a case by case basis
  • Dealing with data intensive testing. Kind of a big deal working for a company whose raison d’etre is data analytics
  • Not really TDD per se, but I think I’d like to also revisit my old article about succeeding with automated testing
  • And lastly, what the hell, let’s talk about judicious usage of mock objects and other fakes because that never seems to ever stop being a real problem

I’m happy to take requests, especially from colleagues. But I absolutely will not promise prompt publishing of said requests:)

What Is Good Code?

This is the second part of a 3 or 4 part series where I’m formulating my thoughts about an ongoing initiative at MedeAnalytics. I started yesterday with a related post called On Giving Technical Guidance to Others that’s a synopsis of an impromptu lecture I game our architecture team about all the things I wish I’d known before becoming any kind of technical leader. I’ll follow this post up hopefully as soon as tomorrow with my reasoning about why prescriptive architectures are harmful and my own spin on the SOLID Principles.

I’m part of an architectural team that’s been charged with modernizing and improving our very large, existing systems. We have an initiative just getting off the ground to break off part of one of our monoliths into a separate system to begin a strangler application strategy to modernize the system over time. This gives us a chance to work out how we want our systems to be structured and built going forward in a smaller subset of work instead of trying to boil the ocean to update the entire monolith codebase at one time.

As part of that effort, I’m trying to put some stakes in the ground to:

  • Ban all usage of formal, prescriptive architectural styles like the Onion Architecture or Clean Architecture because I find that they do more harm than good. Rather, I’m pushing hard for vertical slice or feature folder code organization while still satisfying the need for separation of concerns and decent structuring of the code
  • Generally choose lower code ceremony approaches whenever possible because that promotes easier evolution of the code, and in the end, the only truly guaranteed path to good code is adaptation and evolution in the face of feedback about the code.
  • Be very cautious about how we abstract database access to avoid causing unnecessary complexity or poor performance, which means I probably want to ban any usage of naive IRepository<T> type abstractions
  • Put the SOLID Principles into a little bit of perspective as we do this work and make sure our developers and architects have a wider range of mental tools in their design toolbox than just an easy to remember but hard to interpret or apply acronym developed by C++ developers before many of our developers were even born

The rest of this post is just trying to support those opinions.

First, What is Good?

More on this in a later post as I give my take on SOLID, but Dan North made an attempt at describing “good code” that’s worth your read.

Let’s talk a little bit about the qualities you want in your code. Quite a few folks are going to say that the most important quality is that the code satisfies the business needs and delivers value to the business! If you’ll please get that bit of self righteousness out of your system, let’s move on to the kind of technical quality that’s necessary to continue to efficiently deliver business value over time.

  • You can understand what the code is doing, navigate within the codebase, and generally find code where you would expect it to be based on the evident and documented rules of the system architecture.
  • The code exhibits separation of concerns, meaning that you’re generally able to reason about and change one responsibility of the code at a time (data access, business logic, validation logic, data presentation, etc.). Cohesion and coupling are the alpha and omega of software design. I’m a very strong believer in evolutionary approaches to designing software as the only truly reliable method to arrive at good code, but that’s largely dependent upon the qualities of cohesion and coupling within your code.
  • Rapid feedback is vital to effective coding, so testability of the code is a major factor for me. This can mean that code is structured in a way that it’s easy to unit test in isolation (i.e., you can effectively test business rules without having to run the full stack application or in one hurtful extreme, be forced to use a tool like Selenium). This version of testability is very largely a restatement of cohesion and coupling. Alternatively, if the code depends on some kind of infrastructure that’s easy to deal with in integration testing (like Marten!) and the integration tests run “fast enough,” I say you can relax separation of concerns and jumble things together as long as the code is still easy to reason about.
  • I don’t know a pithy way to describe this, but the code needs to carefully expose the places where system state is changed or “mutated” to make the code’s behavior predictable and prevent bugs. Whether that’s adopting command query segregation, using elements of functional programming, or the uni-directional data flow in place of two way data binding in user interface development, system state changes are an awfully easy way to introduce bugs in code and should be dealt with consciously and with some care.

I think most of us would say that code should be “simple,” and I’d add that I personally want code to be written in a low ceremony way that reduces noise in the code. The problem with that whole statement is that it’s very subjective:

Which is just to say that saying the words “just write simple code!” isn’t necessarily all that helpful or descriptive. What’s helpful is to have some mental tools to help developers judge whether or not their code is “good” and move in the direction of more successful code. Bet yet, do that without introducing unnecessary complexity or code ceremony through well-intentioned prescriptive architectures like “Onion” or “Clean” that purposely try to force developers to write code “the right way.”

And next time on Jeremy tries to explain his twitter ranting…

This has inevitably taken longer than I wished to write, so I’m breaking things up. I will follow up tomorrow and Thursday with my analysis of SOLID, an explanation of why I think the Onion/Clean Architecture style of code organization is best avoided, and eventually some thoughts on database abstractions.

On Giving Technical Guidance to Others

I’m still working on my promised SOLID/Anti-Onion/Anti-Clean/Database Abstraction post, but it’s as usual taking longer than I’d like and I’m publishing this section separately.

So far I’ve followed up with:

Just as a quirk of circumstances, I pretty well went straight from being a self-taught “Shadow IT” developer to being a lead developer and de facto architect on a mission critical supply chain application for a then Fortune 500 company. The system was an undeniable success in the short term, but it came at a cost to me because as a first time lead I had zero ability to enable the other developers working with me to be productive. As such, I ended up writing the mass majority of the code and inevitably became the bottleneck on all subsequent production issues. That doesn’t scale.

The following year I had another chance to lead a follow up project and vowed to do a better job with the other developers (plus I was getting a lot of heat from various management types to do so). In a particular case that I remember to this day, I wrote up a detailed Word document for a coding assignment for another developer. I got all the way down to class and method names and even had some loose sample code I think. I handed that off, patted myself on the back for being a better lead, and went off on my merry way.

As you might have surmised, when I got his code back later it was unusable because he did exactly what I said to do — which turned out to be wrong based on factors I hadn’t anticipated. Worse, he only did exactly what I said to do and missed some concerns that I didn’t think needed to be explicitly called out. I’ve thought a lot about this over the years and come to some conclusions about how I should have tried to work differently with that developer. Before diving into that, let’s first talk about you for awhile!

Congratulations! You’ve made it to some kind of senior technical role in your company. You’ve attained enough skill and knowledge to be recognized for your individual contributions, and now your company is putting you in a position to positively influence other developers, determine technical strategies, and serve as a steward for your company’s systems.

Hopefully you’ll still be hands on in the coding and testing, but increasingly, your role is going to involve trying to create and evolve technical guidance for other developers within your systems. More and more, your success is going to be dependent on your ability to explain ideas, concepts, and approaches to other developers. Not that I’m the fount of all wisdom about this, but here’s some of the things I wish I’d understood before being put into technical leadership roles:

  • It’s crucial to provide the context, reasoning, and applicability behind any technical guidance. Explaining why or when are we doing this is just as important as the “what” or “how.”
  • Being too specific in the guidance or instructions to another developer can easily come with the unintended consequence of turning off their brains and will frequently lead to poor results. Expanding on my first point, it’s better to explain the goals, how their work fits into the larger system, and the qualities of the code you’re hoping to achieve rather than try to make them automatons just following directions. It’s quite possible that JIRA-driven development exacerbates this potential problem.
  • You need to provide some kind of off-ramp to developers to understand the limitations of the guidance. The last thing you want is for developers to blindly follow guidance that is inappropriate for a circumstance that wasn’t anticipated during the formulation of said guidance
  • Recommendations about technology usage probably needs to come as some kind of decision tree with multiple options to its applicability because there’s just about never a one size fits all tool
  • By all means, allow and encourage the actual developers to actively look for better approaches because they’re the ones closest to their code. Especially with talented younger developers, you never want to take away their sense of initiative or close them off from providing feedback, adjustments, or flat out innovation to the “official” guidance. At the very least, you as a senior technical person need to pay attention when a developer tells you that the current approach is confusing or laborious or feels too complicated.
  • Treat every possible recommendation or technical guidance as a theory that hasn’t yet been perfectly proven.

I’ve talked a lot about giving technical guidance, but you should never think that you or any other individual are responsible for doing all the thinking within a software ecosystem. What you might be responsible for is facilitating the sharing of learning and knowledge through the company. I was lucky enough early in my career to spend just a little bit of time working with Martin Fowler who effectively acts as a sort of industry wide, super bumble bee gathering useful knowledge from lots of different projects and cross-pollinating what he’s learned to other teams and other projects. Maybe you don’t impact the entire software development industry like he has, but you can at least facilitate that within your own team or maybe within your larger organization.

As an aside, a very helpful technique to use when trying to explain something in code to another developer is to ask them to explain it back to you in their own words — or conversely, I try to do this when I’m the one getting the explanation to make sure I’ve really understood what I’m being told. My wife is an educator and tells me this is a common technique for teachers as well.

Next time…

In my next post I’m going to cover a lot of ground about why I think prescriptive architectural styles like the “Onion” or “Clean” are harmful, alternatives, a discussion about what use is SOLID these days (more than none, but much less than the focus many people put on it is really worth), and a discussion about database abstractions I find to be harmful that tend to be side effects of prescriptive architectures.

Low Code Ceremony Sagas with Jasper & Marten

You’ll need at least Jasper v2.0.0-alpha-4 if you want to recreate the saga support in this post. All the sample code for this post is in an executable sample on GitHub. Jasper does support sagas with EF Core and Sql Server or Postgresql, but Marten is where most of the effort is going just at the moment.

The Saga pattern is a way to solve the issue of logical, long-running transactions that necessarily need to span over multiple operations. In the approaches I’ve encountered throughout my career, this has generally meant persisting a “saga state” of some sort in a database that is used within a message handling framework to “know” what steps have been completed, and what’s outstanding.

Jumping right into an example, consider a very simple order management service that will have steps to:

  1. Create a new order
  2. Complete the order
  3. Or alternatively, delete new orders if they have not been completed within 1 minute

For the moment, I’m going to ignore the underlying persistence and just focus on the Jasper message handlers to implement the order saga workflow with this simplistic saga code:

using Baseline.Dates;
using Jasper;

namespace OrderSagaSample;

public record StartOrder(string Id);

public record CompleteOrder(string Id);

public record OrderTimeout(string Id) : TimeoutMessage(1.Minutes());

public class Order : Saga
{
    public string? Id { get; set; }

    // By returning the OrderTimeout, we're triggering a "timeout"
    // condition that will process the OrderTimeout message at least
    // one minute after an order is started
    public OrderTimeout Start(StartOrder order, ILogger<Order> logger)
    {
        Id = order.Id; // defining the Saga Id.

        logger.LogInformation("Got a new order with id {Id}", order.Id);
        // creating a timeout message for the saga
        return new OrderTimeout(order.Id);
    }

    public void Handle(CompleteOrder complete, ILogger<Order> logger)
    {
        logger.LogInformation("Completing order {Id}", complete.Id);

        // That's it, we're done. This directs Jasper to delete the
        // persisted saga state after the message is done.
        MarkCompleted();
    }

    public void Handle(OrderTimeout timeout, ILogger<Order> logger)
    {
        logger.LogInformation("Applying timeout to order {Id}", timeout.Id);

        // That's it, we're done. Delete the saga state after the message is done.
        MarkCompleted();
    }
}

I’m just aiming for a quick sample rather than exhaustive documentation here, but a few notes:

  • Jasper leans a bit on type and naming conventions to discover message handlers and to “know” how to call these message handlers. Some folks will definitely not like the magic, but this approach leads to substantially less code and arguably complexity compared to existing .Net tools
  • Jasper supports the idea of scheduled messages, and the new TimeoutMessage base class up there is just a way to utilize that support for “saga timeout” conditions
  • Jasper generally tries to adapt to your application code rather than force a lot of mandatory framework artifacts into your message handler code

Now let’s move over to the service bootstrapping and add Marten in as our persistence mechanism in the Program file:

using Jasper;
using Jasper.Persistence.Marten;
using Marten;
using Oakton;
using Oakton.Resources;
using OrderSagaSample;

var builder = WebApplication.CreateBuilder(args);

// Not 100% necessary, but enables some extra command line diagnostics
builder.Host.ApplyOaktonExtensions();

// Adding Marten
builder.Services.AddMarten(opts =>
    {
        var connectionString = builder.Configuration.GetConnectionString("Marten");
        opts.Connection(connectionString);
        opts.DatabaseSchemaName = "orders";
    })

    // Adding the Jasper integration for Marten.
    .IntegrateWithJasper();


builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();

// Do all necessary database setup on startup
builder.Services.AddResourceSetupOnStartup();

// The defaults are good enough here
builder.Host.UseJasper();

var app = builder.Build();

// Just delegating to Jasper's local command bus for all
app.MapPost("/start", (StartOrder start, ICommandBus bus) => bus.InvokeAsync(start));
app.MapPost("/complete", (CompleteOrder start, ICommandBus bus) => bus.InvokeAsync(start));
app.MapGet("/all", (IQuerySession session) => session.Query<Order>().ToListAsync());
app.MapGet("/", (HttpResponse response) =>
{
    response.Headers.Add("Location", "/swagger");
    response.StatusCode = 301;
});

app.UseSwagger();
app.UseSwaggerUI();

return await app.RunOaktonCommands(args);

Off screen, I’ve started up a docker container for Postgresql to get a blank database. With that running, I’ll start the application up with the usual dotnet run command and open up the Swagger page:

You’ll get a lot of SQL in your terminal on the first run as Marten sets up the database for you, that’s perfectly normal.

I’m going to first create a new order for “Shoes” and execute the /create endpoint:

And verify that it’s persisted by checking the /all endpoint:

If I’m quick enough, I’ll post {"Id": "Shoes"} to /complete, and then verify through the /all endpoint that the “Shoes” order has been completed.

Otherwise, if I’m too slow to complete the order, the timeout message will be applied to our order and you’ll see evidence of that in the logging output like so:

And that’s it, one working saga implementation with database backed persistence through Marten. The goal of Jasper is to make this kind of server side development as low ceremony and easy to use as possible, so any feedback about what you do or don’t like in this sample would be very helpful.

Related Posts

I’ve spit out quite a bit of blogging content the past several weeks on both Marten and Jasper:

(Re) Introducing Jasper as a Command Bus

EDIT 6/15/2022: The correct Nuget is “Jasper.Persistence.Marten”

I just released a second alpha of Jasper 2.0 to Nuget. You can find the project goals for Jasper 2.0 here, and an update from a couple weeks ago here. Be aware that the published documentation for Jasper is very, very far behind. I’m working on it:)

Jasper is a long running open source project with the goal of creating a low ceremony and highly productive framework for building systems in .Net that would benefit from either an in memory command bus or utilize asynchronous messaging. The big driver for Jasper right now is using it in combination with the event sourcing capabilities of Marten as a full stack CQRS architectural framework. Later this week I’ll preview the ongoing Marten + Jasper integration, but for today I’d like to just introduce Jasper itself a little bit.

For a simplistic sample application, let’s say that we’re building an online system for potential customers to make reservations at any number of participating restaurants. I’m going to start by laying down a brand new .Net 6 Web API project. I’m obviously going to choose Marten as my persistence tooling, so the next steps are to add a Nuget reference to the Jasper.Persistence.Marten Nuget which will bring transitive dependencies over for both Jasper and Marten.

Jasper also has some working integration with EF Core using either Sql Server or Postgresql as the backing store so far.

Let’s build an online reservation system for your favorite restaurants!

Let’s say that as a result of an event storming requirements session, we’ve determined that we want both a command message to confirm a reservation, and a corresponding event message out to the internal systems of the various restaurants. I’m going to eschew event sourcing to keep this simpler and just opt for a persistent Reservation document in Marten. All that being said, here’s our code to model everything I just described:

public record ConfirmReservation(Guid ReservationId);
public record ReservationConfirmed(Guid ReservationId);

public class Reservation
{
    public Guid Id { get; set; }
    public DateTimeOffset Time { get; set; }
    public string RestaurantName { get; set; }
    public bool IsConfirmed { get; set; }
}

Building Our First Message Handlers

In this contrived example, the ReservationConfirmed event message will be published separately because it spawns a call to an external system where I’d strongly prefer to have a separate “retry loop” around just that call. That being said, this is the first cut for a command handler for the ConfirmReservation message:

public class ConfirmReservationHandler
{
    public async Task Handle(ConfirmReservation command, IDocumentSession session, IExecutionContext publisher)
    {
        var reservation = await session.LoadAsync<Reservation>(command.ReservationId);

        reservation!.IsConfirmed = true;

        // Watch out, this could be a race condition!!!!
        await publisher.PublishAsync(new ReservationConfirmed(reservation.Id));

        // We're coming back to this in a bit......
        await session.SaveChangesAsync();
    }
}

To be technical, Jasper uses an in memory outbox for all message processing even if there’s no durable message storage to at least guarantee that outgoing messages are only published when the original message is successfully handled. I just wanted to show the potential danger here.

So a couple things to note that are different from existing tools like NServiceBus or MassTransit:

  • Jasper locates message handlers strictly through naming conventions. Public methods named either Handle() or Consume() on public types that are suffixed by Handler or Consumer. There are no mandatory attributes or interfaces. Hell, there’s not even a mandatory method signature except that the first argument is always assumed to be the message type.
  • Jasper Handle() methods can happily support method injection, meaning that the IDocumentSession parameter above is pushed into the method from Jasper itself. In my experience, using method injection frequently simplifies the message handler code as opposed to the idiomatic C# approach of using constructor injection and relaying things through private fields.
  • Message types in Jasper are just concrete types, and there’s no necessary Event or Message base classes of any type — but that may be introduced later strictly for optional diagnostics.

Lastly, notice my comment about the race condition between publishing the outgoing ReservationConfirmed event message and committing the database work through IDocumentSession.SaveChangesAsync(). That’s obviously a problem waiting to bite us, so we’ll come back to that.

Next, let’s move on to the separate handler for ReservationConfirmed:

[LocalQueue("Notifications")]
[RetryNow(typeof(HttpRequestException), 50, 100, 250)]
public class ReservationConfirmedHandler
{
    public async Task Handle(ReservationConfirmed confirmed, IQuerySession session, IRestaurantProxy restaurant)
    {
        var reservation = await session.LoadAsync<Reservation>(confirmed.ReservationId);

        // Make a call to an external web service through a proxy
        await restaurant.NotifyRestaurant(reservation);
    }
}

All this handler does is look up the current state of the reservation and post that to an external system through a proxy interface (IRestaurantProxy).

As I said before, I strongly prefer that calls out to external systems be isolated to their own retry loops. In this case, the [RetryNow] attribute is setting up Jasper to retry a command through this handler on transient HttpException errors with a 50, 100, and then 250 millisecond cooldown period between attempts. Jasper’s error handling policies go much deeper than this, but hopefully you can already see what’s possible.

The usage of the [LocalQueue("Notifications")] attribute is directing Jasper to execute the ReservationConfirmed messages in a separate, local queue named “Notifications”. In effect, we’ve got a producer/consumer solution between the incoming ConfirmReservation command and ReservationConfirmed event messages. The local queueing is done with the TPL Dataflow library. Maybe Jasper will eventually move to using System.Threading.Channels, but for right now there’s just bigger issues to worry about.

Don’t fret if you don’t care for sprinkling attributes all over your code, all of the configuration I’ve done above with attributes can also be done with a fluent interface at bootstrapping time, or even within the message handler classes themselves.

Bootstrapping Jasper

Stepping into the Program.cs file for our new system, I’m going to add bootstrapping for both Marten and Jasper in the simplest possible way like so:

using CommandBusSamples;
using Jasper;
using Marten;
using Oakton;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddMarten(opts =>
{
    opts.Connection("Host=localhost;Port=5433;Database=postgres;Username=postgres;password=postgres");
});

// Adding Jasper as a straight up Command Bus with all
// its default configuration
builder.Host.UseJasper();

var app = builder.Build();

// This isn't *quite* the most efficient way to do this,
// but it's simple to understand, so please just let it go...
// I'm just delegating the HTTP request body as a command 
// directly to Jasper's in memory ICommandBus
app.MapPost("/reservations/confirm", (ConfirmReservation command, ICommandBus bus) => bus.InvokeAsync(command));


// This opts into using Oakton for extended command line options for this app
// Oakton is also a transitive dependency of Jasper itself
return await app.RunOaktonCommands(args);

Okay, so let’s talk about some of the things in that code up above:

  • Jasper tries to embrace the generic host building and core abstractions that came with .Net Core (IHostBuilder, ILogger, IHostedService etc.) wherever possible, so hence the integration happens with the UseJasper() call seen above.
  • The call to UseJasper() also quietly sets up Lamar as the underlying IoC container for your application. I won’t get into that much here, but there are optimizations in Jasper’s runtime model that require Lamar.
  • I used Oakton as the command line parser. That’s not 100% necessary, but there are a lot of development time utilities with Oakton for Jasper development. I’ll show some of that in later posts building on this one.

The one single HTTP route calls directly into the Jasper ICommandBus.InvokeAsync() method to immediately execute the message handler inline for the ConfirmReservation message. As someone who’s a skeptic of “mediator” tools in AspNetCore, I’m not sure this really adds much value as the handler for ConfirmReservation is currently written. However, we can add some transient error handling to our application’s bootstrapping that would apply to the ICommandBus.InvokeAsync() calls like so:

builder.Host.UseJasper(opts =>
{
    // Just setting up some retries on transient database connectivity errors
    opts.Handlers.OnException<NpgsqlException>().OrInner<NpgsqlException>()
        .RetryWithCooldown(50.Milliseconds(), 100.Milliseconds(), 250.Milliseconds());
});

We can also opt to make our ConfirmReservation commands be processed in background threads rather than inline through our web request by changing the Minimal API route to:

// This isn't *quite* the most efficient way to do this,
// but it's simple to understand, so please just let it go...
app.MapPost("/reservations/confirm", (ConfirmReservation command, ICommandBus bus) => bus.EnqueueAsync(command));

The EnqueueAsync() method above places the incoming command message into an in-memory queue.

What if the process dies mid-flight?!?

An experienced user of asynchronous messaging tools will have happily spotted several potential problems in the solution so far. One, there’s a potential race condition in the ConfirmReservationHandler code between database changes being committed and the outgoing message being processed. Two, what if the process dies? If we’re using all this in-memory queueing stuff, that all dies when the process dies, right?

Fortunately, Jasper with some significant help from Postgresql and Marten here, already has a robust inbox and outbox implementation we’ll add next for durable messaging.

For clarity, here’s the original handler code again for the ConfirmReservation message:

public class ConfirmReservationHandler
{
    public async Task Handle(ConfirmReservation command, IDocumentSession session, IExecutionContext publisher)
    {
        var reservation = await session.LoadAsync<Reservation>(command.ReservationId);

        reservation!.IsConfirmed = true;

        // Watch out, this could be a race condition!!!!
        await publisher.PublishAsync(new ReservationConfirmed(reservation.Id));

        // We're coming back to this in a bit......
        await session.SaveChangesAsync();
    }
}

Please note the comment about the race condition in the code. What we need to do is to introduce Jasper’s outbox feature, then revisit this handler.

First though, I need to go back to the bootstrapping code in Program and add a little more code:

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddMarten(opts =>
{
    opts.Connection("Host=localhost;Port=5433;Database=postgres;Username=postgres;password=postgres");
})
    // NEW! Adding Jasper outbox integration to Marten in the "messages"
    // database schema
    .IntegrateWithJasper("messages");

// Adding Jasper as a straight up Command Bus
builder.Host.UseJasper(opts =>
{
    // Just setting up some retries on transient database connectivity errors
    opts.Handlers.OnException<NpgsqlException>().OrInner<NpgsqlException>()
        .RetryWithCooldown(50.Milliseconds(), 100.Milliseconds(), 250.Milliseconds());

    // NEW! Apply the durable inbox/outbox functionality to the two in-memory queues
    opts.DefaultLocalQueue.DurablyPersistedLocally();
    opts.LocalQueue("Notifications").DurablyPersistedLocally();

    // And I just opened a GitHub issue to make this config easier...
});

var app = builder.Build();

// This isn't *quite* the most efficient way to do this,
// but it's simple to understand, so please just let it go...
app.MapPost("/reservations/confirm", (ConfirmReservation command, ICommandBus bus) => bus.EnqueueAsync(command));

// This opts into using Oakton for extended command line options for this app
// Oakton is also a transitive dependency of Jasper itself
return await app.RunOaktonCommands(args);

This time I chained a call on the Marten configuration through the UseWithJasper() extension method. I also added a couple lines of code within the UseJasper() block to mark the in memory queues as being enrolled in the inbox/outbox mechanics through the DurablyPersistedLocally() method.

Now, back to our handler code. Keeping things explicit for now, I’m going to add some necessary mechanics for opting into outbox sending:

public class ConfirmReservationHandler
{
    public async Task Handle(ConfirmReservation command, IDocumentSession session, IExecutionContext publisher)
    {
        // Enroll the execution context and Marten session in
        // outbox sending
        // This is an extension method in Jasper.Persistence.Marten
        await publisher.EnlistInOutboxAsync(session);
        
        var reservation = await session.LoadAsync<Reservation>(command.ReservationId);

        reservation!.IsConfirmed = true;

        // No longer a race condition, I'll explain more below:)
        await publisher.PublishAsync(new ReservationConfirmed(reservation.Id));

        // Persist, and kick out the outgoing messages
        await session.SaveChangesAsync();
    }
}

I’m going to utilize Jasper/Marten middleware in the last section to greatly simplify the code above, so please read to the end:)

With this version of the handler code, thing are working a little differently:

  • The call to PublishAsync() does not immediately release the message to the in memory queue. Instead, it’s routed and held in memory by the IExecutionContext for later
  • When IDocumentSession.SaveChangesAsync() is called, the outgoing messages are persisted into the underlying database in the same database transaction as the change to the Reservation document. Using the Marten integration, the Jasper outbox can even do this within the exact same batched database command as a minor performance optimization
  • At the end of IDocumentSession.SaveChangesAsync() upon a successful transaction, the outgoing messages are kicked out into the outgoing message sending queues in Jasper.

The outbox usage here solves a couple issues. First, it eliminates the race condition between the outgoing messages and the database changes. Secondly, it prevents situations of system inconsistency where either the message succeeds and the database changes fail, or vice versa.

I’ll be writing up a follow up post later this week or next diving deeper into Jasper’s outbox implementation. For a quick preview, by taking a very different approach than existing messaging tools in .Net, Jasper’s outbox is already usable in more scenarios than other alternatives and I’ll try to back that assertion up next time. To answer the obvious question in the meantime, Jasper’s outbox gives you an at least once delivery guarantee even if the current process fails.

Streamline the Handler Mechanics

So the “register outbox, public messages, commit session transaction” dance could easily get repetitive in your code. Jasper’s philosophy is that repetitive code is wasteful, so let’s eliminate the cruft-y code we were writing strictly for Marten or Jasper in the ConfirmReservationHandler. The code below is the exact functional equivalent to the earlier handler — even down to enrolling in the outbox:

public static class ConfirmReservationHandler
{
    [Transactional]
    public static async Task<ReservationConfirmed> Handle(ConfirmReservation command, IDocumentSession session)
    {
        var reservation = await session.LoadAsync<Reservation>(command.ReservationId);

        reservation!.IsConfirmed = true;

        session.Store(reservation);

        // Kicking out a "cascaded" message
        return new ReservationConfirmed(reservation.Id);
    }
}

The usage of the [Transactional] attribute opts only this handler into using Marten transactional middleware that handles all the outbox enrollment mechanics and calls IDocumentSession.SaveChangesAsync() for us after this method is called.

By returning Task<ReservationConfirmed>, I’m directing Jasper to publish the returned value as a message upon the successful completion of the incoming message. I personally like the cascading message pattern in Jasper as a way to make unit testing handler code easier. This was based on a much older custom service bus I helped build and run in production in the mid-10’s.

On the next episode of “please, please pay attention to my OSS projects!”

My very next post is a sequel to Marten just got better for CQRS architectures, but this time using some new Jasper functionality with Marten to further streamline out repetitive code.

Following behind that, I want to write follow ups doing a deeper dive on Jasper’s outbox implementation, using Rabbit MQ with Jasper, then a demonstration of all the copious command line utilities built into both Jasper and Marten.

As my mentor from Dell used to say, “thanks for listening, I feel better now”

Marten just got better for CQRS architectures

I’m assuming some prior knowledge of Event Sourcing as an architectural pattern here. I highly recommend Oskar Dudycz’s Introduction to Event Sourcing training kit or this video from Derek Comartin. While both Event Sourcing and the closely associated CQRS architectural style are both useful without the other, I’m still assuming here that you’re interested in using Marten for event sourcing within a larger CQRS architecture.

So you’re adopting an event sourcing style with Marten for your persistence within a larger CQRS architectural style. Crudely speaking, all “writes” to the system state involve sending a command message to your CQRS service with a workflow something like this:

In the course of handling the command message, our command handler (or HTTP endpoint) needs to:

  1. Fetch a “write model” that represents the state for the current workflow. This projected “write model” will be used by the command handler to validate the incoming command and also to…
  2. Decide what subsequent events should be published to update the state of the system based on the existing state and the incoming command
  3. Persist the new events to the ongoing Marten event store
  4. Possibly publish some or all of the new events to an outgoing transport to be acted upon asynchronously
  5. Deal with concurrency concerns, especially if there’s any significant chance that other related commands maybe coming in for the same logical workflow at the same time

Do note that as I shift to implementations that I’m going to mostly bypass any discussion of design patterns or what I personally consider to be useless cruft from common CQRS approaches in the .Net or JVM worlds. I.e., no repositories will be used in any of this code.

As an example system, let’s say that we’re building a new, online telehealth system that among other things will track how a medical provider spends their time during a shift helping patients during their workday. Using Marten’s “self-aggregate” support, a simplified version of the provider shift state is represented by this model:

public class ProviderShift
{
    public Guid Id { get; set; }
    
    // Pay attention to this, this will come into play
    // later
    public int Version { get; set; }
    
    public Guid BoardId { get; private set; }
    public Guid ProviderId { get; init; }
    public ProviderStatus Status { get; private set; }
    public string Name { get; init; }
    
    public Guid? AppointmentId { get; set; }
    
    public static async Task<ProviderShift> Create(ProviderJoined joined, IQuerySession session)
    {
        var provider = await session.LoadAsync<Provider>(joined.ProviderId);
        return new ProviderShift
        {
            Name = $"{provider.FirstName} {provider.LastName}",
            Status = ProviderStatus.Ready,
            ProviderId = joined.ProviderId,
            BoardId = joined.BoardId
        };
    }

    public void Apply(ProviderReady ready)
    {
        AppointmentId = null;
        Status = ProviderStatus.Ready;
    }

    public void Apply(ProviderAssigned assigned)
    {
        Status = ProviderStatus.Assigned;
        AppointmentId = assigned.AppointmentId;
    }
    
    public void Apply(ProviderPaused paused)
    {
        Status = ProviderStatus.Paused;
        AppointmentId = null;
    }

    // This is kind of a catch all for any paperwork the
    // provider has to do after an appointment has ended
    // for the just concluded appointment
    public void Apply(ChartingStarted charting) => Status = ProviderStatus.Charting;
}

Next up, let’s play the user story for a provider to make their “charting” activity complete after a patient appointment concludes. Looking at the sequence diagram and the bullet list of concerns for each command handler, we’ve got a few things to worry about. Never fear though, because Marten has you (mostly) covered today with a couple new features introduced in Marten v5.4 last week.

Starting with this simple command:

public record CompleteCharting(
    Guid ShiftId, 
    Guid AppointmentId, 
    int Version);

We’ll use Marten’s brand new IEventStore.FetchForWriting<T>() API to whip up the basic command handler (just a small ASP.Net Core Controller endpoint):

    public async Task CompleteCharting(
        [FromBody] CompleteCharting charting, 
        [FromServices] IDocumentSession session)
    {
        /* We've got options for concurrency here! */
        var stream = await session
            .Events.FetchForWriting<ProviderShift>(charting.ShiftId);

        // Validation on the ProviderShift aggregate
        if (stream.Aggregate.Status != ProviderStatus.Charting)
        {
            throw new Exception("The shift is not currently charting");
        }
        
        // We "decided" to emit one new event
        stream.AppendOne(new ChartingFinished(stream.Aggregate.AppointmentId.Value, stream.Aggregate.BoardId));

        await session.SaveChangesAsync();
    }

The FetchForWriting() method used above is doing a couple different things:

  1. Finding the current, persisted version of the event stream for the provider shift and loading that into the current document session to help with optimistic concurrency checks
  2. Fetching the current state of the ProviderShift aggregate for the shift id coming up on the command. Note that this API papers over whether or not the aggregate in question is a “live aggregate” that needs to be calculated on the fly from the raw events or previously persisted as just a Marten document by either an inline or asynchronous projection. I think I would strongly recommend that “write model” aggregates be either inline or live to avoid eventual consistency issues.

Concurrency?!?

Hey, the hard truth is that it’s easy for the command to be accidentally or incidentally dispatched to your service multiple times from messaging infrastructure, multiple users doing the same action in different sessions, or somebody clumsy like me just accidentally clicking a button too many times. One way or another, we may need to harden our command handler against concurrency concerns.

The usage of FetchForWriting<T>() will actually set you up for optimistic concurrency checks. If someone else manages to successfully process a command against the same provider shift between the call to FetchForWriting<T>() and IDocumentSession.SaveChangesAsync(), you’ll get a Marten ConcurrencyException thrown by SaveChangesAsync() that will abort and rollback the transaction.

Moving on though, let’s tighten up the optimistic version check by first telling Marten what the version of the provider shift was that our command thinks that the provider shift is at on the server. First though, we need to get the current version back to the client that’s collecting changes to our provider shift. If you scan back to the ProviderShift aggregate above, you’ll see this property:

    public int Version { get; set; }

With another new little feature in Marten v5.4, the Marten projection support will automatically set the value of a Version to the latest stream version for a single stream aggregate like the ProviderShift. Knowing that, and assuming that ProviderShift is updated inline, we could just deliver the whole ProviderShift to the client with this little web service endpoint (using Marten.AspNetCore extensions):

    [HttpGet("/shift/{shiftId}")]
    public Task GetProviderShift(Guid shiftId, [FromServices] IQuerySession session)
    {
        return session.Json.WriteById<ProviderShift>(shiftId, HttpContext);
    }

The Version property can be a field, scoped as internal, or read-only. Marten is using a dynamically generated Lambda that can happily bypass whatever scoping rules you have to set the version to the latest event for the stream represented by this aggregate. The Version naming convention can also be explicitly ignored, or redirected to a totally differently named member. Lastly, it can even be a .Net Int64 type too — but if you’re doing that, you probably have some severe modeling issues that should be addressed first!

Back to our command handler. If the client has what’s effectively the “expected starting version” of the ProviderShift and sends the CompleteCharting command with that version, we can change the first line of our handler method code to this:

        var stream = await session
                
            // Note: I'm passing in the expected, starting provider shift
            // version from the command
            .Events.FetchForWriting<ProviderShift>(charting.ShiftId, charting.Version);

This new version will throw a ConcurrencyException right off the bat if the expected, starting version is not the same as the last, persisted version in the database. After that, it’s the same optimistic concurrency check at the point of calling SaveChangesAsync() to commit the changes.

Lastly, since Marten is built upon a real database instead of trying to be its own specialized storage engine like many other event sourcing tools, we’ve got one last trick. Instead of putzing around with optimistic concurrency checks let’s go to a pessimistic, exclusive lock on the specific provider shift so that only one session at a time can ever be writing to that provider shift with this variation:

        var stream = await session
                
            // Note: This will try to "wait" to claim an exclusive lock for writing
            // on the provider shift event stream
            .Events.FetchForExclusiveWriting<ProviderShift>(charting.ShiftId);
        

As you can see, Marten has some new functionality to make it even easier to use Marten within CQRS architectures by eliminating some previously repetitive code in both queries on projected state and in command handlers where you need to use Marten’s concurrency control.

Wait, not so fast, you missed some things!

I missed a couple very big things in the sample code above. For one, we’d probably want to broadcast the new events through some kind of service bus to allow other systems or just our own system to asynchronously do other work (like trying to assign our provider to another ready patient appointment). To do that reliably so that the event capture and the outgoing events being published succeed or fail together in one atomic action, I really need an “outbox” of some sort integrated into Marten.

I also left out any kind of potential error handling or message retry capabilities around the concurrency exceptions. And lastly (that I can think of offhand), I completely left out any discussion of the instrumentation you’d want in any kind of grown up system.

Since we’re in the middle of the NBA playoffs, I’m reminded of a Shaquille O’Neal quote from when his backup was Alonzo Mourning, and Mourning had a great game off the bench: “sometimes Superman needs some help from the Incredible Hulk.” In this case, part of the future of Marten is to be combined with another project called Jasper that is going to add external messaging with a robust outbox implementation for Marten to create a full stack for CQRS architectures. Maybe as soon as late next week or at least in June, I’ll write a follow up showing the Marten + Jasper combination that deals with the big missing pieces of this post.

Update on Jasper v2 with an actual alpha

First off, my super power (stop laughing at me!) is having a much longer attention span than the average software developer. In positive ways, this has enabled me to tackle very complex problems. In negative ways, I’ve probably wasted a tremendous amount of time in my career working on systems or projects long after they had probably already failed and I just wouldn’t admit it.

So late last year I started working on a reboot of Jasper, my attempt at creating a “next generation” messaging framework for .Net. The goal of Jasper has changed quite a bit since I started jotting down notes for it in 2014, but the current vision is to be a highly productive command execution engine and asynchronous messaging tool for .Net with less code ceremony than the currently popular tools in this space.

I kicked out a Jasper v 2.0.0-alpha-1 release this week just barely in time for my talk at That Conference yesterday (but didn’t end up showing it at all). Right now the intermediate goals to get to a full Jasper 2.0 rebooted project is to:

  • Finish the baked in Open Telemetry support. It’s there, but there’s holes in what’s being captured
  • Get the interop with MassTransit via Rabbit MQ working for more scenarios. I’ve got a successful proof of concept of bi-directional interaction between Jasper and MassTransit services
  • Finish documentation for the new 2.0 version. I moved the docs to VitePress and started re-writing the docs from scratch, and that takes time

The first two bullet points are all about getting Jasper ready to be something I could dogfood at work.

While I absolutely intend both Jasper and Marten to be perfectly usable without the other, there’s also going to be some specific integration between Jasper and Marten to create a full blown, opinionated CQRS stack for .Net development (think Axon for .Net, but hopefully with much less code ceremony). For this combination, the Marten team is talking about adding messaging subscriptions for the Marten event store functionality, Jasper middleware to reduce repetitive CQRS handler code, and using the outbox functionality in Jasper to also integrate Marten with external messaging infrastructure.

I’ll kick out actual content about all this in the next couple weeks, but a couple folks have noticed the big uptick in Jasper work and asked what was going on, so here’s a little blog post on it:)

Improving the Development and Production Time Experience with Marten V5

Marten V5 dropped last week, with significant new features for multi-tenancy scenarios and enabling users to use multiple Marten document stores in one .Net application. A big chunk of the V5 work was mostly behind the scenes trying to address user feedback from the much larger V4 release late last year. As always, the Marten documentation is here.

First, why didn’t you just…

I’d advise developers and architects to largely eliminate the word “just” and any other lullabye language from their vocabulary when talking about technical problems and solutions.

That being said:

  • Why didn’t you just use source generators instead? Most of this was done before source generators were released, and source generators are limited to information that’s available at compile time. The dynamic code generation in Marten is potentially using information that is only available at run time
  • Why didn’t you just use IL generation instead? Because I despise working directly with IL and I think that would have dramatically curtailed what was easily possible. It’s also possible that we end up having to go there eventually.

Setting the Stage

Consider this simplistic code to start a new Marten DocumentStore against a blank database and persist a single User document:

var store = DocumentStore.For("connection string");

await using var session = store.LightweightSession();
var user = new User
{
    UserName = "pmahomes", 
    FirstName = "Patrick", 
    LastName = "Mahomes"
};

session.Store(user);
await session.SaveChangesAsync();

Hopefully that code is simple enough for new users to follow and immediately start being productive with Marten. The major advantage of document databases over the more traditional RDBMS with or without an ORM is the ability to just get stuff done without having to spend a lot of time configuring databases or object to database mappings or anywhere as much underlying code to just read and write data. To that end, there’s a lot of stuff going on behind the scenes of that code up above.

First off, there’s some automatic database schema management. In the default configuration used up above, Marten is quietly checking the underlying database on the first usage of the User document type to see if the database matches Marten’s configuration for the User document, and applies database migrations at runtime to change the database as necessary.

Secondly, there’s some runtime code generation happening to “bake in” the internal handling of how User documents are read from and written to the database. It’s not apparent here, but there’s a lot of knobs you can twist in Marten to change the behavior of how a document type is stored and retrieved from the database (soft deletes, turning on more metadata tracking, turning off default metadata tracking to be leaner, etc.). That behavior even varies between the lightweight session I used up above and the behavior of IDocumentStore.OpenSession() that adds identity map behavior to the session. To be more efficient over all, Marten generates the tightest possible C# code to handle each document type, then in the default mode, actually compiles that code in memory with Roslyn and uses the dynamically built assembly.

Cool, right? I’d argue that Marten can make teams be far more productive than they would be with the more typical EF Core or Dapper backed approach. Now let’s move on to the unfortunately very real downsides of Marten’s approach and what we’ve done to improve matters:

  • The dynamic Roslyn code generation can sometimes incur a major “cold start” issue on the very first usage. It’s definitely not consistent, as some people do not see any noticeable impact and other folks tell me they get a 9 second delay on the first usage. This cold start issue is especially problematic for folks using Marten in a Serverless architecture
  • The dynamically generated code can’t be used for any kind of potentially valuable AOT optimization
  • Roslyn usage sometimes causes a big ol’ memory leak no matter what we try. This isn’t consistent, so I don’t know why
  • The database change tracking does do some in memory locking, and that’s been prone to dead lock issues in some flavors of .Net (Blazor, WPF)
  • Some of you won’t want to give your application rights to modify a database at runtime
  • In Marten V4 there were a few too many places where Marten was executing the database change detection asynchronously, but from within synchronous calls using the dreaded .GetAwaiter().GetResult() approach. Occasional deadlock issues occurred, mostly in Marten usage within Blazor.

Database Migration Improvements

Alright, let’s tackle the database migration issues first. Marten has long had some command line support so that you could detect and apply any outstanding database changes from your application itself with this call:

dotnet run -- marten-apply

If you use the command line tooling for migrations, you can now optimize Marten to just turn off all runtime database migrations like so:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");
                opts.AutoCreateSchemaObjects = AutoCreate.None;
            });
    }).StartAsync();

Other folks won’t want to use the command line tooling, so there’s another option to just do all database migrations on database startup once, but otherwise completely eliminate all other potential locking in Marten V5, but this time I have to use the IHost integration:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");
                
                // Mild compromise, now I've got to tell
                // Marten about the User document
                opts.RegisterDocumentType<User>();
            })

            // This tells the app to do all database migrations
            // at application startup time
            .ApplyAllDatabaseChangesOnStartup();
    }).StartAsync();

In case you’re wondering, this option is safe to use even if you have multiple application nodes starting up simultaneously. The V5 version here relies on global locks in Postgresql itself to prevent simultaneous database changes that previously resulted in interestingly chaotic failure:(

Pre-building the Generated Types

Now, onto dealing with the dynamic codegen aspect of things. V4 created a “build types ahead” model where you can generate all the dynamic code with this command line call:

dotnet run -- codegen write

You can now completely dodge the runtime code generation issue by this sequence of events:

  1. In your deployment scripts, run dotnet run -- codegen write first
  2. Compile your application, which will embed the newly generated code right into your application’s entry assembly
  3. Use the below setting to completely disable all dynamic codegen:
using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");

                // Turn off all dynamic code generation, but this
                // will blow up if the necessary type isn't compiled
                // into 
                opts.GeneratedCodeMode = TypeLoadMode.Static;
            });
    }).StartAsync();

Again though, this depends on you having all document types registered with Marten instead of depending on runtime discovery as we did in the very first sample in this post — and that’s a bit of friction. What we’ve found is that folks have found the origin pre-built generation model to be clumsy, so we went back to the drawing board for Marten V5 and came up with the…

“Auto” Generated Code Mode

For V5, we have the option shown below:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");

                // use pre-built code if it exists, or
                // generate code if it doesn't and "just work"
                opts.GeneratedCodeMode = TypeLoadMode.Auto;
            });
    }).StartAsync();

My thinking here is that you’d just keep this on all the time, and as long as you’re running the application locally or through your integration test suite (you have one of those, right?), you’d have the dynamic types written to your main project’s code automatically (in an /Internal/Generated folder). Unless you purposely add those to your source control’s ignore list, that code will also be checked in. Woohoo, right?

Now, finally let’s put this all together and bundle all of what I would recommend as Marten best practices into the new…

Optimized Artifact Workflow

New in Marten V5 is what I named the “optimized artifact workflow” (I say “I” because I don’t think other folks like the name:)) as shown below:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");
            })
            // This is the call you want!
            .OptimizeArtifactWorkflow(TypeLoadMode.Static)
            .ApplyAllDatabaseChangesOnStartup();
    })
    
    // In testing harnesses, or with AWS Lambda / Azure Functions,
    // you may have to help out .Net by explicitly setting
    // the main application assembly
    .UseApplicationProject(typeof(User).Assembly)
    
    .StartAsync();

With the OptimizeArtifaceWorkflow(TypeLoadMode.Static) usage above, Marten is running with automatic database management and “Auto” code generation if the host’s environment name is “Development” as it would typically be on a local developer box. In “Production” mode, Marten is running with all automatic database management disabled at runtime beside the initial database change application at startup. In “Production” mode, Marten is also turning off all dynamic code generation with the assumption that all necessary types can be found in the entry assembly.

The goal here was to have a quick setting that optimized Marten usage in both development and production time without having to add in a bunch of nested conditional logic for IHostEnvironment.IsDevelopment() throughout the IHost configuration code.

Exterminating Sync over Async Calls

Back to the very original sample code:

var store = DocumentStore.For("connection string");

await using var session = store.LightweightSession();
var user = new User
{
    UserName = "pmahomes", 
    FirstName = "Patrick", 
    LastName = "Mahomes"
};

session.Store(user);
await session.SaveChangesAsync();

In Marten V4, the first call to session.Store(user) would trigger the database schema detection, which behind the scenes would end up doing a .GetAwaiter().GetResult() trick to call asynchronous code within the synchronous Store() command (not gonna get into that here, but we eliminated all synchronous database schema detection functionality for unrelated reasons in V4).

In V5, we rewired a lot of the internal guts such that the database schema detection is happening instead in the call to IDocumentSession.SaveChangesAsync(), which is of course, asynchronous. That allowed us to eliminate usages of “sync over async” calls. Likewise, we made similar changes throughout other areas of Marten.

Summary

The hope here is that we can make our users be more successful with Marten, and side step the problems our users have had specifically with using Marten with AWS Lambda, Azure Functions, Blazor, and inside of WPF applications. I’m also hoping that the OptimizedArtifactWorkflow() usage greatly simplifies the usage of Marten “best practices.”

Working with Multiple Marten Databases in One Application

Marten V5 dropped last week. I covered the new “database per tenant” strategy for multi-tenancy in my previous blog post. Closely related to that feature is the ability to register and work with multiple Marten databases from a single .Net system, and that’s what I want to talk about today.

Let’s say that for whatever reason (but you know there’s some legacy in there somehow), our application is mostly persisted in its own Marten database, but also needs to interact with a completely separate “Invoicing” database on a different database server and having a completely different configuration. With Marten V5 we can register an additional Marten database by first writing a marker interface for that other database:

    // These marker interfaces *must* be public
    public interface IInvoicingStore : IDocumentStore
    {

    }

And now we can register and configure a completely separate Marten database in our .Net system with the AddMartenStore<T>() usage shown below:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        // You can still use AddMarten() for the main document store
        // of this application
        services.AddMarten("some connection string");

        services.AddMartenStore<IInvoicingStore>(opts =>
            {
                // All the normal options are available here
                opts.Connection("different connection string");

                // more configuration
            })
            // Optionally apply all database schema
            // changes on startup
            .ApplyAllDatabaseChangesOnStartup()

            // Run the async daemon for this database
            .AddAsyncDaemon(DaemonMode.HotCold)

            // Use IInitialData
            .InitializeWith(new DefaultDataSet())

            // Use the V5 optimized artifact workflow
            // with the separate store as well
            .OptimizeArtifactWorkflow();
    }).StartAsync();

So here’s a few things to talk about from that admittedly busy code sample above:

  1. The IInvoicingStore will be registered in your underlying IoC container with singleton scoping. Marten is quietly making a concrete implementation of your interface for you, similar to how Refit works if you’re familiar with that library.
  2. We don’t yet have a way to register a matching IDocumentSession or IQuerySession type to go with the separate document store. I think my though on that is to wait until folks ask for that.
  3. The separate store could happily connect to the same database with a different database schema or connect to a completely different database server altogether
  4. You are able to separately apply all detected database changes on startup
  5. The async daemon can be enabled completely independently for the separate document store
  6. The IInitialData model can be used in isolation with the separate document store for baseline data
  7. The new V5 “optimized artifact workflow” model can be enabled explicitly on each separate document store. This will be the subject of my next Marten related blog post.
  8. It’s not shown up above, but if you really wanted to, you could make the separate document stores use a multi-tenancy strategy with multiple databases
  9. The Marten command line tooling is “multiple database aware,” meaning that it is able to apply changes or assert the configuration on all the known databases at one time or by selecting specific databases by name. This was the main reason the Marten core team did the separate document store story at the same time as the database per tenant strategy.

As I said earlier, we have a service registration for a fully functional DocumentStore implementing our IInvoicingStore that can be injected as a constructor dependency as shown in an internal service of our application:

public class InvoicingService
{
    private readonly IInvoicingStore _store;

    // IInvoicingStore can be injected like any other
    // service in your IoC container
    public InvoicingService(IInvoicingStore store)
    {
        _store = store;
    }

    public async Task DoSomethingWithInvoices()
    {
        // Important to dispose the session when you're done
        // with it
        await using var session = _store.LightweightSession();

        // do stuff with the session you just opened
    }
}

This feature and the multi-tenancy with a database per tenant have been frequent feature requests by Marten users, and it made a lot of sense to tackle them together in V5 because there was quite a bit of overlap in the database change management code to support both. I would very strongly state that a single database should be completely owned by one system, but I don’t know how I really feel about a single system working with multiple databases. Regardless, it comes up often enough that I’m glad we have something in Marten.

I worked with a client system some years back that was a big distributed monolith where the 7-8 separate windows services all talked to the same 4-5 Marten databases, and we hacked together something similar to the new formal support in Marten V5 to accommodate that. I do not recommend getting yourself into that situation though:-)