Tag Archives: Marten

Marten, the Generic Host Builder in .Net Core, and why this could be the golden age for OSS in .Net

Before I dive into some newly improved mechanics for quickly integrating Marten into just about any possible kind of .Net 5+ application, I want to take a trip through yesteryear and appreciate just how much better the .Net platform is today than it’s every been. If you wanna skip down to the positive part of the post, just look for sub heading “The better future for .Net is already here!.”

I’ve been a .Net developer since the very ugly beginning when we were all just glad to be off VB6/COM and onto a “modern” programming language that felt like a better Java but not yet completely horrified at how disastrously bad WebForms was going to turn out to be or disappointed at how poorly all our new .Net stuff worked out when we tried to adopt Agile development practices like Test Driven Development.

I’ve also been very deeply invested in developing open source tools and frameworks on the .Net framework, starting with StructureMap, the very first production capable Inversion of Control library on .Net that first went into a production system in the summer of 2004.

As the maintainer of StructureMap throughout, I had a front row seat to the absolute explosion of adapter libraries for seemingly every possible permutation of framework, IoC tool, and logging library. And this happened because every framework of any note in the .Net world had a completely different way to abstract away the actual IoC container or allowed you to use different logging tools through the specific framework’s special adapter mechanisms.

I’m also old enough to remember when every tool took a direct dependency on log4net around the time they lost their public strong naming key and broke compatibility. That was loads of fun.

There also wasn’t a standard way to deal with application lifecycle events across different types of .Net applications or application frameworks (ASP.Net’s Global.ASAX being an exception I guess). Every framework at that time had a different way of being bootstrapped at application start up time, so developers had to do a lot of plumbing code as they pieced all this junk together with the myriad adapter libraries.

To recap the “old”, pre-Core .Net world:

  • Application framework developers had to create their own IoC and logging abstractions and support a plethora of adapter libraries as well as possibly having to create all new application lifecycle management facilities as part of their framework
  • Library developers had to support users through their usage of problematic framework integrations where the library in question was often used in non-optimal ways (my personal pet peeve, and I edited this post before publishing to eliminate some harsh criticism of certain well known .Net application frameworks).
  • Developers of actual .Net applications were rightfully intimidated by the complexity necessary to integrate open source tools, or simply struggled to do so as there was no standardization of how all those tools went about their integration into .Net applications.

I would argue that the situation in .Net as it was before .Net Core seriously impeded the development and adoption of open source tooling in our community — which also had the effect of slowing down the rate of innovation throughout the .Net community.

You might notice that I used the past tense quite a bit in the previous paragraphs, and that’s because I think…

The better future for .Net is already here!.

It’s hard to overstate this, but the generic host builder (IHostBuilder and IHost) in .Net Core / .Net 5 and the related is hugely advantageous for the .Net ecosystem. I tried to explain why I think this a bit in a recent episode of .Net Rocks, but it’s probably easier to demonstrate this by integrating Marten into a new .Net 5 web api project.

Assuming that you’re already familiar with the usage of the Startup class for configuring ASP.Net Core applications, I’m going to add Marten to the new service with the following code:

public void ConfigureServices(IServiceCollection services)
{
    // This is the absolute, simplest way to integrate Marten into your
    // .Net Core application with Marten's default configuration
    services.AddMarten(options =>
    {
        // Establish the connection string to your Marten database
        options.Connection(Configuration.GetConnectionString("Marten"));

        // If we're running in development mode, let Marten just take care
        // of all necessary schema building and patching behind the scenes
        if (Environment.IsDevelopment())
        {
            options.AutoCreateSchemaObjects = AutoCreate.All;
        }

        if (Environment.IsProduction())
        {
            // This is a new V4 feature I'll blog about soon...
            options.GeneratedCodeMode = TypeLoadMode.LoadFromPreBuiltAssembly;
        }

        // Turn on the async projection daemon in only one node
        options.Projections.AsyncMode = DaemonMode.HotCold;

        // Configure other elements of the event store or document types...
    });
}

Even if you’re a developer who is completely unfamiliar with Marten itself, the integration through AddMarten() follows the recent .Net idiom and would be conceptually similar to many other tools you probably already use. That by itself is a very good thing as there’s less novelty for new Marten developers (there is of course matching UseMarten() signatures that hang off of IHostBuilder that is useful for other kinds of .Net projects)

I purposely used a few bells and whistles in Marten to demonstrate more capabilities of the generic host in .Net Core. To go into more detail, the code above:

  • Registers all the necessary Marten services with the proper service lifecycles in the underlying system container. And that works for any IoC container that supports the standard DI abstractions in the Microsoft.Extensions.DependencyInjection namespace. There isn’t any kind of Marten.Lamar or Marten.DryIoC or Marten.WhateverIoCToolYouHappenToLike adapter libraries.
  • I’ve also turned on Marten’s asynchronous projection process which will run as a long running task in the application by utilizing the standard .Net Core IHostedService mechanism whose lifecycle will be completely managed by the generic host in .Net.
  • We were able to do a little bit of environment specific configuration to let Marten do all necessary schema management automatically at development time, but not allow Marten to change database schemas in production. I also opted into a new Marten V4 optimization for production cold starts when the application is running in “Production” mode. That’s easy to do now using .Net’s standard IHostEnvironment mechanism at bootstrapping time.
  • It’s not obvious from that code, but starting in V4.0, Marten will happily log to the standard ILogger interface from .Net Core. Marten itself doesn’t care where or how you actually capture log information, but because .Net itself now specifies a single set of logging abstractions to rule them all, Marten can happily log to any commonly used logging mechanism (NLog, Serilog, the console, Azure logging, whatever) without any other kind of adapter.
  • And oh yeah, I had to add the connection string to a Postgresql database from .Net Core’s IConfiguration interface, which itself could be pulling information any number of ways, but our code doesn’t have to care about anything else but that one standard interface.

As part of the team behind the Marten, I feel like our jobs are easier now than it was just prior to .Net Core and the generic host builder because now there are:

  • .Net standard abstractions for logging, configuration, and IoC service registration
  • The IHostedService mechanism is an easy, standard way in .Net to run background processes or even just to hook into application start up and tear down events with custom code
  • The IServiceCollection.Add[Tool Name]() and IHostBuilder.Use[Tool Name]() idioms provided us a way to integrate Marten into basically any kind of .Net Core / .Net 5/6 application through a mechanism that is likely to feel familiar to experienced .Net developers — which hopefully reduces the barrier to adoption for Marten

As some of you reading this may know, I was very heavily involved in building an alternative web development framework for .Net called FubuMVC well before .Net Core came on to the scene. When we built FubuMVC, we had to create from scratch lot of the exact same functionality that’s in the generic host today to manage application lifecycle events, bootstrapping the application’s IoC container through its own IoC abstractions, and hooks for extension with other .Net libraries. I wouldn’t say that was the primary reason that FubuMVC failed as a project, but it certainly didn’t help.

Flash forward to today, and I’ve worked off and on on a complete replacement for FubuMVC on .Net Core called Jasper for asynchronous messaging and an alternative for HTTP services. That project isn’t going anywhere at the moment, but Jasper’s internals are much, much smaller than FubuMVC’s were for similar functionality because Jasper can rely on the functionality and standard abstractions of the generic host in .Net Core. That makes tools like Jasper much easier for .Net developers to write.

Unsurprisingly, there are plenty of design and implementation decisions that Microsoft made with the generic host builder that I don’t agree with. I can point to a few things that I believe FubuMVC did much better for composition a decade ago than ASP.Net Core does today (and vice versa too of course). I deeply resent how the ASP.Net Core team wrote the IoC compliance rules that were rammed down the throats of all us IoC library maintainers. You can also point out that there are cultural problems within the .Net community in regards to OSS and that Microsoft itself severely retards open source innovation in .Net by stomping on community projects at will.

All of that may be true, but from a strictly technical perspective, I think that it is now much easier for the .Net community to build and adopt innovative open source tools than it ever has been. I largely credit the work that’s been done with the generic host, the common idioms for tool integration, and the standard interfaces that came in the .Net Core wave for potentially opening up a new golden age for .Net OSS development.

Brief Update on Marten V4

The latest, greatest Marten V4 release candidate is up on Nuget this morning. Marten V4 has turned out to be a massive undertaking for an OSS project with a small team, but we’re not that far off what’s turned into our “Duke Nukem Forever” release. We’re effectively “code complete” other than occasional bugs coming in from early users — which is fortunately a good thing as it’s helped us make some things more robust without impacting the larger community.

At this point it’s mostly a matter of completing the documentation updates for V4, and I’m almost refusing to work on new features until that’s complete. The Marten documentation website is moving to VitePress going forward, so it’ll be re-skinned with better search capabilities. We’re also taking this opportunity to reorganize the documentation with the hopes of making that site more useful for users.

Honestly, the biggest thing holding us back right now in my opinion is that I’ve been a little burned out and slow working on the documentation. Right now, I’m hoping we can pull the trigger on V4 in the next month or two.

Using Alba to Test ASP.Net Services

TL;DR: Alba is a handy library for integration tests against ASP.Net web services, and the rapid feedback that enables is very useful

One of our projects at Calavista right now is helping a client modernize and optimize a large .Net application, with the end goal being everything running on .Net 5 and an order of magnitude improvement in system throughput. As part of the effort to upgrade the web services, I took on a task to replace this system’s usage of IdentityServer3 with IdentityServer4, but still use the existing Marten-backed data storage for user membership information.

Great, but there’s just one problem. I’ve never used IdentityServer4 before and it changed somewhat between the IdentityServer3 code I was trying to reverse engineer and its current model. I ended up getting through that work just fine. A key element of doing that was using the Alba library to create a test harness so I could iterate through configuration changes quickly by rerunning tests on the new IdentityServer4 project. It didn’t start out this way, but Alba is essentially a wrapper around the ASP.Net TestServer and just acts as a utility to make it easier to write automated tests around the HTTP services in your web service projects.

I ended up starting two new .Net projects:

  1. A new web service that hosts IdentityServer4 and is configured to use user membership information from our client’s existing Marten/Postgresql database
  2. A new xUnit.Net project to hold integration tests against the new IdentityServer4 web service.

Let’s dive right into how I set up Alba and xUnit.Net as an automated test harness for our new IdentityServer4 service. If you start a new ASP.Net project with one of the built in project templates, you’ll get a Program file that’s the main entry point for the application and a Startup class that has most of the system’s bootstrapping configuration. The templates will generate this method that’s used to configure the IHostBuilder for the application:

public static IHostBuilder CreateHostBuilder(string[] args)
{
    return Host.CreateDefaultBuilder(args)
        .ConfigureWebHostDefaults(webBuilder =>
        {
            webBuilder.UseStartup<Startup>();
        });
}

For more information on what role of the IHostBuilder is within your application, see .NET Generic Host in ASP.NET Core.

That’s important, because that gives us the ability to stand up the application exactly as it’s configured in an automated test harness. Switching to the new xUnit.Net test project, referenced my new web service project that will host IdentityServer4. Because spinning up your ASP.Net system can be relatively expensive, I only want to do that once and share the IHost between tests. That’s a perfect usage for xUnit.Net’s shared context support.

First I make what will be the shared test fixture context class for the integration tests shown below:

public class AppFixture : IDisposable
{
    public AppFixture()
    {
        // This is calling into the actual application's
        // configuration to stand up the application in 
        // memory
        var builder = Program.CreateHostBuilder(new string[0]);

        // This is the Alba "SystemUnderTest" wrapper
        System = new SystemUnderTest(builder);
    }
    
    public SystemUnderTest System { get; }

    public void Dispose()
    {
        System?.Dispose();
    }
}

The Alba SystemUnderTest wrapper is responsible for building the actual IHost object for your system, and does so using the in memory TestServer in place of Kestrel.

Just as a convenience, I like to create a base class for integration tests I tend to call IntegrationContext:

    public abstract class IntegrationContext 
        // This is some xUnit.Net mechanics
        : IClassFixture<AppFixture>
    {
        public IntegrationContext(AppFixture fixture)
        {
            System = fixture.System;
            
            // Just casting the IServiceProvider of the underlying
            // IHost to a Lamar IContainer as a convenience
            Container = (IContainer)fixture.System.Services;
        }
        
        public IContainer Container { get; }
        
        // This gives the tests access to run
        // Alba "Scenarios"
        public ISystemUnderTest System { get; }
    }

We’re using Lamar as the underlying IoC container in this application, and I wanted to use Lamar-specific IoC diagnostics in the tests, so I expose the main Lamar container off of the base class as just a convenience.

To finally turn to the tests, the very first thing to try with IdentityServer4 was just to hit the descriptive discovery endpoint just to see if the application was bootstrapping correctly and IdentityServer4 was functional *at all*. I started a new test class with this declaration:

    public class EndToEndTests : IntegrationContext
    {
        private readonly ITestOutputHelper _output;
        private IDocumentStore theStore;

        public EndToEndTests(AppFixture fixture, ITestOutputHelper output) : base(fixture)
        {
            _output = output;

            // I'm grabbing the Marten document store for the app
            // to set up user information later
            theStore = Container.GetInstance<IDocumentStore>();
        }

And then a new test just to exercise the discovery endpoint:

[Fact]
public async Task can_get_the_discovery_document()
{
    var result = await System.Scenario(x =>
    {
        x.Get.Url("/.well-known/openid-configuration");
        x.StatusCodeShouldBeOk();
    });
    
    // Just checking out what's returned
    _output.WriteLine(result.ResponseBody.ReadAsText());
}

The test above is pretty crude. All it does is try to hit the `/.well-known/openid-configuration` url in the application and see that it returns a 200 OK HTTP status code.

I tend to run tests while I’m coding by using keyboard shortcuts. Most IDEs support some kind of “re-run the last test” keyboard shortcut. Using that, my preferred workflow is to run the test once, then assuming that the test is failing the first time, work in a tight cycle of making changes and constantly re-running the test(s). This turned out to be invaluable as it took me a couple iterations of code changes to correctly re-create the old IdentityServer3 configuration into the new IdentityServer4 configuration.

Moving on to doing a simple authentication, I wrote a test like this one to exercise the system with known credentials:

[Fact]
public async Task get_a_token()
{
    // All I'm doing here is wiping out the existing database state
    // with Marten and using a little helper method to add a new
    // user to the database as part of the test setup
    // This would obviously be different in your system
    theStore.Advanced.Clean.DeleteAllDocuments();
    await theStore
        .CreateUser("aquaman@justiceleague.com", "dolphin", "System Administrator");
    
    
    var body =
        "client_id=something&client_secret=something&grant_type=password&scope=ourscope%20profile%20openid&username=aquaman@justiceleague.com&password=dolphin";

    // The test would fail here if the status
    // code was NOT 200
    var result = await System.Scenario(x =>
    {
        x.Post
            .Text(body)
            .ContentType("application/x-www-form-urlencoded")
            .ToUrl("/connect/token");
    });

    // As a convenience, I made a new class called ConnectData
    // just to deserialize the IdentityServer4 response into 
    // a .Net object
    var token = result.ResponseBody.ReadAsJson<ConnectData>();
    token.access_token.Should().NotBeNull();
    token.token_type.Should().Be("Bearer");
}

public class ConnectData
{
    public string access_token { get; set; }
    public int expires_in { get; set; }
    public string token_type { get; set; }
    public string scope { get; set; }
    
}

Now, this test took me several iterations to work through until I found exactly the right way to configure IdentityServer4 and adjusted our custom Marten backing identity store (IResourceOwnerPasswordValidator and IProfileService in IdentityServer4 world) until the tests pass. I found it extremely valuable to be able to debug right into the failing tests as I worked, and even needed to take advantage of JetBrains Rider’s capability to debug through external code to understand how IdentityServer4 itself worked. I’m very sure that I was able to get through this work much faster by iterating through tests as opposed to just trying to run the application and driving it through something like Postman or through the connected user interface.

Improvements to Event Sourcing in Marten V4

Marten V4 is still in flight, but everything I’m showing in this post is in the latest alpha release (4.0.0-alpha.6) release to Nuget.

TL;DR: Marten V4 has some significant improvements to its event sourcing support that will help developers deal with potential concurrency issues. The related event projection support in Marten V4 is also significantly more robust than previous versions.

A Sample Project Management Event Store

Imagine if you will, a simplistic application that uses Marten’s event sourcing functionality to do project management tracking with a conceptual CQRS architecture. The domain will include tracking each project task within the greater project. In this case, I’m choosing to model the activity of a single project and its related tasks as a stream of events like:

  1. ProjectStarted
  2. TaskRecorded
  3. TaskStarted
  4. TaskFinished
  5. ProjectCompleted

Starting a new Event Stream

Using event sourcing as our persistence strategy, the real system state is the raw events that model state changes or transitions of our project. As an example, let’s say that our system records and initializes a “stream” of events for a new project with this command handler:

    public class NewProjectCommand
    {
        public string Name { get; set; }
        public string[] Tasks { get; set; }
        public Guid ProjectId { get; set; } = Guid.NewGuid();
    }

    public class NewProjectHandler
    {
        private readonly IDocumentSession _session;

        public NewProjectHandler(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(NewProjectCommand command)
        {
            var timestamp = DateTimeOffset.Now;
            var started = new ProjectStarted  
            {
                Name = command.Name, 
                Timestamp = timestamp
            };

            var tasks = command.Tasks
                .Select(name => new TaskRecorded {
                    Timestamp = timestamp, 
                    Title = name
                });

            // This tells Marten that it should create a new project
            // stream
            _session.Events.StartStream(command.ProjectId, started);

            // This quietly appends events to the event stream
            // created in the line of code above
            _session.Events.Append(command.ProjectId, tasks);

            // The actual persistence to Postgresql happens here
            return _session.SaveChangesAsync();
        }
    }

The call to StartStream() up above tells Marten that it should create a new event stream with the supplied project id. As a long requested improvement in Marten for V4, StartStream() guarantees that the transaction cannot succeed if another project event stream with that project id already exists. This helps the system prevent users from accidentally duplicating projects — at least by its project id anyway.

Assuming that there is no existing stream with the project id, Marten will create a new record to track the new project stream and individual records in Postgresql to persist the raw event data along with metadata like the stream version, time stamps, and the event type.

But we need a “read side” projection of the Projects!

So great, we can persist the raw events, and there’s plenty of valuable things we can do later with those events. However, our application sooner or later is going to need to know what the current state of an ongoing project for user screens or validation logic, so we need some way to compile the events for a single project into some kind of “here’s the current state of the Project” data structure — and that’s where Marten’s support for projections comes into play.

To model a “projected” view of a project from its raw events, I’m creating a single Project class to model the full state of a single, ongoing project. To reduce the number of moving parts, I’m going to make Project be “self-aggregating” such that it’s responsible for mutating itself based on incoming events:

    public class Project
    {
        private readonly IList _tasks = new List();

        public Project(ProjectStarted started)
        {
            Version = 1;
            Name = started.Name;
            StartedTime = started.Timestamp;
        }

        // This gets set by Marten
        public string Id { get; set; }

        public long Version { get; set; }
        public DateTimeOffset StartedTime { get; private set; }
        public DateTimeOffset? CompletedTime { get; private set; }

        public string Name { get; private set; }

        public ProjectTask[] Tasks
        {
            get
            {
                return _tasks.ToArray();
            }
            set
            {
                _tasks.Clear();
                _tasks.AddRange(value);
            }
        }

        public void Apply(TaskRecorded recorded, IEvent e)
        {
            Version = e.Version;
            var task = new ProjectTask
            {
                Title = recorded.Title,
                Number = _tasks.Max(x => x.Number) + 1,
                Recorded = recorded.Timestamp
            };

            _tasks.Add(task);
        }

        public void Apply(TaskStarted started, IEvent e)
        {
            Version = e.Version;
            var task = _tasks.FirstOrDefault(x => x.Number == started.Number);

            // Remember this isn't production code:)
            if (task != null) task.Started = started.Timestamp;
        }

        public void Apply(TaskFinished finished, IEvent e)
        {
            Version = e.Version;
            var task = _tasks.FirstOrDefault(x => x.Number == finished.Number);

            // Remember this isn't production code:)
            if (task != null) task.Finished = finished.Timestamp;
        }

        public void Apply(ProjectCompleted completed, IEvent e)
        {
            Version = e.Version;
            CompletedTime = completed.Timestamp;
        }
    }

I didn’t choose to use that here, but I want to point out that you can use immutable aggregates with Marten V4. In that case you’d simply have the Apply() methods return a new Project object (as far as Marten is concerned anyway). Functional programming enthusiasts will cheer the built in support for immutability, some of you will point out that that leads to less efficient code by increasing the number of object allocations and more code ceremony, and I will happily say that Marten V4 let’s you make that decision to suit your own needs and preferences;-)

Also see Event Sourcing with Marten V4: Aggregated Projections for more information on other alternatives for expressing aggregated projections in Marten.

Working in our project management domain, I’d like the Project aggregate document to be updated every time events are captured for an event related to a project. I’ll add the Project document as a self-aggregating, inline projection in my Marten system like this:

    public class Startup
    {
        public void ConfigureServices(IServiceCollection services)
        {
            services.AddMarten(opts =>
            {
                opts.Connection("some connection");

                // Direct Marten to update the Project aggregate
                // inline as new events are captured
                opts.Events
                    .Projections
                    .SelfAggregate<Project>(ProjectionLifecycle.Inline);

            });
        }
    }

As an example of how inline projections come into play, let’s look at another sample command handler for adding a new task to an ongoing project with a new task title :

    public class CreateTaskCommand
    {
        public string ProjectId { get; set; }
        public string Title { get; set; }
    }

    public class CreateTaskHandler
    {
        private readonly IDocumentSession _session;

        public CreateTaskHandler(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(CreateTaskCommand command)
        {
            var recorded = new TaskRecorded
            {
                Timestamp = DateTimeOffset.UtcNow, 
                Title = command.Title
            };

            _session.Events.Append(command.ProjectId, recorded);
            return _session.SaveChangesAsync();
        }
    }

When SaveChangesAsync() is called above, Marten will issue a single database commit that captures a new TaskRecorded event. In that same commit Marten will fetch the persisted Project document for that project event stream, apply the changes from the new event, and persist the Project document that reflects the information in the events.

Let’s say that we have a user interface in our project management system that allows users to review and edit projects and tasks. Independent of the event streams, the “query” side of our application can happily retrieve the latest data for the Project documents by querying with Marten’s built in document database functionality like this simple MVC controller endpoint:

    public class ProjectController: ControllerBase
    {
        private readonly IQuerySession _session;

        public ProjectController(IQuerySession session)
        {
            _session = session;
        }

        [HttpGet("/api/project/{projectId}")]
        public Task<Project> Get(string projectId)
        {
            return _session.LoadAsync<Project>(projectId);
        }
    }

For a future blog post, I’ll show you a new Marten V4 feature for ultra efficient “read side” web services by streaming JSON data directly from Postgresql straight down the HTTP response body without ever wasting any time with JSON serialization.

Concurrency Problems Everywhere!!!

Don’t name a software project “Genesis”. Hubristic project names lead to massive project failures.

The “inline” projection strategy is easy to use in Marten, but it’s vulnerable to oddball concurrency problems if you’re not careful in your system design. Let’s say that we have two coworkers named Jack and Jill using our project management system. Jack pulls up the data for the “Genesis Project” in the UI client, then gets pulled into a hallway conversation (remember when we used to have those in the before times?). While he’s distracted, Jill makes some changes to the “Genesis Project” to record some tasks she and Jack had talked about earlier and saves the changes to the server. Jack finally gets back to his machine and makes basically the same edits to “Genesis” that Jill already did and saves the data. If we build our project management system naively, we’ve now allowed Jack and Jill to duplicate work and the “Genesis Project” task management is all messed up.

One way to prevent that concurrent change issue is to detect that the project has been changed by Jill when Jack tries to persist his duplicate changes and give him the chance to update his screen with the latest data before pushing new project task changes.

Going back to the Project aggregate document, as Marten appends events to a stream, it increments a numeric version number to each event within a stream. Let’s say in our project management system that we always want to know what the latest version of the project. Marten V4 finally allows you to use event metadata like the event version number within inline projections (this has been a long running request from some of our users). You can see that in action in the method below that updates Project based on a TaskStarted event. You might notice that I also pass in the IEvent object that would let us access the event metadata:

public void Apply(TaskStarted started, IEvent e)
{
    // Update the Project document based on the event version
    Version = e.Version;
    var task = _tasks.FirstOrDefault(x => x.Number == started.Number);

    // Remember this isn't production code:)
    if (task != null) task.Started = started.Timestamp;
}

Now, having the stream version number on the Project document turns out to be very helpful for concurrency issues. Since we’re worried about a Project event stream state getting out of sync if the system receives concurrent updates you can pass the current version that’s conveniently updated on the Project read side document down to the user interface, and have the user interface send you what it thinks is the current version of the project when it tries to make updates. If the underlying project event stream has changed since the user interface fetched the original data, we can make Marten reject the additional events. This is a form of an offline optimistic concurrency check, as we’re going to assume that everything is hunky dory, and just let the infrastructure reject the changes as an exceptional case.

To put that into motion, let’s say that our project management user interface posts this command up to the server to close a single project task:

    public class CompleteTaskCommand
    {
        public Guid ProjectId { get; set; }
        public int TaskNumber { get; set; }
        
        // This is the version of the project data
        // that was being edited in the user interface
        public long ExpectedVersion { get; set; }
    }

In the command handler, we can direct Marten to reject the new events being appended if the stream has been changed in between the user interface having fetched the project data and submitting its updates to the server:

    public class CompleteTaskHandler
    {
        private readonly IDocumentSession _session;

        public CompleteTaskHandler(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(CompleteTaskCommand command)
        {
            var @event = new TaskFinished
            {
                Number = command.TaskNumber,
                Timestamp = DateTimeOffset.UtcNow
            };

            _session.Events.Append(
                command.ProjectId,

                // Using this overload will make Marten do
                // an optimistic concurrency check against
                // the existing version of the project event
                // stream as it commits
                command.ExpectedVersion,

                @event);
            return _session.SaveChangesAsync();
        }
    }

In the CompleteTaskHandler above, the call to SaveChangesAsync() will throw a EventStreamUnexpectedMaxEventIdException exception if the project event stream has advanced past the existing version assumed by the originator of the command in command.ExpectedVersion. To make our system more resilient, we would need to catch the Marten exception and deal with the proper application workflow.

What I showed above is pretty draconian in terms of what edits it allows to go through. In other cases you may get by with a simple workflow that just tries to guarantee that a single Project aggregate is only being updated by a single process at any time. Marten V4 comes through with a couple new ways to append events with more control over concurrent updates to a single event stream.

Let’s stick to optimistic checks for now and look at the new AppendOptimistic() method for appending events to an existing event stream with a rewritten version of our CompleteTaskCommand handling:

    public class CompleteTaskHandler2
    {
        private readonly IDocumentSession _session;

        public CompleteTaskHandler2(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(CompleteTaskCommand command)
        {
            var @event = new TaskFinished
            {
                Number = command.TaskNumber,
                Timestamp = DateTimeOffset.UtcNow
            };

            // If some other process magically zips
            // in and updates this project event stream
            // between the call to AppendOptimistic()
            // and SaveChangesAsync(), Marten will detect
            // that and reject the transaction
            _session.Events.AppendOptimistic(
                command.ProjectId,
                @event);

            return _session.SaveChangesAsync();
        }
    }

We would still need to catch the Marten exceptions and handle those somehow. I myself would typically try to handle that with a messaging or command execution framework like MassTransit or my own Jasper project that comes with robust exception handling and retry capabilities.

At this point everything I’ve shown for concurrency control involves optimistic locks that result in exceptions being thrown and transactions being rejected. For another alternative, Marten V4 leverages Postgresql’s robust row locking support with this version of our CompleteTaskCommand handler that uses the new V4 AppendExclusive() method:

    public class CompleteTaskHandler3
    {
        private readonly IDocumentSession _session;

        public CompleteTaskHandler3(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(CompleteTaskCommand command)
        {
            var @event = new TaskFinished
            {
                Number = command.TaskNumber,
                Timestamp = DateTimeOffset.UtcNow
            };

            // This tries to acquire an exclusive
            // lock on the stream identified by
            // command.ProjectId in the database
            // so that only one process at a time
            // can update this event stream
            _session.Events.AppendExclusive(
                command.ProjectId,
                @event);
            return _session.SaveChangesAsync();
        }
    }

This is heavier in terms of its impact on the database, but simpler to implement as there is no flow control by exceptions to worry about.

I should also note that both the AppendOptimistic() and AppendExclusive() methods will verify that the stream already exists and throw an exception if the stream does not.

Event Sourcing with Marten V4: Aggregated Projections

All the code samples in this post are from alpha code, and maybe subject to change based on user feedback. At a minimum, I’d expect the configuration code to change as we write more documentation and sample code and try to sand down anything that’s awkward, confusing, or not discoverable. I’m planning on making this the first in a series of a blog posts. Please, please, please share any feedback or questions you might have about the Marten usage here.

The Marten team kicked out a new alpha this week (4.0.0-alpha.5) that among other things, includes most of our planned improvements to Marten’s event sourcing support.

Before I dive into the Marten V4 improvements, let’s rewind and talk about what event sourcing is, starting with some quotes:

The fundamental idea of Event Sourcing is that of ensuring every change to the state of an application is captured in an event object, and that these event objects are themselves stored in the sequence they were applied for the same lifetime as the application state itself.

Event Sourcing by Martin Fowler (no relation to Marten:))

And,

Event Sourcing is an alternative way to persist data. In contrast with state-oriented persistence that only keeps the latest version of the entity state, Event Sourcing stores each state mutation as a separate record called an event.

What is Event Sourcing? by Alexey Zimarev 

We just finished a client project at Calavista that used event sourcing that we generally felt to be a success. In this case, the business problem was very workflow centric and lent itself well to being modeled as a series of events reflecting user actions captured by the system or determined by business rules in background processes. Moreover, the project had significant requirements to track metrics and auditing compliance and we found event sourcing to be a very effective way to knock out the auditing requirements as well as set our client up to be able to support whatever metrics they wished to develop in the future by ingesting the raw events.

We did need to know the current state of the active workflows going on within the system for many of the business rules, so we kept a live “projected” view of the raw events in memory in a background process. That strategy certainly won’t work for every system, but one way or another, your system built with event sourcing is likely going to need some way to derive the system state from the raw events — and this is where I finally want to switch to talking about the work we’ve been doing in Marten V4 to improve Marten’s read-side projection support.

If you’re wondering, we didn’t use Marten because the project in question was written on top of Node.js. If it had been a .Net project, I absolutely believe that Marten would have been a very good fit.

Marten’s Event Sourcing Support

The value of Marten as an event store is that in one library, you get:

  1. The ability to capture events in durable, sequential storage
  2. Opt in multi-tenancy support
  3. User-defined “Projections” that compile the derived state of the system based on the raw events, including the ability to store the projected views as just regular old Marten documents. Those “projected” views can be built on the fly, updated inline when new, related events are being captured, or built asynchronously by background processes provided by Marten.
  4. Plenty of functionality to query and retrieve event data

And all of that runs using the solid, fully transactional Postgresql database engine which is well supported on every major cloud hosting platform. I’m going to argue that Marten gives .Net developers the easiest path to a full fledged event sourcing persistence subsystem within your application architecture because it’s self-contained.

A Sample Domain Model

Let’s say we’re building a system to track our user’s road trips throughout the U.S. In that domain, we’re tracking and capturing these events for each active trip starting from the beginning of the system:

  1. TripStarted — a new trip started on a certain day
  2. TripEnded — a trip in progress ended at its final destination
  3. TripAborted — a trip in progress was ended before it was completed
  4. Departure and Arrival — a trip party reached or left a U.S. state
  5. Travel — a trip party drove within a single state on a single day a series of movements, all of a single cardinal direction like it’s a 1980’s Atari video game (cut me some slack, I needed a simple domain to test the projections here:))

Getting Started with Marten

Marten completely embraces the HostBuilder concept introduced in later versions of .Net Core for easy integration into .Net applications. Starting from the “worker” template to start a new .Net project, I add a reference to the Marten Nuget package and add this call to the AddMarten() extension method like so:

public static IHostBuilder CreateHostBuilder(string[] args)
{
    return Host.CreateDefaultBuilder(args)
        .ConfigureServices((hostContext, services) =>
        {
            var configuration = hostContext.Configuration;
            var environment = hostContext.HostingEnvironment;

            services.AddHostedService<Worker>();

            // This is the absolute, simplest way to integrate Marten into your
            // .Net Core application with Marten's default configuration
            services.AddMarten(options =>
            {
                // Establish the connection string to your Marten database
                options.Connection(configuration.GetConnectionString("Marten"));

                // If we're running in development mode, let Marten just take care
                // of all necessary schema building and patching behind the scenes
                if (environment.IsDevelopment())
                {
                    options.AutoCreateSchemaObjects = AutoCreate.All;
                }
            });
        });
}

I’ll add some new V4 options later in the post, but the basics of what I did up there is described in our documentation.

Now to capture some events. Assume that a new trip is started when our system receives this command message from another system or our user interface:

public class CreateNewTrip
{
    public int Day { get; set; }
    public string State { get; set; }
    public Movement[] Movements { get; set; }
}

Next, let’s say our message handler looks like this (it varies by messaging framework, but this is what it could look like with Jasper):

public class NewTripHandler
{
    private readonly IDocumentSession _session;

    public NewTripHandler(IDocumentSession session)
    {
        _session = session;
    }

    public async Task Handle(CreateNewTrip trip)
    {
        var started = new TripStarted
        {
            Day = trip.Day
        };

        var departure = new Departure
        {
            Day = trip.Day,
            State = trip.State
        };

        var travel = new Travel
        {
            Day = trip.Day,
            Movements = new List<Movement>(trip.Movements)
        };

        // This will create a new event stream and
        // append the three events to that new stream
        // when the IDocumentSession is saved
        var action = _session.Events
            .StartStream(started, departure, travel);

        // You can also use strings as the identifier
        // for streams
        var tripId = action.Id;

        // Commit the events to the new event
        // stream for this trip
        await _session.SaveChangesAsync();
    }
}

In Marten nomenclature, a “stream” is a related set of events in the event storage. In this system, we’ll use a stream for every distinct trip. The code above takes in the CreateNewTrip command message, and creates three events to record the initial progress of the new trip, and persists the new events.

So now that we’ve captured events, let’s move on to projections in Marten V4, because that’s been both a major area of effort and also the biggest changes for usage in this forthcoming release.

Aggregate by Stream

Projections support in Marten comes in a couple different flavors, but I’m guessing that the most common is projecting a single document view of a single stream of events. In this case, we’ll create a projected Trip view like this:

public class Trip
{
    public Guid Id { get; set; }

    // the day the trip ended
    public int EndedOn { get; set; }

    // total mileage of the trip to date
    public double Traveled { get; set; }

    // what state is the trip party in
    // presently
    public string State { get; set; }

    // is the trip ongoing?
    public bool Active { get; set; }

    // what day did the trip start?
    public int StartedOn { get; set; }
}

To configure an aggregated projection for a Trip stream, we’ll subclass the new AggregateProjection<T> class like so:

public class TripAggregation: AggregateProjection<Trip>
{
    public TripAggregation()
    {
        // Delete the Trip document for this
        // stream if this event is encountered
        DeleteEvent<TripAborted>();

        ProjectionName = "Trip";

        // We'll change this later
        Lifecycle = ProjectionLifecycle.Live;
    }

    public void Apply(Arrival e, Trip trip) => trip.State = e.State;
    public void Apply(Travel e, Trip trip) => trip.Traveled += e.TotalDistance();
    public void Apply(TripEnded e, Trip trip)
    {
        trip.Active = false;
        trip.EndedOn = e.Day;
    }

    public Trip Create(TripStarted started)
    {
        return new Trip {StartedOn = started.Day, Active = true};
    }
}

A couple notes here to explain the code:

  • Marten is depending on naming conventions to know what to do with a certain kind of event type. So the Create() method is used to create the Trip aggregate from an event object of type TripStarted.
  • The Apply() methods are used to make updates to an existing aggregate document based on an event of a certain type
  • For the moment, the TripAggregation is only used for “live” aggregations that are done on the fly. We’ll change that later

Now, let’s put this new projection to use. In our call to AddMarten() up above, I’m going to add one line of code to register our new projection:

options.Events.Projections.Add(new TripAggregation());

Let’s say that in our system we write events for trips very frequently, but very rarely need to see the current state of the trip (don’t know why that would be so, but just go with it). In that case we can just lean on Marten’s ability to aggregate the projected view on the fly as shown below with the AggregateStreamAsync() method:

public class EndTrip
{
    public Guid TripId { get; set; }
    public bool Successful { get; set; }
    public string State { get; set; }
    public int Day { get; set; }
}

public class EndTripHandler
{
    private readonly IDocumentSession _session;

    public EndTripHandler(IDocumentSession session)
    {
        _session = session;
    }

    public async Task Handle(EndTrip end)
    {
        // we need to first see the current
        // state of the Trip to decide how
        // to proceed
        var trip = await _session
            .Events
            .AggregateStreamAsync<Trip>(end.TripId);
        
        // finish processing the EndTrip command...
    }
}

If instead, we’d like to keep the matching Trip document up to date and persisted in the database as new events come in, we can switch the TripProjection to using the “inline” lifecycle by setting the Lifecycle property in the constructor function of TripProjection to:

public TripAggregation()
{
    DeleteEvent<TripAborted>();

    ProjectionName = "Trip";

    // Now let's change the lifecycle to inline
    Lifecycle = ProjectionLifecycle.Inline;
}

“Inline” projections are updated when events are appended, and in the same transaction as the append event database changes. This gives you true ACID transaction integrity between the events and the projected views. This can set you up for possible concurrency issues if multiple application threads are trying to update the same stream of events simultaneously, so exercise some caution with using inline projections.

So with the Trip documents being updated inline with new events coming in, our EndTripHandler becomes:

public class EndTripHandler
{
    private readonly IDocumentSession _session;

    public EndTripHandler(IDocumentSession session)
    {
        _session = session;
    }

    public async Task Handle(EndTrip end)
    {
        // we need to first see the current
        // state of the Trip to decide how
        // to proceed, so load the pre-built
        // projected document from the database
        var trip = await _session
            .LoadAsync<Trip>(end.TripId);

        // finish processing the EndTrip command...
    }
}

You can also run projections asynchronously in a background thread, but I’m going to leave that for a subsequent post.

Some other points of interest about AggregateProjection<T>:

  • There’s some wiggle room in the signature of the conventionally named Apply() and Create() methods. These method signatures can be asynchronous. They can also take in parameters for IQuerySession to load other information from Marten as they work or the IEvent / IEvent<T> data for the event to get access to metadata about the event or the event’s stream
  • The Apply() methods happily support immutable aggregate types. You’d simple return the new aggregate document created in an Apply() method or Task<T> where the T is the aggregate document type. That might not be very efficient because of the extra object allocations, but hey, some folks really want that.
  • I didn’t show it up above, but if you dislike the conventional “magic” I used above, that’s okay, there are methods you can use to define how to update or create the aggregate document based on specific event types through inline Lambda functions.
  • You can also conditionally delete the aggregated document if the logical workflow represented by an event stream is completed based on user defined logic.
  • All of the conventional method signatures as well as the inline Lambda usages for defining how an aggregate would be updated can also accept interface or abstract class types as well. This was a user request to enable folks to use event types from external assemblies in extensibility scenarios.

Aggregate a Projected Document Across Event Streams

I thought you said to never cross the streams! – had to be said:)

This is admittedly contrived, but let’s say we want a projected view from our raw trip events that tells us for each day the system is active:

  • How many trips started?
  • How many trips ended?
  • How many miles did all the active trips drive in each direction?

And by “day” in this system, I just mean the day number since the system went online. That aggregate might look like this:

public class Day
{
    public int Id { get; set; }

    // how many trips started on this day?
    public int Started { get; set; }

    // how many trips ended on this day?
    public int Ended { get; set; }

    // how many miles did the active trips
    // drive in which direction on this day?
    public double North { get; set; }
    public double East { get; set; }
    public double West { get; set; }
    public double South { get; set; }
}

So what we want to do is to group the TripStarted, TripEnded, and Travel events by the day, and create an aggregated Day document to reflect all the events that happened on the same day. The first step is to tell Marten how to know what events are going to be associated with a specific day, and the easiest way in Marten V4 is to have the events implement a common interface like this one:

public interface IDayEvent
{
    int Day { get; }
}

And then have the relevant events implement that interface like the TripStarted event:

public class TripStarted : IDayEvent
{
    public int Day { get; set; }
}

Now, to make the Marten projection for the Day document type, we’ll use the new V4 version of ViewProjection as a subclass like so:

// The 2nd generic parameter is the identity type of
// the document type. In this case the Day document
// is identified by an integer representing the number
// of days since the system went online
public class DayProjection: ViewProjection<Day, int>
{
    public DayProjection()
    {
        // Tell the projection how to group the events
        // by Day document
        Identity<IDayEvent>(x => x.Day);
        
        // This just lets the projection work independently
        // on each Movement child of the Travel event
        // as if it were its own event
        FanOut<Travel, Movement>(x => x.Movements);
        
        ProjectionName = "Day";
    }

    public void Apply(Day day, TripStarted e) => day.Started++;
    public void Apply(Day day, TripEnded e) => day.Ended++;

    public void Apply(Day day, Movement e)
    {
        switch (e.Direction)
        {
            case Direction.East:
                day.East += e.Distance;
                break;
            case Direction.North:
                day.North += e.Distance;
                break;
            case Direction.South:
                day.South += e.Distance;
                break;
            case Direction.West:
                day.West += e.Distance;
                break;

            default:
                throw new ArgumentOutOfRangeException();
        }
    }
}

The new ViewProjection is a subclass of AggregateProjection, and therefore has all the same capabilities as its parent type.

You’d most likely need to run a projection that crosses streams in the asynchronous projection lifecycle where the projection is executed in a background process, but I’m leaving that to another blog post.

Summary & What’s Next?

I focused strictly on projections that focused on aggregations, but there are plenty of use cases that don’t fit into that mold. In subsequent posts I’ll explore the other options for projections in Marten V4. The asynchronous projection support also got a full rewrite in Marten V4, so I’ll share plenty more about that.

In other posts, I’ll discuss some other improvements in the event capture process for reliability and concurrency issues. Hopefully for the next alpha release, we’ll be able to utilize native Postgresql database sharding to allow for far more scaleability in Marten’s event sourcing support.

What would it take for you to adopt Marten?

If you’re stumbling in here without any knowledge of the Marten project (or me), Marten is an open source .Net library that developers can adopt in their project to use the rock solid Postgresql database as a pretty full featured document database and event store. If you’re unfamiliar with Marten, I think I’d say its feature set makes it similar to MongoDb (but the usage is significantly different), RavenDb, or Cosmos Db. On the event sourcing side of things, I think the only comparison in .Net world is GetEventStore itself, but you can certainly piece together an event store by combining other OSS libraries and database engines.

The Marten community is working very hard on our forthcoming (and long delayed) V4.0 release. We’ve already made some big strides on the document database side of things, and now we’re deep into some significant event store improvements (this link looks best in VS Code w/ the Mermaid plugin active). At Calavista, we’re considering if and how we can build a development practice around Marten for existing and potential clients. I’ve obviously got a lot of skin in the game here as the original creator of Marten. Nothing would make me happier than Marten being even more successful and that I get to help Calavista clients use Marten in real life systems as part of my day job.

I’d really like to hear from other folks what it would really take for them to seriously consider adopting Marten. What is Marten lacking now that you would need, or what kind of community or company support options would be necessary for your shop to use Marten in projects? I’m happy to hear any and all feedback or suggestions from as many people as I can get to respond.

I’m happy to take comments here, or the discussion for this topic is also on GitHub.

Existing Strengths

  • Marten is only a library, and at least for the document database features it’s very unobtrusive into your application code compared to many other persistence options
  • The Marten community is active and I hope you’d say that we’re welcoming to newcomers
  • By building on top of Postgresql, Marten comes with good cloud support from all the major cloud providers and plenty of existing monitoring options
  • Marten comes with many of the very real productivity advantages of a NoSQL solution, but has very strong transactional support from Postgresql itself
  • Marten’s event sourcing functionality comes “in the box” and there’s less work to do to fully incorporate event sourcing — including the all important “read side projection” support — into a .Net architecture than many other alternatives
  • Marten is part of the .Net Foundation
  • If you need commercial support for Marten, you can engage with Calavista Software.

Does any of that resonate with you? If you’ve used Marten before, is there anything missing from that list? And feel free to tell me you’re dubious about anything I’m claiming in the list above.

What’s already done or in flight

  • We made a lot of improvements to Marten’s Linq provider support. Not just in terms of expanding the querying scenarios we support, but also in improving the performance of the library across the board. I know this has been a source of trouble for many users in the past, and I’m excited about the improvements we’ve made in V4.
  • The event store functionality will get a lot more documentation — including sample applications — for V4
  • An important part of many event sourcing architectures is a background process to continuously build “projected” views of the raw events coming in. The current version of Marten has this capability, but it requires the user to do a lot of heavy architectural lifting to use it in any kind of clustered application. In V4, we’ll have an in the box recipe that will be used to do leader election and work distribution through an application cluster in “real server applications.” The asynchronous projection support in V4 will also support multi-tenancy (finally) and we have some ideas to greatly optimize projection rebuilds without system downtime
  • Using native Postgresql sharding for scalability, especially for the event store
  • Allowing users to specify event archival rules to keep the event store tables smaller and more performant
  • Adding more integration with .Net’s generic HostBuilder and standard logging abstractions for easier integration into .Net applications
  • Improving multi-tenancy usage based on user feedback
  • Document and event store metadata capabilities like you’d need for Marten to take part in end to end Open Telemetry tracing within your architecture.
  • More sample applications. To be honest, I’m hoping to find published reference applications built with Entity Framework Core and shift them to Marten. This might be part of an effort to show Jasper as a replacement for MediatR or NServiceBus/MassTransit as well.

And again, does any of that address whatever concerns you might have about adopting Marten? Or that you’d already had in the past?

Other Ideas?

Here are some other ideas that have been kicked around for improving Marten usage, but these ideas would probably need to come through some sort of Marten commercialization or professional support.

  • Cloud hosting recipes. Hopefully through Calavista projects, I’d like to develop some pre-built guidance and quick recipes for standing up scalable and maintainable Marten/Postgresql environments on both Azure and AWS. This would include schema migrations, monitoring, dynamic scaling, and any necessary kind of database provisioning. I think this might get into Terraform/Pulumi infrastructure as well.
  • Cloud hosting models for parallelizing and distributing work with asynchronous event projections. Maybe even getting into dynamic scaling.
  • Multi-tenancy through separate databases for each client tenant. You can pull this off today yourself, but there’s a lot of things to manage. Here I’m proposing more cloud hosting recipes for Marten/Postgresql that would include schema migrations and distributed work strategies for processing asynchronous event projections across the tenant databases.
  • Some kind of management user interface? I honestly don’t know what we’d do with that yet, but other folks have asked for something.
  • Event streaming Marten events through Kafka, Pulsar, AWS Kinesis, or Azure Event Hubs
  • Marten Outbox and Inbox approaches with messaging tools. I’ve already got this built and working with Jasper, but we could extend this to MassTransit or NServiceBus as well.

Planned Event Store Improvements for Marten V4, Daft Punk Edition

There’s a new podcast about Marten on the .Net Core Show that posted last week.

Marten V4 development has been heavily underway this year. To date, the work has mostly focused on the document store functionality (Linq, general performance improvements, and document metadata).

While I certainly hope the other improvements to Marten V4 will make a positive difference to our users, the big leap forward in capability is going to be on the event sourcing side of Marten. We’ve gathered a lot of user feedback on this feature set in the past couple years, but there’s always room for more discussion as things are taking shape.

First though, to set the mood:

The master issue for V4 event sourcing improvements is on GitHub here.

Scalability

We know there’s plenty of concern about how well Marten’s event store will scale over time. Beyond the performance improvements I’ll try to outline in following sections below, we’re planning to introduce support for:

Event Metadata

Similar to the document storage, the event storage in V4 will allow users to capture additional metadata to the event storage. There will be support in the event store Linq provider to query against this metadata, and this metadata will be available to the projections. Right now, the plan is to have opt in, additional fields for:

  • Correlation Id
  • Causation Id
  • User name

Additionally, the plan is to also have a “headers” field for user defined data that does not fall into the fields listed above. Marten will capture the metadata at the session level, with the thinking being that you could opt into custom Marten session creation that would automatically apply metadata for the current HTTP request or service bus message or logical unit of work.

There’ll be a follow up post on this soon.

Event Capture Improvements

When events are appended to event streams, we’re planning some small improvements for V4:

Projections, Projections, Projections!

This work is heavily in flight, so please shoot any feedback you might have our (Marten team’s) way.

Building your own event store is actually pretty easy — until the time you want to actually do something with the events you’ve captured or keep a “read-side” view of the status up to date with the incoming events. Based on a couple years of user feedback, all of that is exactly where Marten needs to grow up the most.

The master issue tracking the projection improvements is here. The Marten community (mostly me to be honest) has gone back and forth quite a bit on the shape of the new projection work and nothing I say here is set in stone. The main goals are to:

  • Significantly improve performance and throughput. We’re doing this partially by reducing in memory object allocations, but mostly by introducing much, much more parallelization of the projection work in the async daemon.
  • Simplify the usage of immutable data structures as the projected documents (note that we have plenty of F# users, and now C# record types make that a lot easier too).
  • Introduce snapshotting
  • Supplement the existing ViewProjection mechanism with conventional methods similar to the .Net StartUp class
  • Completely gut the existing ViewProjection to improve its performance while hopefully avoiding breaking API compatibility

There is some thought about breaking the projection support into its own project or making the event sourcing support be storage-agnostic, but I’m not sure about that making it to V4. My personal focus is on performance and scalability, and way too many of the possible optimizations seem to require coupling to details of Marten’s existing storage.

“Async Daemon”

The Async Daemon is an under-documented Marten subsystem we use to process asynchronously built event projections and do projection rebuilds. While it’s “functional” today, it has a lot of shortcomings (it can only run in one node at a time, and we don’d have any kind of leader election or failover) that prevent most folks from adopting it.

The master issue for the Async Daemon V4 is here, but the tl:dr is:

  • Make sure there’s adequate documentation (duh.)
  • Should be easy to integrate in your application
  • Has to be able to run in an application cluster in such a way that it guarantees that every projected view (or slice of a projected view) is being updated on exactly one node at a time
  • Improved performance and throughput of normal projection building
  • No downtime projection rebuilds
  • Way, way faster projection rebuilds

Now, to the changes coming in V4. Let’s assume that you’re doing “serious” work and needing to host your Marten-using .Net Core application across multiple nodes via some sort of cloud hosting. With minimal configuration, you’d like to have the asynchronous projection building “just work” across your cluster.

Here’s a visual representation of my personal “vision” for the async daemon in V4:

In V4 the async daemon will become a .Net Core BackgroundService that will be registered by the AddMarten() integration with HostBuilder. That mechanism will allow us to run background work inside of your .Net Core application.

Inside that background process the async daemon is going to have to elect a single “leader/distributor” agent that can only run on one node. That leader/distributor agent will be responsible for assigning work to the async daemon running inside all the active nodes in the application. What we’re hoping to do is to distribute and parallelize the projection building across running nodes. And oh yeah, do this without having to need any other kind of infrastructure besides the Postgresql database.

Within a single node, we’re adding a lot more parallelization to the projection building instead of treating everything as a dumb “left fold” single threaded queue problem. I’m optimistic that that’s going to make a huge difference for throughput. On top of that, I’m hoping that the new async daemon will be able to split work between different nodes without the nodes stepping on each other.

There’s still plenty of details to work out, and this post is just meant to be a window into some of the work that is happening within Marten for our big V4 release sometime in 2021.

Marten V4 Preview: Command Line Administration

TL;DR — It’s going to be much simpler in V4 to incorporate Marten’s command line administration tools into your .Net Core application.

In my last post I started to lay out some of the improvements in the forthcoming Marten V4 release with our first alpha Nuget release. In this post, I’m going to show the improvements to Marten’s command line package that can be used for some important database administration and schema migrations.

Unlike ORM tools like Entity Framework (it’s a huge pet peeve of mine when people describe Marten as an ORM), Marten by and large tries to allow you to be as productive as possible by keeping your focus on your application code instead of having to spend much energy and time on the details of your database schema. At development time you can just have Marten use its AutoCreate.All mode and it’ll quietly do anything it needs to do with your Postgresql database to make the document storage work at runtime.

For real production though, it’s likely that you’ll want to explicitly control when database schema changes happen. It’s also likely that you won’t want your application to have permissions to change the underlying database schema on the fly. To that end, Marten has quite a bit of functionality to export database schema updates for formal database migrations.

We’ve long supported an add on package called Marten.CommandLine that let’s you build your own command line tool to help manage these schema updates, but to date it’s required you to build a separate console application parallel to your application and has probably not been that useful to most folks.

In V4 though, we’re exploiting the Oakton.AspNetCore library that allows you to embed command line utilities directly into your .Net Core application. Let’s make that concrete with a small sample application in Marten’s GitHub repository.

Before I dive into that code, Marten v3.12 added a built in integration for Marten into the .Net Core generic HostBuilder that we’re going to depend on here. Using the HostBuilder for configuring and bootstrapping Marten into your application allows you to use the exact same Marten configuration and application configuration in the Marten command utilities without any additional work.

This sample application was built with the standard dotnet new webapi template. On top of that, I added a reference to the Marten.CommandLine library.

.Net Core applications tend to be configured and bootstrapped by a combination of a Program.Main() method and a StartUp class. First, here’s the Program.Main() method from the sample application:

public class Program
{
// It's actually important to return Task<int>
// so that the application commands can communicate
// success or failure
public static Task<int> Main(string[] args)
{
return CreateHostBuilder(args)

// This line replaces Build().Start()
// in most dotnet new templates
.RunOaktonCommands(args);
}

public static IHostBuilder CreateHostBuilder(string[] args) =>
Host.CreateDefaultBuilder(args)
.ConfigureWebHostDefaults(webBuilder =>
{
webBuilder.UseStartup<Startup>();
});
}

Note the signature of the Main() method and how it uses the RunOaktonCommands() method to intercept the command line arguments and execute named commands (with the default being to just run the application like normal).

Now, the Startup.ConfigureServices() method with Marten added in is this:

public void ConfigureServices(IServiceCollection services)
{
    // This is the absolute, simplest way to integrate Marten into your
    // .Net Core application with Marten's default configuration
    services.AddMarten(Configuration.GetConnectionString("Marten"));
}

Now, to the actual command line. As long as the Marten.CommandLine assembly is referenced by your application, you should see the additional Marten commands. From your project’s root directory, run dotnet run -- help and we see there’s some additional Marten-related options:

Oakton command line options with Marten.CommandLine in play

And that’s it. Now you can use dotnet run -- dump to export out all the SQL to recreate the Marten database schema, or maybe dotnet run -- patch upgrade_staging.sql --e Staging to create a SQL patch file that would make any necessary changes to upgrade your staging database to reflect the current Marten configuration (assuming that you’ve got an appsettings.Staging.json file with the right connection string pointing to your staging Postgresql server).

Check out the Marten.CommandLine documentation for more information on what it can do, but expect some V4 improvements to that as well.

Marten V4 Preview: Linq and Performance

Marten is an open source library for .Net that allows developers to treat the robust Postgresql database as a full featured and transactional document database (NoSQL) as well as supporting the event sourcing pattern of application persistence.

After a false start last summer, development on the long awaited and delayed Marten V4.0 release is heavily in flight and we’re making a lot of progress. The major focus of the remaining work is improving the event store functionality (that I’ll try to blog about later in the week if I can). We posted the first Marten V4 alpha on Friday for early adopters — or folks that need Linq provider fixes ASAP! — to pull down and start trying out. So far the limited feedback has been a nearly seamless upgrade.

You can track the work and direction of things through the GitHub issues that are already done and the ones that are still planned.

For today though, I’d like to focus on what’s been done so far in V4 in terms of making Marten simply better and faster at its existing feature set.

Being Faster by Doing Less

One of the challenging things about Marten’s feature set is the unique permutations of what exactly happens when you store, delete, or load document to and from the database. For example, some documents may or may not be:

On top of that, Marten supports a couple different flavors of document sessions:

  • Query-only sessions that are strictly read only querying
  • The normal session that supports an internal identity map functionality that caches previously loaded documents
  • Automatic dirty checking sessions that are the heaviest Marten sessions
  • “Lightweight” sessions that don’t use any kind of identity map caching or automatic dirty checking for faster performance and better memory usage — at the cost of a little more developer written code.

The point here is that there’s a lot of variability in what exactly happens in Marten when you save, load, or delete a document with Marten. In the current version, Marten uses a combination of runtime if/then logic, some “Nullo” classes, and a little bit of “Expression to Lambda” runtime compilation.

For V4, I completely re-wired the internals to use C# code generated and compiled at runtime using Roslyn’s runtime compilation capabilities. Marten is using the LamarCompiler and LamarCodeGeneration libraries as helpers. You can see these two libraries and this technique in action in a talk I gave at NDC London in 2019.

The end result of all this work is that we can generated the tightest possible C# handling code and the tightest possible SQL for the exact permutation of document storage characteristics and session type. Along the way, we’ve striven to reduce the number of dictionary lookups, runtime branching logic, empty Nullo objects, and generally the number of computer instructions that would have to be executed by the underlying processor just to save, load, or delete a document.

So far, so good. It’s hard to say exactly how much this is going to impact any given Marten-using application, but the existing test suite clearly runs faster now and I’m not seeing any noticeable issue with the “cold start” of the initial, one time code generation and compilation (that was a big issue in early Roslyn to the point where we ripped that out of pre 1.0 Marten, but seems to be solved now).

If anyone is curious, I’d be happy to write a blog post diving into the guts of how that works. And why the new .Net source generator feature wouldn’t work in this case if anyone wants to know about that too.

Linq Provider Almost-Rewrite

To be honest, I think Marten’s existing Linq provider (pre-V4) is pretty well stuck at the original proof of concept stage thrown together 4-5 years ago. The number of open issues where folks had hit limitations in the Linq provider support built up — especially with anything involving child collections on document types.

For V4, we’ve heavily restructured the Linq parsing and SQL generation code to address the previous shortcomings. There’s a little bit of improvement in the performance of Linq parsing and also a little bit of optimization of the SQL generated by avoiding unnecessary CASTs. Most of the improvement has been toward addressing previously unsupported scenarios. A potential improvement that we haven’t yet exploited much is to make the SQL generation and Linq parsing more able to support custom value types and F#-isms like discriminated unions through a new extensibility mechanism that teaches Marten about how these types are represented in the serialized JSON storage.

Querying Descendent Collections

Marten pre-V4 didn’t handle querying through child collections very well and that’s been a common source of user issues. With V4, we’re heavily using the Common Table Expression query support in Postgresql behind the scenes to make Linq queries like this one shown below possible:

var results = theSession.Query<Top>()
.Where(x => x.Middles.Any(b => b.Bottoms.Any()))
.ToList();

I think that at this point Marten can handle any combination of querying through child collections through any number of levels with all possible query operators (Any() / Count()) and any supported Where() fragment within the child collection.

Multi-Document Includes

Marten has long had some functionality for fetching related documents together in one database round trip for more efficient document reading. A long time limitation in Marten is that this Include() capability was only usable for logical “one to one” or “many to one” document relationships. In V4, you can now use Include() querying for “one to many” relationships as shown below:

[Fact]
public void include_many_to_list()
{
var user1 = new User { };
var user2 = new User { };
var user3 = new User { };
var user4 = new User { };
var user5 = new User { };
var user6 = new User { };
var user7 = new User { };

theStore.BulkInsert(new User[]{user1, user2, user3, user4, user5, user6, user7});

var group1 = new Group
{
Name = "Odds",
Users = new []{user1.Id, user3.Id, user5.Id, user7.Id}
};

var group2 = new Group {Name = "Evens", Users = new[] {user2.Id, user4.Id, user6.Id}};

using (var session = theStore.OpenSession())
{
session.Store(group1, group2);
session.SaveChanges();
}

using (var query = theStore.QuerySession())
{
var list = new List<User>();

query.Query<Group>()
.Include(x => x.Users, list)
.Where(x => x.Name == "Odds")
.ToList()
.Single()
.Name.ShouldBe("Odds");

list.Count.ShouldBe(4);
list.Any(x => x.Id == user1.Id).ShouldBeTrue();
list.Any(x => x.Id == user3.Id).ShouldBeTrue();
list.Any(x => x.Id == user5.Id).ShouldBeTrue();
list.Any(x => x.Id == user7.Id).ShouldBeTrue();
}
}

This was a longstanding request from users, and to be honest, we had to completely rewrite the Include() internals to add this support. Again, we used Common Table Expression SQL statements in combination with per session temporary tables to pull this off.

Compiled Queries Actually Work

I think the Compiled Query feature is unique in Marten. It’s probably easiest and best to think of it as a “stored procedure” for Linq queries in Marten. The value of a compiled query in Marten is:

  1. It potentially cleans up the application code that has to interact with Marten queries, especially for more complex queries
  2. It’s potentially some reuse for commonly executed queries
  3. Mostly though, it’s a significant performance improvement because it allows Marten to “remember” the Linq query plan.

While compiled queries have been supported since Marten 1.0, there’s been a lot of gap between what works in Marten’s Linq support and what functions correctly inside of compiled queries. With the advent of V4, the compiled query planning was rewritten with a new strategy that so far seems to support all of the Linq capabilities of Marten. We think this will make the compiled query feature much more useful going forward.

Here’s an example compiled query that was not possible before V4:

public class FunnyTargetQuery : ICompiledListQuery<Target>
{
public Expression<Func<IMartenQueryable<Target>, IEnumerable<Target>>> QueryIs()
{
return q => q
.Where(x => x.Flag && x.NumberArray.Contains(Number));
}

public int Number { get; set; }
}

And in usage:

var actuals = session.Query(new FunnyTargetQuery{Number = 5}).ToArray();

Multi-Level SelectMany because why not?

Marten has long supported the SelectMany() keyword in the Linq provider support, but in V4 it’s much more robust with the ability to chain SelectMany() clauses n-deep and do that in combination with any kind of Count() / Distinct() / Where() / OrderBy() Linq clauses. Here’s an example:

[Fact]
public void select_many_2_deep()
{
var group1 = new TargetGroup
{
Targets = Target.GenerateRandomData(25).ToArray()
};

var group2 = new TargetGroup
{
Targets = Target.GenerateRandomData(25).ToArray()
};

var group3 = new TargetGroup
{
Targets = Target.GenerateRandomData(25).ToArray()
};

var groups = new[] {group1, group2, group3};

using (var session = theStore.LightweightSession())
{
session.Store(groups);
session.SaveChanges();
}

using var query = theStore.QuerySession();

var loaded = query.Query<TargetGroup>()
.SelectMany(x => x.Targets)
.Where(x => x.Color == Colors.Blue)
.SelectMany(x => x.Children)
.OrderBy(x => x.Number)
.ToArray()
.Select(x => x.Id).ToArray();

var expected = groups
.SelectMany(x => x.Targets)
.Where(x => x.Color == Colors.Blue)
.SelectMany(x => x.Children)
.OrderBy(x => x.Number)
.ToArray()
.Select(x => x.Id).ToArray();

loaded.ShouldBe(expected);
}

Again, we pulled that off with Common Table Expression statements.

Calling Generic Methods from Non-Generic Code in .Net

Somewhat often (or at least it feels that way this week) I’ll run into the need to call a method with a generic type argument from code that isn’t generic. To make that concrete, here’s an example from Marten. The main IDocumentSession service has a method called Store() that directs Marten to persist one or more documents of the same type. That method has this signature:

void Store<T>(params T[] entities);

That method would typically be used like this:

using (var session = store.OpenSession())
{
    // The generic constraint for "Team" is inferred from the usage
    session.Store(new Team { Name = "Warriors" });
    session.Store(new Team { Name = "Spurs" });
    session.Store(new Team { Name = "Thunder" });

    session.SaveChanges();
}

Great, and easy enough (I hope), but Marten also has this method where folks can add a heterogeneous mix of any kind of document types all at once:

void StoreObjects(IEnumerable<object> documents);

Internally, that method groups the documents by type, then delegates to the property Store<T>() method for each document type — and that’s where this post comes into play.

(Re-)Introducing Baseline

Baseline is a library available on Nuget that provides oodles of little helper extension methods on common .Net types and very basic utilities that I use in almost all my projects, both OSS and at work. Baseline is an improved subset of what was long ago FubuCore (FubuCore was huge, and it also spawned Oakton), but somewhat adapted to .Net Core.

I wanted to call this library “spackle” because it fills in usability gaps in the .Net base class library, but Jason Bock beat me to it with his Spackle library of extension methods. Since I expected this library to be used as a foundational piece from within basically all the projects in the JasperFx suite, I chose the name “Baseline” which I thought conveniently enough described its purpose and also because there’s an important throughway near the titular Jasper called “Baseline”. I don’t know for sure that it’s the basis for the name, but the Battle of Carthage in the very early days of the US Civil War started where this road is today.

Crossing the Non-Generic to Generic Divide with Baseline

Back to the Marten StoreObjects(object[]) calling Store<T>(T[]) problem. Baseline has a helper extension method called CloseAndBuildAs<T>() method I frequently use to solve this problem. It’s unfortunately a little tedious, but first design a non-generic interface that will wrap the calls to Store<T>() like this:

internal interface IHandler
{
void Store(IDocumentSession session, IEnumerable<object> objects);
}

And a concrete, open generic type that implements IHandler:

internal class Handler<T>: IHandler
{
public void Store(IDocumentSession session, IEnumerable<object> objects)
{
// Delegate to the Store<T>() method
session.Store(objects.OfType<T>().ToArray());
}
}

Now, the StoreObjects() method looks like this:

public void StoreObjects(IEnumerable<object> documents)
{
assertNotDisposed();

var documentsGroupedByType = documents
.Where(x => x != null)
.GroupBy(x => x.GetType());

foreach (var group in documentsGroupedByType)
{
// Build the right handler for the group type
var handler = typeof(Handler<>).CloseAndBuildAs<IHandler>(group.Key);
handler.Store(this, group);
}
}

The CloseAndBuildAs<T>() method above does a couple things behind the scenes:

  1. It creates a closed type for the proper Handler<T> based on the type arguments passed into the method
  2. Uses Activator.CreateInstance() to build the concrete type
  3. Casts that object to the interface supplied as a generic argument to the CloseAndBuildAs<T>() method

The method shown above is here in GitHub. It’s not shown, but there are some extra overloads to also pass in constructor arguments to the concrete types being built.