Author Archives: jeremydmiller

Improvements to Event Sourcing in Marten V4

Marten V4 is still in flight, but everything I’m showing in this post is in the latest alpha release (4.0.0-alpha.6) release to Nuget.

TL;DR: Marten V4 has some significant improvements to its event sourcing support that will help developers deal with potential concurrency issues. The related event projection support in Marten V4 is also significantly more robust than previous versions.

A Sample Project Management Event Store

Imagine if you will, a simplistic application that uses Marten’s event sourcing functionality to do project management tracking with a conceptual CQRS architecture. The domain will include tracking each project task within the greater project. In this case, I’m choosing to model the activity of a single project and its related tasks as a stream of events like:

  1. ProjectStarted
  2. TaskRecorded
  3. TaskStarted
  4. TaskFinished
  5. ProjectCompleted

Starting a new Event Stream

Using event sourcing as our persistence strategy, the real system state is the raw events that model state changes or transitions of our project. As an example, let’s say that our system records and initializes a “stream” of events for a new project with this command handler:

    public class NewProjectCommand
    {
        public string Name { get; set; }
        public string[] Tasks { get; set; }
        public Guid ProjectId { get; set; } = Guid.NewGuid();
    }

    public class NewProjectHandler
    {
        private readonly IDocumentSession _session;

        public NewProjectHandler(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(NewProjectCommand command)
        {
            var timestamp = DateTimeOffset.Now;
            var started = new ProjectStarted  
            {
                Name = command.Name, 
                Timestamp = timestamp
            };

            var tasks = command.Tasks
                .Select(name => new TaskRecorded {
                    Timestamp = timestamp, 
                    Title = name
                });

            // This tells Marten that it should create a new project
            // stream
            _session.Events.StartStream(command.ProjectId, started);

            // This quietly appends events to the event stream
            // created in the line of code above
            _session.Events.Append(command.ProjectId, tasks);

            // The actual persistence to Postgresql happens here
            return _session.SaveChangesAsync();
        }
    }

The call to StartStream() up above tells Marten that it should create a new event stream with the supplied project id. As a long requested improvement in Marten for V4, StartStream() guarantees that the transaction cannot succeed if another project event stream with that project id already exists. This helps the system prevent users from accidentally duplicating projects — at least by its project id anyway.

Assuming that there is no existing stream with the project id, Marten will create a new record to track the new project stream and individual records in Postgresql to persist the raw event data along with metadata like the stream version, time stamps, and the event type.

But we need a “read side” projection of the Projects!

So great, we can persist the raw events, and there’s plenty of valuable things we can do later with those events. However, our application sooner or later is going to need to know what the current state of an ongoing project for user screens or validation logic, so we need some way to compile the events for a single project into some kind of “here’s the current state of the Project” data structure — and that’s where Marten’s support for projections comes into play.

To model a “projected” view of a project from its raw events, I’m creating a single Project class to model the full state of a single, ongoing project. To reduce the number of moving parts, I’m going to make Project be “self-aggregating” such that it’s responsible for mutating itself based on incoming events:

    public class Project
    {
        private readonly IList _tasks = new List();

        public Project(ProjectStarted started)
        {
            Version = 1;
            Name = started.Name;
            StartedTime = started.Timestamp;
        }

        // This gets set by Marten
        public string Id { get; set; }

        public long Version { get; set; }
        public DateTimeOffset StartedTime { get; private set; }
        public DateTimeOffset? CompletedTime { get; private set; }

        public string Name { get; private set; }

        public ProjectTask[] Tasks
        {
            get
            {
                return _tasks.ToArray();
            }
            set
            {
                _tasks.Clear();
                _tasks.AddRange(value);
            }
        }

        public void Apply(TaskRecorded recorded, IEvent e)
        {
            Version = e.Version;
            var task = new ProjectTask
            {
                Title = recorded.Title,
                Number = _tasks.Max(x => x.Number) + 1,
                Recorded = recorded.Timestamp
            };

            _tasks.Add(task);
        }

        public void Apply(TaskStarted started, IEvent e)
        {
            Version = e.Version;
            var task = _tasks.FirstOrDefault(x => x.Number == started.Number);

            // Remember this isn't production code:)
            if (task != null) task.Started = started.Timestamp;
        }

        public void Apply(TaskFinished finished, IEvent e)
        {
            Version = e.Version;
            var task = _tasks.FirstOrDefault(x => x.Number == finished.Number);

            // Remember this isn't production code:)
            if (task != null) task.Finished = finished.Timestamp;
        }

        public void Apply(ProjectCompleted completed, IEvent e)
        {
            Version = e.Version;
            CompletedTime = completed.Timestamp;
        }
    }

I didn’t choose to use that here, but I want to point out that you can use immutable aggregates with Marten V4. In that case you’d simply have the Apply() methods return a new Project object (as far as Marten is concerned anyway). Functional programming enthusiasts will cheer the built in support for immutability, some of you will point out that that leads to less efficient code by increasing the number of object allocations and more code ceremony, and I will happily say that Marten V4 let’s you make that decision to suit your own needs and preferences;-)

Also see Event Sourcing with Marten V4: Aggregated Projections for more information on other alternatives for expressing aggregated projections in Marten.

Working in our project management domain, I’d like the Project aggregate document to be updated every time events are captured for an event related to a project. I’ll add the Project document as a self-aggregating, inline projection in my Marten system like this:

    public class Startup
    {
        public void ConfigureServices(IServiceCollection services)
        {
            services.AddMarten(opts =>
            {
                opts.Connection("some connection");

                // Direct Marten to update the Project aggregate
                // inline as new events are captured
                opts.Events
                    .Projections
                    .SelfAggregate<Project>(ProjectionLifecycle.Inline);

            });
        }
    }

As an example of how inline projections come into play, let’s look at another sample command handler for adding a new task to an ongoing project with a new task title :

    public class CreateTaskCommand
    {
        public string ProjectId { get; set; }
        public string Title { get; set; }
    }

    public class CreateTaskHandler
    {
        private readonly IDocumentSession _session;

        public CreateTaskHandler(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(CreateTaskCommand command)
        {
            var recorded = new TaskRecorded
            {
                Timestamp = DateTimeOffset.UtcNow, 
                Title = command.Title
            };

            _session.Events.Append(command.ProjectId, recorded);
            return _session.SaveChangesAsync();
        }
    }

When SaveChangesAsync() is called above, Marten will issue a single database commit that captures a new TaskRecorded event. In that same commit Marten will fetch the persisted Project document for that project event stream, apply the changes from the new event, and persist the Project document that reflects the information in the events.

Let’s say that we have a user interface in our project management system that allows users to review and edit projects and tasks. Independent of the event streams, the “query” side of our application can happily retrieve the latest data for the Project documents by querying with Marten’s built in document database functionality like this simple MVC controller endpoint:

    public class ProjectController: ControllerBase
    {
        private readonly IQuerySession _session;

        public ProjectController(IQuerySession session)
        {
            _session = session;
        }

        [HttpGet("/api/project/{projectId}")]
        public Task<Project> Get(string projectId)
        {
            return _session.LoadAsync<Project>(projectId);
        }
    }

For a future blog post, I’ll show you a new Marten V4 feature for ultra efficient “read side” web services by streaming JSON data directly from Postgresql straight down the HTTP response body without ever wasting any time with JSON serialization.

Concurrency Problems Everywhere!!!

Don’t name a software project “Genesis”. Hubristic project names lead to massive project failures.

The “inline” projection strategy is easy to use in Marten, but it’s vulnerable to oddball concurrency problems if you’re not careful in your system design. Let’s say that we have two coworkers named Jack and Jill using our project management system. Jack pulls up the data for the “Genesis Project” in the UI client, then gets pulled into a hallway conversation (remember when we used to have those in the before times?). While he’s distracted, Jill makes some changes to the “Genesis Project” to record some tasks she and Jack had talked about earlier and saves the changes to the server. Jack finally gets back to his machine and makes basically the same edits to “Genesis” that Jill already did and saves the data. If we build our project management system naively, we’ve now allowed Jack and Jill to duplicate work and the “Genesis Project” task management is all messed up.

One way to prevent that concurrent change issue is to detect that the project has been changed by Jill when Jack tries to persist his duplicate changes and give him the chance to update his screen with the latest data before pushing new project task changes.

Going back to the Project aggregate document, as Marten appends events to a stream, it increments a numeric version number to each event within a stream. Let’s say in our project management system that we always want to know what the latest version of the project. Marten V4 finally allows you to use event metadata like the event version number within inline projections (this has been a long running request from some of our users). You can see that in action in the method below that updates Project based on a TaskStarted event. You might notice that I also pass in the IEvent object that would let us access the event metadata:

public void Apply(TaskStarted started, IEvent e)
{
    // Update the Project document based on the event version
    Version = e.Version;
    var task = _tasks.FirstOrDefault(x => x.Number == started.Number);

    // Remember this isn't production code:)
    if (task != null) task.Started = started.Timestamp;
}

Now, having the stream version number on the Project document turns out to be very helpful for concurrency issues. Since we’re worried about a Project event stream state getting out of sync if the system receives concurrent updates you can pass the current version that’s conveniently updated on the Project read side document down to the user interface, and have the user interface send you what it thinks is the current version of the project when it tries to make updates. If the underlying project event stream has changed since the user interface fetched the original data, we can make Marten reject the additional events. This is a form of an offline optimistic concurrency check, as we’re going to assume that everything is hunky dory, and just let the infrastructure reject the changes as an exceptional case.

To put that into motion, let’s say that our project management user interface posts this command up to the server to close a single project task:

    public class CompleteTaskCommand
    {
        public Guid ProjectId { get; set; }
        public int TaskNumber { get; set; }
        
        // This is the version of the project data
        // that was being edited in the user interface
        public long ExpectedVersion { get; set; }
    }

In the command handler, we can direct Marten to reject the new events being appended if the stream has been changed in between the user interface having fetched the project data and submitting its updates to the server:

    public class CompleteTaskHandler
    {
        private readonly IDocumentSession _session;

        public CompleteTaskHandler(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(CompleteTaskCommand command)
        {
            var @event = new TaskFinished
            {
                Number = command.TaskNumber,
                Timestamp = DateTimeOffset.UtcNow
            };

            _session.Events.Append(
                command.ProjectId,

                // Using this overload will make Marten do
                // an optimistic concurrency check against
                // the existing version of the project event
                // stream as it commits
                command.ExpectedVersion,

                @event);
            return _session.SaveChangesAsync();
        }
    }

In the CompleteTaskHandler above, the call to SaveChangesAsync() will throw a EventStreamUnexpectedMaxEventIdException exception if the project event stream has advanced past the existing version assumed by the originator of the command in command.ExpectedVersion. To make our system more resilient, we would need to catch the Marten exception and deal with the proper application workflow.

What I showed above is pretty draconian in terms of what edits it allows to go through. In other cases you may get by with a simple workflow that just tries to guarantee that a single Project aggregate is only being updated by a single process at any time. Marten V4 comes through with a couple new ways to append events with more control over concurrent updates to a single event stream.

Let’s stick to optimistic checks for now and look at the new AppendOptimistic() method for appending events to an existing event stream with a rewritten version of our CompleteTaskCommand handling:

    public class CompleteTaskHandler2
    {
        private readonly IDocumentSession _session;

        public CompleteTaskHandler2(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(CompleteTaskCommand command)
        {
            var @event = new TaskFinished
            {
                Number = command.TaskNumber,
                Timestamp = DateTimeOffset.UtcNow
            };

            // If some other process magically zips
            // in and updates this project event stream
            // between the call to AppendOptimistic()
            // and SaveChangesAsync(), Marten will detect
            // that and reject the transaction
            _session.Events.AppendOptimistic(
                command.ProjectId,
                @event);

            return _session.SaveChangesAsync();
        }
    }

We would still need to catch the Marten exceptions and handle those somehow. I myself would typically try to handle that with a messaging or command execution framework like MassTransit or my own Jasper project that comes with robust exception handling and retry capabilities.

At this point everything I’ve shown for concurrency control involves optimistic locks that result in exceptions being thrown and transactions being rejected. For another alternative, Marten V4 leverages Postgresql’s robust row locking support with this version of our CompleteTaskCommand handler that uses the new V4 AppendExclusive() method:

    public class CompleteTaskHandler3
    {
        private readonly IDocumentSession _session;

        public CompleteTaskHandler3(IDocumentSession session)
        {
            _session = session;
        }

        public Task Handle(CompleteTaskCommand command)
        {
            var @event = new TaskFinished
            {
                Number = command.TaskNumber,
                Timestamp = DateTimeOffset.UtcNow
            };

            // This tries to acquire an exclusive
            // lock on the stream identified by
            // command.ProjectId in the database
            // so that only one process at a time
            // can update this event stream
            _session.Events.AppendExclusive(
                command.ProjectId,
                @event);
            return _session.SaveChangesAsync();
        }
    }

This is heavier in terms of its impact on the database, but simpler to implement as there is no flow control by exceptions to worry about.

I should also note that both the AppendOptimistic() and AppendExclusive() methods will verify that the stream already exists and throw an exception if the stream does not.

Oakton v3 super charges the .Net Core/5 command line, and helps Lamar deliver uniquely useful IoC diagnostics

With the Great Texas Ice and Snow Storm of 2021 last week playing havoc with basically everything, I switched to easier work on Oakton and Lamar in the windows of time when we had power. The result of that is the planned v3 release that merged the former Oakton.AspNetCore library into Oakton itself, and took a dependency on Spectre.Console to make all the command output a whole lot more usable. As a logical follow up to the Oakton improvements, I mostly rewrote the Lamar.Diagnostics package that uses Oakton to add Lamar-related troubleshooting commands to your application. I used Spectre.Console to improve how the Lamar configuration was rendered to enhance its usability. I wanted the improved Lamar diagnostics for an ongoing Calavista client project, and that’ll be in heavy usage this very week.

Late last week I released version 3.0.1 of Oakton, a library for building rich command line functionality into .Net Core / .Net 5.0 applications. While competent command line parsing tools are a dime a dozen in the .Net space by this point, Oakton stands out by integrating directly into the generic HostBuilder and extending the command line operations of your currently configured .Net system (Oakton can still be used without HostBuilder as well, see Using the CommandExecutor for more information).

Jumping right into the integration effort, let’s assume that you have a .Net Core 3.1 or .Net 5 application generated by one of the built in project templates. If so, your application’s entry point probably looks like this Program.Main() method:

public static void Main(string[] args)
{
    CreateHostBuilder(args).Build().Run();
}

To integrate Oakton commands, add a reference to the Oakton Nuget, then change Program.Main() to this:

public static Task<int> Main(string[] args)
{
    return CreateHostBuilder(args)
        .RunOaktonCommands(args);
}

It’s important to change the return type to Task<int> because that’s used to communicate exit codes to the operating system to denote success or failure for commands that do some kind of system verification.

When you use dotnet run, you can pass arguments or flags to the dotnet run command itself, and also arguments and flags to your actual application. The way that works is that arguments are delimited by a double dash, so that in this usage: dotnet run --framework net5.0 -- check-env, the --framework flag is passed to dotnet run itself to modify its behavior, and the check-env argument is passed to your application.

Out of the box, Oakton adds a couple named commands to your application along with a default “run my application” command. You can access the built in Oakton command line help with dotnet run -- help. The output of that is shown below:

   ---------------------------------------------------------------------
     Available commands:
   ---------------------------------------------------------------------
     check-env -> Execute all environment checks against the application
      describe -> Writes out a description of your running application to either the console or a file
          help -> list all the available commands
           run -> Start and run this .Net application
   --------------------------------------------------------------------- 

Slightly Enhanced “Run”

Once Oakton is in place, your basic dotnet run command has a couple new flag options.

  1. dotnet run still starts up your application just like it did before you applied Oakton
  2. dotnet run -- -e Staging using the -e flag starts the application with the host environment name overridden to the value of the flag. In this example, I started the application running as “Staging.”
  3. dotnet run -- --check or dotnet run -- -c will run your application, but first perform any registered environment checks before starting the full application. More on this below.

Environment Checks

Oakton has a simple mechanism to embed environment checks directly into your application. The point of environment checks is to make your .Net application be able to check out its configuration during deployment in order to “fail fast” if external dependencies like databases are unavailable, or required files or missing, or necessary configuration items are invalid.

To use environment checks in your application with Oakton, just run this command:

dotnet run -- check-env

The command will fail if any of the environment checks fail, and you’ll get a report of every failure.

I wrote about this a couple years ago, and Oakton v3 just makes the environment check output a little prettier. If you’re using Lamar as your application’s IoC container, the Lamar.Diagnostics package adds some easier recipes for exposing environment checks in your code in combination with Oakton’s check-env command. I’ll show that later in this post.

The environment check functionality can be absolutely invaluable when your application is being deployed to an environment that isn’t perfectly reliable or has dependencies outside the control of your team. Ask me how I know this to be the case…

Describing the Application

A couple of the OSS projects I support are very sensitive to application configuration, and it’s sometimes challenging to get the right information from users who are having problems with my OSS libraries. As a possible way to ameliorate that issue, there’s Oakton’s extensible describe command:

dotnet run -- describe

Out of the box it just lists basic information about the application and referenced assemblies, but it comes with a simple extensibility mechanism I’m hoping to exploit for embedding Jasper, Marten, and Lamar diagnostics in applications.

Here’s the sample output from a simplistic application built with the worker template:

── About WorkerService ─────────────────────────────────────────
          Entry Assembly: WorkerService
                 Version: 1.0.0.0
        Application Name: WorkerService
             Environment: Development
       Content Root Path: the content root folder
AppContext.BaseDirectory: the application's base directory


── Referenced Assemblies ───────────────────────────────────────
┌───────────────────────────────────────────────────────┬─────────┐
│ Assembly Name                                         │ Version │
├───────────────────────────────────────────────────────┼─────────┤
│ System.Runtime                                        │ 5.0.0.0 │
│ Microsoft.Extensions.Configuration.UserSecrets        │ 5.0.0.0 │
│ Microsoft.Extensions.Hosting.Abstractions             │ 5.0.0.0 │
│ Microsoft.Extensions.DependencyInjection.Abstractions │ 5.0.0.0 │
│ Microsoft.Extensions.Logging.Abstractions             │ 5.0.0.0 │
│ Oakton                                                │ 3.0.2.0 │
│ Microsoft.Extensions.Hosting                          │ 5.0.0.0 │
└───────────────────────────────────────────────────────┴─────────┘

Lamar Diagnostics

As an add-on to Oakton, I updated the Lamar.Diagnostics package that uses Oakton to add Lamar diagnostics to a .Net application. If this package is added to a .Net project that uses Oakton for command line parsing and Lamar as its IoC container, you’ll see new commands light up from dotnet run -- help:

    lamar-scanning -> Runs Lamar's type scanning diagnostics
    lamar-services -> List all the registered Lamar services
    lamar-validate -> Runs all the Lamar container validations

The lamar-scanning command just gives you access to Lamar’s type scanning diagnostics report (something my team & I needed post haste for an ongoing Calavista client project).

The lamar-validate command runs the AssertConfigurationIsValid() method on your application’s configured container to validate that every service registration is valid, can be built, and also tries out Lamar’s own built in environment check system. You can read more about what this method does in the Lamar documentation.

You can also add the Lamar container validation step into the Oakton environment check system with this service registration:

// This adds a Container validation
// to the Oakton "check-env" command
// "services" would be an IServiceCollection object
// either in StartUp.ConfigureServices() or in a Lamar
// ServiceRegistry
services.CheckLamarConfiguration();

As I said earlier, Lamar still supports a very old StructureMap feature to embed environment checks right into your application’s services with the [ValidateMethod] attribute. Let’s say that you have a class in your system called DatabaseUsingService that requires a valid connection string from configuration, and DatabaseUsingService is registered in your Lamar container. You could add a method on that class just to check if the database is reachable like the Validate() method below:

    public class DatabaseUsingService
    {
        private readonly DatabaseSettings _settings;

        public DatabaseUsingService(DatabaseSettings settings)
        {
            _settings = settings;
        }

        [ValidationMethod]
        public void Validate()
        {
            // For *now*, Lamar requires validate methods be synchronous
            using (var conn = new SqlConnection(_settings.ConnectionString))
            {
                // If this blows up, the environment check fails:)
                conn.Open();
            }
        }
    }

When Lamar validates the container, it will see the methods marked with [ValidationMethod], pull the service from the configured container, and try to call this method and record any exceptions.

Lamar Services Preview

The big addition in the latest Lamar.Diagnostics package is a big improvement in how it renders a report about the registered services compared to the old, built in WhatDoIHave() textual report.

Starting from a pretty basic ASP.Net Core application, using the dotnet run -- lamar-services command gives us this output:

If you want to dig into the dependency train and how Lamar is going to build a certain registration, you can use this flag dotnet run -- lamar-services --build-plans. Which is going to look like this:

Hat tip to Spectre.Console for making the layout a lot prettier and easier to read.

Where does Oakton go next?

I’m somewhat hopeful that other folks would pick up Oakton’s command line model and add more utilities, application describer parts, and environment checks in either Oakton itself or in extension libraries. For my part, Marten v4’s command line support will use Oakton to add schema management and event store utilities directly to your application. I think it’d also be helpful to have an ASP.Net specific set of application described parts to at least preview routes.

Event Sourcing with Marten V4: Aggregated Projections

All the code samples in this post are from alpha code, and maybe subject to change based on user feedback. At a minimum, I’d expect the configuration code to change as we write more documentation and sample code and try to sand down anything that’s awkward, confusing, or not discoverable. I’m planning on making this the first in a series of a blog posts. Please, please, please share any feedback or questions you might have about the Marten usage here.

The Marten team kicked out a new alpha this week (4.0.0-alpha.5) that among other things, includes most of our planned improvements to Marten’s event sourcing support.

Before I dive into the Marten V4 improvements, let’s rewind and talk about what event sourcing is, starting with some quotes:

The fundamental idea of Event Sourcing is that of ensuring every change to the state of an application is captured in an event object, and that these event objects are themselves stored in the sequence they were applied for the same lifetime as the application state itself.

Event Sourcing by Martin Fowler (no relation to Marten:))

And,

Event Sourcing is an alternative way to persist data. In contrast with state-oriented persistence that only keeps the latest version of the entity state, Event Sourcing stores each state mutation as a separate record called an event.

What is Event Sourcing? by Alexey Zimarev 

We just finished a client project at Calavista that used event sourcing that we generally felt to be a success. In this case, the business problem was very workflow centric and lent itself well to being modeled as a series of events reflecting user actions captured by the system or determined by business rules in background processes. Moreover, the project had significant requirements to track metrics and auditing compliance and we found event sourcing to be a very effective way to knock out the auditing requirements as well as set our client up to be able to support whatever metrics they wished to develop in the future by ingesting the raw events.

We did need to know the current state of the active workflows going on within the system for many of the business rules, so we kept a live “projected” view of the raw events in memory in a background process. That strategy certainly won’t work for every system, but one way or another, your system built with event sourcing is likely going to need some way to derive the system state from the raw events — and this is where I finally want to switch to talking about the work we’ve been doing in Marten V4 to improve Marten’s read-side projection support.

If you’re wondering, we didn’t use Marten because the project in question was written on top of Node.js. If it had been a .Net project, I absolutely believe that Marten would have been a very good fit.

Marten’s Event Sourcing Support

The value of Marten as an event store is that in one library, you get:

  1. The ability to capture events in durable, sequential storage
  2. Opt in multi-tenancy support
  3. User-defined “Projections” that compile the derived state of the system based on the raw events, including the ability to store the projected views as just regular old Marten documents. Those “projected” views can be built on the fly, updated inline when new, related events are being captured, or built asynchronously by background processes provided by Marten.
  4. Plenty of functionality to query and retrieve event data

And all of that runs using the solid, fully transactional Postgresql database engine which is well supported on every major cloud hosting platform. I’m going to argue that Marten gives .Net developers the easiest path to a full fledged event sourcing persistence subsystem within your application architecture because it’s self-contained.

A Sample Domain Model

Let’s say we’re building a system to track our user’s road trips throughout the U.S. In that domain, we’re tracking and capturing these events for each active trip starting from the beginning of the system:

  1. TripStarted — a new trip started on a certain day
  2. TripEnded — a trip in progress ended at its final destination
  3. TripAborted — a trip in progress was ended before it was completed
  4. Departure and Arrival — a trip party reached or left a U.S. state
  5. Travel — a trip party drove within a single state on a single day a series of movements, all of a single cardinal direction like it’s a 1980’s Atari video game (cut me some slack, I needed a simple domain to test the projections here:))

Getting Started with Marten

Marten completely embraces the HostBuilder concept introduced in later versions of .Net Core for easy integration into .Net applications. Starting from the “worker” template to start a new .Net project, I add a reference to the Marten Nuget package and add this call to the AddMarten() extension method like so:

public static IHostBuilder CreateHostBuilder(string[] args)
{
    return Host.CreateDefaultBuilder(args)
        .ConfigureServices((hostContext, services) =>
        {
            var configuration = hostContext.Configuration;
            var environment = hostContext.HostingEnvironment;

            services.AddHostedService<Worker>();

            // This is the absolute, simplest way to integrate Marten into your
            // .Net Core application with Marten's default configuration
            services.AddMarten(options =>
            {
                // Establish the connection string to your Marten database
                options.Connection(configuration.GetConnectionString("Marten"));

                // If we're running in development mode, let Marten just take care
                // of all necessary schema building and patching behind the scenes
                if (environment.IsDevelopment())
                {
                    options.AutoCreateSchemaObjects = AutoCreate.All;
                }
            });
        });
}

I’ll add some new V4 options later in the post, but the basics of what I did up there is described in our documentation.

Now to capture some events. Assume that a new trip is started when our system receives this command message from another system or our user interface:

public class CreateNewTrip
{
    public int Day { get; set; }
    public string State { get; set; }
    public Movement[] Movements { get; set; }
}

Next, let’s say our message handler looks like this (it varies by messaging framework, but this is what it could look like with Jasper):

public class NewTripHandler
{
    private readonly IDocumentSession _session;

    public NewTripHandler(IDocumentSession session)
    {
        _session = session;
    }

    public async Task Handle(CreateNewTrip trip)
    {
        var started = new TripStarted
        {
            Day = trip.Day
        };

        var departure = new Departure
        {
            Day = trip.Day,
            State = trip.State
        };

        var travel = new Travel
        {
            Day = trip.Day,
            Movements = new List<Movement>(trip.Movements)
        };

        // This will create a new event stream and
        // append the three events to that new stream
        // when the IDocumentSession is saved
        var action = _session.Events
            .StartStream(started, departure, travel);

        // You can also use strings as the identifier
        // for streams
        var tripId = action.Id;

        // Commit the events to the new event
        // stream for this trip
        await _session.SaveChangesAsync();
    }
}

In Marten nomenclature, a “stream” is a related set of events in the event storage. In this system, we’ll use a stream for every distinct trip. The code above takes in the CreateNewTrip command message, and creates three events to record the initial progress of the new trip, and persists the new events.

So now that we’ve captured events, let’s move on to projections in Marten V4, because that’s been both a major area of effort and also the biggest changes for usage in this forthcoming release.

Aggregate by Stream

Projections support in Marten comes in a couple different flavors, but I’m guessing that the most common is projecting a single document view of a single stream of events. In this case, we’ll create a projected Trip view like this:

public class Trip
{
    public Guid Id { get; set; }

    // the day the trip ended
    public int EndedOn { get; set; }

    // total mileage of the trip to date
    public double Traveled { get; set; }

    // what state is the trip party in
    // presently
    public string State { get; set; }

    // is the trip ongoing?
    public bool Active { get; set; }

    // what day did the trip start?
    public int StartedOn { get; set; }
}

To configure an aggregated projection for a Trip stream, we’ll subclass the new AggregateProjection<T> class like so:

public class TripAggregation: AggregateProjection<Trip>
{
    public TripAggregation()
    {
        // Delete the Trip document for this
        // stream if this event is encountered
        DeleteEvent<TripAborted>();

        ProjectionName = "Trip";

        // We'll change this later
        Lifecycle = ProjectionLifecycle.Live;
    }

    public void Apply(Arrival e, Trip trip) => trip.State = e.State;
    public void Apply(Travel e, Trip trip) => trip.Traveled += e.TotalDistance();
    public void Apply(TripEnded e, Trip trip)
    {
        trip.Active = false;
        trip.EndedOn = e.Day;
    }

    public Trip Create(TripStarted started)
    {
        return new Trip {StartedOn = started.Day, Active = true};
    }
}

A couple notes here to explain the code:

  • Marten is depending on naming conventions to know what to do with a certain kind of event type. So the Create() method is used to create the Trip aggregate from an event object of type TripStarted.
  • The Apply() methods are used to make updates to an existing aggregate document based on an event of a certain type
  • For the moment, the TripAggregation is only used for “live” aggregations that are done on the fly. We’ll change that later

Now, let’s put this new projection to use. In our call to AddMarten() up above, I’m going to add one line of code to register our new projection:

options.Events.Projections.Add(new TripAggregation());

Let’s say that in our system we write events for trips very frequently, but very rarely need to see the current state of the trip (don’t know why that would be so, but just go with it). In that case we can just lean on Marten’s ability to aggregate the projected view on the fly as shown below with the AggregateStreamAsync() method:

public class EndTrip
{
    public Guid TripId { get; set; }
    public bool Successful { get; set; }
    public string State { get; set; }
    public int Day { get; set; }
}

public class EndTripHandler
{
    private readonly IDocumentSession _session;

    public EndTripHandler(IDocumentSession session)
    {
        _session = session;
    }

    public async Task Handle(EndTrip end)
    {
        // we need to first see the current
        // state of the Trip to decide how
        // to proceed
        var trip = await _session
            .Events
            .AggregateStreamAsync<Trip>(end.TripId);
        
        // finish processing the EndTrip command...
    }
}

If instead, we’d like to keep the matching Trip document up to date and persisted in the database as new events come in, we can switch the TripProjection to using the “inline” lifecycle by setting the Lifecycle property in the constructor function of TripProjection to:

public TripAggregation()
{
    DeleteEvent<TripAborted>();

    ProjectionName = "Trip";

    // Now let's change the lifecycle to inline
    Lifecycle = ProjectionLifecycle.Inline;
}

“Inline” projections are updated when events are appended, and in the same transaction as the append event database changes. This gives you true ACID transaction integrity between the events and the projected views. This can set you up for possible concurrency issues if multiple application threads are trying to update the same stream of events simultaneously, so exercise some caution with using inline projections.

So with the Trip documents being updated inline with new events coming in, our EndTripHandler becomes:

public class EndTripHandler
{
    private readonly IDocumentSession _session;

    public EndTripHandler(IDocumentSession session)
    {
        _session = session;
    }

    public async Task Handle(EndTrip end)
    {
        // we need to first see the current
        // state of the Trip to decide how
        // to proceed, so load the pre-built
        // projected document from the database
        var trip = await _session
            .LoadAsync<Trip>(end.TripId);

        // finish processing the EndTrip command...
    }
}

You can also run projections asynchronously in a background thread, but I’m going to leave that for a subsequent post.

Some other points of interest about AggregateProjection<T>:

  • There’s some wiggle room in the signature of the conventionally named Apply() and Create() methods. These method signatures can be asynchronous. They can also take in parameters for IQuerySession to load other information from Marten as they work or the IEvent / IEvent<T> data for the event to get access to metadata about the event or the event’s stream
  • The Apply() methods happily support immutable aggregate types. You’d simple return the new aggregate document created in an Apply() method or Task<T> where the T is the aggregate document type. That might not be very efficient because of the extra object allocations, but hey, some folks really want that.
  • I didn’t show it up above, but if you dislike the conventional “magic” I used above, that’s okay, there are methods you can use to define how to update or create the aggregate document based on specific event types through inline Lambda functions.
  • You can also conditionally delete the aggregated document if the logical workflow represented by an event stream is completed based on user defined logic.
  • All of the conventional method signatures as well as the inline Lambda usages for defining how an aggregate would be updated can also accept interface or abstract class types as well. This was a user request to enable folks to use event types from external assemblies in extensibility scenarios.

Aggregate a Projected Document Across Event Streams

I thought you said to never cross the streams! – had to be said:)

This is admittedly contrived, but let’s say we want a projected view from our raw trip events that tells us for each day the system is active:

  • How many trips started?
  • How many trips ended?
  • How many miles did all the active trips drive in each direction?

And by “day” in this system, I just mean the day number since the system went online. That aggregate might look like this:

public class Day
{
    public int Id { get; set; }

    // how many trips started on this day?
    public int Started { get; set; }

    // how many trips ended on this day?
    public int Ended { get; set; }

    // how many miles did the active trips
    // drive in which direction on this day?
    public double North { get; set; }
    public double East { get; set; }
    public double West { get; set; }
    public double South { get; set; }
}

So what we want to do is to group the TripStarted, TripEnded, and Travel events by the day, and create an aggregated Day document to reflect all the events that happened on the same day. The first step is to tell Marten how to know what events are going to be associated with a specific day, and the easiest way in Marten V4 is to have the events implement a common interface like this one:

public interface IDayEvent
{
    int Day { get; }
}

And then have the relevant events implement that interface like the TripStarted event:

public class TripStarted : IDayEvent
{
    public int Day { get; set; }
}

Now, to make the Marten projection for the Day document type, we’ll use the new V4 version of ViewProjection as a subclass like so:

// The 2nd generic parameter is the identity type of
// the document type. In this case the Day document
// is identified by an integer representing the number
// of days since the system went online
public class DayProjection: ViewProjection<Day, int>
{
    public DayProjection()
    {
        // Tell the projection how to group the events
        // by Day document
        Identity<IDayEvent>(x => x.Day);
        
        // This just lets the projection work independently
        // on each Movement child of the Travel event
        // as if it were its own event
        FanOut<Travel, Movement>(x => x.Movements);
        
        ProjectionName = "Day";
    }

    public void Apply(Day day, TripStarted e) => day.Started++;
    public void Apply(Day day, TripEnded e) => day.Ended++;

    public void Apply(Day day, Movement e)
    {
        switch (e.Direction)
        {
            case Direction.East:
                day.East += e.Distance;
                break;
            case Direction.North:
                day.North += e.Distance;
                break;
            case Direction.South:
                day.South += e.Distance;
                break;
            case Direction.West:
                day.West += e.Distance;
                break;

            default:
                throw new ArgumentOutOfRangeException();
        }
    }
}

The new ViewProjection is a subclass of AggregateProjection, and therefore has all the same capabilities as its parent type.

You’d most likely need to run a projection that crosses streams in the asynchronous projection lifecycle where the projection is executed in a background process, but I’m leaving that to another blog post.

Summary & What’s Next?

I focused strictly on projections that focused on aggregations, but there are plenty of use cases that don’t fit into that mold. In subsequent posts I’ll explore the other options for projections in Marten V4. The asynchronous projection support also got a full rewrite in Marten V4, so I’ll share plenty more about that.

In other posts, I’ll discuss some other improvements in the event capture process for reliability and concurrency issues. Hopefully for the next alpha release, we’ll be able to utilize native Postgresql database sharding to allow for far more scaleability in Marten’s event sourcing support.

Re-evaluating the “*DD’s” of Software Development: Test Driven Development

EDIT 1/22/2021: Changed the title a little bit to make sure y’all weren’t getting NFSW ads when you pull this up. Mea culpa:)

A bazillion years ago (2007) I wrote a throwaway post on my old CodeBetter blog titled BDD, TDD, and the other Double D’s. At some point last summer we were having a conversation at an all hands meeting at Calavista where the subjects of Test Driven Development (TDD) and Behavior Driven Development (BDD) came up. We were looking for new content to post on the company website, so I volunteered to modernize my old blog post above to explain the two techniques, the differences between them, and how we utilize both TDD and BDD on our Calavista projects. And I said that I’d have it ready to go in a couple days. Flash forward 6-8 months, and here’s the first part of that blog post, strictly focused on TDD:)

I’ll be mildly editing this and re-publishing in a more “professional” voice for the Calavista blog page soon.

Test Driven Development (TDD) and Behavior Driven Development (BDD) as software techniques have both been around for years, but confusion still abounds in the software industry. In the case of TDD, there’s also been widespread backlash from the very beginning. In this new series of blog posts I want to dive into what both TDD and BDD are, how they’re different (and you may say they aren’t), how we use these techniques on Calavista projects, and some thoughts about making their usage be more successful. Along the way, I’ll also talk about some other complementary “double D” in software development like Domain Driven Development (DDD) and Responsibility Driven Development.

Test Driven Development

Test Driven Development (TDD) is a development practice where developers author code by first describing the intended functionality in small automated tests, then writing the necessary code to make that test pass. TDD came out of the Extreme Programming (XP) process and movement in the late 90’s and early 00’s that sought to maximize rapid feedback mechanisms in the software development process.

As I hinted at in the introduction, the usage and effectiveness of Test Driven Development is extremely controversial. With just a bit of googling you’ll find both passionate advocates and equally passionate detractors. While I will not dispute that some folks will have had negative experiences or impressions of TDD, I still recommend using TDD. Moreover, we use TDD as a standard practice on our Calavista client engagements and I do as well in my personal open source development work.

As many folks have noted over the years, the word “Test” might be an unfortunate term because TDD at heart is a software design technique (BDD was partially a way to adjust the terminology and goals of the earlier TDD to focus more on the underlying goals by moving away from the word “Test”). I would urge you to approach TDD as a way to write better code and also as a way to continue to make your code better over time through refactoring (as I’ll discuss below).

Succeeding in software development is often a matter of having effective feedback mechanisms to let the team know what is and is not working. When used effectively, TDD can be very beneficial inside of a team’s larger software process first as a very rapid feedback cycle. Using TDD, developers continuously flow between testing and coding and get constant feedback about how their code is behaving as they work. It’s always valuable to start any task with the end in mind, and a TDD workflow makes a developer think about what successful completion of any coding task is before they implement that code.

Done well with adequately fine-grained tests, TDD can drastically reduce the amount of time developers have to spend debugging code. So yes, it can be time consuming to write all those unit tests, but spending a lot of time hunting around in a debugger trying to troubleshoot code defects is pretty time consuming as well. In my experience, I’ve been better off writing unit tests against individual bits of a complex feature first before trying to troubleshoot problems in the entire subsystem.

Secondly, TDD is not efficient or effective without the type of code modularity that is also frequently helpful for code maintainability in general. Because of that, TDD is a forcing function to make developers focus and think through the modularity of their code upfront. Code that is modular provides developers more opportunities to constantly shift between writing focused unit tests and the code necessary to make those new tests pass. Code that isn’t modular will be very evident to a developer because it causes significant friction in their TDD workflow. At a bare minimum, adopting TDD should at least spur developers to closely consider decoupling business logic, rules, and workflow from infrastructural concerns like databases or web servers that are intrinsically harder to work with in automated unit tests. More on this in a later post on Domain Driven Development.

Lastly, when combined with the process of refactoring, TDD allows developers to incrementally evolve their code and learn as they go by creating a safety net of quickly running tests that preserve the intended functionality. This is important, because it’s just not always obvious upfront what the best way is to code a feature. Even if you really could code a feature with a perfect structure the first time through, there’s inevitably going to be some kind of requirements change or performance need that sooner or later will force you to change the structure of that “perfect” code.

Even if you do know the “perfect” way to structure the code, maybe you decide to use a simpler, but less performant way to code a feature in order to deliver that all important Minimum Viable Product (MVP) release. In the longer term, you may need to change your system’s original, simple internals to increase the performance and scaleability. Having used TDD upfront, you might be able to do that optimization work with much less risk of introducing regression defects when backed up by the kind of fine-grained automated test coverage that TDD leaves behind. Moreover, the emphasis that TDD forces you to have on code modularity may also be beneficial in code optimization by allowing you to focus on discrete parts of the code.

Too much, or the wrong sort of modularity can of course be a complete disaster for performance, so don’t think that I’m trying to say that modularity is any kind of silver bullet.

As a design technique, TDD is mostly focused on fine grained details of the code and is complementary to other software design tools or techniques. By no means would TDD ever be the only software design technique or tool you’d use on a non-trivial software project. I’ve written a great deal about designing with and for testability over the years myself, but if you’re interested in learning more about strategies for designing testable code, I highly recommend Jim Shore’s Testing without Mocks paper for a good start.

To clear up a common misconception, TDD is a continuous workflow, meaning that developers would be constantly switching between writing a single or just a few tests and writing the “real” code. TDD does not — or at least should not — mean that you have to specify all possible tests first, then write all the code. Combined with refactoring, TDD should help developers learn about and think through the code as they’re writing code.

So now let’s talk about the problems with TDD and the barriers that keep many developers and development teams from adopting or succeeding with TDD:

  1. There can be a steep learning curve. Unit testing tools aren’t particularly hard to learn, but developers have to be very mindful about how their code is going to be structured and organized to really make TDD work.
  2. TDD requires a fair amount of discipline in your moment to moment approach, and it’s very easy to lose that under schedule pressure — and developers are pretty much always under some sort of schedule pressure.
  3. The requirement for modularity in code can be problematic for some otherwise effective developers who aren’t used to coding in a series of discrete steps
  4. A common trap for development teams is writing the unit tests in such a way that the tests are tightly coupled to the implementation of the code. Unit testing that relies too heavily on mock objects is a common culprit behind this problem. In this all too common case, you’ll hear developers complain that the tests break too easily when they try to change the code. In that case, the tests are possibly doing more harm than good. The followup post on BDD will try to address this issue.
  5. Some development technologies or languages aren’t conducive to a TDD workflow. I purposely choose programming tools, libraries, and techniques with TDD usage in mind, but we rarely have complete control over our development environment.

You might ask, what about test coverage metrics? I’m personally not that concerned about test coverage numbers, don’t have any magic number you need to hit, and I think it’s very subjective anyway based on what kind of technology or code you’re writing anyway. My main thought about test coverage metrics are only somewhat informative in that the metrics can only tell you when you may have problems, but can never tell you that the actual test coverage is effective in any way. That being said, it’s relatively easy with the current development tooling to collect and publish test coverage metrics in your Continuous Integration builds, so there’s no reason not to track code coverage. In the end I think it’s more important for the development team to internalize the discipline to have effective test coverage on each and every push to source control than it is to have some kind of automated watchdog yelling at them. Lastly, as with all metrics, test coverage numbers are useless if the development team is knowingly gaming the test coverage numbers with worthless tests.

Does TDD have to be practiced in its pure “test first” form? Is it really any better than just writing the tests later? I wouldn’t say that you absolutely have to always do pure TDD. I frequently rough in code first, then when I have a clear idea of what I’m going to do, write the tests immediately after. The issue with a “test after” approach is that the test coverage is rarely as good as you’d get from a test-first approach, and you don’t get as much of the design benefits of TDD. Without some thought about how code is going to be tested upfront, my experience over the years is that you’ll often see much less modularity and worse code structure. For teams new to TDD I’d advise trying to work “pure” test first for awhile, and then start to relax that standard later.

At the end of this, do I still believe in TDD after years of using it and years of development community backlash? I do, yes. My experience has been that code written in a TDD style is generally better structured and the codebase is more likely to be maintainable over time. I’ve also used TDD long enough to be well past the admittedly rough learning curve.

My personal approach has changed quite a bit over the years of course, with the biggest change being much more reliance on intermediate level integration tests and deemphasizing mock or stub objects, but that’s a longer conversation.

In my next post, I’ll finally talk about Behavior Driven Development, how it’s an evolution and I think a complement to TDD, and how we’ve been able to use BDD successfully at Calavista.

What would it take for you to adopt Marten?

If you’re stumbling in here without any knowledge of the Marten project (or me), Marten is an open source .Net library that developers can adopt in their project to use the rock solid Postgresql database as a pretty full featured document database and event store. If you’re unfamiliar with Marten, I think I’d say its feature set makes it similar to MongoDb (but the usage is significantly different), RavenDb, or Cosmos Db. On the event sourcing side of things, I think the only comparison in .Net world is GetEventStore itself, but you can certainly piece together an event store by combining other OSS libraries and database engines.

The Marten community is working very hard on our forthcoming (and long delayed) V4.0 release. We’ve already made some big strides on the document database side of things, and now we’re deep into some significant event store improvements (this link looks best in VS Code w/ the Mermaid plugin active). At Calavista, we’re considering if and how we can build a development practice around Marten for existing and potential clients. I’ve obviously got a lot of skin in the game here as the original creator of Marten. Nothing would make me happier than Marten being even more successful and that I get to help Calavista clients use Marten in real life systems as part of my day job.

I’d really like to hear from other folks what it would really take for them to seriously consider adopting Marten. What is Marten lacking now that you would need, or what kind of community or company support options would be necessary for your shop to use Marten in projects? I’m happy to hear any and all feedback or suggestions from as many people as I can get to respond.

I’m happy to take comments here, or the discussion for this topic is also on GitHub.

Existing Strengths

  • Marten is only a library, and at least for the document database features it’s very unobtrusive into your application code compared to many other persistence options
  • The Marten community is active and I hope you’d say that we’re welcoming to newcomers
  • By building on top of Postgresql, Marten comes with good cloud support from all the major cloud providers and plenty of existing monitoring options
  • Marten comes with many of the very real productivity advantages of a NoSQL solution, but has very strong transactional support from Postgresql itself
  • Marten’s event sourcing functionality comes “in the box” and there’s less work to do to fully incorporate event sourcing — including the all important “read side projection” support — into a .Net architecture than many other alternatives
  • Marten is part of the .Net Foundation
  • If you need commercial support for Marten, you can engage with Calavista Software.

Does any of that resonate with you? If you’ve used Marten before, is there anything missing from that list? And feel free to tell me you’re dubious about anything I’m claiming in the list above.

What’s already done or in flight

  • We made a lot of improvements to Marten’s Linq provider support. Not just in terms of expanding the querying scenarios we support, but also in improving the performance of the library across the board. I know this has been a source of trouble for many users in the past, and I’m excited about the improvements we’ve made in V4.
  • The event store functionality will get a lot more documentation — including sample applications — for V4
  • An important part of many event sourcing architectures is a background process to continuously build “projected” views of the raw events coming in. The current version of Marten has this capability, but it requires the user to do a lot of heavy architectural lifting to use it in any kind of clustered application. In V4, we’ll have an in the box recipe that will be used to do leader election and work distribution through an application cluster in “real server applications.” The asynchronous projection support in V4 will also support multi-tenancy (finally) and we have some ideas to greatly optimize projection rebuilds without system downtime
  • Using native Postgresql sharding for scalability, especially for the event store
  • Allowing users to specify event archival rules to keep the event store tables smaller and more performant
  • Adding more integration with .Net’s generic HostBuilder and standard logging abstractions for easier integration into .Net applications
  • Improving multi-tenancy usage based on user feedback
  • Document and event store metadata capabilities like you’d need for Marten to take part in end to end Open Telemetry tracing within your architecture.
  • More sample applications. To be honest, I’m hoping to find published reference applications built with Entity Framework Core and shift them to Marten. This might be part of an effort to show Jasper as a replacement for MediatR or NServiceBus/MassTransit as well.

And again, does any of that address whatever concerns you might have about adopting Marten? Or that you’d already had in the past?

Other Ideas?

Here are some other ideas that have been kicked around for improving Marten usage, but these ideas would probably need to come through some sort of Marten commercialization or professional support.

  • Cloud hosting recipes. Hopefully through Calavista projects, I’d like to develop some pre-built guidance and quick recipes for standing up scalable and maintainable Marten/Postgresql environments on both Azure and AWS. This would include schema migrations, monitoring, dynamic scaling, and any necessary kind of database provisioning. I think this might get into Terraform/Pulumi infrastructure as well.
  • Cloud hosting models for parallelizing and distributing work with asynchronous event projections. Maybe even getting into dynamic scaling.
  • Multi-tenancy through separate databases for each client tenant. You can pull this off today yourself, but there’s a lot of things to manage. Here I’m proposing more cloud hosting recipes for Marten/Postgresql that would include schema migrations and distributed work strategies for processing asynchronous event projections across the tenant databases.
  • Some kind of management user interface? I honestly don’t know what we’d do with that yet, but other folks have asked for something.
  • Event streaming Marten events through Kafka, Pulsar, AWS Kinesis, or Azure Event Hubs
  • Marten Outbox and Inbox approaches with messaging tools. I’ve already got this built and working with Jasper, but we could extend this to MassTransit or NServiceBus as well.

My 2021 OSS Plans (Marten, Jasper, Storyteller, and more)

I don’t know about you, but my 2020 didn’t quite go the way I planned, Among other things, my grand OSS plans really didn’t go the way I hoped. Besides the obvious issues caused by the pandemic, I was extremely busy at work most of the year on projects unrelated to any of my OSS projects and just didn’t have the energy or time to do much outside of work.

Coming into the new year though, my workload has leveled out and I’m re-charged from the holidays. Moreover, I’m going to get to use some of my OSS tools for at least one client next year and that’s helping my enthusiasm level. At the end of the day though, I still enjoy the creative aspect of my OSS work and I’m ready to get things moving again.

Here’s what I’m hoping to accomplish in 2021:

Marten V4.0 is already heavily underway with huge improvements ongoing for its Event Sourcing support. We’ve also had some significant success improving the Linq querying support and performance all around by almost doing a full re-write of Marten’s internals. There’s a lot more to do yet, but I’m hopeful that Marten V4.0 will be released in the 1st quarter of 2021.

In the slightly longer term, the Marten core team is talking about ways to possibly monetize Marten through either add on products or a services model of some sort. I’m also talking with my Calavista colleagues about about how we might create service offerings around Marten (scalable cloud hosting for Marten, DevOps guidance, migration projects?).

Regardless, Marten is getting the lion’s share of my attention for the time being and I’m excited about the work we have in flight.

Jasper is a toolkit for common messaging scenarios between .Net applications with a robust in process command runner that can be used either with or without the messaging. 

After having worked on it for over half a decade, I actually did release Jasper V1.0 last year! But it was during the first awful wave of Covid-19 and it just got lost in the shuffle of everything else going on. I also didn’t promote it very much.

I’m going to change that this year and make a big push to blog about it and promote it. I think there’s a lot of possible synergy between Jasper and Marten to build out CQRS architectures on .Net.

Development wise, I’m hoping to:

  • Add end to end open telemetry tracing support
  • Async API standard support (roughly Swagger for messaging based architectures if I’m understanding things correctly)
  • Kafka & Pulsar support has been basically done for 6 months through Jarrod‘s hard work.
  • Performance optimizations
  • A circuit breaker error handling option at the transport layer similar to what MassTransit just added here: https://masstransit-project.com/advanced/middleware/killswitch.html. This would have been an extremely useful feature to have had last year for a client, and I’ve wanted it in Jasper ever since
  • Jasper has an unpublished add on for building HTTP services in ASP.Net Core with a very lightweight “Endpoint” model that I’d like to finish, document, and release. It’s more or less the old FubuMVC style of HTTP handlers, but completely built on an ASP.Net Core foundation rather than its own framework.

Storyteller has been completely dormant as a project last year, but I know I’ve got a project coming up at work next year where it could be a great fit. I had started a lot of work a couple years ago for a big V6 overhaul of Storyteller. If and when I’m able, I’d like to dust off those plans and revamp Storyteller this year, but with a twist.

Instead of fighting the overwhelming tide, I think Storyteller will finally embrace the Gherkin specification language. I think this is probably a no-brainer decision to just opt into something that lots of people already understand and common development tools like JetBrains Rider or VS Code already have first class support for Gherkin.

I still think there’s value in having the Storyteller user interface even with the Gherkin support, so I’ll be looking at an all new client that tries to take the things that worked with the huge Storyteller 3.0 re-write a few years ago and puts that in a more modern shell. The current client is a hodgepodge of very early React.js and Redux, and I’d honestly want to tackle a re-write mostly to update my own UI/UX skillset. I’m still leaning toward using the very latest React.js, but I’ve at least looked at Blazor and sort of following MAUI.

I’ve mostly been just keeping up with bugs, pull requests, and new .Net versions for Lamar. At some point, Lamar needs support for IAsyncDisposable. I also get plenty of questions about how to override Lamar service registrations during integration testing scenarios, which is tricky just because of the weird gyrations that go on with HostBuilder bootstrapping and external IoC containers. There is some existing functionality in Lamar that could be useful for this, but I need to document it.

I might think about cutting the existing LamarCodeGeneration and LamarCompiler projects to their own first class library status because they’re developing a life of their own independent from Lamar. LamarCodeGeneration might be helpful for authoring source generators in .Net.

The farm road you see at the edge of this picture is almost perfectly flat with almost no traffic, and you can see for miles. I may or may not have used that to see how fast I could get my first car to go as a teenager. Let’s not tell my parents about that:)

There was just a wee bit of work to move Oakton and Oakton.AspNetCore to .Net 5.0. In the new year I think I’d like to just merge those two projects into one single library, and look at using Spectre Console to make the output of the built in environment test commands look a helluva lot spiffier and easier to read.

Alba

Alba just got .Net 5.0 support. I’ll get a chance to use Alba on a client project this year to do HTTP API testing, and we’ll see if that leads to any new work.

StructureMap

I’ll occasionally answer StructureMap questions as they come in, but that’s it. I’ll be helping one of Calavista’s clients migrate from StructureMap to Lamar in 2021, so I’ll be using it for work at least.

FubuMVC

It’s still dead as a door knob. There are plenty of bits of FubuMVC floating around in Oakton, Alba, Jasper, and Baseline though.

Planned Event Store Improvements for Marten V4, Daft Punk Edition

There’s a new podcast about Marten on the .Net Core Show that posted last week.

Marten V4 development has been heavily underway this year. To date, the work has mostly focused on the document store functionality (Linq, general performance improvements, and document metadata).

While I certainly hope the other improvements to Marten V4 will make a positive difference to our users, the big leap forward in capability is going to be on the event sourcing side of Marten. We’ve gathered a lot of user feedback on this feature set in the past couple years, but there’s always room for more discussion as things are taking shape.

First though, to set the mood:

The master issue for V4 event sourcing improvements is on GitHub here.

Scalability

We know there’s plenty of concern about how well Marten’s event store will scale over time. Beyond the performance improvements I’ll try to outline in following sections below, we’re planning to introduce support for:

Event Metadata

Similar to the document storage, the event storage in V4 will allow users to capture additional metadata to the event storage. There will be support in the event store Linq provider to query against this metadata, and this metadata will be available to the projections. Right now, the plan is to have opt in, additional fields for:

  • Correlation Id
  • Causation Id
  • User name

Additionally, the plan is to also have a “headers” field for user defined data that does not fall into the fields listed above. Marten will capture the metadata at the session level, with the thinking being that you could opt into custom Marten session creation that would automatically apply metadata for the current HTTP request or service bus message or logical unit of work.

There’ll be a follow up post on this soon.

Event Capture Improvements

When events are appended to event streams, we’re planning some small improvements for V4:

Projections, Projections, Projections!

This work is heavily in flight, so please shoot any feedback you might have our (Marten team’s) way.

Building your own event store is actually pretty easy — until the time you want to actually do something with the events you’ve captured or keep a “read-side” view of the status up to date with the incoming events. Based on a couple years of user feedback, all of that is exactly where Marten needs to grow up the most.

The master issue tracking the projection improvements is here. The Marten community (mostly me to be honest) has gone back and forth quite a bit on the shape of the new projection work and nothing I say here is set in stone. The main goals are to:

  • Significantly improve performance and throughput. We’re doing this partially by reducing in memory object allocations, but mostly by introducing much, much more parallelization of the projection work in the async daemon.
  • Simplify the usage of immutable data structures as the projected documents (note that we have plenty of F# users, and now C# record types make that a lot easier too).
  • Introduce snapshotting
  • Supplement the existing ViewProjection mechanism with conventional methods similar to the .Net StartUp class
  • Completely gut the existing ViewProjection to improve its performance while hopefully avoiding breaking API compatibility

There is some thought about breaking the projection support into its own project or making the event sourcing support be storage-agnostic, but I’m not sure about that making it to V4. My personal focus is on performance and scalability, and way too many of the possible optimizations seem to require coupling to details of Marten’s existing storage.

“Async Daemon”

The Async Daemon is an under-documented Marten subsystem we use to process asynchronously built event projections and do projection rebuilds. While it’s “functional” today, it has a lot of shortcomings (it can only run in one node at a time, and we don’d have any kind of leader election or failover) that prevent most folks from adopting it.

The master issue for the Async Daemon V4 is here, but the tl:dr is:

  • Make sure there’s adequate documentation (duh.)
  • Should be easy to integrate in your application
  • Has to be able to run in an application cluster in such a way that it guarantees that every projected view (or slice of a projected view) is being updated on exactly one node at a time
  • Improved performance and throughput of normal projection building
  • No downtime projection rebuilds
  • Way, way faster projection rebuilds

Now, to the changes coming in V4. Let’s assume that you’re doing “serious” work and needing to host your Marten-using .Net Core application across multiple nodes via some sort of cloud hosting. With minimal configuration, you’d like to have the asynchronous projection building “just work” across your cluster.

Here’s a visual representation of my personal “vision” for the async daemon in V4:

In V4 the async daemon will become a .Net Core BackgroundService that will be registered by the AddMarten() integration with HostBuilder. That mechanism will allow us to run background work inside of your .Net Core application.

Inside that background process the async daemon is going to have to elect a single “leader/distributor” agent that can only run on one node. That leader/distributor agent will be responsible for assigning work to the async daemon running inside all the active nodes in the application. What we’re hoping to do is to distribute and parallelize the projection building across running nodes. And oh yeah, do this without having to need any other kind of infrastructure besides the Postgresql database.

Within a single node, we’re adding a lot more parallelization to the projection building instead of treating everything as a dumb “left fold” single threaded queue problem. I’m optimistic that that’s going to make a huge difference for throughput. On top of that, I’m hoping that the new async daemon will be able to split work between different nodes without the nodes stepping on each other.

There’s still plenty of details to work out, and this post is just meant to be a window into some of the work that is happening within Marten for our big V4 release sometime in 2021.

Marten V4 Preview: Command Line Administration

TL;DR — It’s going to be much simpler in V4 to incorporate Marten’s command line administration tools into your .Net Core application.

In my last post I started to lay out some of the improvements in the forthcoming Marten V4 release with our first alpha Nuget release. In this post, I’m going to show the improvements to Marten’s command line package that can be used for some important database administration and schema migrations.

Unlike ORM tools like Entity Framework (it’s a huge pet peeve of mine when people describe Marten as an ORM), Marten by and large tries to allow you to be as productive as possible by keeping your focus on your application code instead of having to spend much energy and time on the details of your database schema. At development time you can just have Marten use its AutoCreate.All mode and it’ll quietly do anything it needs to do with your Postgresql database to make the document storage work at runtime.

For real production though, it’s likely that you’ll want to explicitly control when database schema changes happen. It’s also likely that you won’t want your application to have permissions to change the underlying database schema on the fly. To that end, Marten has quite a bit of functionality to export database schema updates for formal database migrations.

We’ve long supported an add on package called Marten.CommandLine that let’s you build your own command line tool to help manage these schema updates, but to date it’s required you to build a separate console application parallel to your application and has probably not been that useful to most folks.

In V4 though, we’re exploiting the Oakton.AspNetCore library that allows you to embed command line utilities directly into your .Net Core application. Let’s make that concrete with a small sample application in Marten’s GitHub repository.

Before I dive into that code, Marten v3.12 added a built in integration for Marten into the .Net Core generic HostBuilder that we’re going to depend on here. Using the HostBuilder for configuring and bootstrapping Marten into your application allows you to use the exact same Marten configuration and application configuration in the Marten command utilities without any additional work.

This sample application was built with the standard dotnet new webapi template. On top of that, I added a reference to the Marten.CommandLine library.

.Net Core applications tend to be configured and bootstrapped by a combination of a Program.Main() method and a StartUp class. First, here’s the Program.Main() method from the sample application:

public class Program
{
// It's actually important to return Task<int>
// so that the application commands can communicate
// success or failure
public static Task<int> Main(string[] args)
{
return CreateHostBuilder(args)

// This line replaces Build().Start()
// in most dotnet new templates
.RunOaktonCommands(args);
}

public static IHostBuilder CreateHostBuilder(string[] args) =>
Host.CreateDefaultBuilder(args)
.ConfigureWebHostDefaults(webBuilder =>
{
webBuilder.UseStartup<Startup>();
});
}

Note the signature of the Main() method and how it uses the RunOaktonCommands() method to intercept the command line arguments and execute named commands (with the default being to just run the application like normal).

Now, the Startup.ConfigureServices() method with Marten added in is this:

public void ConfigureServices(IServiceCollection services)
{
    // This is the absolute, simplest way to integrate Marten into your
    // .Net Core application with Marten's default configuration
    services.AddMarten(Configuration.GetConnectionString("Marten"));
}

Now, to the actual command line. As long as the Marten.CommandLine assembly is referenced by your application, you should see the additional Marten commands. From your project’s root directory, run dotnet run -- help and we see there’s some additional Marten-related options:

Oakton command line options with Marten.CommandLine in play

And that’s it. Now you can use dotnet run -- dump to export out all the SQL to recreate the Marten database schema, or maybe dotnet run -- patch upgrade_staging.sql --e Staging to create a SQL patch file that would make any necessary changes to upgrade your staging database to reflect the current Marten configuration (assuming that you’ve got an appsettings.Staging.json file with the right connection string pointing to your staging Postgresql server).

Check out the Marten.CommandLine documentation for more information on what it can do, but expect some V4 improvements to that as well.

Marten V4 Preview: Linq and Performance

Marten is an open source library for .Net that allows developers to treat the robust Postgresql database as a full featured and transactional document database (NoSQL) as well as supporting the event sourcing pattern of application persistence.

After a false start last summer, development on the long awaited and delayed Marten V4.0 release is heavily in flight and we’re making a lot of progress. The major focus of the remaining work is improving the event store functionality (that I’ll try to blog about later in the week if I can). We posted the first Marten V4 alpha on Friday for early adopters — or folks that need Linq provider fixes ASAP! — to pull down and start trying out. So far the limited feedback has been a nearly seamless upgrade.

You can track the work and direction of things through the GitHub issues that are already done and the ones that are still planned.

For today though, I’d like to focus on what’s been done so far in V4 in terms of making Marten simply better and faster at its existing feature set.

Being Faster by Doing Less

One of the challenging things about Marten’s feature set is the unique permutations of what exactly happens when you store, delete, or load document to and from the database. For example, some documents may or may not be:

On top of that, Marten supports a couple different flavors of document sessions:

  • Query-only sessions that are strictly read only querying
  • The normal session that supports an internal identity map functionality that caches previously loaded documents
  • Automatic dirty checking sessions that are the heaviest Marten sessions
  • “Lightweight” sessions that don’t use any kind of identity map caching or automatic dirty checking for faster performance and better memory usage — at the cost of a little more developer written code.

The point here is that there’s a lot of variability in what exactly happens in Marten when you save, load, or delete a document with Marten. In the current version, Marten uses a combination of runtime if/then logic, some “Nullo” classes, and a little bit of “Expression to Lambda” runtime compilation.

For V4, I completely re-wired the internals to use C# code generated and compiled at runtime using Roslyn’s runtime compilation capabilities. Marten is using the LamarCompiler and LamarCodeGeneration libraries as helpers. You can see these two libraries and this technique in action in a talk I gave at NDC London in 2019.

The end result of all this work is that we can generated the tightest possible C# handling code and the tightest possible SQL for the exact permutation of document storage characteristics and session type. Along the way, we’ve striven to reduce the number of dictionary lookups, runtime branching logic, empty Nullo objects, and generally the number of computer instructions that would have to be executed by the underlying processor just to save, load, or delete a document.

So far, so good. It’s hard to say exactly how much this is going to impact any given Marten-using application, but the existing test suite clearly runs faster now and I’m not seeing any noticeable issue with the “cold start” of the initial, one time code generation and compilation (that was a big issue in early Roslyn to the point where we ripped that out of pre 1.0 Marten, but seems to be solved now).

If anyone is curious, I’d be happy to write a blog post diving into the guts of how that works. And why the new .Net source generator feature wouldn’t work in this case if anyone wants to know about that too.

Linq Provider Almost-Rewrite

To be honest, I think Marten’s existing Linq provider (pre-V4) is pretty well stuck at the original proof of concept stage thrown together 4-5 years ago. The number of open issues where folks had hit limitations in the Linq provider support built up — especially with anything involving child collections on document types.

For V4, we’ve heavily restructured the Linq parsing and SQL generation code to address the previous shortcomings. There’s a little bit of improvement in the performance of Linq parsing and also a little bit of optimization of the SQL generated by avoiding unnecessary CASTs. Most of the improvement has been toward addressing previously unsupported scenarios. A potential improvement that we haven’t yet exploited much is to make the SQL generation and Linq parsing more able to support custom value types and F#-isms like discriminated unions through a new extensibility mechanism that teaches Marten about how these types are represented in the serialized JSON storage.

Querying Descendent Collections

Marten pre-V4 didn’t handle querying through child collections very well and that’s been a common source of user issues. With V4, we’re heavily using the Common Table Expression query support in Postgresql behind the scenes to make Linq queries like this one shown below possible:

var results = theSession.Query<Top>()
.Where(x => x.Middles.Any(b => b.Bottoms.Any()))
.ToList();

I think that at this point Marten can handle any combination of querying through child collections through any number of levels with all possible query operators (Any() / Count()) and any supported Where() fragment within the child collection.

Multi-Document Includes

Marten has long had some functionality for fetching related documents together in one database round trip for more efficient document reading. A long time limitation in Marten is that this Include() capability was only usable for logical “one to one” or “many to one” document relationships. In V4, you can now use Include() querying for “one to many” relationships as shown below:

[Fact]
public void include_many_to_list()
{
var user1 = new User { };
var user2 = new User { };
var user3 = new User { };
var user4 = new User { };
var user5 = new User { };
var user6 = new User { };
var user7 = new User { };

theStore.BulkInsert(new User[]{user1, user2, user3, user4, user5, user6, user7});

var group1 = new Group
{
Name = "Odds",
Users = new []{user1.Id, user3.Id, user5.Id, user7.Id}
};

var group2 = new Group {Name = "Evens", Users = new[] {user2.Id, user4.Id, user6.Id}};

using (var session = theStore.OpenSession())
{
session.Store(group1, group2);
session.SaveChanges();
}

using (var query = theStore.QuerySession())
{
var list = new List<User>();

query.Query<Group>()
.Include(x => x.Users, list)
.Where(x => x.Name == "Odds")
.ToList()
.Single()
.Name.ShouldBe("Odds");

list.Count.ShouldBe(4);
list.Any(x => x.Id == user1.Id).ShouldBeTrue();
list.Any(x => x.Id == user3.Id).ShouldBeTrue();
list.Any(x => x.Id == user5.Id).ShouldBeTrue();
list.Any(x => x.Id == user7.Id).ShouldBeTrue();
}
}

This was a longstanding request from users, and to be honest, we had to completely rewrite the Include() internals to add this support. Again, we used Common Table Expression SQL statements in combination with per session temporary tables to pull this off.

Compiled Queries Actually Work

I think the Compiled Query feature is unique in Marten. It’s probably easiest and best to think of it as a “stored procedure” for Linq queries in Marten. The value of a compiled query in Marten is:

  1. It potentially cleans up the application code that has to interact with Marten queries, especially for more complex queries
  2. It’s potentially some reuse for commonly executed queries
  3. Mostly though, it’s a significant performance improvement because it allows Marten to “remember” the Linq query plan.

While compiled queries have been supported since Marten 1.0, there’s been a lot of gap between what works in Marten’s Linq support and what functions correctly inside of compiled queries. With the advent of V4, the compiled query planning was rewritten with a new strategy that so far seems to support all of the Linq capabilities of Marten. We think this will make the compiled query feature much more useful going forward.

Here’s an example compiled query that was not possible before V4:

public class FunnyTargetQuery : ICompiledListQuery<Target>
{
public Expression<Func<IMartenQueryable<Target>, IEnumerable<Target>>> QueryIs()
{
return q => q
.Where(x => x.Flag && x.NumberArray.Contains(Number));
}

public int Number { get; set; }
}

And in usage:

var actuals = session.Query(new FunnyTargetQuery{Number = 5}).ToArray();

Multi-Level SelectMany because why not?

Marten has long supported the SelectMany() keyword in the Linq provider support, but in V4 it’s much more robust with the ability to chain SelectMany() clauses n-deep and do that in combination with any kind of Count() / Distinct() / Where() / OrderBy() Linq clauses. Here’s an example:

[Fact]
public void select_many_2_deep()
{
var group1 = new TargetGroup
{
Targets = Target.GenerateRandomData(25).ToArray()
};

var group2 = new TargetGroup
{
Targets = Target.GenerateRandomData(25).ToArray()
};

var group3 = new TargetGroup
{
Targets = Target.GenerateRandomData(25).ToArray()
};

var groups = new[] {group1, group2, group3};

using (var session = theStore.LightweightSession())
{
session.Store(groups);
session.SaveChanges();
}

using var query = theStore.QuerySession();

var loaded = query.Query<TargetGroup>()
.SelectMany(x => x.Targets)
.Where(x => x.Color == Colors.Blue)
.SelectMany(x => x.Children)
.OrderBy(x => x.Number)
.ToArray()
.Select(x => x.Id).ToArray();

var expected = groups
.SelectMany(x => x.Targets)
.Where(x => x.Color == Colors.Blue)
.SelectMany(x => x.Children)
.OrderBy(x => x.Number)
.ToArray()
.Select(x => x.Id).ToArray();

loaded.ShouldBe(expected);
}

Again, we pulled that off with Common Table Expression statements.

In tepid defense of…

Hey all, I’ve been swamped at work and haven’t had any bandwidth or energy for blogging, but I’ve actually been working up ideas for a new blog series. I’m going to call it “In tepid defense of [XYZ]”, where XYZ is some kind of software development tool or technique that’s:

  • Gotten a bad name from folks overusing it, or using it in some kind of dogmatic way that isn’t useful
  • Is disparaged by a certain type of elitist, hipster developer
  • Might still have some significant value if used judiciously

My list of topics so far is:

  • IoC Containers — I’m going to focus on where, when, and how they’re still useful — but with a huge dose of what I think are the keys to using them successfully in real projects. Which is more or less gonna amount to using them very simply and not making them do too much weird runtime switcheroo.
  • S.O.L.I.D. — Talking about the principles as a heuristic to think through designing code internals, but most definitely not throwing this out there as any kind of hard and fast programming laws. This will be completely divorced from any discussion about you know who.
  • UML — I’m honestly using UML more now than I had been for years and it’s worth reevaluating UML diagramming after years of the backlash to silly things like “Executable UML”
  • Don’t Repeat Yourself (DRY) — I think folks bash this instead of thinking more about when and how they eliminate duplication in their code without going into some kind of really harmful architecture astronaut mode

I probably don’t have the energy or guts to tackle OOP in general or design patterns in specific, but we’ll see.

Anything interesting to anybody?