How I Prioritize OSS Bugs

I just got back from a week long vacation with the family, and I was as rested and relaxed as I can ever be, at least before I picked up a head cold on the last day. Today though, it’s time to start catching up on OSS bug reports that have come in in the past 10 days or so. I thought it might be fun to dash off my personal prioritization for the bugs that come into the Marten, Wolverine, or related projects.

Roughly, here’s an unscientific ranking of the factors that get bugs fixed sooner rather than later:

  1. Any issue that is blocking or harming a JasperFx Software client’s system
  2. Issues that already have user supplied pull requests to fix the issue. You never want to leave a pull request open too long if someone has taken the time to contribute. It still happens for a variety of reasons, but you do still try.
  3. Bugs that I find embarrassing
  4. Any problem that would likely give a new user a poor first impression of the tools
  5. Problems that I think would potentially impact many users
  6. Any other issue for a JasperFx client
  7. Issues reported by significant contributors, and I’m pretty loose with what I think of as “significant”
  8. Easy fixes just to help keep the open GitHub issue counts as low as possible because that’s something I do care about
  9. Open issues in whatever project I happen to be preparing a release for while that project has my attention
  10. Any issue that will require breaking API changes in the tool, but this subset will sometimes be prioritized to the top whenever we’re making a full point release
  11. Any issue that is going to require significant changes to the internals, but this is somewhat similar to the previous line
  12. Issues that aren’t likely to impact many users
  13. Issues reported by people being kind of a jerk that aren’t likely to impact many users

For older versions of any of the tools, like Marten 7.*, the list is much shorter:

  • For a JasperFx Software client who cannot upgrade soon, we’ll of course make fixes to the older branch and forward that fix to the current version
  • For everybody else, eh, probably not unless it’s really bad or they’ve just asked very nicely

Critter Stack Futures for the rest of 2025

It’s the halfway point of 2025 some how, and we’ve now gotten past the big Marten 8.0 and Wolverine 4.0 releases. Right before I go on vacation next week, I thought it would be a good time to jot down some thoughts about where the Critter Stack might go for the rest of 2025 and probably into 2026.

Critter Watch

The big ticket item is our ongoing work on “Critter Watch”, which will be a commercial management and observability add on for Wolverine, Marten, and any future new Critter tools. The top line pitch for Critter Watch is that it well help you know what your applications are, how they interact with each other, whether they’re healthy in production, and provide features to help heal the inevitable production problems when they happen.

The general idea is to have a standalone application deployed that acts as a management console for 1 or more Wolverine applications in our user’s environments:

Upfront for the Critter Watch MVP (and requests from a client), we’re focused on:

  • Visualizing the systems being monitored, their Wolverine and Marten configuration, and the capabilities of the systems. We’re currently researching AsyncAPI publishing and visualization as well. The whole point of this is to help teams understand how the messages in your system are handled, published, and routed.
  • Event Sourcing management, but this is mostly about managing the execution of asynchronous projections and subscriptions at runtime and being able to understand the ongoing performance or any ongoing problems
  • Dead letter queue management for Wolverine

I have less clarity over development time tooling, but we’re at least interested in having some of Critter Watch usable as an embedded tool during development.

After years of talking about this and quite a bit of envisioning, development started in earnest over the past 6 weeks with a stretch goal of having a pilot usage by the end of July for a JasperFx Software client.

I do not yet have any hard pricing numbers yet, but we are very interested in talking to anyone who would be interested in Critter Watch.

Concurrency, Concurrency, Concurrency!

I think that systems built with Event Sourcing are a little more sensitive to concurrent data reads and writes, or maybe it’s just that those problems are there all the time but more readily observable with Event Sourcing and Event Driven Architectures. In my work with JasperFx Software clients, concurrency is probably the most common subject of questions.

Mostly today you deal with this either by building in selective retry capabilities based on version conflict detection, or get fancier with queueing and message routing to eliminate the concurrent access as much as possible. Or both of course.

A great way to side step the concurrent access while not sacrificing throughput through parallelization is to use Wolverine’s support for Azure Service Bus Session Identifiers and FIFO Queues.

Which is great, but what if you’re not using Azure Service Bus? What if you’re only using local queueing? And wouldn’t it be nice if the existing Azure Service Bus FIFO support was a little less cumbersome to use in your code?

I don’t have a ton of detail, but there’s a range of internal proposals to create some new recipes for Wolverine usage to enable teams to more easily “shard” logical work between queues and within the local workers listening to queues to improve Wolverine’s handling of concurrent access without sacrificing parallel work and throughput or requiring repetitive code. Some of this is being done in collaboration with JasperFx clients.

Improving Wolverine’s Declarative Data Access

For lack of a better description, Wolverine has a feature set I’m heretofore calling “declarative data access” with the [Entity] attribute that triggers code generation within message handlers or HTTP endpoints to load requested data from Marten, EF Core, or RavenDb. And of course, there’s also what we call the “aggregate handler workflow” recipe for using the Decider pattern with Wolverine and Marten that I think is the simplest way to express business logic when using Event Sourcing in the .NET ecosystem.

To take these productivity features even farther, I think we’ll add some:

  1. More control over what action to take if an entity is missing. Today, the HTTP endpoints will just return a 404 status code if required entities can’t be found. In future versions, we’ll let you customize log or ProblemDetails messages and have more control over how Wolverine generates the “if missing” path
  2. At least for Marten, opt into Marten’s batch querying support if you are using more than one of any combination of the existing [Aggregate], [ReadAggregate], [Entity], or [Document] attributes to load data within a single HTTP endpoint or message handler as a way of improving performance by reducing network round trips to the database. And don’t sneeze at that, chattiness is a common performance killer in enterprise applications. Especially when the code is unnecessarily complicated by typical usages of Clean or Onion Architectural approaches.

If you follow Event Sourcing related topics online, you’ll hear quite a bit of buzz from some of the commercial tools about “Dynamic Consistency Boundaries” (DCB). We get asked about this with Marten occasionally, but the Marten core team’s position is that Marten doesn’t require this feature because you can already do “read” and “write” operations across multiple event streams with transactional integrity as is.

What the batch querying I just described will do for Marten though is make the full “Critter Stack” usage be more performant when you need to potentially work with more than one event stream at a time with all the transactional support and strong consistency that Marten (really PostgreSQL) already provides.

For Marten users, this is essentially making Marten’s FetchForWriting() API able to enroll in batch querying for more efficient data querying when working across streams. That work is actually well underway.

But if you prefer to use the fancier and more novel DCB approaches that aren’t even officially released yet, feel free to pay out some big bucks to use one of the commercial tools.

Smaller, But Still Important Work!

  • Partially for Critter Watch, Wolverine should support connecting to multiple brokers in a single application for each transport type. Some of this is already done, with Kafka being next up, but we need to add this to every transport
  • Improved interoperability support for Wolverine talking to non-Wolverine applications. There’s an existing pull request that goes quite a ways for this, but this might end up being more a documentation effort than anything else
  • More options in Wolverine with Marten or just Marten for streaming Marten data as JSON directly to HTTP. We have some support already of course, but there are more opportunities for expanding that
  • Exposing an MCP server off of Marten event data, but I have very little detail about what that would be. I would be very interested in partnering with a company who wanted to do this, and a JasperFx client might be working with us later this year on AI with Marten
  • Improving throughput in Marten’s event projections and subscriptions. We’ve done a lot the past couple years, but there are still some other ideas in the backlog we haven’t played yet
  • Expanding Wolverine support for more database engines, with CosmosDb the most likely contender this year. This might be contingent upon client work of course.

What about the SQL Server backed Event Store?

Yeah, I don’t know. We did a ton of work in Marten 8 to pull what will be common code out in a way that it could be reused in the SQL Server backed event store. I do not know when we might work on this as CritterWatch will take priority for now.

And finally….

And on that note I’m essentially on vacation for a week and I’ll catch up with folks in late July.

OSS Project Lessons Learned with David Giard

I got to talk to David Giard on his podcast last week about some of the lessons I’ve learned the hard way across several large OSS projects. For a little background, I got to follow through on a 15 to 20 year dream of mine to found a company called JasperFx Software LLC to build a services and product offerings around the “Critter Stack” family of open source tools (Marten and Wolverine) in the .NET ecosystem. The two main tools are doing well right now, with Marten being the most used Event Sourcing tool for .NET projects and Wolverine gaining traction as an alternative messaging tool and HTTP endpoint framework with its focus on reduced code ceremony and testable code.

The relative success of these tools came after I was the technical leader of a very large, ambitious project called FubuMVC (and FubuTransportation) that fizzled out after I probably sunk 2-3 man years of effort into it over a half decade. As David did helpfully point out, some of the now success of Marten and Wolverine was absolutely predicated on some lessons learned both positive (mostly technical) and negative (community engagement, documentation, samples) from the earlier FubuMVC experience.

Without further ado, here’s David & I:

Wire Up XUnit Logging for Crazy Integration Testing

I worked a little bit this weekend on a small new feature in Wolverine that we’ll need as part of our forthcoming “CritterWatch” tooling. What I was doing isn’t that interesting, but the killer problem was that it required me to write an integration test that would:

  1. Spin up multiple IHost instances for the same testing application
  2. Verify that Wolverine was correctly assigning running tasks to only the leader node
  3. Stop the leader node, see leadership and that same task shift to the newly elected leader
  4. Make sure that task was really only ever running on the single leader node

Needless to say, it’s a long running test and it turned out to be non trivial to get both the test harness and the necessary code exactly right. Honestly, I didn’t get this done until I stopped and integrated application logging directly into the xUnit.Net test harness (plus integrating a Wolverine specific event observer too) so I could see what the heck was going on inside all of these application instances.

So without further ado, here’s the recipe we’re using (and copy/pasting around) in Wolverine to do that. First off, we need an ILogger and ILoggerProvider implementation that will pipe logging to xUnit’s ITestOutputHelper like so:

public class XUnitLogger : ILogger
{
private readonly string _categoryName;

private readonly List<string> _ignoredStrings = new()
{
"Declared",
"Successfully processed message"
};

private readonly ITestOutputHelper _testOutputHelper;

public XUnitLogger(ITestOutputHelper testOutputHelper, string categoryName)
{
_testOutputHelper = testOutputHelper;
_categoryName = categoryName;
}

public bool IsEnabled(LogLevel logLevel)
{
return logLevel != LogLevel.None;
}

public IDisposable BeginScope<TState>(TState state)
{
return new Disposable();
}

public void Log<TState>(LogLevel logLevel, EventId eventId, TState state, Exception exception,
Func<TState, Exception, string> formatter)
{
if (exception is DivideByZeroException)
{
return;
}

if (exception is BadImageFormatException)
{
return;
}

// Obviously this is crude and you would do something different here...
if (_categoryName == "Wolverine.Transports.Sending.BufferedSendingAgent" &&
logLevel == LogLevel.Information) return;
if (_categoryName == "Wolverine.Runtime.WolverineRuntime" &&
logLevel == LogLevel.Information) return;
if (_categoryName == "Microsoft.Hosting.Lifetime" &&
logLevel == LogLevel.Information) return;
if (_categoryName == "Wolverine.Transports.ListeningAgent" &&
logLevel == LogLevel.Information) return;
if (_categoryName == "JasperFx.Resources.ResourceSetupHostService" &&
logLevel == LogLevel.Information) return;
if (_categoryName == "Wolverine.Configuration.HandlerDiscovery" &&
logLevel == LogLevel.Information) return;

var text = formatter(state, exception);
if (_ignoredStrings.Any(x => text.Contains(x))) return;

_testOutputHelper.WriteLine($"{_categoryName}/{logLevel}: {text}");

if (exception != null)
{
_testOutputHelper.WriteLine(exception.ToString());
}
}

public class Disposable : IDisposable
{
public void Dispose()
{
}
}
}

public class OutputLoggerProvider : ILoggerProvider
{
private readonly ITestOutputHelper _output;

public OutputLoggerProvider(ITestOutputHelper output)
{
_output = output;
}


public void Dispose()
{
}

public ILogger CreateLogger(string categoryName)
{
return new XUnitLogger(_output, categoryName);
}
}

And register it inside the test harness like so:

public class leader_pinned_listener : IAsyncDisposable
{
    private readonly ITestOutputHelper _output;

    public leader_pinned_listener(ITestOutputHelper output)
    {
        _output = output;
    }

    private async Task<IHost> startHost()
    {
        await dropSchemaAsync();
        
        var host =  await Host.CreateDefaultBuilder()
            .UseWolverine(opts =>
            {
                // This is where I'm adding in the custom ILoggerProvider
                opts.Services.AddSingleton<ILoggerProvider>(new OutputLoggerProvider(_output));
                
                // More configuration that isn't germane...

        return host;
    }

Hey, it’s crude, but the point here was that this kind of gnarly integration testing, and especially with a lot of asynchronous behavior, is a lot easier to get through when you have more insight into how the code you’re testing is actually behaving.

Low Ceremony Railway Programming with Wolverine

Railway Programming is an idea that came out of the F# community as a way to develop for “sad path” exception cases without having to resort to throwing .NET Exceptions as a way of doing flow control. Railway Programming works by chaining together functions with a standardized response in such a way that it’s relatively easy to abort workflows as preliminary steps are found to be invalid while still passing the results of the preceding function as the input into the next function.

Wolverine has some direct support for a quasi-Railway Programming approach by moving validation or data loading steps prior to the main message handler or HTTP endpoint logic. Let’s jump into a quick sample that works with either message handlers or HTTP endpoints using the built in HandlerContinuation enum:

public static class ShipOrderHandler
{
    // This would be called first
    public static async Task<(HandlerContinuation, Order?, Customer?)> LoadAsync(ShipOrder command, IDocumentSession session)
    {
        var order = await session.LoadAsync<Order>(command.OrderId);
        if (order == null)
        {
            return (HandlerContinuation.Stop, null, null);
        }

        var customer = await session.LoadAsync<Customer>(command.CustomerId);

        return (HandlerContinuation.Continue, order, customer);
    }

    // The main method becomes the "happy path", which also helps simplify it
    public static IEnumerable<object> Handle(ShipOrder command, Order order, Customer customer)
    {
        // use the command data, plus the related Order & Customer data to
        // "decide" what action to take next

        yield return new MailOvernight(order.Id);
    }
}

By naming convention (but you can override the method naming with attributes as you see fit), Wolverine will try to generate code that will call methods named Before/Validate/Load(Async) before the main message handler method or the HTTP endpoint method. You can use this compound handler approach to do set up work like loading data required by business logic in the main method or in this case, as validation logic that can stop further processing based on failed validation or data requirements or system state. Some Wolverine users like using these method to keep the main methods relatively simple and focused on the “happy path” and business logic in pure functions that are easier to unit test in isolation.

By returning a HandlerContinuation value either by itself or as part of a tuple returned by a BeforeValidate, or LoadAsync method, you can direct Wolverine to stop all other processing.

You have more specialized ways of doing that in HTTP endpoints by using the ProblemDetails specification to stop processing like this example that uses a Validate() method to potentially stop processing with a descriptive 400 and error message:

public record CategoriseIncident(
    IncidentCategory Category,
    Guid CategorisedBy,
    int Version
);

public static class CategoriseIncidentEndpoint
{
    // This is Wolverine's form of "Railway Programming"
    // Wolverine will execute this before the main endpoint,
    // and stop all processing if the ProblemDetails is *not*
    // "NoProblems"
    public static ProblemDetails Validate(Incident incident)
    {
        return incident.Status == IncidentStatus.Closed 
            ? new ProblemDetails { Detail = "Incident is already closed" } 
            
            // All good, keep going!
            : WolverineContinue.NoProblems;
    }
    
    // This tells Wolverine that the first "return value" is NOT the response
    // body
    [EmptyResponse]
    [WolverinePost("/api/incidents/{incidentId:guid}/category")]
    public static IncidentCategorised Post(
        // the actual command
        CategoriseIncident command, 
        
        // Wolverine is generating code to look up the Incident aggregate
        // data for the event stream with this id
        [Aggregate("incidentId")] Incident incident)
    {
        // This is a simple case where we're just appending a single event to
        // the stream.
        return new IncidentCategorised(incident.Id, command.Category, command.CategorisedBy);
    }
}

The value WolverineContinue.NoProblems tells Wolverine that everything is good, full speed ahead. Anything else will write the ProblemDetails value out to the response, return a 400 status code (or whatever you decide to use), and stop processing. Returning a ProblemDetails object hopefully makes these filter methods easy to unit test themselves.

You can also use the AspNetCore IResult as another formally supported “result” type in these filter methods like this shown below:

public static class ExamineFirstHandler
{
    public static bool DidContinue { get; set; }
    
    public static IResult Before([Entity] Todo2 todo)
    {
        return todo != null ? WolverineContinue.Result() : Results.Empty;
    }

    [WolverinePost("/api/todo/examinefirst")]
    public static void Handle(ExamineFirst command) => DidContinue = true;
}

In this case, the “special” value WolverineContinue.Result() tells Wolverine to keep going, otherwise, Wolverine will execute the IResult returned from one of these filter methods and stop all other processing for the HTTP request.

It’s maybe a shameful approach for folks who are more inline with a Functional Programming philosophy, but you could also use a signature like:

[WolverineBefore]
public static UnauthorizedHttpResult? Authorize(SomeCommand command, ClaimsPrincipal user)

In the case above, Wolverine will do nothing if the return value is null, but will execute the UnauthorizedHttpResult response if there is, and stop any further processing. There is *some* minor value to expressing the actual IResult type above because that can be used to help generate OpenAPI metadata.

Lastly, let’s think about the very common need to write an HTTP endpoint where you want to return a 404 status code if the requested data doesn’t exist. In many cases the API user is supplying the identity value for an entity, and your HTTP endpoint will first query for that data, and if it doesn’t exist, abort the processing with the 404 status code. Wolverine has some built in help for this tedious task through its unique persistence helpers as shown in this sample HTTP endpoint below:

    [WolverineGet("/orders/{id}")]
    public static Order GetOrder([Entity] Order order) => order;

Note the presence of the [Entity] attribute for the Order argument to this HTTP endpoint route. That’s telling Wolverine that that data should be loaded using the “id” route argument as the Order key from whatever persistence mechanism in your application deals with the Order entity, which could be Marten of course, an EF Core DbContext that has a mapping for Order, or Wolverine’s RavenDb integration. Unless we purposely mark [Entity(Required = false)], Wolverine.HTTP will return a 404 status code if the Order entity does not exist. The simplistic sample from Wolverine’s test suite above doesn’t do any kind of mapping from the raw Order to a view model, but the mechanics of the [Entity] loading would work equally if you also mapped the raw Order to some kind of OrderViewModel maybe.

Last Thoughts

I’m pushing Wolverine users and JasperFx clients to utilize Wolverine’s quasi Railway Programming capabilities as guard clauses to better separate out validation or error condition handling into easily spotted, atomic operations while reducing the core HTTP request or message handler to being a “happy path” operation. Especially in HTTP services where the ProblemDetails specification and integration with Wolverine fits well with this pattern and where I’d expect many HTTP client tools to already know how to work with problem details responses.

There have been a few attempts to adapt Railway Programming to C# that I’m aware of, inevitably using some kind of custom Result type that denotes success or failure with the actual results for the next function. I’ve seen some folks and OSS tools try to chain functions together with nested lambda functions within a fluent interface. I’m not a fan of any of this because I think the custom Result types just add code noise and extra mechanical work, then the fluent Interface approach can easily be nasty to debug and detracts from readability by the extra code noise. But anyway, read a lot more about this in Andrew Lock’s Series: Working with the result pattern and make up your own mind.

I’ve also seen an approach where folks used MediatR handlers for each individual step in the “railway” where each handler had to return a custom Result type with the inputs for the next handler in the series. I beg you, please don’t do this in your own system because that leads to way too much complexity, code that’s much harder to reason about because of the extra hoops and indirection, and potentially poor system performance because again, you can’t see what the code is doing and you can easily end up making unnecessarily duplicate database round trips or just being way too “chatty” to the database. And no, replacing MediatR handlers with Wolverine handlers is not going to help because the pattern was the problem and not MediatR itself.

As always, the Wolverine philosophy is that the path to long term success in enterprise-y software systems is by relentlessly eliminating code ceremony so that developers can better reason about how the system’s logic and behavior works. To a large degree, Wolverine is a reaction to the very high ceremony Clean/Onion Architecture/iDesign architectural approaches of the past 15-20 years and how hard those systems can be to deal with over time.

And as happens with just about any halfway good thing in programming, some folks overused the Railway Programming idea and there’s a little bit of pushback or backlash to the technique. I can’t find the quote to give it the real attribution, but something I’ve heard Martin Fowler say is that “we don’t know how useful an idea really can be until we push it too far, then pull back a little bit.”

Making Event Sourcing with Marten Go Faster

You’re about to start a new system with Event Sourcing using Marten, and you’re expecting your system to be hugely successful such that it’s going to handle a huge amount of data, but you’re already starting with pretty ambitious non-functional requirements for the system to be highly performant and all the screens or exposed APIs be snappy.

Basically, what you want to do is go as fast as Marten and PostgreSQL will allow. Fortunately, Marten has a series of switches and dials that can be configured to squeeze out more performance, but for a variety of historical reasons and possible drawbacks, are not the defaults for a barebones Marten configuration as shown below:

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddMarten(opts =>
{
    opts.Connection(builder.Configuration.GetConnectionString("marten"));
});

Cut me some slack in my car choice for the analogy here. I’m not only an American, but I’m an American from a rural area who grew up dreaming about having my own Mustang or Camaro because that’s as far out as I could possibly imagine back then.

At this point, we have is the equivalent to a street legal passenger car, maybe the equivalent to an off the shelf Mustang:

Which probably easily goes fast enough for every day usage for the mass majority of us most of the time. But we really need a fully tricked out Mustang GTD that’s absurdly optimized to just flat out go fast:

Let’s start trimming weight off our street legal Marten setup to go faster with…

Opt into Lightweight Sessions by Default

Starting from a new system so we don’t care about breaking existing code by changing behavior, let’s opt for lightweight sessions by default:

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddMarten(opts =>
{
    opts.Connection(builder.Configuration.GetConnectionString("marten"));
})
    
// Jettison some "Identity Map" weight by going lighter weight    
.UseLightweightSessions();

By default, the instances of IDocumentSession you get out of an IoC container would utilize the Identity Map feature to track loaded entities by id so that if you happened to try to load the same entity from the same session, you would get the exact same object. As I’m sure you can imagine, that means that every entity fetched by a session is stuffed into a dictionary internally (Marten uses the highly performant ImTools ImHashMap everywhere, but still), and the session also has to bounce through the dictionary before loading data as well. It’s just a little bit of overhead we can omit by opting for “Lightweight Sessions” if we don’t need that behavior by default.

We’ve always been afraid to change the default behavior here to the more efficient approach because it can absolutely lead to breaking existing code that depends on the Identity Map behavior. On the flip side, I think you should not need Identity Map mechanics if you can keep the call stacks within your code short enough that you can actually “see” where you might be trying to load the same data twice or more in the same parent operation.

On to the next thing…

Make Writes Faster with Quick Append

Next, since we again don’t have any existing code that can be broken here, let’s opt for “Quick Append” writes like so:

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddMarten(opts =>
{
    opts.Connection(builder.Configuration.GetConnectionString("marten"));

    // Make event writing faster, like 2X faster in our testing
    opts.Events.AppendMode = EventAppendMode.Quick;
})
    
// Jettison some "Identity Map" weight by going lighter weight    
.UseLightweightSessions();

This will help the system be able to append new events much faster, but at the cost of not being able to use some event metadata like event versions, sequence, or timestamp information within “Inline” projections.

Again, even though this option has been clocked as being much faster, we have not wanted to make this the default because it could break existing systems for people who depend on having the rich metadata during the Inline application of projections that forces Marten to do a kind of two step process to append events. This “Quick Append” option also helps reduce concurrent access problems writing to streams and generally makes the “Async Daemon” subsystem processing asynchronous projections and subscriptions run much smoother.

We’re not out of tricks yet by any means, so let’s go on…

Use the Identity Map for Inline Aggregates

Wait, I thought you told me not to cross the streams! Yeah, about the Identity Map thing, there’s one exception where we actually do want that behavior within CQRS command handlers like this one using Wolverine and its “Aggregate Handler Workflow” integration with Marten:

    // This tells Wolverine that the first "return value" is NOT the response
    // body
    [EmptyResponse]
    [WolverinePost("/api/incidents/{incidentId:guid}/category")]
    public static IncidentCategorised Post(
        // the actual command
        CategoriseIncident command, 
        
        // Wolverine is generating code to look up the Incident aggregate
        // data for the event stream with this id
        [Aggregate("incidentId")] Incident incident)
    {
        // This is a simple case where we're just appending a single event to
        // the stream.
        return new IncidentCategorised(incident.Id, command.Category, command.CategorisedBy);
    }

In the case above, the Incident model is a projected document that’s first used by the command handler to “decide” what new events to emit. If we’re updating the Incident model with an Inline projection that tries to update the Incident model in the database at the same time it wants to append events, then it’s an advantage for performance to “just” use the original Incident model we used initially, then forwarding the new state based on the new events and persisting the results right then and there. We can opt into this optimization even for the lightweight sessions we earlier wanted to use by adopting one more UseIdentityMapForAggregates flag:

builder.Services.AddMarten(opts =>
{
    opts.Connection(builder.Configuration.GetConnectionString("marten"));

    // Make event writing faster, like 2X faster in our testing
    opts.Events.AppendMode = EventAppendMode.Quick;

    // This can cut down on the number of database round trips
    // Marten has to do during CQRS command handler execution
    opts.Events.UseIdentityMapForAggregates = true;
})
    
// Jettison some "Identity Map" weight by going lighter weight    
.UseLightweightSessions();

Note, this optimization can easily break code for folks who use some sort of stateful “Aggregate Root” approach where the state of the projected aggregate object might be mutated during the course of executing the command. As this has traditionally been a popular approach in Event Sourcing circles, we can’t make this be a default option. If you instead either make the projected aggregates like Incident either immutable or treat them as a dumb data input to your command handlers with a more Functional Programming “Decider” function approach, you can get away with the performance optimization.

And also, I strongly prefer and recommend the FP “Decider” approach to JasperFx Software clients as is and I think that folks using the older “Aggregate Root” approach tend to have more runtime bugs.

Moving on, let’s keep our database smaller…

Event Stream Archiving

By and large, you can improve system performance in almost any situation by trying to keep your database from growing too large by archiving or retiring obsolete information. Marten has first class support for “Archiving Event Streams” where you effectively just move event streams that only represent historical information and are not really active into an archived state.

Moreover, we can divide our underlying PostgreSQL storage for events into “hot” and “cold” storage by utilizing PostgreSQL’s table partitioning support like this:

builder.Services.AddMarten(opts =>
{
    opts.Connection(builder.Configuration.GetConnectionString("marten"));

    // Make event writing faster, like 2X faster in our testing
    opts.Events.AppendMode = EventAppendMode.Quick;

    // This can cut down on the number of database round trips
    // Marten has to do during CQRS command handler execution
    opts.Events.UseIdentityMapForAggregates = true;

    // Let's leverage PostgreSQL table partitioning
    // to our advantage
    opts.Events.UseArchivedStreamPartitioning = true;
})
    
// Jettison some "Identity Map" weight by going lighter weight    
.UseLightweightSessions();

If you’re aggressive with marking event streams as Archived, the PostgreSQL table partitioning can move off archived event streams into a different table partition than our active event data. This is essentially keeping the “active” event table storage relatively stable in size, and most operations will execute against this smaller table partition while still being able to access the archived data too if explicitly opt into including that.

We added this feature in a minor point 7.* release, so it had to be opt in, and I think I was too hesitant to make this a default in 8.0, so it’s still “opt in”.

Stream Compacting

Beyond archiving event streams, maybe you just want to “compact” a longer event stream so you technically retain all the existing state, but further reduce the size of your active database storage. To that end, Marten 8.0 added Stream Compacting.

Distributing Asynchronous Projections

I had been mostly talking about using projections running Inline such that the projections are updated at the same time as the events are captured. That’s sometimes applicable or desirable, but other times you’ll want to optimize the “write” operations by moving the updating of projected data to an Async projection running in the background. But now let’s say that we have quite a few asynchronous projections and several subscriptions as well. In early versions of Marten, we had to run everything in a “Hot/Cold” mode where every known projection or subscription had to run on one single “leader” node. So even if you were running your application across a dozen or more nodes, only one could be executing all of the asynchronous projections and subscriptions.

That’s obviously a potential bottleneck, so Marten 7.0 by itself introduced some ability to spread projections and subscriptions over multiple nodes. If we introduce Wolverine into the mix though, we can do quite a bit better than that by allowing Wolverine to distribute the asynchronous Marten work across our entire cluster with its ability to distribute Marten projections and subscriptions with the UseWolverineManagedEventSubscriptionDistribution option in the WolverineFx.Marten Nuget:

builder.Services.AddMarten(opts =>
{
    opts.Connection(builder.Configuration.GetConnectionString("marten"));

    // Make event writing faster, like 2X faster in our testing
    opts.Events.AppendMode = EventAppendMode.Quick;

    // This can cut down on the number of database round trips
    // Marten has to do during CQRS command handler execution
    opts.Events.UseIdentityMapForAggregates = true;

    // Let's leverage PostgreSQL table partitioning
    // to our advantage
    opts.Events.UseArchivedStreamPartitioning = true;
})
    
// Jettison some "Identity Map" weight by going lighter weight    
.UseLightweightSessions()

.IntegrateWithWolverine(opts =>
{
    opts.UseWolverineManagedEventSubscriptionDistribution = true;
});

Is there anything else for the future?

It never ends, and yes, there are still quite a few ideas in our product backlog to potentially improve performance and scalability of Marten’s Event Sourcing. Offhand, that includes looking at alternative, higher performance serializers and more options to parallelize asynchronous projections to squeeze out more throughput by sharing some data access across projections.

Summary

There are quite a few “opt in” features in Marten that will help your system perform better, but these features are “opt in” because they can be harmful if you’re not building around the assumptions these features make about how your code works. The good news though is that you’ll be able to better utilize these features if you follow the Critter Stack’s recommended practices by striving for shorter code stacks (i.e., how many jumps between methods and classes does your code make when receiving a system input like a message or HTTP request) so your code is easier to reason about anyway, and avoiding mutating projected aggregate data outside of Marten.

Marten 8.0, Wolverine 4.0, and even Lamar 15.0 are out!

It’s a pretty big “Critter Stack” community release day today, as:

  1. Marten has its 8.0 release
  2. Wolverine got a 4.0 release
  3. Lamar, the spiritual successor to StructureMap, had a corresponding 15.0 release
  4. And underneath those tools, the new JasperFx & JasperFx.Events library went 1.0 and the supporting Weasel library that provides some low level functionality went 8.0

Before getting into the highlights, let me start by thanking the Critter Stack Core team for all their support, contributions to both the code and documentation, and for being a constant sounding board for me and source of ideas and advice:

Next, I’d like to thank our Critter Stack community for all the interest and the continuous help we get with suggestions, pull requests that improve the tools, and especially for the folks who take the time to create actionable bug reports because that’s half the battle of getting problems fixed. And while there are plenty of days when I wish there wasn’t a veritable pack of raptors prowling around the projects probing for weaknesses in the projects, I cannot overstate the importance for an OSS project to have user and community feedback.

Alright, on to some highlights.

The big changes are that we consolidated several smaller shared libraries into one bigger shared JasperFx library and also combined some smaller libraries like Marten.CommandLine, Weasel.CommandLine, and Lamar.Diagnostics into Marten, Weasel, and Lamar respectfully. That’s hopefully going to help folks get to command line utilities quicker and easier, and the Critter Stack tools do get some value out of those command line utilities.

We’ve now got a shared model to configure behavioral differences at “Development” vs “Production” time for both Marten and Wolverine all at one time like this:

// These settings would apply to *both* Marten and Wolverine
// if you happen to be using both
builder.Services.CritterStackDefaults(x =>
{
    x.ServiceName = "MyService";
    x.TenantIdStyle = TenantIdStyle.ForceLowerCase;
    
    // You probably won't have to configure this often,
    // but if you do, this applies to both tools
    x.ApplicationAssembly = typeof(Program).Assembly;
    
    x.Production.GeneratedCodeMode = TypeLoadMode.Static;
    x.Production.ResourceAutoCreate = AutoCreate.None;

    // These are defaults, but showing for completeness
    x.Development.GeneratedCodeMode = TypeLoadMode.Dynamic;
    x.Development.ResourceAutoCreate = AutoCreate.CreateOrUpdate;
});

It might be awhile before this pays off for us, but everything from the last couple paragraphs is also meant to speed up the development of additional Event Sourcing “Critter” tools to expand beyond PostgreSQL — not that we’re even slightly backing off our investment in the do everything PostgreSQL database!

For Marten 8.0, we’ve done a lot to make projections easier to use with explicit code, and added a new Stream Compacting feature for yet more scalability.

For Wolverine 4.0, we’ve improved Wolverine’s ability to support modular monolith architectures that might utilize multiple Marten stores or EF Core DbContext services targeting the same database or even different databases. More on this soon.

Wolverine 4.0 also gets some big improvements for EF Core users with a new Multi-Tenancy with EF Core feature.

Both Wolverine and Marten got some streamlined Open Telemetry span naming changes that were suggested by Pascal Senn of ChiliCream who collaborates with JasperFx for a mutual client.

For both Wolverine and Lamar 15, we added a little more full support for the [FromKeyedService] and “keyed services” in the .NET Core DI abstractions like this for a Wolverine handler:

    // From a test, just showing that you *can* do this
    // *Not* saying you *should* do that very often
    public static void Handle(UseMultipleThings command, 
        [FromKeyedServices("Green")] IThing green,
        [FromKeyedServices("Red")] IThing red)
    {
        green.ShouldBeOfType<GreenThing>();
        red.ShouldBeOfType<RedThing>();
    }

And inside of Lamar itself, any dependency from a constructor function that has this:

// Lamar will inject the IThing w/ the key "Red" here
public record ThingUser([FromKeyedServices("Red")] IThing Thing);

Granted, Lamar already had its own version of keyed services and even an equivalent to the [FromKeyedService] attribute long before this was added to the .NET DI abstractions and ServiceProvider conforming container, but .NET is Microsoft’s world and lowly OSS projects pretty well have to conform to their abstractions sometimes.

Just for the record, StructureMap had an equivalent to keyed services in its first production release way back in 2004 back when David Fowler was probably in middle school making googly eyes at Rihanna.

What’s Next for the Critter Stack?

Honestly, I had to cut some corners on documentation to get the releases out for a JasperFx Software client, so I’ll be focused on that for most of this week. And of course, plenty of open issues and some outstanding pull requests didn’t make the release, so those hopefully get addressed in the next couple minor releases.

For the bigger picture, I think the rest of this year is:

  1. “CritterWatch”, our long planned, not moving fast enough for my taste, management and observability console for both Marten and Wolverine.
  2. Improvements to Marten’s performance and scalability for Event Sourcing. We did a lot in that regard last year throughout Marten 7.*, but there’s another series of ideas to increase the throughput even farther.
  3. Wolverine is getting a lot of user contributions right now, and I expect that especially the asynchronous messaging support will continue to grow. I would like to see us add CosmosDb support to Wolverine by the end of the year. By and large, I would like to increase Wolverine’s community usage over all by trying to grow the tool beyond just folks already using Marten — but the Marten + Wolverine combination will hopefully continue to improve.
  4. More Critters? We’re still talking about a SQL Server backed Event Store, with CosmosDb being a later alternative

Wrapping Up

As for the wisdom of ever again making a release cycle where the entire Critter Stack has a major release at the exact same time, this:

Finally, a lot of things didn’t make the release that folks wanted, heck that I wanted, but at some point it becomes expensive for a project to have a long running branch for “vNext” and you have to make the release. I’m hopeful that even though these major releases didn’t add a ton of new functionality that they set us up with the right foundation for where the tools go next.

I also know that folks will have plenty of questions and probably even inevitably run into problems or confusion with the new releases — especially until we can catch up on documentation — but I stole time from the family to get this stuff out this weekend and I’ll probably not be able to respond to anyone but JasperFx customers on Monday. Finally, in the meantime, right after every big push, I promise to start responding to whatever problems folks will have, but:

Symbolically Important Wolverine 3.13.4 Release

We were able to publish the Wolverine 3.13.4 release this morning with a handful of important fixes for error retries in a modular monolith architecture, recovering from Rabbit MQ connection interruptions, and Azure Service Bus, Kafka, and Amazon SQS fixes.

The awesome part of this release was how much of it, including a huge fix from Hamed Sabzian, came from the community (e.g. “not me”). Even one of the issues I addressed only came with some significant help from users building reproduction projects. Another issue was reported by a JasperFx Software customer who we’re working with for some new multi-tenancy functionality.

Beyond just the symbolic show of community engagement and involvement with Wolverine, this release hopefully marks the end of new development with Wolverine 3.*. There’s now a maintenance branch for 3.0, but Wolverine’s main branch is now the forthcoming 4.0 release that should hit by Monday next week.

Thank you to all the contributors to this release and recent releases, and that absolutely includes folks who took the time to open actionable issues and create reproduction steps for those issues.

Stream Compacting in Marten 8.0

One of the earliest lessons I learned designing software systems is that reigning in unchecked growth of databases through judicious pruning and archiving can do wonders for system performance over time. As yet another tool in the toolbox for scaling Marten and in collaboration with a JasperFx Software customer, we’re adding an important feature in Marten 8.0 called “Stream Compacting” that can be used to judiciously shrink Marten’s event storage to keep the database a little more limber as old data is no longer relevant.

Let’s say that you failed to be omniscient in your event stream modeling and ended up with a longer stream of events than you’d ideally like and that is bloating your database size and maybe impacting performance. Maybe you’re going to be in a spot where you don’t really care about all the old events, but really just want to maintain the current projected state and more recent events. And maybe you’d like to throw the old events in some kind of “cold” storage like an S3 bucket or [something to be determined later].

Enter the new “Stream Compacting” feature that will come with Marten 8.0 next week like so:

public static async Task compact(IDocumentSession session, Guid equipmentId, IEventsArchiver archiver)
{
    // Maybe we have ceased to care about old movements of a piece of equipment
    // But we want to retain an accurate positioning over the past year
    // Yes, maybe we should have done a "closing the books" pattern, but we didn't
    // So instead, let's just "compact" the stream

    await session.Events.CompactStreamAsync<Equipment>(equipmentId, x =>
    {
        // We could say "compact" all events for this stream
        // from version 1000 and below
        x.Version = 1000;

        // Or instead say, "compact all events older than 30 days ago":
        x.Timestamp = DateTimeOffset.UtcNow.Subtract(30.Days());

        // Carry out some kind of user defined archiving process to
        // "move" the about to be archived events to something like an S3 bucket
        // or an Azure Blob or even just to another table
        x.Archiver = archiver;

        // Pass in a cancellation token because this might take a bit...
        x.CancellationToken = CancellationToken.None;
    });
}

What this “compacting” does is effectively create a snapshot of the stream state (the Equipment type in the example above) and replaces the existing events that are archived in the database with a single Compacted<Equipment> event with this shape:

// Right now we're just "compacting" in place, but there's some
// thought to extending this to what one of our contributors
// calls "re-streaming" in their system where they write out an
// all new stream that just starts with a summary
public record Compacted<T>(T Snapshot, Guid PreviousStreamId, string PreviousStreamKey)

The latest, greatest Marten projection bits are always able to restart any SingleStreamProjection with the Snapshot data of a Compacted<T> event, with no additional coding on your part.

And now, to answer a few questions that my client (Carsten, this one’s for you, sorry I was slow today:)) asked me about this today:

  • Is there going to be a default archiver? Not yet, but I’m all ears on what that could or should be. It’ll always be pluggable of course because I’d expect a wide range of usages
  • How about async projections? This will not impact asynchronous projections that are already in flight. The underlying mechanism is not using any persisted, projected document state but is instead fetching the raw events and effectively doing a live aggregation to come back to the compacted version of the projected document.
  • Can you compact a single stream multiple times? Yes. I’m thinking folks could use a projection “side effect” to emit a request message to compact a stream every 1,000 events or some other number.
  • What happens in case the async daemon moves beyond (e.g. new events were saved while the compacting is ongoing) – will the compacting aggregation overwrite the projection updates done by the async daemon – basically the same for inline projections? The compacting will be done underneath the async daemon, but will not impact the daemon functionality. The projections are “smart enough” to restart the snapshot state from any Compacted<T> event found in the middle of the current events anyway.
  • How does rewind and replay work if a stream is compacted? Um, you would only be able to replay at or after the point of compacting. But we can talk about making this able to recover old events from archiving in a next phase!
  • Any other limitations? Yeah, same problem we ran into with the “optimized rebuild” feature from Marten 7.0. This will not play well if there are more than one single stream projection views for the same type of stream. Not insurmountable, but definitely not convenient. I think you’d have to explicitly handle a Compacted<T1> event in the projection for T2 if both T1 and T2 are separate views of the same stream type.
  • Why do I care? You probably don’t upfront, but this might easily be a way to improve the performance and scalability of a busy system over time as the database grows.
  • Is this a replacement or alternative to the event archival partitioning from Marten 7? You know, I’m not entirely sure, and I think your usage may vary. But if your database is likely to grow massively large over time and you can benefit from shrinking the size of the “hot” part of the database of events you no longer care about, do at least one or both of these options!

Summary

The widespread advice from event sourcing experts is to “keep your streams short”, but I also partially suspect this is driven by technical limitations of some of the commonly used, early commercial event store tools. I also believe that Marten is less impacted by long stream sizes than many other event store tools, but still, smaller databases will probably outperform bigger ones in most cases.

Critter Stack Release Plans

Time for an update on Critter Stack release plans, and a follow up on my previous Critter Stack Work in Progress post from March. The current plan is to release Marten 8.0, Weasel 8.0, and Wolverine 4.0 on June 1st. It’s not going to be a huge release in terms of new functionality, but there are some important structural changes that will help us build some future features, and we needed to jettison older .NET versions while getting onto the latest Npgsql. “CritterWatch” is still very much planned and a little bit in progress, but we’ve got to get these big releases out first.

The key takeaways are that I want to essentially freeze Marten 7.* for everything but bug fixes right now, and probably freeze Wolverine 3.* for new feature development after a last wave of pull requests gets pulled in over the next couple days.

I’m admittedly too drowsy and lazy to write much tonight, so here’s just a dump of what I wrote up for the rest of our core team to review. I think we’re already at the point where we’re ready to just work on documentation and a few last touches, so the mass majority of this doesn’t get done in time, but here’s the full brain storm:

First though, what’s been done:

  • .NET 6 & 7 were dropped
  • Updated to Npgsql 9 across the board
  • Dropped all synchronous APIs in Marten
  • Deleted some [Obsolete] APIs in Marten
  • Consolidation of supporting libraries to a single JasperFx library
  • JasperFx has that new consolidated configuration option for common configuration like application assembly, code generation, and the stateful resource AutoCreate mode
  • Pulled out event projections and core event store abstractions to a new JasperFx.Events library
    • Removed code generation from all projections
    • Better explicit code options for aggregations and event projections
  • Wolverine 4 has better handles envelope storage & the transactional inbox/outbox for modular monoliths
  • Improved “Descriptor” model to describe the static configuration of Wolverine and/or Marten applications that we’ll use for CritterWatch too
  • Expanded commands for dead letter queue management in Wolverine that was meant for CritterWatch
  • Multi-tenancy options in Wolverine for SQL Server or PostgreSQL w/o Marten, multi-tenancy usage with EF Core

Punchlist?

  1. Marten 7.40.4 release w/ a pair of outstanding PRs
    1. Cherry pick commits to Marten “master”
  2. JasperFx & JasperFx.Events 1.0
    1. Documentation website?
  3. Weasel “master” branch
    1. All tests should be passing
  4. Marten “master” branch
    1. All tests should be passing
    2. Documentation website should be building – that’s going to take some effort because of code samples
    3. Get Anne’s PR for tutorials in (cool new guided tour of building a system using Event Sourcing and Event Driven Architecture with first Marten, then Wolverine)
    4. Stream Compacting feature – for a JasperFx customer (this is definitely in for Marten 8, this is a big improvement for keeping a larger system running fast over time by compacting the database)
    5. Fix the optimized projection rebuild options? Or rip it out and leave it for CritterWatch?
    6. Ability to overwrite the event timestamp (relatively easy)
    7. Migration guide 
    8. Figure out what the proper behavior of “Live” aggregations when there’s some ShouldDelete() action going on
  5. Wolverine
    1. One last 3.14 release with easy to grab pull requests and bug fixes
    2. Rebase on 3.14
    3. Fork off the 3.0 branch
    4. 4.0 becomes main branch
    5. All tests should be passing
    6. Documentation website should build
    7. Migration guide
  6. Critter Watch preparation
    1. When integrated w/ CritterWatch, Wolverine can build the descriptor model for the entire application, including EventStoreUsage. No idea where this work stands right now. Did quite a bit earlier this year, then went off in a different direction
    2. Review all Open Telemetry usage and activity naming across Marten and especially Wolverine. Add Open Telemetry & Metrics metadata to the descriptor model sent to CritterWatch. I think this is somewhat likely to get done before Wolverine 4.0.
    3. Ability to send messages from CritterWatch to Wolverine. Might push through some kind of message routing and/or message handler extensibility

Nice to do?

  1. Consider consolidating the stateful resource / AutoCreate configuration so there are fewer thing to configure. See Managing Auto Creation of Database or Message Broker Resources in the Critter Stack vNext
  2. Programmatic message routing in Wolverine that varies based on the message contents? This is around options to route a message to one of a set of destinations based on the message core. Thinking about concurrency here. Could be done later.
  3. More open issues in the Marten 8 milestone, but it’s about time to drop any issue that isn’t a breaking change
  4. More open issues in the Wolverine 4 milestone or Wolverine in general
  5. Ermine/Polecat readiness? (Marten ported to SQL Server)
    1. Spike it out? 
    2. Look for opportunities to pull shared items into Weasel?