Wolverine 1.11.0 was released this week (here’s the release notes) with a small improvement to its ability to subscribe to Marten events captured within Wolverine message handlers or HTTP endpoints. Since Wolverine 1.0, users have been able to opt into having Marten forward events captured within Wolverine handlers to any known Wolverine subscribers for that event with the EventForwardingToWolverine() option.
The latest Wolverine release adds the ability to automatically publish an event as a different message using the event data and its metadata as shown in the sample code below:
builder.Services.AddMarten(opts =>
{
var connectionString = builder.Configuration.GetConnectionString("marten");
opts.Connection(connectionString);
})
// Adds Wolverine transactional middleware for Marten
// and the Wolverine transactional outbox support as well
.IntegrateWithWolverine()
.EventForwardingToWolverine(opts =>
{
// Setting up a little transformation of an event with its event metadata to an internal command message
opts.SubscribeToEvent<IncidentCategorised>().TransformedTo(e => new TryAssignPriority
{
IncidentId = e.StreamId,
UserId = e.Data.UserId
});
});
This isn’t a general purpose outbox, but rather immediately publishes captured events based on normal Wolverine publishing rules immediately at the time the Marten transaction is committed.
So in this sample handler:
public static class CategoriseIncidentHandler
{
public static readonly Guid SystemId = Guid.NewGuid();
// This Wolverine handler appends an IncidentCategorised event to an event stream
// for the related IncidentDetails aggregate referred to by the CategoriseIncident.IncidentId
// value from the command
[AggregateHandler]
public static IEnumerable<object> Handle(CategoriseIncident command, IncidentDetails existing)
{
if (existing.Category != command.Category)
{
// Wolverine will transform this event to a TryAssignPriority message
// on the successful commit of the transaction wrapping this handler call
yield return new IncidentCategorised
{
Category = command.Category,
UserId = SystemId
};
}
}
}
To try to close the loop, when Wolverine handles the CategoriseIncident message, it will:
Potentially append an IncidentCategorised event to the referenced event stream
Try to transform that event to a new TryAssignPriority message
Commit the changes queued up to the underlying Marten IDocumentSession unit of work
If the transaction is successful, publish the TryAssignPriority message — which in this sample case would be routed to a local queue within the Wolverine application and handled in a different thread later
That’s a lot of text and gibberish, but all I’m trying to say is that you can make Wolverine reliably react to events captured in the Marten event store.
JasperFx Software is officially ready to provide formal paid support for Marten, Wolverine, or any other JasperFx dependency of the previous two tools. Starting now, potential users can feel safe in adopting Marten or Wolverine for mission critical applications knowing that there’s a backing company ready to support them in their usage of these tools.
This is an important step for us to make the Critter Stack a viable platform over the long term for our users. Our long term goal is to make the “Critter Stack” the single best set of tooling for creating server side applications in .NET in general and more specifically the best tooling for building solutions with Event Sourcing and CQRS on any platform.
In the future, we’ll also be announcing add on products for the Critter Stack tools that will provide advanced usages and deployment strategies including:
Zero downtime deployments
A blue/green deployment capability
Enhanced scalability for the Marten event store functionality
Management consoles for Wolverine messaging, dead letter queue management, and Marten event storage and projection oversight
We would like to thank our vibrant community for all of your support, encouragement, and enthusiasm over these past years since Marten arrived on the scene in 2016. We look forward to engaging with you all as we have embarked on building a sustainable business model around these tools.
JasperFx Software will be shortly announcing the availability of official support plans for Marten, Wolverine, and other JasperFx open source tools. We’re working hard to build a sustainable ecosystem around these tools so that companies can feel confident in making a technical bet on these high productivity tools for .NET server side development.
I’ll be presenting a short talk at .NET Conf 2023 entitled “CQRS with Event Sourcing using the Critter Stack.” It’s going to be a quick dive into how to use Marten and Wolverine to build a very small system utilizing a CQRS Architecture with Event Sourcing as the persistence strategy.
Hopefully, I’ll be showing off:
How Wolverine’s runtime architecture is significantly different than other .NET tools and why its approach leads to much lower code ceremony and potentially higher performance
Marten and PostgreSQL providing a great local developer story both in development and in integration testing
How the Wolverine + Marten integration makes your domain logic easily unit testable without resorting to complicated Clean/Onion/Hexagonal Architectures
Wolverine’s built in integration testing support that you’ll wish you had today in other .NET messaging tools
The built in tooling for unraveling Wolverine or Marten’s “conventional magic”
Here’s the talk abstract:
CQRS with Event Sourcing using the “Critter Stack”
Do you have a system where you think would be a good fit for a CQRS architecture that also uses Event Sourcing for at least part of its persistence strategy? Are you intimidated by the potential complexity of that kind of approach? Fear not, using a combination of the PostgreSQL-backed Marten library for event sourcing and its newer friend Wolverine for command handling and asynchronous messaging, I’ll show you how you can quickly get started with both CQRS and Event Sourcing. Once we get past the quick start, I’ll show you how the Critter Stack’s unique approach to the “Decider” pattern will help you create robust command handlers with very little code ceremony while still enjoying easy testability. Moving beyond basic command handling, I’ll show you how to reliably subscribe to and publish the events or other messages created by your command handlers through Wolverine’s durable outbox and direct subscriptions to Marten’s event storage.
You can’t really get Midjourney to create an image of a wolverine without veering into trademark violations, so look at the weasel and marten up there working on a website application together!
Before I show the new functionality, let’s imagine that you have a simple web service for invoicing where you’re using Marten as a document database for persistence. You might have a very simplistic web service for exposing a single Invoice like this (and yes, I know you’d probably want to do some kind of transformation to a view model but put that aside for a moment):
[WolverineGet("/invoices/longhand/id")]
[ProducesResponseType(404)]
[ProducesResponseType(200, Type = typeof(Invoice))]
public static async Task<IResult> GetInvoice(
Guid id,
IQuerySession session,
CancellationToken cancellationToken)
{
var invoice = await session.LoadAsync<Invoice>(id, cancellationToken);
if (invoice == null) return Results.NotFound();
return Results.Ok(invoice);
}
It’s not that much code, but there’s still some repetitive boilerplate code. Especially if you’re going to care or be completist about your OpenAPI metadata. The design and usability aesthetic of Wolverine is to reduce code ceremony as much as possible without sacrificing performance or observability, so let’s look at a newer alternative.
Next, I’m going to install the new WolverineFx.Http.Marten Nuget to our web service project, and write this new endpoint using the [Document] attribute:
The code up above is an exact functional equivalent to the first code sample, and even produces the exact same OpenAPI metadata (or at least tries to, OpenAPI has been a huge bugaboo for Wolverine because so much of the support inside of AspNetCore is hard wired for MVC Core). Notice though, how much less you have to do. You have a synchronous method, so that’s a little less ceremony. It’s a pure function, so even if there was code to transform the invoice data to an API specific shape, you could unit test this method without any infrastructure involved or using something like Alba. Heck, that is setting you up so that Wolverine itself is handling the “return 404 if the Invoice is not found” behavior as shown in the unit test from Wolverine itself (using Alba):
[Fact]
public async Task returns_404_on_id_miss()
{
// Using Alba to run a request for a non-existent
// Invoice document
await Scenario(x =>
{
x.Get.Url("/invoices/" + Guid.NewGuid());
x.StatusCodeShouldBe(404);
});
}
Simple enough, but now let’s look at a new HTTP-centric mechanism for the Wolverine + Marten “Aggregate Handler” workflow for writing CQRS “Write” handlers using Marten’s event sourcing. You might want to glance at the previous link for more context before proceeding, or refer back to it later at least.
The main change here is that folks asked to provide the aggregate identity through a route parameter, and then to enforce a 404 response code if the aggregate does not exist.
Using an “Order Management” problem domain, here’s what an endpoint method to ship an existing order could look like:
[WolverinePost("/orders/{orderId}/ship2"), EmptyResponse]
// The OrderShipped return value is treated as an event being posted
// to a Marten even stream
// instead of as the HTTP response body because of the presence of
// the [EmptyResponse] attribute
public static OrderShipped Ship(ShipOrder2 command, [Aggregate] Order order)
{
if (order.HasShipped)
throw new InvalidOperationException("This has already shipped!");
return new OrderShipped();
}
Notice the new [Aggregate] attribute on the Order argument. At runtime, this code is going to:
Take the “orderId” route argument, parse that to a Guid (because that’s the identity type for an Order)
Use that identity — and any version information on the request body or a “version” route argument — to use Marten’s FetchForWriting() mechanism to both load the latest version of the Order aggregate and to opt into optimistic concurrency protections against that event stream.
Return a 404 response if the aggregate does not already exist
Pass the Order aggregate into the actual endpoint method
Take the OrderShipped event returned from the method, and apply that to the Marten event stream for the order
Commit the Marten unit of work
As always, the goal of this workflow is to turn Wolverine endpoint methods into low ceremony, synchronous pure functions that are easily testable with unit tests.
I’ve recently fielded some user problems with Wolverine’s transactional inbox/outbox subsystem going absolutely haywire. After asking a plethora of questions, I finally realized that the underlying issue was using Wolverine within AWS Lambda or Azure Function functions where the process is short lived.
Wolverine heretofore is optimized for running in multiple, long lived process nodes because that’s typical for asynchronous messaging architectures. By not getting a chance to cleanly shut down its background processing, users were getting a ton of junk data in Wolverine’s durable message tables that was causing all kinds of aggravation.
To nip that problem in the bud, Wolverine 1.10 introduced a new concept of durability modes to allow you to optimize Wolverine for different types of basic usage:
public enum DurabilityMode
{
/// <summary>
/// The durability agent will be optimized to run in a single node. This is very useful
/// for local development where you may be frequently stopping and restarting the service
///
/// All known agents will automatically start on the local node. The recovered inbox/outbox
/// messages will start functioning immediately
/// </summary>
Solo,
/// <summary>
/// Normal mode that assumes that Wolverine is running on multiple load balanced nodes
/// with messaging active
/// </summary>
Balanced,
/// <summary>
/// Disables all message persistence to optimize Wolverine for usage within serverless functions
/// like AWS Lambda or Azure Functions. Requires that all endpoints be inline
/// </summary>
Serverless,
/// <summary>
/// Optimizes Wolverine for usage as strictly a mediator tool. This completely disables all node
/// persistence including the inbox and outbox
/// </summary>
MediatorOnly
}
Focusing on just the serverless scenario, you want to turn off all of Wolverine’s durable node tracking, leader election, agent assignment, and long running background processes of all types — and now you can do that just fine like so:
using var host = await Host.CreateDefaultBuilder()
.UseWolverine(opts =>
{
opts.Services.AddMarten("some connection string")
// This adds quite a bit of middleware for
// Marten
.IntegrateWithWolverine();
// You want this maybe!
opts.Policies.AutoApplyTransactions();
// But wait! Optimize Wolverine for usage within Serverless
// and turn off the heavy duty, background processes
// for the transactional inbox/outbox
opts.Durability.Mode = DurabilityMode.Serverless;
}).StartAsync();
I’m helping a JasperFx Software client get a new system off the ground that’s using both Hot Chocolate for GraphQL and Marten for event sourcing and general persistence. That’s led to a couple blog posts so far:
Today though, I want to talk about some early ideas for automating integration testing of GraphQL endpoints. Before I show my intended approach, here’s a video from ChiliCream (the company behind Hot Chocolate) showing their recommendations for testing:
Now, to be honest, I don’t agree with their recommended approach. I played a lot of sports growing up in a small town, and one of my coach’s favorite sayings actually applies here:
If you want to be good, practice like you play
every basketball coach I ever played for
That saying really just meant to try to do things well in practice so that it would carry right through into the real games. In the case of integration testing, I want to be testing against the “real” application configuration including the full ASP.Net Core middleware stack and the exact Marten and Hot Chocolate configuration for the application instead of against a separately constructed IoC and Hot Chocolate configuration. In this particular case, the application is using multi-tenancy through a separate database per tenant strategy with the tenant selection at runtime being ultimately dependent upon expected claims on the ClaimsPrincipal for the request.
All that being said, I’m unsurprisingly opting to use the Alba library within xUnit specifications to test through the entire application stack with just a few overrides of the application. My usual approach with xUnit.Net and Alba is to create a shared context that manages the lifecycle of the bootstrapped application in memory like so:
public class AppFixture : IAsyncLifetime
{
public IAlbaHost Host { get; private set; }
public async Task InitializeAsync()
{
// This is bootstrapping the actual application using
// its implied Program.Main() set up
Host = await AlbaHost.For<Program>(x => { });
}
Moving on to the GraphQL mechanics, what I’ve come up with so far is to put a GraphQL query and/or mutation in a flat file within the test project. I hate not having the test inputs in the same code file as the test, but I’m trying to offset that by spitting out the GraphQL query text into the test output to make it a little easier to troubleshoot failing tests. The Alba mechanics — so far — look like this (simplified a bit from the real code):
public Task<IScenarioResult> PostGraphqlQueryFile(string filename)
{
// This ugly code is just loading up the GraphQL query from
// a named file
var path = AppContext
.BaseDirectory
.ParentDirectory()
.ParentDirectory()
.ParentDirectory()
.AppendPath("GraphQL")
.AppendPath(filename);
var queryText = File.ReadAllText(path);
// Building up the right JSON to POST to the /graphql
// endpoint
var dictionary = new Dictionary<string, string>();
dictionary["query"] = queryText;
var json = JsonConvert.SerializeObject(dictionary);
// Write the GraphQL query being used to the test output
// just as information for troubleshooting
this.output.WriteLine(queryText);
// Using Alba to run a GraphQL request end to end
// in memory. This would throw an exception if the
// HTTP status code is not 200
return Host.Scenario(x =>
{
// I'm omitting some code here that we're using to mimic
// the tenant detection in the real code
x.Post.Url("/graphql").ContentType("application/json");
// Dirty hackery.
x.ConfigureHttpContext(c =>
{
var stream = c.Request.Body;
// This encoding turned out to be necessary
// Thank you Stackoverflow!
stream.WriteAsync(Encoding.UTF8.GetBytes(json));
stream.Position = 0;
});
});
}
That’s the basics of running the GraphQL request through, but part of the value of Alba in testing more traditional “JSON over HTTP” endpoints is being able to easily read the HTTP outputs with Alba’s built in helpers that use the application’s JSON serialization setup. I was missing that initially with the GraphQL usage, so I added this extra helper for testing a single GraphQL query or mutation at a time where there is a return body from the mutation:
public async Task<T> PostGraphqlQueryFile<T>(string filename)
{
// Delegating to the previous method
var result = await PostGraphqlQueryFile(filename);
// Get the raw HTTP response
var text = await result.ReadAsTextAsync();
// I'm using Newtonsoft.Json to get into the raw JSON
// a little bit
var json = (JObject)JsonConvert.DeserializeObject(text);
// Make the test fail if the GraphQL response had any errors
json.ContainsKey("errors").ShouldBeFalse($"GraphQL response had errors:\n{text}");
// Find the *actual* response within the larger GraphQL response
// wrapper structure
var data = json["data"].First().First().First().First();
// This would vary a bit in your application
var serializer = JsonSerializer.Create(new JsonSerializerSettings
{
ContractResolver = new CamelCasePropertyNamesContractResolver()
});
// Deserialize the raw JSON into the response type for
// easier access in tests because "strong typing for the win!"
return serializer.Deserialize<T>(new JTokenReader(data));
}
And after all that, that leads to integration tests in test fixture classes subclassing our IntegrationContext base type like this:
public class SomeTestFixture : IntegrationContext
{
public SomeTestFixture(ITestOutputHelper output, AppFixture fixture) : base(output, fixture)
{
}
[Fact]
public async Task perform_mutation()
{
var response = await this.PostGraphqlQueryFile<SomeResponseType>("someGraphQLMutation.txt");
// Use the strong typed response object in the
// "assert" part of your test
}
}
Summary
We’ll see how it goes, but already this harness helped me out to have some repeatable steps to tweak transaction management and multi-tenancy without breaking the actual code. With the custom harness around it, I think we’ve made the GraphQL endpoint testing be somewhat declarative.
I’m seeing an increasing amount of interest in exposing Marten data behind GraphQL endpoints from folks in the Critter Stack Discord channels and from current JasperFx Software clients. After having mostly let other folks handle the Marten + Hot Chocolate combination, I finally spent some significant time looking into what it takes to put Marten behind Hot Chocolate’s GraphQL handling — and I unfortunately saw some real issues for unwary users that I wrote about last week in Hot Chocolate, GraphQL, and the Critter Stack.
Today, I want to jot down my thoughts about how a good GraphQL layer for Marten could be constructed, with a couple caveats that hopefully much of this will be possible once I know much more about Hot Chocolate internals and that the rest of the Marten core team and I have zero interest in building a GraphQL layer from scratch.
Command Batching
A big part of GraphQL usage is wanting a way to aggregate queries from your client to the backend without making a lot of network round trips in a way that pretty well destines you for poor performance. Great, awesome, but on the server side, Hot Chocolate runs every query in parallel, which for Marten means opening a session for each query or serializing the usage of Marten’s sessions and therefore losing the parallelization.
Instead of that parallelization, what I’d love to do is cut in higher up in the GraphQL execution pipeline and instead, batch up the queries into a single database command. What we’ve repeatedly found over 8 years of Marten development (where did the time go?) is that batching database queries into a single network round trip to a PostgreSQL database consistently leads to better performance than making serialized requests. And that’s even with the more complex query building we do within Marten, Weasel, and Wolverine.
Streaming Marten JSON
In the cases where you don’t need to do any transformation of the JSON data being fetched by Marten into the GraphQL results (and remember, it is legal to return more fields than the client actually requested), Marten has an option for very fast HTTP services where it can just happily stream the server stored JSON data right to the HTTP response byte by byte. That’s vastly more efficient than the normal “query data, transform that to objects, then use a JSON serializer to write those objects to HTTP” mechanics.
More Efficient Parsing
Go easy commenting on this one, because this is all conjecture on my part here.
The process of going from a GraphQL query to actual results (which then have to be serialized down to the HTTP response) in Hot Chocolate + Marten is what a former colleague of mine would refer to as a “crime against computer science”:
Hot Chocolate gets the raw string for the GraphQL request that’s sorta like JSON, but definitely not compliant JSON
GraphQL is (I’m guessing) translated to some kind of intermediate model
When using a Hot Chocolate query based on returning a LINQ IQueryable — and most Hot Chocolate samples do this — Hot Chocolate is building up a LINQ Expression on the fly
Marten’s LINQ provider is then taking that newly constructed LINE Expression, and parsing that to create first an intermediate model representing the basics of the operation (are we fetching a list? limiting or skipping results? transforming the raw document data? where/order clauses?)
Marten’s LINQ provider takes Marten’s intermediate model and creates a model that represents fragments of SQL and also determines a query handler strategy for the actual results (list results? FirstOrDefault()? Single() Count()? )
Marten evaluates all these SQL fragments to build up a PostgreSQL SQL statement, executes that, and uses its query handler to resolve the actual resulting documents
If you read that list above and thought to yourself, that sounds like a ton of object allocations and overhead and I wonder if that could end up being slow, yeah, me, too.
What I’d ideally like to see is a model where Marten can take whatever GraphQL’s intermediate model is and effectively skip down from #2 straight down to #5/6. I’d also love to see some kind of way to cache “query plans” in a similar way to Marten’s compiled query mechanism where repetitive patterns of GraphQL queries can be cached to skip even more of the parsing and LINQ query/SQL generation/handler strategy selection overhead.
Batching Mutations to Marten
Betting this would be the easiest thing to pull off. Instead of depending on ambient transactions in .NET (ick), I’d like to be able to look ahead at all the incoming mutations, and if they are all Marten related, use Marten’s own unit of work mechanics and native database transactions.
Wrapping Up
That’s it for now. Not every blog post has to be War and Peace:-)
I might be back next week with an example of how to do integration testing of GraphQL endpoints with Alba — right after I learn how to do that so I can show a JasperFx client.
As part of an ongoing JasperFx client engagement, Wolverine (1.9.0) just added some new options for event streaming from Wolverine applications. The immediate need was to support messaging with the MQTT protocol for usage inside of a new system in the “Internet of Things” problem space. Knowing that a different JasperFx client is going to need to support event subscriptions with Apache Kafka, it was also convenient to finally add the much requested option for Kafka support within Wolverine while the similar MQTT work was still fresh in my mind.
While the new MQTT transport option is documented, the Kafka transport documentation is still on the way, so I’m going to focus on that first.
To get started with Kafka within a Wolverine application, add the WolverineFx.Kafka Nuget to your project. Next, add the Kafka transport option, any messaging subscription rules, and the topics you want your application to listen to with code like this:
using var host = await Host.CreateDefaultBuilder()
.UseWolverine(opts =>
{
opts.UseKafka("localhost:29092");
// Just publish all messages to Kafka topics
// based on the message type (or message attributes)
// This will get fancier in the near future
opts.PublishAllMessages().ToKafkaTopics();
// Or explicitly make subscription rules
opts.PublishMessage<ColorMessage>()
.ToKafkaTopic("colors");
// Listen to topics
opts.ListenToKafkaTopic("red")
.ProcessInline();
opts.ListenToKafkaTopic("green")
.BufferedInMemory();
// This will direct Wolverine to try to ensure that all
// referenced Kafka topics exist at application start up
// time
opts.Services.AddResourceSetupOnStartup();
}).StartAsync();
I’m very sure that these two transports (and shortly a third option for Apache Pulsar) will need to be enhanced when they meet real users and unexpected use cases, but I think there’s a solid foundation ready to go.
In the near future, JasperFx Software will be ready to start offering official support contracts and relationships for both Marten and Wolverine. In the slightly longer term, we’re hoping to create some paid add on products (with support!) for Wolverine for “big, serious enterprise usage.” One of the first use cases I’d like us to tackle with that initiative will be a more robust event subscription capability from Marten’s event sourcing through Wolverine’s messaging capabilities. Adding options especially for Kafka messaging and also for MQTT, Pulsar, and maybe SignalR is an obvious foundational piece to make that a reality.
I was foolish enough to glance at my speaker feedback from my talk at KCDC this summer where I gave an updated version of my Concurrency and Parallelism talk. Out of a combination of time constraints and the desire to shamelessly promote the Critter Stack tools, I mostly used sample problems and solutions from my own work with Marten and Wolverine. One red card evaluation complained that the talk was useless to him (and I have no doubt it was a “him”) because it didn’t focus on mainstream .NET tools. I’m going to do the same thing here, mostly out of time constraints, but I would hope you would take away some understanding of the conceptual patterns I’m discussing here rather than being hung up on the exact choice of tooling.
Continuing my new series about the usefulness of old design patterns that I started with The Lowly Strategy Pattern is Still Useful. Today I want to talk about a handful of design patterns commonly implemented inside of mainstream persistence tooling that you probably already use today. In all cases, the original terminology I’m using here comes from Martin Fowler’s seminal Patterns of Enterprise Application Architecture book from the early 00’s. I’ll be using Marten (of course) for all the examples, but these patterns all exist within Entity Framework Core as well. Again, anytime I write about design pattern usage, I urge you to pay more attention to the concepts, roles, and responsibilities within code without getting too hung up on implementation details.
I seriously doubt that most of you will ever purposely sit down and write your own implementations of these patterns, but it’s always helpful to understand how the tools and technical layers directly underneath your code actually work.
The first two patterns are important for performance and sometimes even just scoping within complex system operations. In a subsequent post I think I’d like to tackle patterns for data consistency and managing concurrency.
Quick Note on Marten Mechanics
If you’re not already familiar with it, Marten is a library in .NET that turns PostgreSQL into a full featured document database and event store. When you integrate Marten into a typical .NET system, you will probably use this idiomatic approach to add Marten:
// This is the absolute, simplest way to integrate Marten into your
// .NET application with Marten's default configuration
builder.Services.AddMarten(options =>
{
// Establish the connection string to your Marten database
options.Connection(builder.Configuration.GetConnectionString("Marten")!);
});
Using the AddMarten() method adds a service called IDocumentSession from Marten into your application’s IoC container with a Scoped lifetime, meaning that a single IDocumentSession should be created, shared, and used within a single HTTP request or within the processing of a single message within a messaging framework. That lifetime is important to understand the Marten (and similar EF Core DbContext usage) usage of the Identity Map and Unit of Work patterns explained below.
If you are more familiar with EF Core, just translate Marten’s IDocumentSession to EF Core’s DbContext in this post if that helps.
Identity Map
Ensures that each object gets loaded only once by keeping every loaded object in a map. Looks up objects using the map when referring to them.
In enterprise software systems I’ve frequently hit code that tries to implement a single, logical transaction across multiple internal functions or services in a large call stack. This situation usually arises over time out of sheer complexity of the business rules and the build up of “special” condition handling over time for whatever cockamamie logic that additional customers require. Unfortunately, this arrangement can frequently lead to duplicated database queries for the same reference data that is needed by completely different areas of code within the single, logical transaction — which can easily lead to very poor performance in your system.
In my experience, chattiness (making many network round trips) to the database has been maybe the single most common source of poor system performance. Followed closely by chattiness between user interface clients and the backend services. The identity map mechanics shown here can be an easy way to mitigate at least the first problem.
This is where the “Unit of Work” pattern can help. Let’s say that in your code you frequently need to load information about the User entities within your system. Here’s a little demonstration of what the identity map actually does for you in terms of scoped caching:
using var store = DocumentStore.For("some connection string");
// Chiefs great Tamba Hali!
var user = new User { FirstName = "Tamba", LastName = "Hali" };
// Marten assigns the identity for the User as it
// persists the new document
await store.BulkInsertAsync(new[] { user });
// Open a Marten session with the identity map
// functionality
await using var session = store.IdentitySession();
// First request for this document, so this call would
// hit the database
var user2 = await session.LoadAsync<User>(user.Id);
// This time it would be loaded from the identity map
// in memory
var user3 = await session.LoadAsync<User>(user.Id);
// Just to prove that
user2.ShouldBeSameAs(user3);
// And also...
var user4 = await session.Query<User>()
.FirstAsync(x => x.FirstName == "Tamba");
// Yup, Marten has to actually query the database, but still
// finds the referenced document from the identity map when
// it resolves the results from the raw PostgreSQL data
user4.ShouldBeSameAs(user2);
With the “Identity Map” functionality, the Marten session is happily able to avoid making repeated requests to the database for the same information across multiple attempts to access that same data.
In bigger call stacks where there’s a real need to potentially access the same data at different times, the Identity Map is a great advantage. However, in smaller usages the Identity Map is nothing but extra overhead as your persistence tooling has to track the data it loads in some kind of in memory key/value storage. Especially in cases where you’re needing to load quite a bit of data at one time, the identity map can be a significant drag both in terms of memory usage and in performance.
Not to worry though, at least for Marten we can purposely create “lightweight” sessions that leave out the identity map tracking altogether like this:
using var store = DocumentStore.For("some connection string");
// Create a lightweight session without any
// identity map overhead
using var session = store.LightweightSession();
or globally within our application like so:
builder.Services.AddMarten(options =>
{
// Establish the connection string to your Marten database
options.Connection(builder.Configuration.GetConnectionString("Marten")!);
})
// Tell Marten to use lightweight sessions
// for the default IoC registration of
// IDocumentSession
.UseLightweightSessions();
You’re unlikely to ever purposely build your own implementation of the Identity Map pattern, but it’s in many common persistence tools and it’s still quite valuable to understand that behavior and also when you would rather bypass that usage to be more efficient.
Unit of Work
Maintains a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems.
Data consistency is a pretty big deal in most enterprise systems, and that makes us developers have to care about our transactional boundaries to ensure that related changes succeed or fail in one operation. This is where the Unit of Work pattern implemented by tools like Marten comes into play.
For Marten, the IDocumentSession is the unit of work as shown below:
public static async Task DoWork(IDocumentSession session)
{
DoWorkPart1(session);
DoWorkPart2(session);
DoWorkPart3(session);
// Make all the queued up persistence
// operations in one database transactions
await session.SaveChangesAsync();
}
public static void DoWorkPart1(IDocumentSession session)
{
session.Store(new User{FirstName = "Travis", LastName = "Kelce"});
session.DeleteWhere<User>(x => x.Department == "Wide Receiver");
}
public static void DoWorkPart2(IDocumentSession session)
{
session.Store(new User{FirstName = "Patrick", LastName = "Mahomes"});
session.Events.StartStream<Game>(new KickedOff());
}
public static void DoWorkPart3(IDocumentSession session)
{
session.Store(new User{FirstName = "Chris", LastName = "Jones"});
}
When IDocumentSession.SaveChangesAsync() is called, it executes a database command for all the new documents stored and the deletion operation queued up across the different helper methods all at one time. Marten is letting us worry about business logic and expressing what database changes should be made while Marten handles the actual transaction boundaries for us when it writes to the databse.
A couple more things to note about the code above:
If you don’t need to read any data first, Marten doesn’t even open a database connection until you call the SaveChangesAsync() method. That’s important to know because database connections are expensive within your system, and you want them to be as short lived as possible. In a manual implementation without a unit of work tracker of some sort, you often open a connection and start a transaction that you then pass around within your code — which leads to holding onto connections longer and risking potential destabilization of your system through connection exhaustion. And don’t blow that off, because that happens quite frequently when we developers are less than perfect with our database connection hygiene.
As I said earlier, Marten registers the IDocumentSession in your IoC container as Scoped, meaning that the same session would be shared by all objects created by the same scoped container within an AspNetCore request or inside message handling frameworks like Wolverine, NServiceBus, or MassTransit. That scoping is important to make the transactional boundaries through the session’s unit of work tracking actually work across different functions within the code.
I’m not sure about other tools, but Marten also batches the various database commands into a single request to the database when SaveChangesAsync() is called. We’ve consistently found that to be a very important performance optimization.
Next time…
I’d like to dive into patterns for managing data concurrency.
Hey folks, this is more a brain dump to collect my own thoughts than any kind of tome of accumulated wisdom and experience. Please treat this accordingly, and absolutely chime in on the Critter Stack Discord discussion going on about this right now.I’m also very willing and maybe even likely to change my mind about anything I’m going to say in this post.
There’s been some recent interest and consternation about the combination of Marten within Hot Chocolate as a GraphQL framework. At the same time, I am working with a new JasperFx client who wants to use Hot Chocolate with Marten’s event store functionality behind mutations and projected data behind GraphQL queries.
Long story short, Marten and Hot Chocolate do not mix well without some significant thought and deviation from normal, out of the box Marten usage. Likewise, I’m seeing some significant challenges in using Wolverine behind Hot Chocolate mutations. The rest of this post is a rundown of the issues, sticking points, and possible future ameliorations to make this combination more effective for our various users.
Connections are Sticky in Marten
If you use the out of the box IServiceCollection.AddMarten() mechanism to add Marten into a .NET application, you’re registering Marten’s IQuerySession and IDocumentSession as a Scoped lifetime — which is optimal for usage within short lived ASP.Net Core HTTP requests or within message bus handlers (like Wolverine!). In both of those cases, the session can be expected to have a short lifetime and generally be running in a single thread — which is good because Marten sessions are absolutely not thread safe.
However, for historical reasons (integration with Dapper was a major use case in early Marten usage, so there’s some optimization for that with Marten sessions), Marten sessions have a “sticky” connection lifecycle where an underlying Npgsql connection is retained on the first database query until the session is disposed. Again, if you’re utilizing Marten within ASP.Net Core controller methods or Minimal API calls or Wolverine message handlers or most other service bus frameworks, the underlying IoC container of your application is happily taking care of resource disposal for you at the right times in the request lifecycle.
The last sentence is one of the most important, but poorly understood advantages of using IoC containers in applications in my opinion.
Ponder the following Marten usage:
public static async Task using_marten(IDocumentStore store)
{
// The Marten query session is IDisposable,
// and that absolutely matters!
await using var session = store.QuerySession();
// Marten opens a database connection at the first
// need for that connection, then holds on to it
var doc = await session.LoadAsync<User>("jeremy");
// other code runs, but the session is still open
// just in case...
// The connection is closed as the method exits
// and the session is disposed
}
The problem with Hot Chocolate comes in because Hot Chocolate is trying to parallelize queries when you get multiple queries in one GraphQL request — which since that query batching was pretty well the raison d’être for GraphQL in the first place, so you should assume that’s quite common!
Now, consider a naive usage of a Marten session in a Hot Chocolate query:
public async Task<SomeEntity> GetEntity(
[Service] IQuerySession session
Input input)
{
// load data using the session
}
Without taking some additional steps to serialize access to the IQuerySession across Hot Chocolate queries, you will absolutely hit concurrency errors when Hot Chocolate tries to parallelize data fetching. You can beat this by either forcing Hot Chocolate to serialize access like so:
builder.Services
.AddGraphQLServer()
// Serialize access to the IQuerySession within Hot Chocolate
.RegisterService<IQuerySession>(ServiceKind.Synchronized)
or by making the session lifetime in your container Transient by doing this:
The first choice will potentially slow down your GraphQL endpoints by serializing access to the IQuerySession while fetching data. The second choice is a non-idiomatic usage of Marten that potentially fouls up usage of Marten within non-GraphQL operations as you could potentially be using separate Marten sessions when you really meant to be using a shared instance.
For Marten V7, we’re going to strongly consider some kind of query runner that does not have sticky connections for the express purpose of simplifying Hot Chocolate + Marten integration, but I can’t promise any particular timeline for that work. You can track that work here though.
Multi-Tenancy and Session Lifecycles
Multi-Tenancy throws yet another spanner into the works. Consider the following Hot Chocolate query method:
public IQueryable<User> GetUsers(
[Service] IDocumentStore documentStore, [GlobalState] string tenant)
{
using var session = documentStore.LightweightSession(tenant);
return session.Query<User>();
}
Assuming that you’ve got some kind of Hot Chocolate interceptor to detect the tenant id for you, and that value is communicated through Hot Chocolate’s global state mechanism, you might think to open a Marten session directly like the code above. That code above will absolutely not work under any kind of system load because it’s putting you into a damned if you do, damned if you don’t situation. If you dispose the session before this method completes, the IQueryable execution will throw an ObjectDisposedException when Hot Chocolate tries to execute the query. If you *don’t* dispose the session, the IoC container for the request scope doesn’t know about it, so can’t dispose it for you and Marten is going to be hanging on to the open database connection until garbage collection comes for it — and under a significant load, that means your system will behave very badly when the database connection pool is exhausted!
What we need to do is to have some way that our sessions can be created for the right tenant for the current request, but have the session tracked some how so that the scoped IoC container can be used to clean up the open sessions at the end of the request. As a first pass, I’m using this crude approach first with this service that’s registered with the IoC container with a Scoped lifetime:
/// <summary>
/// This will be Scoped in the container per request, "knows" what
/// the tenant id for the request is. Also tracks the active Marten
/// session
/// </summary>
public class ActiveTenant : IDisposable
{
public ActiveTenant(IHttpContextAccessor contextAccessor, IDocumentStore store)
{
if (contextAccessor.HttpContext is not null)
{
// Try to detect the active tenant id from
// the current HttpContext
var context = contextAccessor.HttpContext;
if (context.Request.Headers.TryGetValue("tenant", out var tenant))
{
var tenantId = tenant.FirstOrDefault();
if (tenantId.IsNotEmpty())
{
this.Session = store.QuerySession(tenant!);
}
}
}
this.Session ??= store.QuerySession();
}
public IQuerySession Session { get; }
public void Dispose()
{
this.Session.Dispose();
}
}
Now, rewrite the Hot Chocolate query from way up above with:
public IQueryable<User> GetUsers(
[Service] ActiveTenant tenant)
{
return tenant.Session.Query<User>();
}
That does still have to be paired with this Hot Chocolate configuration to dodge the concurrent access problems like so:
I took some time this morning to research Hot Chocolate’s Mutation model (think “writes”). Since my client is using Marten as an event store and I’m me, I was looking for opportunities to:
What I’ve found so far has been a series of blockers once you zero in on the fact that Hot Chocolate is built around the possibility of having zero to many mutation messages in any one request — and that that request should be treated as a logical transaction such that every mutation should either succeed or fail together. With that being said, I see the blockers as:
Wolverine doesn’t yet support message batching in any kind of built in way, and is unlikely to do so before a 2.0 release that isn’t even so much as a glimmer in my eyes yet
Hot Chocolate depends on ambient transactions (Boo!) to manage the transaction boundaries. That by itself almost knocks out the out of the box Marten integration and forces you to use more custom session mechanics to enlist in ambient transactions.
The existing Wolverine transactional outbox depends on an explicit “Flush” operation after the actual database transaction is committed. That’s handled quite gracefully by Wolverine’s Marten integration in normal issue (in my humble and very biased opinion), but that can’t work across multiple mutations in one GraphQL request
There is a mechanism to replace. the transaction boundary management in Hot Chocolate, but it was very clearly built around ambient transactions and it has a synchronous signature to commit the transaction. Like any sane server side development framework, Wolverine performs the IO intensive database transactional mechanics and outbox flushing operations with asynchronous methods. To fit that within Hot Chocolate’s transactional boundary abstraction would require calls to turn the asynchronous Marten and Wolverine APIs into synchronous calls with GetAwaiter().GetResult(), which is tantamount to painting a bullseye on your chest and daring the Fates to not make your application crater with deadlocks under load.
I think at this point, my recommended approach is going to forego integrating Wolverine into Hot Chocolate mutations altogether with some combination of:
Don’t use Hot Chocolate mutations whatsoever if there’s no need for the operation batching and use old fashioned ASP.Net Core with or without Wolverine’s HTTP support
Or document a pattern for using the Decider pattern within Hot Chocolate as an alternative to Wolverine’s “aggregate handler” usage. The goal here is to document a way for developers to keep infrastructure out of business logic code and maximize testability
If using Hot Chocolate mutations, I think there’s a need for a better outbox subscription model directly against Marten’s event store. The approach Oskar outlined here would certainly be a viable start, but I’d rather have an improved version of that built directly into Wolverine’s Marten integration. The goal here is to allow for an Event Driven Architecture which Wolverine supports quite well and the application in question could definitely utilize, but do so without creating any complexity around the Hot Chocolate integration.
In the long, long term:
Add a message batch processing option to Wolverine that manages transactional boundaries between messages for you
Have a significant palaver between the Marten/Wolverine core teams and the fine folks behind Hot Chocolate to iron a bit of this out
My Recommendations For Now
Honestly, I don’t think that I would recommend using GraphQL in general in your system whatsoever unless you’re building some kind of composite user interface where GraphQL would be beneficial in reducing chattiness between your user interface and backing service by allowing unrelated components in your UI happily batch up requests to your server. Maybe also if you were using GraphQL as a service gateway to combine disparate data sources on the server side in a consistent way, but even then I wouldn’t automatically use GraphQL.
I’m not knowledgeable enough to say how much GraphQL usage would help speed up your user interface development, so take all that I said in the paragraph above with a grain of salt.
At this point I would urge folks to be cautious about using the Critter Stack with Hot Chocolate. Marten can be used if you’re aware of the potential problems I discussed above. Even when we beat the sticky connection thing and the session lifecycle problems, Marten’s basic model of storing JSON in the database is really not optimized for plucking out individual fields in Select() transforms. While Marten does support Select() transforms, it’s may not as efficient as the equivalent functionality on top of a relational database model would be. It’s possible that GraphQL might be a better fit with Marten if you were primarily using projected read models purposely designed for client consumption through GraphQL or even projecting event data to flat tables that are queried by Hot Chocolate.
Wolverine with Hot Chocolate maybe not so much if you’d have any problems with the transactional boundary issues.
I would be urge you to do load testing with any usage of Hot Chocolate as I think from peeking into its internals that it’s not the most efficient server side tooling around. Again, that doesn’t mean that you will automatically have performance problems with Hot Chocolate, but I think you you should be cautious with its usage.
In general, I’d say that GraphQL creates way too much abstraction over your underlying data storage — and my experience consistently says that abstracted data access can lead to some very poor system performance by both making your application harder to understand and by eliminating the usage of advanced features of your data storage tooling behind least common denominator abstractions.
This took much longer than I wanted it too, as always. I might write a smaller follow up on how I’d theoretically go about building an optimized GraphQL layer from scratch for Marten — which I have zero intension of ever doing, but it’s a fun thought experiment.