Update on Jasper v2 with an actual alpha

First off, my super power (stop laughing at me!) is having a much longer attention span than the average software developer. In positive ways, this has enabled me to tackle very complex problems. In negative ways, I’ve probably wasted a tremendous amount of time in my career working on systems or projects long after they had probably already failed and I just wouldn’t admit it.

So late last year I started working on a reboot of Jasper, my attempt at creating a “next generation” messaging framework for .Net. The goal of Jasper has changed quite a bit since I started jotting down notes for it in 2014, but the current vision is to be a highly productive command execution engine and asynchronous messaging tool for .Net with less code ceremony than the currently popular tools in this space.

I kicked out a Jasper v 2.0.0-alpha-1 release this week just barely in time for my talk at That Conference yesterday (but didn’t end up showing it at all). Right now the intermediate goals to get to a full Jasper 2.0 rebooted project is to:

  • Finish the baked in Open Telemetry support. It’s there, but there’s holes in what’s being captured
  • Get the interop with MassTransit via Rabbit MQ working for more scenarios. I’ve got a successful proof of concept of bi-directional interaction between Jasper and MassTransit services
  • Finish documentation for the new 2.0 version. I moved the docs to VitePress and started re-writing the docs from scratch, and that takes time

The first two bullet points are all about getting Jasper ready to be something I could dogfood at work.

While I absolutely intend both Jasper and Marten to be perfectly usable without the other, there’s also going to be some specific integration between Jasper and Marten to create a full blown, opinionated CQRS stack for .Net development (think Axon for .Net, but hopefully with much less code ceremony). For this combination, the Marten team is talking about adding messaging subscriptions for the Marten event store functionality, Jasper middleware to reduce repetitive CQRS handler code, and using the outbox functionality in Jasper to also integrate Marten with external messaging infrastructure.

I’ll kick out actual content about all this in the next couple weeks, but a couple folks have noticed the big uptick in Jasper work and asked what was going on, so here’s a little blog post on it:)

Resetting Marten Database State Between Tests

TL;DR: Marten has a new method in V5 called ResetAllData() that’s very handy for rolling back database state to a known point in automated tests.

I’m a big believer in utilizing intermediate level integration tests. By this I mean the middle layer of the typical automated testing pyramid where you’re most definitely testing through your application’s infrastructure, but not necessarily running the system end to end.

Now, any remotely successful test automation strategy means that you have to be able to exert some level of control over the state of the system leading into a test because all automated tests need the combination of known inputs and expected outcomes. To that end, Marten has built in support for completely rolling back the state of a Marten-ized database between tests that I’ll be demonstrating in this post.

When I’m working on a system that uses a relational database, I’m a fan of using Respawn from Jimmy Bogard that helps you rollback the state of a database to its beginning point as part of integration test setup. Likewise, Marten has the “clean” functionality for the same purpose:

public async Task clean_out_documents(IDocumentStore store)
{
    // Completely remove all the database schema objects related
    // to the User document type
    await store.Advanced.Clean.CompletelyRemoveAsync(typeof(User));

    // Tear down and remove all Marten related database schema objects
    await store.Advanced.Clean.CompletelyRemoveAllAsync();

    // Deletes all the documents stored in a Marten database
    await store.Advanced.Clean.DeleteAllDocumentsAsync();

    // Deletes all of the persisted User documents
    await store.Advanced.Clean.DeleteDocumentsByTypeAsync(typeof(User));

    // For cases where you may want to keep some document types,
    // but eliminate everything else. This is here specifically to support
    // automated testing scenarios where you have some static data that can
    // be safely reused across tests
    await store.Advanced.Clean.DeleteDocumentsExceptAsync(typeof(Company), typeof(User));
    
    // And get at event storage too!
    await store.Advanced.Clean.DeleteAllEventDataAsync();
}

So that’s tearing down data, but many if not most systems will need some baseline reference data to function. We’re still in business though, because Marten has long had a concept of initial data applied to a document store on its start up with the IInitialData interface. To illustrate that interface, here’s a small sample implementation:

    internal class BaselineUsers: IInitialData
    {
        public async Task Populate(IDocumentStore store, CancellationToken cancellation)
        {
            using var session = store.LightweightSession();
            session.Store(new User
            {
                UserName = "magic",
                FirstName = "Earvin",
                LastName = "Johnson"
            });

            session.Store(new User
            {
                UserName = "sircharles",
                FirstName = "Charles",
                LastName = "Barkley"
            });

            await session.SaveChangesAsync(cancellation);
        }
    }

And the BaselineUsers type could be applied like this during initial application configuration:

using var host = await Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services.AddMarten(opts =>
        {
            opts.Connection("some connection string");
        }).InitializeWith<BaselineUsers>();
    }).StartAsync();

Or, maybe a little more likely, if you have some reference data that’s only applicable for your automated testing, we can attach our BaselineUsers data set to Marten, but **only in our test harness** with usage like this:

// First, delegate to your system under test project's
// Program.CreateHostBuilder() method to get the normal system configuration
var host = Program.CreateHostBuilder(Array.Empty<string>())

    // But next, apply initial data to Marten that we need just for testing
    .ConfigureServices(services =>
    {
        // This will add the initial data to the DocumentStore
        // on application startup
        services.InitializeMartenWith<BaselineUsers>();
    }).StartAsync();

For some background, as of V5 the mechanics for the initial data set feature moved to executing in an IHostedService so there’s no more issue of asynchronous code being called from synchronous code with the dreaded “will it dead lock or am I feeling lucky?” GetAwaiter().GetResult() mechanics.

Putting it all together with xUnit

The way I like to do integration testing with xUnit (the NUnit mechanics would involve static members, but the same concepts of lifetime still apply) is to have a “fixture” class that will bootstrap and hold on to a shared IHost instance for the system under test between tests like this one:

    public class MyAppFixture: IAsyncLifetime
    {
        public IHost Host { get; private set; }

        public async Task InitializeAsync()
        {
            // First, delegate to your system under test project's
            // Program.CreateHostBuilder() method to get the normal system configuration
            Host = await Program.CreateHostBuilder(Array.Empty<string>())

                // But next, apply initial data to Marten that we need just for testing
                .ConfigureServices(services =>
                {
                    services.InitializeMartenWith<BaselineUsers>();
                }).StartAsync();
            }

        public async Task DisposeAsync()
        {
            await Host.StopAsync();
        }
    }

Next, I like to have a base class for integration tests that in this case will consume the MyAppFixture above, but also reset the Marten database between tests with the new V5 IDocumentStore.Advanced.ResetAllStore() like this one:

    public abstract class IntegrationContext : IAsyncLifetime
    {
        protected IntegrationContext(MyAppFixture fixture)
        {
            Services = fixture.Host.Services;
        }

        public IServiceProvider Services { get; set; }

        public Task InitializeAsync()
        {
            var store = Services.GetRequiredService<IDocumentStore>();

            // This cleans out all existing data, and reapplies
            // the initial data set before all tests
            return store.Advanced.ResetAllData();
        }

        public virtual Task DisposeAsync()
        {
            return Task.CompletedTask;
        }
    }

Do note that I left out some xUnit ICollectionFixture mechanics that you might need to do to make sure that MyAppFixture is really shared between tests. See xUnit’s Shared Context documentation.

Improving the Development and Production Time Experience with Marten V5

Marten V5 dropped last week, with significant new features for multi-tenancy scenarios and enabling users to use multiple Marten document stores in one .Net application. A big chunk of the V5 work was mostly behind the scenes trying to address user feedback from the much larger V4 release late last year. As always, the Marten documentation is here.

First, why didn’t you just…

I’d advise developers and architects to largely eliminate the word “just” and any other lullabye language from their vocabulary when talking about technical problems and solutions.

That being said:

  • Why didn’t you just use source generators instead? Most of this was done before source generators were released, and source generators are limited to information that’s available at compile time. The dynamic code generation in Marten is potentially using information that is only available at run time
  • Why didn’t you just use IL generation instead? Because I despise working directly with IL and I think that would have dramatically curtailed what was easily possible. It’s also possible that we end up having to go there eventually.

Setting the Stage

Consider this simplistic code to start a new Marten DocumentStore against a blank database and persist a single User document:

var store = DocumentStore.For("connection string");

await using var session = store.LightweightSession();
var user = new User
{
    UserName = "pmahomes", 
    FirstName = "Patrick", 
    LastName = "Mahomes"
};

session.Store(user);
await session.SaveChangesAsync();

Hopefully that code is simple enough for new users to follow and immediately start being productive with Marten. The major advantage of document databases over the more traditional RDBMS with or without an ORM is the ability to just get stuff done without having to spend a lot of time configuring databases or object to database mappings or anywhere as much underlying code to just read and write data. To that end, there’s a lot of stuff going on behind the scenes of that code up above.

First off, there’s some automatic database schema management. In the default configuration used up above, Marten is quietly checking the underlying database on the first usage of the User document type to see if the database matches Marten’s configuration for the User document, and applies database migrations at runtime to change the database as necessary.

Secondly, there’s some runtime code generation happening to “bake in” the internal handling of how User documents are read from and written to the database. It’s not apparent here, but there’s a lot of knobs you can twist in Marten to change the behavior of how a document type is stored and retrieved from the database (soft deletes, turning on more metadata tracking, turning off default metadata tracking to be leaner, etc.). That behavior even varies between the lightweight session I used up above and the behavior of IDocumentStore.OpenSession() that adds identity map behavior to the session. To be more efficient over all, Marten generates the tightest possible C# code to handle each document type, then in the default mode, actually compiles that code in memory with Roslyn and uses the dynamically built assembly.

Cool, right? I’d argue that Marten can make teams be far more productive than they would be with the more typical EF Core or Dapper backed approach. Now let’s move on to the unfortunately very real downsides of Marten’s approach and what we’ve done to improve matters:

  • The dynamic Roslyn code generation can sometimes incur a major “cold start” issue on the very first usage. It’s definitely not consistent, as some people do not see any noticeable impact and other folks tell me they get a 9 second delay on the first usage. This cold start issue is especially problematic for folks using Marten in a Serverless architecture
  • The dynamically generated code can’t be used for any kind of potentially valuable AOT optimization
  • Roslyn usage sometimes causes a big ol’ memory leak no matter what we try. This isn’t consistent, so I don’t know why
  • The database change tracking does do some in memory locking, and that’s been prone to dead lock issues in some flavors of .Net (Blazor, WPF)
  • Some of you won’t want to give your application rights to modify a database at runtime
  • In Marten V4 there were a few too many places where Marten was executing the database change detection asynchronously, but from within synchronous calls using the dreaded .GetAwaiter().GetResult() approach. Occasional deadlock issues occurred, mostly in Marten usage within Blazor.

Database Migration Improvements

Alright, let’s tackle the database migration issues first. Marten has long had some command line support so that you could detect and apply any outstanding database changes from your application itself with this call:

dotnet run -- marten-apply

If you use the command line tooling for migrations, you can now optimize Marten to just turn off all runtime database migrations like so:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");
                opts.AutoCreateSchemaObjects = AutoCreate.None;
            });
    }).StartAsync();

Other folks won’t want to use the command line tooling, so there’s another option to just do all database migrations on database startup once, but otherwise completely eliminate all other potential locking in Marten V5, but this time I have to use the IHost integration:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");
                
                // Mild compromise, now I've got to tell
                // Marten about the User document
                opts.RegisterDocumentType<User>();
            })

            // This tells the app to do all database migrations
            // at application startup time
            .ApplyAllDatabaseChangesOnStartup();
    }).StartAsync();

In case you’re wondering, this option is safe to use even if you have multiple application nodes starting up simultaneously. The V5 version here relies on global locks in Postgresql itself to prevent simultaneous database changes that previously resulted in interestingly chaotic failure:(

Pre-building the Generated Types

Now, onto dealing with the dynamic codegen aspect of things. V4 created a “build types ahead” model where you can generate all the dynamic code with this command line call:

dotnet run -- codegen write

You can now completely dodge the runtime code generation issue by this sequence of events:

  1. In your deployment scripts, run dotnet run -- codegen write first
  2. Compile your application, which will embed the newly generated code right into your application’s entry assembly
  3. Use the below setting to completely disable all dynamic codegen:
using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");

                // Turn off all dynamic code generation, but this
                // will blow up if the necessary type isn't compiled
                // into 
                opts.GeneratedCodeMode = TypeLoadMode.Static;
            });
    }).StartAsync();

Again though, this depends on you having all document types registered with Marten instead of depending on runtime discovery as we did in the very first sample in this post — and that’s a bit of friction. What we’ve found is that folks have found the origin pre-built generation model to be clumsy, so we went back to the drawing board for Marten V5 and came up with the…

“Auto” Generated Code Mode

For V5, we have the option shown below:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");

                // use pre-built code if it exists, or
                // generate code if it doesn't and "just work"
                opts.GeneratedCodeMode = TypeLoadMode.Auto;
            });
    }).StartAsync();

My thinking here is that you’d just keep this on all the time, and as long as you’re running the application locally or through your integration test suite (you have one of those, right?), you’d have the dynamic types written to your main project’s code automatically (in an /Internal/Generated folder). Unless you purposely add those to your source control’s ignore list, that code will also be checked in. Woohoo, right?

Now, finally let’s put this all together and bundle all of what I would recommend as Marten best practices into the new…

Optimized Artifact Workflow

New in Marten V5 is what I named the “optimized artifact workflow” (I say “I” because I don’t think other folks like the name:)) as shown below:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");
            })
            // This is the call you want!
            .OptimizeArtifactWorkflow(TypeLoadMode.Static)
            .ApplyAllDatabaseChangesOnStartup();
    })
    
    // In testing harnesses, or with AWS Lambda / Azure Functions,
    // you may have to help out .Net by explicitly setting
    // the main application assembly
    .UseApplicationProject(typeof(User).Assembly)
    
    .StartAsync();

With the OptimizeArtifaceWorkflow(TypeLoadMode.Static) usage above, Marten is running with automatic database management and “Auto” code generation if the host’s environment name is “Development” as it would typically be on a local developer box. In “Production” mode, Marten is running with all automatic database management disabled at runtime beside the initial database change application at startup. In “Production” mode, Marten is also turning off all dynamic code generation with the assumption that all necessary types can be found in the entry assembly.

The goal here was to have a quick setting that optimized Marten usage in both development and production time without having to add in a bunch of nested conditional logic for IHostEnvironment.IsDevelopment() throughout the IHost configuration code.

Exterminating Sync over Async Calls

Back to the very original sample code:

var store = DocumentStore.For("connection string");

await using var session = store.LightweightSession();
var user = new User
{
    UserName = "pmahomes", 
    FirstName = "Patrick", 
    LastName = "Mahomes"
};

session.Store(user);
await session.SaveChangesAsync();

In Marten V4, the first call to session.Store(user) would trigger the database schema detection, which behind the scenes would end up doing a .GetAwaiter().GetResult() trick to call asynchronous code within the synchronous Store() command (not gonna get into that here, but we eliminated all synchronous database schema detection functionality for unrelated reasons in V4).

In V5, we rewired a lot of the internal guts such that the database schema detection is happening instead in the call to IDocumentSession.SaveChangesAsync(), which is of course, asynchronous. That allowed us to eliminate usages of “sync over async” calls. Likewise, we made similar changes throughout other areas of Marten.

Summary

The hope here is that we can make our users be more successful with Marten, and side step the problems our users have had specifically with using Marten with AWS Lambda, Azure Functions, Blazor, and inside of WPF applications. I’m also hoping that the OptimizedArtifactWorkflow() usage greatly simplifies the usage of Marten “best practices.”

Working with Multiple Marten Databases in One Application

Marten V5 dropped last week. I covered the new “database per tenant” strategy for multi-tenancy in my previous blog post. Closely related to that feature is the ability to register and work with multiple Marten databases from a single .Net system, and that’s what I want to talk about today.

Let’s say that for whatever reason (but you know there’s some legacy in there somehow), our application is mostly persisted in its own Marten database, but also needs to interact with a completely separate “Invoicing” database on a different database server and having a completely different configuration. With Marten V5 we can register an additional Marten database by first writing a marker interface for that other database:

    // These marker interfaces *must* be public
    public interface IInvoicingStore : IDocumentStore
    {

    }

And now we can register and configure a completely separate Marten database in our .Net system with the AddMartenStore<T>() usage shown below:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        // You can still use AddMarten() for the main document store
        // of this application
        services.AddMarten("some connection string");

        services.AddMartenStore<IInvoicingStore>(opts =>
            {
                // All the normal options are available here
                opts.Connection("different connection string");

                // more configuration
            })
            // Optionally apply all database schema
            // changes on startup
            .ApplyAllDatabaseChangesOnStartup()

            // Run the async daemon for this database
            .AddAsyncDaemon(DaemonMode.HotCold)

            // Use IInitialData
            .InitializeWith(new DefaultDataSet())

            // Use the V5 optimized artifact workflow
            // with the separate store as well
            .OptimizeArtifactWorkflow();
    }).StartAsync();

So here’s a few things to talk about from that admittedly busy code sample above:

  1. The IInvoicingStore will be registered in your underlying IoC container with singleton scoping. Marten is quietly making a concrete implementation of your interface for you, similar to how Refit works if you’re familiar with that library.
  2. We don’t yet have a way to register a matching IDocumentSession or IQuerySession type to go with the separate document store. I think my though on that is to wait until folks ask for that.
  3. The separate store could happily connect to the same database with a different database schema or connect to a completely different database server altogether
  4. You are able to separately apply all detected database changes on startup
  5. The async daemon can be enabled completely independently for the separate document store
  6. The IInitialData model can be used in isolation with the separate document store for baseline data
  7. The new V5 “optimized artifact workflow” model can be enabled explicitly on each separate document store. This will be the subject of my next Marten related blog post.
  8. It’s not shown up above, but if you really wanted to, you could make the separate document stores use a multi-tenancy strategy with multiple databases
  9. The Marten command line tooling is “multiple database aware,” meaning that it is able to apply changes or assert the configuration on all the known databases at one time or by selecting specific databases by name. This was the main reason the Marten core team did the separate document store story at the same time as the database per tenant strategy.

As I said earlier, we have a service registration for a fully functional DocumentStore implementing our IInvoicingStore that can be injected as a constructor dependency as shown in an internal service of our application:

public class InvoicingService
{
    private readonly IInvoicingStore _store;

    // IInvoicingStore can be injected like any other
    // service in your IoC container
    public InvoicingService(IInvoicingStore store)
    {
        _store = store;
    }

    public async Task DoSomethingWithInvoices()
    {
        // Important to dispose the session when you're done
        // with it
        await using var session = _store.LightweightSession();

        // do stuff with the session you just opened
    }
}

This feature and the multi-tenancy with a database per tenant have been frequent feature requests by Marten users, and it made a lot of sense to tackle them together in V5 because there was quite a bit of overlap in the database change management code to support both. I would very strongly state that a single database should be completely owned by one system, but I don’t know how I really feel about a single system working with multiple databases. Regardless, it comes up often enough that I’m glad we have something in Marten.

I worked with a client system some years back that was a big distributed monolith where the 7-8 separate windows services all talked to the same 4-5 Marten databases, and we hacked together something similar to the new formal support in Marten V5 to accommodate that. I do not recommend getting yourself into that situation though:-)

Multi-Tenancy with Marten

Multitenancy is a reference to the mode of operation of software where multiple independent instances of one or multiple applications operate in a shared environment. The instances (tenants) are logically isolated, but physically integrated.

Gartner Glossary

In this case, I’m referring to “multi-tenancy” in regards to Marten‘s ability to deploy one logical system where the data for each client, organization, or “tenant” is segregated such that users are only ever reading or writing to their own tenant’s data — even if that data is all stored in the same database.

In my research and experience, I’ve really only seen three main ways that folks handle multi-tenancy at the database layer (and this is going to be admittedly RDBMS-centric here):

  1. Use some kind of “tenant id” column in every single database table, then do something behind the scenes in the application layer to always be filtering on that column based on the current user. Marten has supported what I named the “Conjoined” model* since very early versions.
  2. Separate database schema per tenant within the same database. This model is very unlikely to ever be supported by Marten because Marten compiles database schema names into generated code in many, many places.
  3. Using a completely separate database per tenant with identical structures. This approach gives you the most complete separation of data between tenants, and could easily give your system much more scalability when the database is your throughput bottleneck. While you could — and many folks did — roll your own version of “tenant per database” with Marten, it wasn’t supported out of the box.

But, drum roll please, Marten V5 that dropped just last week adds out of the box support for doing multi-tenancy with Marten using a separate database for each tenant. Let’s just right into the simplest possible usage. Let’s say that we have a small system where all we want:

  1. Tenants “tenant1” and “tenant2” to be stored in a database named “database1”
  2. Tenant “tenant3” should be stored in a database named “tenant3”
  3. Tenant “tenant4” should be stored in a database named “tenant4”

And that’s that. Just three databases that are known at bootstrapping time. Jumping into the configuration code in a small .Net 6 web api projection gives us this code:

var builder = WebApplication.CreateBuilder(args);

var db1ConnectionString = builder.Configuration
    .GetConnectionString("database1");

var tenant3ConnectionString = builder.Configuration
    .GetConnectionString("tenant3");

var tenant4ConnectionString = builder.Configuration
    .GetConnectionString("tenant4");

builder.Services.AddMarten(opts =>
{
    opts.MultiTenantedDatabases(x =>
    {
        // Map multiple tenant ids to a single named database
        x.AddMultipleTenantDatabase(db1ConnectionString,"database1")
            .ForTenants("tenant1", "tenant2");

        // Map a single tenant id to a database, which uses the tenant id as well for the database identifier
        x.AddSingleTenantDatabase(tenant3ConnectionString, "tenant3");
        x.AddSingleTenantDatabase(tenant4ConnectionString,"tenant4");
    });

    // Register all the known document types just
    // to enable database schema management
    opts.Schema.For<User>()
        // This is *only* necessary if you want to put more
        // than one tenant in one database. Which we did.
        .MultiTenanted();
});

Now let’s see this in usage a bit. Knowing that the variable theStore in the test below is the IDocumentStore registered in our system with the configuration code above, this test shows off a bit of the multi-tenancy usage:

[Fact]
public async Task can_use_bulk_inserts()
{
    var targets3 = Target.GenerateRandomData(100).ToArray();
    var targets4 = Target.GenerateRandomData(50).ToArray();

    await theStore.Advanced.Clean.DeleteAllDocumentsAsync();

    // This will load new Target documents into the "tenant3" database
    await theStore.BulkInsertDocumentsAsync("tenant3", targets3);

    // This will load new Target documents into the "tenant4" database
    await theStore.BulkInsertDocumentsAsync("tenant4", targets4);

    // Open a query session for "tenant3". This QuerySession will
    // be connected to the "tenant3" database
    using (var query3 = theStore.QuerySession("tenant3"))
    {
        var ids = await query3.Query<Target>().Select(x => x.Id).ToListAsync();

        ids.OrderBy(x => x).ShouldHaveTheSameElementsAs(targets3.OrderBy(x => x.Id).Select(x => x.Id).ToList());
    }

    using (var query4 = theStore.QuerySession("tenant4"))
    {
        var ids = await query4.Query<Target>().Select(x => x.Id).ToListAsync();

        ids.OrderBy(x => x).ShouldHaveTheSameElementsAs(targets4.OrderBy(x => x.Id).Select(x => x.Id).ToList());
    }
}

So far, so good. There’s a little extra configuration in this case to express the mapping of tenants to database, but after that, the mechanics are identical to the previous “Conjoined” multi-tenancy model in Marten. However, as the next set of questions will show, there was a lot of thinking and new infrastructure code under the visible surface because Marten can no longer assume that there’s only one database in the system.

https://commons.wikimedia.org/wiki/File:Iceberg.jpg

To dive a little deeper, I’m going to try to anticipate the questions a user might have about this new functionality:

Is there a DocumentStore per database, or just one?

DocumentStore is a very expensive object to create because of the dynamic code compilation that happens within it. Fortunately, with this new feature set, there is only one DocumentStore. The one DocumentStore does store the database schema difference detection by database though.

How much can I customize the database configuration?

The out of the box options for “database per tenant” configuration are pretty limited, and we know that they won’t cover every possible need of our users. No worries though, because this is pluggable by writing your own implementation of our ITenancy interface, then setting that on StoreOptions.Tenancy as part of your Marten bootstrapping.

For more examples, here’s the StaticMultiTenancy model that underpins the example usage up above. There’s also the SingleServerMultiTenancy model that will dynamically create a named database on the same database server for each tenant id.

To apply your custom ITenancy model, set that on StoreOptions like so:

var store = DocumentStore.For(opts =>
{
    // NO NEED TO SET CONNECTION STRING WITH THE
    // Tenancy option below
    //opts.Connection("connection string");

    // Apply custom tenancy model
    opts.Tenancy = new MySpecialTenancy();
});

Is it possible to mix “Conjoined” multi-tenancy with multiple databases?

Yes, it is, and the example code above tried to show that. You’ll still have to mark document types as MultiTenanted() to opt into the conjoined multi-tenancy in that case. We supported that model thinking that this would be helpful for cases where the logical tenant of an application may have suborganizations. Whether or not this ends up being useful is yet to be proven.

What about the “Clean” functionality?

Marten has some built in functionality to reset or teardown database state on demand that is frequently used for test automation (think Respawn, but built into Marten itself). With the introduction of database per tenant multi-tenancy, the old IDocumentStore.Advanced.Clean functionality had to become multi-database aware. So when you run this code:

    // theStore is an IDocumentStore
    await theStore.Advanced.Clean.DeleteAllDocumentsAsync();

Marten is deleting all the document data in every known tenant database. To be more targeted, we can also “clean” a single database like so:

            // Find a specific database
            var database = await store.Storage.FindOrCreateDatabase("tenant1");

            // Delete all document data in just this database
            await database.DeleteAllDocumentsAsync();

What about database management?

Marten tries really hard to manage database schema changes for you behind the scenes so that your persistence code “just works.” Arguably the biggest task for per database multi-tenancy was enhancing the database migration code to support multiple databases.

If you’re using the Marten command line support for the system above, this will apply any outstanding database changes to each and every known tenant database:

dotnet run -- marten-apply

But to be more fine-grained, we can choose to apply changes to only the tenant database named “database1” like so:

dotnet run -- marten-apply --database database1

And lastly, you can interactively choose which databases to migrate like so:

dotnet run -- marten-apply -i

In code, you can direct Marten to detect and apply any outstanding database migrations (between how Marten is configured in code and what actually exists in the underlying database) across all tenant database upon application startup like so:

services.AddMarten(opts =>
{
    // Marten configuration...
}).ApplyAllDatabaseChangesOnStartup();

The migration code above runs in an IHostedService upon application startup. To avoid collisions between multiple nodes in your application starting up at the same time, Marten uses a Postgresql advisory lock so that only one node at a time can be trying to apply database migrations. Lesson learned:)

Or in your own code, assuming that you have a reference to an IDocumentStore object named theStore, you can use this syntax:

// Apply changes to all tenant databases
await theStore.Storage.ApplyAllConfiguredChangesToDatabaseAsync();

// Apply changes to only one database
var database = await theStore.Storage.FindOrCreateDatabase("database1");
await database.ApplyAllConfiguredChangesToDatabaseAsync();

Can I execute a transaction across databases in Marten?

Not natively with Marten, but I think you could pull that off with TransactionScope, and multiple Marten IDocumentSession objects for each database.

Does the async daemon work across the databases?

Yes! Using the IHost integration to set up the async daemon like so:

services.AddMarten(opts =>
{
    // Marten configuration...
})
    // Starts up the async daemon across all known
    // databases on one single node
    .AddAsyncDaemon(DaemonMode.HotCold);

Behind the scenes, Marten is just iterating over all the known tenant databases and actually starting up a separate object instance of the async daemon for each database.

We don’t yet have any way of distributing projection work across application nodes, but that is absolutely planned.

Can I rebuild a projection by database? Or by all databases at one time?

Oopsie. In the course of writing this blog post I realized that we don’t yet support “database per tenant” with the command line projections option. You can create a daemon instance programmatically for a single database like so:

// Rebuild the TripProjection on just the database named "database1"
using var daemon = await theStore.BuildProjectionDaemonAsync("database1");
await daemon.RebuildProjection<TripProjection>(CancellationToken.None);

*I chose the name “Conjoined,” but don’t exactly remember why. I’m going to claim that that was taken from the “Conjoiner” sect from Alistair Reynolds “Revelation Space” series.

Marten V5 is out!

The Marten team published Marten V5.0 today! It’s not as massive a leap as the Marten V4 release late last year was (and a much, much easier transition from 4 to 5 than 3 to 4 was:)), but I think this addresses a lot of the user issues folks have had with the V4 and makes Marten a much better tool in production and in development.

Some highlights:

  • The release notes and the 5.0 GitHub milestone issues.
  • The closed GitHub milestone just to prove we were busy
  • Fully supports .Net 6 and the latest version of Npgsql for folks who use Marten in combination with Dapper or *gasp* EF Core
  • Marten finally supports doing multi-tenancy through a “database per tenant” strategy with Marten fully able to handle schema migrations across all the known databases!
  • There were a lot of improvements to the database change management and the “pre-built code generation” model has a much easier to use alternative now. See the Development versus Production Usage. Also see the new AddMarten().ApplyAllDatabaseChangesOnStartup() option here.
  • I went through the Marten internals with a fine toothed comb to try and eliminate async within sync calls using .GetAwaiter().GetResult() to try to prevent deadlock issues that some users had reported with, shall we say, “alternative” Marten usages.
  • You can now add and resolve additional document stores in one .Net application.
  • There’s a new option for “custom aggregations” in the event sourcing support for advanced aggregations that fall outside of what was currently possible. This still allows for the performance optimizations we did for Marten V4 aggregates without having to roll your own infrastructure.

As always, thank you to Oskar Dudycz and Babu Annamalai for all their contributions as Marten is fortunately a team effort.

I’ll blog some later this and next week on the big additions. 5.0.1 will inevitably follow soon with who knows what bug fixes. And after that, I’m taking a break on Marten development for a bit:)

Ruminations on 20 Years of being a .Net Developer

.Net turned 20 years old and retrospectives are all over the web right now — so here’s mine!

First off, here’s what’s most important about my 20 years of being a .Net developer:

  • I’ve made a good living as a .Net developer
  • I’ve met a lot of great people over the years through the .Net community
  • I’ve had the opportunity to travel quite a bit around North America and Europe to technical conferences because of my involvement with .Net
  • Most of the significant technical achievements in my career involved .Net in some way (ironically though, the most successful projects from a business perspective were a VB6/Javascript web application and a Node.js/React.js application much more recently)

Moreover, I think .Net as a technology is in a good place today as a result of the .Net Core generation.

Those things above are far more important than anything else in this write up that might not come off as all that positive.

The Initial Excitement

I didn’t have a computer growing up, so the thought of being a software developer just wasn’t something I could have grasped at that time. I came out of college in the mid-90’s as a mechanical engineer — and to this day wish I’d paid attention at the time to how much more I enjoyed our handful of classes that involved programming than I did my other engineering coursework.

Fortunately, I came into an industry that was just barely scratching the surface of project automation, and there was opportunities everywhere to create little bits of homegrown coding solutions to automate some of our engineering work or just do a lot better job of tracking materials. Being self-taught and already working heavily with Windows and Microsoft tools, I naturally fell into coding with Office VBA, ASP “classic”, MS Access, and later to Visual Basic 6 (VB6) and Oracle PL/SQL. As the dot-com bubble was blooming, I was able to turn my “shadow IT” work into a real development role in a Microsoft-centric company that at that time was largely building software with the old Microsoft DNA stack.

I first worked with .Net to to an architectural spike with VB.Net in the fall of 2002 as a potential replacement to a poorly performing VB6 shipping system. That spike went absolutely nowhere, but I appreciated how the new .Net languages and platform seemed to remove the training wheels feel of classic VB6. Now we had full fledged inheritance, interfaces, and full fledged OOP without the friction of COM Hell just like the Java folks did! We even had real stack trace messages that gave you contextual information about runtime errors. Don’t laugh since that’s something most of you reading this take for granted today, but that was literally the thing that helped us convince management to let us start working with .Net early.

My second “real” job as a software developer was my introduction to Agile Software Development — and that’s when the initial excitement about .Net all wore off for me.

The early mainstream tools in .Net like WebForms were very poorly suited for Agile software processes like TDD and even CI (.Net was difficult to script on the build server back then). Many if not most .Net development shops had the software as construction metaphor philosophy where teams first designed a relational database structure, then built the middle and front end layers to reflect the database structure. The unfortunate result of that approach was to greatly couple application logic and functionality with the database infrastructure and further make Agile techniques harder to use.

So fine, let’s talk about ALT.Net for just a minute and quickly move on…

In the early to mid-2000s, and as many of us who were trying to adopt Agile development with .Net were becoming increasingly frustrated with the current state of .Net tooling, Ruby on Rails popped up. I know that Rails has probably lost a great deal of its shine in the past dozen years or so, but at the time it was brand new, Rails made ASP.Net development look like a giant turd.

Right at the time that the rest of the development world seemed to be changing and improving fast, the .Net world was insular and largely a monoculture and self-contained echo chamber revolving around Microsoft. The catalyzing event for me was the MVP summit in 2007 when the poor unsuspecting Entity Framework team did an introductory demo of early EF and were completely blindsided by the negative feedback that I and many of the original instigators of ALT.Net gave them at the time.

Long story short, we thought that the early version of Entity Framework was a complete disaster on technical grounds (and I still do to this day), and because of Microsoft’s complete dominance of mindshare in the .Net ecosystem, that we were all going to end up being forced to use it in our shops.

That experience led to initially the EF Vote of No Confidence (read it at your peril, I thought the wording was overwrought at the time and hasn’t aged well). More importantly, that was the beginning of the ALT.Net community that drastically changed my own personal path within .Net.

If you care, you can read my post The Very Last ALT.Net Retrospective I’ll Ever Write from a couple years ago.

I’m happy to say that I think that EF Core of today is perfectly fine if you need a heavy ORM, but the Entity Framework V1 at that time was a different beast altogether and best forgotten in my opinion.

ASP.Net MVC and .Net OSS

Scott Guthrie was at the first big ALT.Net open spaces event in Austin in the fall of ’07 to give the first public demo of what became ASP.Net MVC. More than ALT.Net itself, I think the advent of MVC did a tremendous amount of good to open up the .Net community to ideas from the outside and led to a lot of improvements. I thought then and now that early MVC was mediocre in terms of its technical approach, but hey, it got us off of WebForms and moving toward a better place.

I thought that OSS development really matured after MVC kind of broke the dam for alternative approaches in .Net. Nuget itself made a huge difference for OSS adoption in .Net — but, yes, I will point out that the community itself tried a couple times to create a packaging specification for .Net before Microsoft finally stepped in and sort of crushed one of those attempts by coopting the name.

Even outside of MVC I thought that .Net tools got quite a bit better for Agile development, and I myself was deeply immersed in a couple OSS efforts (that largely failed, but let’s just move on). .Net is probably never going to be a great for OSS development because of Microsoft’s complete dominance and their tendency to purposely or not, squash OSS alternatives within the .Net ecosystem.

Project K and .Net Core

I’m admittedly getting tired of writing this, so I’m gonna quickly say that I think that “Project K” and .Net Core maybe saved .Net from irrelevance. I think that Microsoft has done a tremendous job making .Net a modern platform. In particular, I think these things are a big deal:

  • Being cross-platform and container friendly
  • The dotnet CLI makes project scripting of .Net applications so, so much easier than it used to be
  • The newer SDK project system made Nuget management so much better than it was before. Not to take anything away from Paket, but I think the newer project system plus the new dotnet CLI was a game changer
  • The performance improvements that have been baked into the newer versions of the .Net framework
  • The core, opinionated IHost, IConfiguration, IHostedService, etc. abstractions have done a lot of good to make it easier to spin up new systems with .Net. I think that’s also made it much easier for OSS authors and users to extend .Net

Granted, I routinely get irritated watching .Net MVP types gush about the .Net teams adding new features or ideas that were previously invented in other ecosystems or earlier .Net OSS projects that never took hold. So after a mostly positive post, I’ll leave off with a couple criticisms:

  • .Net would be in a much better place if we weren’t throttled by how fast Microsoft teams can innovate if instead innovations from the greater .Net community had a chance to gain widespread adoption
  • The MVP program is a net negative for .Net in my opinion. I think the MVP program generates way too much incentive to focus on tools from Microsoft to the exclusion of anything else. One of the drivers for founding ALT.Net to me back in the day was the tendency of .Net developers to jump on anything new coming from Microsoft without first considering if that new thing was actually any good.

.Net 6 WebApplicationBuilder and Lamar

TL;DR — The latest Lamar V8.0.1 release has some bug fixes and mild breaking changes around the .Net Core DI integration that eliminates user reported problems with the new .Net 6 bootstrapping.

Hey, before I jump into the Lamar improvements for .Net 6, read Jimmy Bogard’s latest post for an example reason why you would opt to use Lamar over the built in DI container.

I’ve had a rash of error reports against Lamar when used with the new WebApplicationBuilder bootstrapping model that came with ASP.Net Core 6. Fortunately, the common culprit (in ahem oddball .Net Core mechanics more than Lamar itself) was relatively easy to find, and the most recent Lamar V8 made some minor adjustments to the .Net Core adapter code to fix the issues.

To use Lamar with the new .Net 6 bootstrapping model, you need to install the Lamar.Microsoft.DependencyInjection Nuget and use the UseLamar() extension method on IHostBuilder to opt into using Lamar in place of the built in DI container.

You can find more information about using Lamar with the new WebApplicationBuilder model and Minimal APIs in the Lamar documentation.

As an example, consider this simplistic system from the Lamar testing code:

var builder = WebApplication.CreateBuilder(args);

// use Lamar as DI.
builder.Host.UseLamar((context, registry) =>
{
    // register services using Lamar
    registry.For<ITest>().Use<MyTest>();
    
    // Add your own Lamar ServiceRegistry collections
    // of registrations
    registry.IncludeRegistry<MyRegistry>();

    // discover MVC controllers -- this was problematic
    // inside of the UseLamar() method, but is "fixed" in
    // Lamar V8
    registry.AddControllers();
});

var app = builder.Build();
app.MapControllers();

// Add Minimal API routes
app.MapGet("/", (ITest service) => service.SayHello());

app.Run();

Notice that we’re adding service registrations directly within the nested lambda passed into the UseLamar() method. In the previous versions of Lamar, those service registrations were completely isolated and additive to the service registrations in the Startup.ConfigureServices() — and that was very rarely an issue. In the new .Net 6 model, that became problematic as some of Microsoft’s out of the box service registration extension methods like AddControllers() depend on state being smuggled through the service collection and did not work inside of the UseLamar() method before Lamar v8.

The simple “fix” in Lamar v8 was to ensure that the service registrations inside of UseLamar() were done additively to the existing set of service registrations built up by the core .Net host building like so:

        /// <summary>
        /// Shortcut to replace the built in DI container with Lamar using service registrations
        /// dependent upon the application's environment and configuration.
        /// </summary>
        /// <param name="builder"></param>
        /// <param name="registry"></param>
        /// <returns></returns>
        public static IHostBuilder UseLamar(this IHostBuilder builder, Action<HostBuilderContext, ServiceRegistry> configure = null)
        {
            return builder
                .UseServiceProviderFactory<ServiceRegistry>(new LamarServiceProviderFactory())
                .UseServiceProviderFactory<IServiceCollection>(new LamarServiceProviderFactory())
                .ConfigureServices((context, services) =>
                {
                    var registry = new ServiceRegistry(services);
                
                    configure?.Invoke(context, registry);
                
                    // Hack-y, but this makes everything work as 
                    // expected
                    services.Clear();
                    services.AddRange(registry);

#if NET6_0_OR_GREATER
                    // This enables the usage of implicit services in Minimal APIs
                    services.AddSingleton(s => (IServiceProviderIsService) s.GetRequiredService<IContainer>());
#endif
                    
                });
        }

The downside of this “fix” was that I eliminated all other overloads of the UseLamar() extension method that relied on custom Lamar ServiceRegistry types. You can still use the IncludeRegistry<T>() method to use custom ServiceRegistry types though.

As always, if you have any issues with Lamar with or without ASP.Net Core, the Lamar Gitter room is the best and fastest way to ask questions.

Unit Tests for Expected Exceptions

I generally write code for tools or libraries used by other developers instead of business facing features, so I frequently come up on the need to communicate invalid operations, incorrect configuration assertions, or just provide more contextual information about failures to those other developers. In these cases where the exception logic is important, I will write unit tests against the code that should be throwing an exception in certain cases.

When you test expected exception flow, you need to do these things:

  1. Set up the expected failure case that should result in the exception (Duh.)
  2. Call the code that should be throwing an exception
  3. Assert that an exception was thrown, and it was the expected type of exception. This is an important point because doing this naively can result in false positive test results.
  4. Potentially make additional assertions against the message or other details of the thrown exception

I generally use Shouldly in my project work for test assertions, and it comes with a mechanism for testing expected exceptions. Using an example from Marten’s Linq support, we needed to tell users when they were using an unsupported .Net type in their Linq query with a more useful exception, so we have this test case for that exception workflow:

        [Fact]
        public void get_a_descriptive_exception_message()
        {
            var ex = Should.Throw<BadLinqExpressionException>(() =>
            {
                // This action is executed by Shouldly inside
                // a try/catch block that asserts on the expected
                // exception
                theSession
                    .Query<MyClass>()
                    .Where(x => x.CustomObject == new CustomObject())
                    .ToList();
            });

            ex.Message.ShouldBe("Marten cannot support custom value types in Linq expression. Please query on either simple properties of the value type, or register a custom IFieldSource for this value type.");
        }

That’s using Shouldly, but Fluent Assertions has a very similar mechanism. My strong recommendation is that you use one of these two libraries anytime you are testing expected exception flow because it’s repetitive ceremony to test the expected exception flow with raw try/catch blocks and also easy to forget to even assert that an exception was thrown.

Actually, I’d go farther and recommend you pretty well always use either Shouldly (my preference) or Fluent Assertions (bigger, and more popular in general) in your testing projects. I think these libraries do a lot to make tests easier to read, quicker to write, and frequently easier to troubleshoot failing tests as well.

Lastly, if you want to understand what the Shouldly Should.Throw<TException>(Action) method is really doing, here’s an older extension method I used in projects before Shouldly was around that does effectively the same thing (the usage is `Exception<T>.ShouldBeThrownBy(Action)`):

    public static class Exception<T> where T : Exception
    {
        public static T ShouldBeThrownBy(Action action)
        {
            T exception = null;

            try
            {
                action();
            }
            catch (Exception e)
            {
                exception = e.ShouldBeOfType<T>();
            }

            // This is important, we need to protect against false
            // positive results by asserting that no exception was
            // thrown at the expected time and cause this test to fail
            exception.ShouldNotBeNull("An exception was expected, but not thrown by the given action.");

            return exception;
        }

Batch Querying with Marten

Before I talk about the batch querying feature set in Marten, let’s take a little detour through a common approach to persistence in .Net architectures that commonly causes the exact problem that Marten’s batch querying seeks to solve.

I’ve been in several online debates lately about the wisdom or applicability of granular repository abstractions over inner persistence infrastructure like EF Core or Marten like this sample below:

    public interface IRepository<T>
    {
        Task<T> Load(Guid id, CancellationToken token = default);
        Task Insert(T entity, CancellationToken token = default);
        Task Update(T entity, CancellationToken token = default);
        Task Delete(T entity, CancellationToken token = default);

        IQueryable<T> Query();
    }

That’s a pretty common approach, and I’m sure it’s working out for some people in at least simpler CRUD-centric applications. Unfortunately though, that reliance on fine-grained repositories also breaks down badly in more complicated systems where a single logical operation may need to span multiple entity types. Coincidentally, I have frequently seen this kind of fine grained abstraction directly lead to performance problems in the systems I’ve helped with after their original construction over the past 6-8 years.

For an example, let’s say that we have a message handler that will need to access and modify data from three different entity types in one logical transaction. Using the fine grained repository strategy, we’d have something like this:

    public class SomeMessage
    {
        public Guid UserId { get; set; }
        public Guid OrderId { get; set; }
        public Guid AccountId { get; set; }
    }

    public class Handler
    {
        private readonly IUnitOfWork _unitOfWork;
        private readonly IRepository<Account> _accounts;
        private readonly IRepository<User> _users;
        private readonly IRepository<Order> _orders;

        public Handler(
            IUnitOfWork unitOfWork,
            IRepository<Account> accounts,
            IRepository<User> users,
            IRepository<Order> orders)
        {
            _unitOfWork = unitOfWork;
            _accounts = accounts;
            _users = users;
            _orders = orders;
        }

        public async Task Handle(SomeMessage message)
        {
            // The potential performance problem is right here.
            // Multiple round trips to the database
            var user = await _users.Load(message.UserId);
            var account = await _accounts.Load(message.AccountId);
            var order = await _orders.Load(message.OrderId);

            var otherOrders = await _orders.Query()
                .Where(x => x.Amount > 100)
                .ToListAsync();

            // Carry out rules and whatnot

            await _unitOfWork.Commit();
        }
    }

So here’s the problem with the code up above as I see it:

  1. You’re having to inject separate dependencies for the matching repository type for each entity type, and that adds code ceremony and noise code.
  2. The code is making repeated round trips to the database server every time it needs more data. This is a contrived example, and it’s only 4 trips, but in real systems this could easily be many more. To make this perfectly clear, one of the very most pernicious sources of slow code is chattiness (frequent network round trips) between the application layer and backing database.

Fortunately, Marten has a facility called batch querying that we can use to fetch multiple data queries at one time, and even start processing against the earlier results while the later results are still being read. To use that, we’ve got to ditch the “one size fits all, least common denominator” repository abstraction and use the raw Marten IDocumentSession service as shown in this version below:

    public class MartenHandler
    {
        private readonly IDocumentSession _session;

        public MartenHandler(IDocumentSession session)
        {
            _session = session;
        }

        public async Task Handle(SomeMessage message)
        {
            // Not gonna lie, this is more code than the first alternative
            var batch = _session.CreateBatchQuery();

            var userLookup = batch.Load<User>(message.UserId);
            var accountLookup = batch.Load<Account>(message.AccountId);
            var orderLookup = batch.Load<Order>(message.OrderId);
            var otherOrdersLookup = batch.Query<Order>().Where(x => x.Amount > 100).ToList();

            await batch.Execute();

            // We can immediately start using the data from earlier
            // queries in memory while the later queries are still processing
            // in the background for a little bit of parallelization
            var user = await userLookup;
            var account = await accountLookup;
            var order = await orderLookup;

            var otherOrders = await otherOrdersLookup;

            // Carry out rules and whatnot

            // Commit any outstanding changes with Marten
            await _session.SaveChangesAsync();
        }

The code above creates a single, batched query for the four queries this handler needs, meaning that Marten is making a single database query for the four SELECT statements. As an improvement in the Marten V4 release, the results coming back from Postgresql are processed in a background Task, meaning that in the code above we can start working with the initial Account, User, and Order data while Marten is still building out the last Order results (remember that Marten has to deserialize JSON data to build out your documents and that can be non-trivial for large documents).

I think these are the takeaways for the before and after code here:

  1. Network round trips are expensive and chattiness can be a performance bottleneck, but batch querying approaches like Marten’s can help a great deal.
  2. Putting your persistence tooling behind least common denominator abstractions like the IRepository<T> approach shown above eliminate the ability to use advanced features of your actual persistence tooling. That’s a serious drawback as that disallows the usage of the exact features that allow you to create high performance solutions — and this isn’t specific to using Marten as your backing persistence tooling.
  3. Writing highly performant code can easily mean writing more code as you saw above with the batch querying. The point there being to not automatically opt for the most highly performant approach if it’s unnecessary and more complex than a slower, but simpler approach. Premature optimization and all that.

I’m only showing a small fraction of what the batch query supports, so certainly checkout the documentation for more examples.