My Thoughts on Code “Modernization”

Some of this is going to be specific to a .Net ecosystem, but most of what I’m talking about here I think should be applicable to most development shops. This is more or less a companion white paper for a big internal presentation I did at work this week.

My team at work is tasked with a multi-year code and architecture modernization across our large technical platforms. To give just a little bit of context, it’s a familiar story. We have some very large, very old, complex monolithic systems in production using some technologies, frameworks, and libraries that in a perfect world we’d like to update or replace. Being that quite a bit of code was written before Test Driven Development was just a twinkle in Kent Beck’s eye, the automated test coverage on parts of the code isn’t what we’d like it to be.

With all that said, to any of my colleagues that read this, I’d say that we’re in much better shape quality and ecosystem wise than the average shop with old, continuously developed systems.

During a recent meeting right before Christmas, one of my colleagues had the temerity to ask “what’s the end goal of modernization and when can we say we’re done?” — which set off some furious thinking, conversations within the team, and finally a presentation to the rest of our development groups.

My Thoughtful Spot on the Internet | Tina Hargaden's Blog

We came up with these three main goals for our modernization efforts:

  1. Arrive at a point where we can practice Continuous Delivery (CD) within all our major product lines
  2. Improved Developer (and Tester) Happiness
  3. System Performance

Arguably, I’d say that being able to practice Continuous Delivery with a corresponding DevOps culture would help us achieve the other two goals, so I’m almost ready to declare that our main goal. Everything else that’s been on our “modernization agenda” is arguably just an intermediate step on the way to the goal of continuous delivery, or another goal that is at least partially unlocked by the advances we’ll have to make in order to get to continuous delivery.

Intermediate Steps

Speaking of the major intermediate or enabling steps we’ve identified, I took a shot at showing what we think are the major enabling steps for our future CD strategy in a diagram:

Upgrading to .Net vLatest

Upgrading from the full “classic” Windows-only version of .Net to the latest version of .Net and ASP.Net Core is taking up most of our hands on focus right now. There’s probably some performance gains to be had by merely updating to the latest .Net 5/6, but I see the big advantages to the latest .Net versions as being much more container friendly and allowing us flexibility on hosting options (Linux containers) compared to where we are now. I personally think that the recent generations of .Net and ASP.Net Core are far easier to work with in automated testing scenarios, and that should hopefully be a major enable of CD processes for us.

Most importantly of all, I’d like to get back to using a Mac for daily development work, so there’s that.

Improved Automated Testing

We’re fortunately starting from a decent base of test automation, but there’s plenty of opportunities to get better before we can support more frequent releases. (I’ve written quite a bit about automated testing here). Long story short, I think we have some opportunities to:

  1. Get better at writing testable code for easier and more effective unit testing
  2. Introduce a lot more integration testing in the middle zone of the stereotypical “test pyramid”
  3. Cut back on expensive Selenium-based testing wherever possible in favor of some other form of more efficient test automation. See Jeremy’s Only Rule of Testing.

Since all of this is interrelated anyway, “testability” is absolutely one of the factors we’ll use to decide where service boundaries are as we try to slice our large monoliths into smaller, more focused services. If it’s not valuable to test a service by itself without including other services, then that service boundary is probably wrong.

Containerization

This comes up a lot at work, but I’d call this as mostly an enabler step toward deploying to cloud hosting and easier incremental deployment than we have today rather than any kind of end in itself, especially in areas where we need elastic scaling. I think being able to run our services in containers also going to be helpful for the occasional time when you need to test locally against multiple services or processes.

And yeah, we could try to do a lift and shift to move our big full .Net framework apps to virtual machines in the cloud or try out Windows containers, but previous analysis has suggested that that’s not viable for us. Plus nobody wants to do that.

Open Telemetry Tracing and Production Monitoring

This effort is fortunately well underway, but one of our intermediate goals is to apply effective Open Telemetry tracing through all our products, and I say that for these reasons:

  1. It enables us to use a growing off the shelf ecosystem of visualization and metrics tooling
  2. I think it’s an invaluable debugging tool, especially when you have asynchronous messaging or dependencies on external systems — and we’re only going to be increasing our reliance on messaging as we move more and more to micro-services
  3. Open Telemetry is very handy in diagnosing performance or throughput problems by allowing you to “see” the context of what is happening within and across systems during a logical business operation.

To the last point, my key example of this was helping a team last year analyze some performance issues in their web services. An experienced developer will probably look through database logs to identify slow queries that might explain the poor performance as one of their first steps, but in this case that turned up no single query that was slow enough to explain the performance issues. Fortunately, I was able to diagnose the issue as an N+1 query issue by reading through the code, but let’s just say that I got lucky.

If we’d had open telemetry tracing between the web service calls and the database queries that each service invocation made, I think we would have been able to quickly see a relationship between slow web service calls and the sheer number of little database queries that the web service was making during the slow web service requests, which should have led the team to immediately suspect an N+1 problem.

As for production monitoring, we of course already do that but there’s some opportunity to be more responsive at least to performance issues detected by the monitoring rules. We’re working under the assumption that deploying more often and more incrementally means that we’ll also have to be better at detecting production issues. Not that you purposely try to let problems get through testing, but if we’re going to convince the greater company that it’s safe to deploy small changes in an automated fashion, we need to have ways to rapidly detect when new problems in production are introduced.

Again, the general theme is for us to be resilient and adaptive because problems are inevitable — but don’t let the fear of potential problems put us into an analysis paralysis spiral.

Cloud Hosting

I think that’s a major enabler of continuous delivery, with the real goal for us being more flexible in how our development, testing, and production environments are configured as we continue to break up the monolith codebases and change our current architecture. I’d also love for us to be able to flexibly spin up environments for testing on demand, and tear them down when they’re not needed without a lot of formal paperwork in the middle.

There might also be an argument for shifting to the cloud if we could reduce hosting and production support costs along the way, but I think there’s a lot of analysis left to do before we can make that claim to the folks in the high backed chairs.

System Performance

Good runtime performance and meeting our SLA agreements for such is absolutely vital for us as medical analytics company. I wrestled quite a bit with making this a first class goal of our “modernization” initiative and came down on the side of “yes, but…” My thinking here, with some agreement from other folks, is that system performance issues will be much easier to address when we’re backed by a continuous delivery backbone.

There’s something to be said for doing upfront architecture work to consider known performance risks before a single line of code is written, but the truth is that a great deal of the code is already written. Moreover, the performance issues and bottlenecks that pop up in production aren’t always where we would have expected them to be during upfront architecture efforts anyway.

Improving performance in a complicated system is generally going to require a lot of measurement and iteration. Knowing that, having the faster release cycle made safe by effective automated test coverage should help us react quicker to performance problems or take advantage of newer ideas to improve performance as we learn more about how our systems behave or gain some insights into client data sets. Likewise, we’ll have to improve our production monitoring and instrumentation to anyway to enable continuous delivery, and we’re hopeful that that will also help us more quickly identify and diagnose performance issues.

To phrase this a bit more bluntly, I believe that upfront design and architecture can be valuable and sometimes necessary, but consistent success in software development is more likely a result of feedback and adaptation over time than being dependent on getting everything right the first time.

Ending this post abruptly….

I’m tired, it’s late, and I’m going to play the trick of making this a blog series instead of one gigantic post that never gets finished. In following posts, I’d like to discuss my thoughts on:

  • Creating the circumstances for “Developer Happiness” with some thinking about what kind of organizational structure and technical ecosystem allows developers and testers to be maximally productive and at least have a chance to be happy within their roles
  • Some thinking around micro-services and micro-frontends as we try to break up the big ol’ monoliths with some focus on intermediate steps to get there

My professional and OSS aspirations for 2022

I trot out one of these posts at the beginning of each year, but this time around it’s “aspirations” instead of “plans” because a whole lot of stuff is gonna be a repeat from 2020 and 2021 and I’m not going to lose any sleep over what doesn’t get done in the New Year or not be open to brand new opportunities.

In 2022 I just want the chance to interact with other developers. I’ll be at ThatConference in Round Rock, TX in January May? speaking about Event Sourcing with Marten (my first in person conference since late 2019). Other than that, my only goal for the year (Covid-willing) is to maybe speak at a couple more in person conferences just to be able to interact with other developers in real space again.

My peak as a technical blogger was the late aughts, and I think I’m mostly good with not sweating any kind of attempt to regain that level of readership. I do plan to write material that I think would be useful for my shop, or just about what I’m doing in the OSS space when I feel like it.

Which brings me to the main part of this post, my involvement with the JasperFx (Marten, Lamar, etc). family of OSS projects (plus Storyteller) which takes up most of my extracurricular software related time. Just for an idea of the interdependencies, here’s the highlights of the JasperFx world:

.NET Transactional Document DB and Event Store on PostgreSQL

Marten took a big leap forward late in 2021 with the long running V4.0 release. I think that release might have been the single biggest, most complicated OSS release that I’ve ever been a part of — FubuMVC 1.0 notwithstanding. There’s also a 5.0-alpha release out that addresses .Net 6 support and the latest version of Npgsql.

Right now Marten is a victim of its own success, and our chat room is almost constantly hair on fire with activity, which directly led to some planned improvements for V5 (hopefully by the end of January?) in this discussion thread:

  • Multi-tenancy through a separate database per tenant (long planned, long delayed, finally happening now)
  • Some kind of ability to register and resolve services for more than one Marten database in a single application
  • And related to the previous two bullet points, improved database versioning and schema migrations that could accommodate there being more than one database within a single .Net codebase
  • Improve the “generate ahead” model to make it easier to adopt. Think faster cold start times for systems that use Marten

Beyond that, some of the things I’d like to maybe do with Marten this year are:

  • Investigate the usage of Postgresql table partitioning and database sharding as a way to increase scalability — especially with the event sourcing support
  • Projection snapshotting
  • In conjunction with Jasper, expand Marten’s asynchronous projection support to shard projection work across multiple running nodes, introduce some sort of optimized, no downtime projection rebuilds, and add some options for event streaming with Marten and Kafka or Pulsar
  • Try to build an efficient GraphQL adapter for Marten. And by efficient, I mean that you wouldn’t have to bounce through a Linq translation first and hopefully could opt into Marten’s JSON streaming wherever possible. This isn’t likely, but sounds kind of interesting to play with.

In a perfect, magic, unicorns and rainbows world, I’d love to see the Marten backlog in GitHub get under 50 items and stay there permanently. Commence laughing at me on that one:(

Jasper is a toolkit for common messaging scenarios between .Net applications with a robust in process command runner that can be used either with or without the messaging.

I started working on rebooting Jasper with a forthcoming V2 version late last year, and made quite a bit of progress before Marten got busy and .Net 6 being released necessitated other work. There’s a non-zero chance I will be using Jasper at work, which makes that a much more viable project. I’m currently in flight with:

  • Building Open Telemetry tracing directly into Jasper
  • Bi-directional compatibility with MassTransit applications (absolutely necessary to adopt this in my own shop).
  • Performance optimizations
  • .Net 6 support
  • Documentation overhaul
  • Kafka as a message transport option (Pulsar was surprisingly easy to add, and I’m hopeful that Kafka is similar)

And maybe, just maybe, I might extend Jasper’s somewhat unique middleware approach to web services utilizing the new ASP.Net Core Minimal API support. The idea there is to more or less create an improved version of the old FubuMVC idiom for building web services.

Lamar is a modern IoC container and the successor to StructureMap

I don’t have any real plans for Lamar in the new year, but there are some holes in the documentation, and a couple advanced features could sure use some additional examples. 2021 ended up being busy for Lamar though with:

  1. Lamar v6 added interception (finally), a new documentation website, and a facility for overriding services at test time
  2. Lamar v7 added support for IAsyncEnumerable (also finally), a small enhancement for the Minimal API feature in ASP.Net Core, and .Net 6 support

Add Robust Command Line Options to .Net Applications

Oakton did have a major v4/4.1 release to accommodate .Net 6 and ASP.Net Core Minimal API usage late in 2021, but I have yet to update the documentation. I would like to shift Oakton’s documentation website to VitePress first. The only plans I have for Oakton this year is to maybe see if there’d be a good way for Oakton to enable “buddy” command line tools to your application like the dotnet ef tool using the HostFactoryResolver class.

The bustling metropolis of Alba, MO

Alba is a wrapper around the ASP.Net Core TestServer for declarative, in process testing of ASP.Net Core web services. I don’t have any plans for Alba in the new year other than to respond to any issues or opportunities to smooth out usage from my shop’s usage of Alba.

Alba did get a couple major releases in 2021 though:

  1. Alba 5.0 streamlined the entry API to mimic IHost, converted the documentation website to VitePress, and introduced new facilities for dealing with security in testing.
  2. Alba 6.0 added support for WebApplicationFactory and ASP.Net Core 6

Solutions for creating robust, human readable acceptance tests for your .Net or CoreCLR system and a means to create “living” technical documentation.

Storyteller has been mothballed for years, and I was ready to abandon it last year, but…

We still use Storyteller for some big, long running integration style tests in both Marten and Jasper where I don’t think xUnit/NUnit is a good fit, and I think maybe I’d like to reboot Storyteller later this year. The “new” Storyteller (I’m playing with the idea of calling it “Bobcat” as it might be a different tool) would be quite a bit smaller and much more focused on enabling integration testing rather than trying to be a BDD tool.

Not sure what the approach might be, it could be:

  • “Just” write some extension helpers to xUnit or NUnit for more data intensive tests
  • “Just” write some extension helpers to SpecFlow
  • Rebuild the current Storyteller concept, but also support a Gherkin model
  • Something else altogether?

My goals if this happens is to have a tool for automated testing that maybe supports:

  • Much more data intensive tests
  • Better handles integration tests
  • Strong support for test parallelization and even test run sharding in CI
  • Could help write characterization tests with a record/replay kind of model against existing systems (I’d *love* to have this at work)
  • Has some kind of model that is easy to use within an IDE like Rider or VS, even if there is a separate UI like Storyteller does today

And I’d still like to rewrite a subset of the existing Storyteller UI as an excuse to refresh my front end technology skillset.

To be honest, I don’t feel like Storyteller has ever been much of a success, but it’s the OSS project of mine that I’ve most enjoyed working on and most frequently used myself.

Weasel

Weasel is a set of libraries for database schema migrations and ADO.Net helpers that we spun out of Marten during its V4 release. I’m not super excited about doing this, but Weasel is getting some sort of database migration support very soon. Weasel isn’t documented itself yet, so that’s the only major plan other than supporting whatever Marten and/or Jasper needs this year.

Baseline

Baseline is a grab bag of helpers and extension methods that dates back to the early FubuMVC project. I haven’t done much with Baseline in years, and it might be time to prune it a little bit as some of what Baseline does is now supported in the .Net framework itself. The file system helpers especially could be pruned down, but then also get asynchronous versions of what’s left.

StructureMap

I don’t think that I got a single StructureMap question last year and stopped following its Gitter room. There are still plenty of systems using StructureMap out there, but I think the mass migration to either Lamar or another DI container is well underway.

Marten’s Compiled Query Feature

TL;DR: Marten’s compiled query feature makes using Linq queries significantly more efficient at runtime if you need to wring out just a little more performance in your Marten-backed application.

I was involved in a twitter conversation today that touched on the old Specification pattern of describing a reusable database query by an object (watch it, that word is overloaded in software development world and even refers to separate design patterns). I mentioned that Marten actually has an implementation of this pattern we call Compiled Queries.

Jumping right into a concrete example, let’s say that we’re building an issue tracking system because we hate Jira so much that we’d rather build one completely from scratch. At some point you’re going to want to query for all open issues currently assigned to a user. Assuming our new Marten-backed issue tracker has a document type called Issue, a compiled query class for that would look like this:

    // ICompiledListQuery<T> is from Marten
    public class OpenIssuesAssignedToUser: ICompiledListQuery<Issue>
    {
        public Expression<Func<IMartenQueryable<Issue>, IEnumerable<Issue>>> QueryIs()
        {
            return q => q
                .Where(x => x.AssigneeId == UserId)
                .Where(x => x.Status == "Open");
        }
        // This is an input parameter to the query
        public Guid UserId { get; set; }
    }

And now in usage, we’ll just spin up a new instance of the OpenIssuesAssignedToUser to query for the open issues for a given user id like this:

    var store = DocumentStore.For(opts =>
    {
        opts.Connection("some connection string");
    });

    await using var session = store.QuerySession();

    var issues = await session.QueryAsync(new OpenIssuesAssignedToUser
    {
        UserId = userId // passing in the query parameter to a known user id
    });
    
    // do whatever with the issues

Other than the weird method signature of the QueryIs() method, that class is pretty simple if you’re comfortable with Marten’s superset of Linq. Compiled queries can be valuable anywhere where the old Specification (query objects) pattern is useful, but here’s the cool part…

Compiled Queries are Faster

Linq has been an awesome addition to the .Net ecosystem, and it’s usually the very first thing I mention when someone asks me why they should consider .Net over Java or any other programming ecosystem. On the down side though, it’s complicated as hell, there’s some runtime overhead to generating and parsing Linq queries at runtime, and most .Net developers don’t actually understand how it works internally under the covers.

The best part of the compiled query feature in Marten is that on the first usage of a compiled query type, Marten memoizes its “query plan” for the represented Linq query so there’s significantly less overhead for subsequent usages of the same compiled query type within the same application instance.

To illustrate what’s happening when you issue a Linq query, consider the same logical query as above, but this time in inline Linq:


    var issues = await session.Query<Issue>()
        .Where(x => x.AssigneeId == userId)
        .Where(x => x.Status == "Open")
        .ToListAsync();

    // do whatever with the issues

When the Query() code above is executed, Marten is:

  1. Building an entire object model in memory using the .Net Expression model.
  2. Linq itself never executes any of the code within Where() or Select() clauses, instead it parses and interprets that Expression object model with a series of internal Visitor types.
  3. The result of visiting the Expression model is to build a corresponding, internal IQueryHandler object is created that “knows” how to build up the SQL for the query and then how to process the resulting rows returned by the database and then to coerce the raw data into the desired results (JSON deserialization, stash things in identity maps or dirty checking records, etc).
  4. Executing the IQueryHandler, which in turn writes out the desired SQL query to the outgoing database command
  5. Make the actual call to the underlying Postgresql database to return a data reader
  6. Interpret the data reader and coerce the raw records into the desired results for the Linq query

Sounds kind of heavyweight when you list it all out. When we move the same query to a compiled query, we only have to incur the cost of parsing the Linq query Expression model once, and Marten “remembers” the exact SQL statement, how to map query inputs like OpenIssuesAssignedToUser.UserId to the right database command parameter, and even how to process the raw database results. Behind the scenes, Marten is generating and compiling a new class at runtime to execute the OpenIssuesAssignedToUser query like this (I reformatted the generated source code just a little bit here):

using System.Collections.Generic;
using Marten.Internal;
using Marten.Internal.CompiledQueries;
using Marten.Linq;
using Marten.Linq.QueryHandlers;
using Marten.Testing.Documents;
using NpgsqlTypes;
using Weasel.Postgresql;

namespace Marten.Testing.Internals.Compiled
{
    public class
        OpenIssuesAssignedToUserCompiledQuery: ClonedCompiledQuery<IEnumerable<Issue>, OpenIssuesAssignedToUser>
    {
        private readonly HardCodedParameters _hardcoded;
        private readonly IMaybeStatefulHandler _inner;
        private readonly OpenIssuesAssignedToUser _query;
        private readonly QueryStatistics _statistics;

        public OpenIssuesAssignedToUserCompiledQuery(IMaybeStatefulHandler inner, OpenIssuesAssignedToUser query,
            QueryStatistics statistics, HardCodedParameters hardcoded): base(inner, query, statistics, hardcoded)
        {
            _inner = inner;
            _query = query;
            _statistics = statistics;
            _hardcoded = hardcoded;
        }


        public override void ConfigureCommand(CommandBuilder builder, IMartenSession session)
        {
            var parameters = builder.AppendWithParameters(
                @"select d.id, d.data from public.mt_doc_issue as d where (CAST(d.data ->> 'AssigneeId' as uuid) = ? and  d.data ->> 'Status' = ?)");

            parameters[0].NpgsqlDbType = NpgsqlDbType.Uuid;
            parameters[0].Value = _query.UserId;
            _hardcoded.Apply(parameters);
        }
    }

    public class
        OpenIssuesAssignedToUserCompiledQuerySource: CompiledQuerySource<IEnumerable<Issue>, OpenIssuesAssignedToUser>
    {
        private readonly HardCodedParameters _hardcoded;
        private readonly IMaybeStatefulHandler _maybeStatefulHandler;

        public OpenIssuesAssignedToUserCompiledQuerySource(HardCodedParameters hardcoded,
            IMaybeStatefulHandler maybeStatefulHandler)
        {
            _hardcoded = hardcoded;
            _maybeStatefulHandler = maybeStatefulHandler;
        }


        public override IQueryHandler<IEnumerable<Issue>> BuildHandler(OpenIssuesAssignedToUser query,
            IMartenSession session)
        {
            return new OpenIssuesAssignedToUserCompiledQuery(_maybeStatefulHandler, query, null, _hardcoded);
        }
    }
}

What else can compiled queries do?

Besides being faster than raw Linq and being useful as the old reliable Specification pattern, compiled queries can be very valuable if you absolutely insist on mocking or stubbing the Marten IQuerySession/IDocumentSession. You should never, ever try to mock or stub the IQueryable interface with a dynamic mock library like NSubstitute or Moq, but mocking the IQuerySession.Query<T>(T query) method is pretty straight forward.

Most of the Linq support in Marten is usable within compiled queries — even the Include() feature for querying related document types in one round trip. There’s even an ability to “stream” the raw JSON byte array data from compiled query results directly to the HTTP response body in ASP.Net Core for Marten’s “ludicrous speed” mode.

Build Automation on a Database Backed .Net System

Hey, I blog a lot about the OSS tools I work on, so this week I’m going in a different direction and blogging about other OSS tools I use in daily development. In no small part, this blog post is a demonstration to some of my colleagues to get them to weigh in on the approach I took here.

I’ve been dragging my feet for way, way too long at work on what’s going to be our new centralized identity provider service based on Identity Server 5 from Duende Software. It is the real world, so for the first phase of things, the actual user credentials are stored in an existing Sql Server database, with roughly a database per client strategy of multi-tenancy. For this new server, I’m introducing a small lookup database to store the locations of the client specific databases. So the new server has this constellation of databases:

After some initial spiking, the first serious thing I did was to set up the automated developer build for the codebase. For local development, I need a script that:

  • Sets up multiple Sql Server databases for local development and testing
  • Restore Nuget dependencies
  • Can build the actual C# code (and might later delegate to NPM if there’s any JS/TS code in the user interface)
  • Run all the tests in the codebase

For very simple projects I’ll just use the dotnet command line to run tests from the command line in CI builds or at Git commit time. Likewise in Node.js projects, npm by itself is frequently good enough. If all there was was the C# code, dotnet test would be enough of a build script in this Identity Server project, but the database requirements are enough to justify a more complex build automation approach.

Build Scripting with Bullseye

Until very recently, I still used the Ruby-based Rake tooling for build scripting, but Ruby as a scripting language has definitely fallen out of favor in .Net circles. After Babu Annamalai introduced Bullseye/SimpleExec into Marten, I’m using Bullseye as my go to build scripting tool.

At least in my development circles, make-like, task-oriented build automation tools have definitely lost popularity in recent years. But in this identity server project, that’s exactly what I want for build automation. My task-oriented build scripting tool of choice for .Net work is the combination of Bullseye with SimpleExec. Bullseye itself is very easy to use because you’re using C# in a small .Net project. Because it’s just a .Net console application, you also have complete access to Nuget libraries — as we’ll exploit in just a bit.

To get started with Bullseye, I created a folder called build off of the repository root of my identity server codebase, and created a small .Net console application that I also call build. You can see an example of this in the Lamar codebase.

Because we’ll need this in a minute, I’ll also place some wrapper scripts at the root directory of the repository to call the build project called build.cmd, build.ps1, and build.sh for Windows, Powershell, and *nix development. The build.cmd file is just delegating to the new build project and passing all the command line variables like so:

@echo off

dotnet run --project build/build.csproj -c Release -- %*

Back to the new build project, I added Nuget references to Bullseye and SimpleExec. In the Program.Main() function (this could be a little simpler with the new streamlined .Net 6 entry point), I’ll add a couple static namespace declarations:

using static Bullseye.Targets;
using static SimpleExec.Command;

Now we’re ready to write our first couple tasks directly into the Program code file. I still prefer to have separately executable tasks restoring Nugets, compiling, and running all the tests so you can run partial builds at will. In this case, using some sample code from the Oakton build script:

// Just delegating to the dotnet cli to restore nugets
Target("restore", () =>
{
    Run("dotnet", "restore src/Oakton.sln");
});

// compile the whole solution, but after running
// the restore task
Target("compile",  DependsOn("restore"),() =>
{
    Run("dotnet",
        $"build src/Oakton.sln --no-restore");
});

Target("test", DependsOn("compile"),() =>
{
    RunTests("Tests");
});

// Little helper function to execute tests by test project name
// still just delegating to the command line
private static void RunTests(string projectName, string directoryName = "src")
{
    Run("dotnet", $"test --no-build {directoryName}/{projectName}/{projectName}.csproj");
}

We’ve got a couple more steps to make this a full build script. We also need this code at the very bottom of our Program.Main() function to actually run tasks:

RunTargetsAndExit(args);

I typically have an explicit “default” task that gets executed when you just type build / ./build.sh that usually just includes other named tasks. In the case of Oakton, it runs the unit test task plus another task called “commands” that smoke tests several command line calls:

Target("default", DependsOn("test", "commands"));

Usually, I’ll also use a “ci” task that is intended for continuous integration builds that is a superset of the default build task with extra integration tests or Nuget publishing (this isn’t as common now that we tend to use separate GitHub actions for Nuget publishing). In Oakton’s case the “ci” task is exactly the same:

Target("ci", DependsOn("default"));

After all that is in place, and working in Windows at the moment, I like to make git commits with the old “check in dance” like this:

build && git commit -a -m "some commit message"

Less commonly, but still valuable, let’s say that Microsoft has just released a new version of .Net that causes a cascade of Nuget updates and other havoc in your projects. While working through that, I’ll frequently do something like this to work out Nuget resolution issues:

git clean -xfd && build restore

Or once in awhile the IDE build error window can be misleading, so I’ll build from the command line with:

build compile

So yeah, most of the build “script” I’m showing here is just delegating to the dotnet CLI and it’s not very sophisticated. I still like having this file so I can jump between my projects and just type “build” or “build compile” without having to worry about what the solution file name is, or telling dotnet test which projects to run. That being said though, let’s jump into something quite a bit more complicated.

Adding Sql Server and EF Core into the Build Script

For the sake of testing my new little identity server, I need at least a couple different client databases plus the lookup database. Going back to first principles of Agile Development practices, it should be possible for a brand new developer to do a clean clone of the new identity server codebase and very quickly be running the entire service and all its tests. I’m going to pull that off by adding new tasks to the Bullseye script to set up databases and automate all the testing.

First up, I don’t need very much data for testing, so I’m more than good enough just running Sql Server in Docker, so I’ll add this docker-compose.yml file to my repository:

version: '3'
services:
  sqlserver:
    image: "microsoft/mssql-server-linux"
    ports:
      - "1435:1433"
    environment:
      - "ACCEPT_EULA=Y"
      - "SA_PASSWORD=P@55w0rd"
      - "MSSQL_PID=Developer"

The only think interesting to note is that I mapped a non-default port number (1435) to this container for the sole sake of being able to run this container in parallel to Sql Server itself that I have to have for other projects at work. Back to Bullseye, and I’ll add a new task to delegate to docker compose to start up Sql Server:

Target("docker-up", async () =>
{
    await RunAsync("docker", "compose up -d");
});

And trust me on this one, the Docker setup is asynchronous, so you actually need to make your build script wait a little bit until the new Sql Server database is accessible before doing anything else. For that purpose, I use this little function:

public static async Task WaitForDatabaseToBeReady()
{
    Console.WriteLine("Waiting for Sql Server to be available...");
    var stopwatch = new Stopwatch();
    stopwatch.Start();
    while (stopwatch.Elapsed.TotalSeconds < 30)
    {
        try
        {
            // ConnectionSource is really just exposing a constant
            // with the known connection string to the Dockerized
            // Sql Server
            await using var conn = new SqlConnection(ConnectionSource.ConnectionString);
            await conn.OpenAsync();

            var cmd = conn.CreateCommand();
            cmd.CommandText = "select 1";
            await cmd.ExecuteReaderAsync();

            Console.WriteLine("Sql Server is up and ready!");
            return;
        }
        catch (Exception)
        {
            await Task.Delay(250);
            Console.WriteLine("Database not ready yet, trying again.");
        }
    }
}

Next, I need some code to create additional databases (I’m sure you can do this somehow in the docker compose file itself, but I didn’t know how at the time and this was easy). I’m going to omit the actual CREATE DATABASE calls, but just know there’s a method with this signature on a static class in my build project called Database:

public static async Task BuildDatabases()

I’m using EF Core for data access in this project, and also using EF Core migrations to do database schema building, so we’ll want the dotnet ef tooling available, so I added a task for just that:

Target("install-ef", IgnoreIfFailed(() =>
    Run("dotnet", "tool install --global dotnet-ef")
));

The dotnet ef command line usage has a less than memorable pattern of usage, so I made a little helper function that’s gonna get called for different combinations of EF Core context name and database connection strings:

public static async Task RunEfUpdate(string contextName, string databaseName)
{
    Console.WriteLine($"Running EF Migration for context {contextName} on database '{databaseName}'");

    // ConnectionSource is a little helper specific to my 
    // identity server project
    var connection = ConnectionSource.ConnectionStringForDatabase(databaseName);
    await Command.RunAsync("dotnet",
        $"ef database update --project src/ProjectName/ProjectName.csproj --context {contextName} --connection \"{connection}\"");
}

For a little more context, I have two separate EF Core DbContext classes (obfuscated from the real code):

  1. LookupDbContext — the “master” registry of client databases by client id
  2. IdentityDbContext — addresses a single client database holding user credentials

And now, after all that work, here’s a Bullseye script that can stand up a new Sql Server database in Docker, build the required databases if necessary, establish baseline data, and run the correct EF Core migrations as needed:

Target("database", DependsOn("docker-up"), async () =>
{
    // "Database" is a static class in my build project where
    // I've dumped database helper code
    await Database.BuildDatabases();

    // RunEfUpdate is delegating to dotnet ef
    await Database.RunEfUpdate("LookupDbContext", "identity");
    
    // Not shown, but fleshing out some static lookup data
    // with straight up SQL calls
    
    // Running migrations on all three test databases for client
    // credential databases
    await Database.RunEfUpdate("IdentityDbContext", "environment1");
    await Database.RunEfUpdate("IdentityDbContext", "environment2");
    await Database.RunEfUpdate("IdentityDbContext", "environment3");
});

Now, the tests for this identity server are almost all going to be integration tests, so I won’t even bother separating out integration tests from unit tests. That being said, our main test library is going to require the Sql Server database built above to be available before the tests are executed, so I’m going to add a dependency to the test task like so:

// The database is required
Target("test", DependsOn("compile", "database"), () =>
{
    RunTests("Test Project Name");
});

Now, when someone does a quick clone of this codebase, they should be able to just run the build.cmd/ps1/sh script and assuming that they already have the correct version of .Net installed plus Docker Desktop:

  1. Have all the nuget dependencies restored
  2. Compile the entire solution
  3. Start a new Sql Server instance in Docker with all testing databases built out with the correct database structure and lookup data
  4. Have executed all the automated tests

Bonus Section: Integration with GitHub Actions

I’m a little bit old school with CI. I grew up in the age when you tried to keep your CI set up as crude as possible and mostly just delegated to a build script that did all the actual work. To that end, if I’m using Bullseye as my build scripting tool and GitHub Actions for CI, I delegate to Bullseye like this from the Oakton project:

name: .NET

on:
  push:
    branches: [ master ]
  pull_request:
    branches: [ master ]
    
env:
  config: Release
  disable_test_parallelization: true
  DOTNET_CLI_TELEMETRY_OPTOUT: 1
  DOTNET_SKIP_FIRST_TIME_EXPERIENCE: 1

jobs:
  build:

    runs-on: ubuntu-latest
    timeout-minutes: 20

    steps:
    - uses: actions/checkout@v2

    - name: Setup .NET 5
      uses: actions/setup-dotnet@v1
      with:
        dotnet-version: 5.0.x
    - name: Setup .NET 6
      uses: actions/setup-dotnet@v1
      with:
        dotnet-version: 6.0.x
    - name: Build Script
      run: dotnet run -p build/build.csproj -- ci

The very bottom line of code is the pertinent part that delegates to our Bullseye script and runs the “ci” target that’s my own idiom. Part of the point here is to have the build script steps committed and versioned to source control — which these days is also done with the YAML GitHub action definition files, so that’s not as important as it used to be. What is still important today is that coding in YAML sucks, so I try to keep most of the actual functionality in nice, clean C#.

Bonus: Why didn’t you…????

  • Why didn’t you just use MSBuild? It’s possible to use MSBuild as a task runner, but no thank you. I was absolutely sick to death of coding via XML in NAnt when MSBuild was announced, and I’ll admit that I never gave MSBuild the time of day. I’ll pass on more coding in Xml.
  • Why didn’t you just use Nuke or Cake? I’ve never used Nuke and can’t speak to it. I’m not a huge Cake fan, and Bullseye is a simple model to me
  • Why didn’t you just use Powershell? You end up making powershell scripts call other scripts and it clutters the file system up.

Alba 6.0 is friendly with .Net 6, Minimal API, and WebApplicationFactory

Alba is a small open source library that is a helper for integration testing against ASP.Net Core HTTP methods that makes the underlying ASP.Net Core TestServer easier and more declarative to use within tests.

Continuing a busy couple weeks of OSS work getting tools on speaking terms with .Net 6, Alba v6.0 was released early this week with support for .Net 6 and the new WebApplication bootstrapping model within ASP.Net Core 6.0. Before I dive into the details, a big thanks to Hawxy who did most of the actual coding for this release.

The biggest change was getting Alba ready to work with the new WebApplicationBuilder and WebApplicationFactory models in ASP.Net Core such that Alba can be used with any typical way to bootstrap an ASP.Net Core project. See the Alba Setup page in the documentation for more details.

Using Alba with Minimal API Projects

From Alba’s own testing, let’s say you have a small Minimal API project that’s bootstrapped like this in your web services Program file:

using System;
using Microsoft.AspNetCore.Builder;

var builder = WebApplication.CreateBuilder(args);

// Add services to the container.

var app = builder.Build();

// Configure the HTTP request pipeline.

app.UseHttpsRedirection();


app.MapGet("/", () => "Hello World!");
app.MapGet("/blowup", context => throw new Exception("Boo!"));

app.Run();

Alba’s old (and still supported) model of using the application’s HostBuilder in the .Net 5 project templates is no help here, but that’s okay, because Alba now also understand how to use WebApplicationFactory to bootstrap the application shown above. Here’s some sample code to do just that in a small xUnit test:

// WebApplicationFactory can resolve old and new style of Program.cs
// .NET 6 style - the global:: namespace prefix would not be required in a normal test project
await using var host = await AlbaHost.For<global::Program>(x =>
{
    x.ConfigureServices((context, services) =>
    {
        services.AddSingleton<IService, ServiceA>();
    });
});

host.Services.GetRequiredService<IService>().ShouldBeOfType<ServiceA>();

var text = await host.GetAsText("/");
text.ShouldBe("Hello World!");

And you’re off to the races and authoring integration tests with Alba!

How Alba and ASP.Net Have Evolved

The code that ultimately became Alba has been around for over a decade, and I think it’s a little interesting to see the evolution of web development in .Net through Alba & I’s history.

1998: I built a couple internal, “shadow IT” applications for my engineering team with ASP “classic” and fell in love with web development

2003: I was part of a small team building a new system on the brand new ASP.Net WebForms application model and fell out of love with web development for several years

2008-2014: In the heady days of ALT.Net (and with the opening provided by ASP.Net MVC to finally consider alternatives to the abomination that was WebForms) I was the lead of the alternative web framework in .Net called FubuMVC. What is now “Alba” started as a subsystem within FubuMVC itself that we used to test fubu’s content negotiation subsystem completely in memory without having to run a web server.

~2015: I ripped Alba into its own library and ported the code to work against the OWIN model

2017: Alba 1.0 was ported to the new ASP.Net Core, but used its own special sauce to run HTTP requests in memory with stubbed out HttpContext objects

2018: Alba 2.0 accommodated a world of changes from the release of ASP.Net Core 2.*. There was temporarily separate Nugets for ASP.Net Core 1 and ASP.Net Core 2 because the models were so different. That sucked.

2019: Alba 3.0 was released supporting ASP.Net Core 3.*, and ditched all support for anything on the full .Net framework. At this point Alba’s internals were changed to utilize the ASP.Net Core TestServer and HostBuilder models

2020: Alba 4.0 supported ASP.Net Core 5.0

August 2021: Alba 5.0 added a new extension model with initial extensions for testing applications secured by JWT bearer tokens

December 2021: .Net 6 came with a lot of changes to the ASP.Net Core bootstrapping model, so here we are with a brand new Alba 6.0.

Lots and lots of changes in the web development world within .Net, and I’m betting that’s not completely done changing. For my part, Alba isn’t the most widely used library, but there’s more than enough usage for me to feel good about a piece of fubumvc living on. Plus we use it at work for integration testing, so Alba is definitely going to live on.

Lamar v7 meets .Net 6, Minimal APIs, and IAsyncDisposable

It’s been a busy couple weeks in OSS world for me scurrying around and getting things usable in .Net 6. Today I’m happy to announce the release of Lamar 7.0. The Nuget for Lamar itself and Lamar.Microsoft.DependencyInjection with adjusted dependencies for .Net 6 went up yesterday, and I made some additions to the documentation website just now. There are no breaking changes in the API, but Lamar dropped all support for any version of .Net < .Net 5.0. Before I get into the highlights, I’d like to thank:

  • Babu Annamalai for making the docs so easy to re-publish
  • Khalid Abuhakmeh and Stephan Steiger for their help with the Minimal API support
  • Andrew Lock for writing some very helpful blog posts about new .Net 6 internals that have helped me get through .Net 6 improvements to several tools the past couple weeks.

Lamar and Minimal API

Lamar v7 adds some specific support for better usability of the new Minimal API feature in ASP.Net Core. Below is the sample we use in the Lamar documentation and the internal tests:

var builder = WebApplication.CreateBuilder(args);

// use Lamar as DI.
builder.Host.UseLamar((context, registry) =>
{
    // register services using Lamar
    registry.For<ITest>().Use<MyTest>();
    registry.IncludeRegistry<MyRegistry>();

    // add the controllers
    registry.AddControllers();
});


var app = builder.Build();
app.MapControllers();

// [FromServices] is NOT necessary when using Lamar v7
app.MapGet("/", (ITest service) => service.SayHello());

app.Run();

Lamar and IAsyncDisposable

Just copy/pasting the documentation here…

The Lamar IContainer itself, and all nested containers (scoped containers in .Net DI nomenclature) implement both IDisposable and IAsyncDisposable. It is not necessary to call both Dispose() and DisposeAsync() as either method will dispose all tracked IDisposable / IAsyncDisposable objects when either method is called.

// Asynchronously disposing the container
await container.DisposeAsync();

The following table explains what method is called on a tracked object when the creating container is disposed:

If an object implements…Container.Dispose()Container.DisposeAsync()
IDisposableDispose()Dispose()
IAsyncDisposableDisposeAsync().GetAwaiter().GetResult()DisposeAsync()
IDisposable and IAsyncDisposableDisposeAsync()DisposeAsync()

If any objects are being created by Lamar that only implement IAsyncDisposable, it is probably best to strictly use Container.DisposeAsync() to avoid any problematic mixing of sync and async code.

Multi-Tenancy with Marten

We’ve got an upcoming Marten 5.0 release ostensibly to support breaking changes related to .Net 6, but that also gives us an opportunity to consider work that would result in breaking API changes. A strong candidate for V5 right now is finally adding long delayed first class support for multi-tenancy through separate databases.

Let’s say that you’re building an online database-backed, web application of some sort that will be servicing multiple clients. At a minimum, you need to isolate data access so that client users can only interact with the data for the correct client or clients. Ideally, you’d like to get away with only having one deployed instance of your application that services the users of all the clients. In other words, you want to support “multi-tenancy” in your architecture.

Software multitenancy is a software architecture in which a single instance of software runs on a server and serves multiple tenants.

Multi-tenancy on Wikipedia

For the rest of this post, I’m going to use the term “tenant” to refer to whatever the organizational entity is that owns separate database data. Depending on your business domain, that could be a client, a sub-organization, a geographic area, or some other organizational concept.

Fortunately, if you use Marten as your backing database store, Marten has strong support for multi-tenancy with new improvements in the recent V4 release and more potential improvements tentatively planned for V5.

There are three basic approaches to segregating tenant data in a database:

  1. Single database, single schema, but use a field or property in each table to denote the tenant. This is Marten’s approach today with what we call the “Conjoined” model. The challenge here is that all queries and writes to the database need to take into account the currently used tenant — and that’s where Marten’s multi-tenancy support helps a great deal. Database schema management is easier with this approach because there’s only one set of database objects to worry about. More on this later.
  2. Separate schema per tenant in a single database. Marten does not support this model, and it doesn’t play well with Marten’s current internal design. I seriously doubt that Marten will ever support this.
  3. Separate database per tenant. This has been in Marten’s backlog forever, and maybe now is the time this finally gets done (plenty of folks have used Marten this way already with custom infrastructure on top of Marten, but there’s some significant overhead). I’ll speak to this much more in the last section of this post.

Basic Multi-Tenancy Support in Marten

To set up multi-tenancy in your document storage with Marten, we can set up a document store with these options:

    var store = DocumentStore.For(opts =>
    {
        opts.Connection("some connection string");

        // Let's just say that each and every document
        // type is going to be multi-tenanted
        opts.Policies.AllDocumentsAreMultiTenanted();

        // Or you can do this document type by document type
        // if some document types are not related to a tenant
        opts.Schema.For<User>().MultiTenanted();
    });

There’s a couple other ways to opt document types into multi-tenancy, but you get the point. With just this, we can start a new Marten session for a particular tenant and carry out basic operations isolated to a single tenant like so:

    // Open a session specifically for the tenant "tenant1"
    await using var session = store.LightweightSession("tenant1");

    // This would return *only* the admin users from "tenant1"
    var users = await session.Query<User>().Where(x => x.Roles.Contains("admin"))
        .ToListAsync();

    // This user would be automatically be tagged as belonging to "tenant1" 
    var user = new User {UserName = "important_guy", Roles = new string[] {"admin"}};
    session.Store(user);

    await session.SaveChangesAsync();

The key thing to note here is that other than telling Marten which tenant you want to work with as you open a new session, you don’t have to do anything else to keep the tenant data segregated as Marten is dealing with those mechanics behind the scenes on all queries, inserts, updates, and deletions from that session.

Awesome, except that some folks needed to occasionally do operations against multiple tenants at one time…

Tenant Spanning Operations in Marten V4

The big improvements in Marten V4 for multi-tenancy was in making it much easier to work with data from multiple tenants in one document session. Marten has long had the ability to query data across tenants with the AnyTenant() or ` like so:

    var allAdmins = await session.Query<User>()
        .Where(x => x.Roles.Contains("admin"))
        
        // This is a Marten specific extension to Linq
        // querying
        .Where(x => x.AnyTenant())
        
        .ToListAsync();

Which is great for what it is, but there wasn’t any way to know what tenant each document returned belonged to. We made a huge effort in V4 to expand Marten’s document metadata capabilities, and part of that is the ability to write the tenant id to a document being fetched from the database by Marten. The easiest way to do that is to have your document type implement the new ITenanted interface like so:

    public class MyTenantedDoc: ITenanted
    {
        public Guid Id { get; set; }
        
        // This property will be set by Marten itself
        // when the document is persisted or loaded
        // from the database
        public string TenantId { get; set; }
    }

So now we at least have the ability to know which documents we queried across the tenants belong to which tenant.

The next thing folks wanted from V4 was the ability to make writes against multiple tenants with one single document session in a single unit of work. To that end, Marten V4 introduced the concept of ITenantOperations to log operations against a specific tenants besides the tenant the current session was opened as. And all those operations should be committed to the underlying Postgresql database as a single transaction.

To make that concrete, here’s some sample code, but this time adding two new User document with the same user name, but to two different tenants by tenant id:

    // Same user name, but in different tenants
    var user1 = new User {UserName = "bob"};
    var user2 = new User {UserName = "bob"};

    // This exposes operations against only tenant1
    session.ForTenant("tenant1").Store(user1);

    // This exposes operations that would apply to 
    // only tenant2
    session.ForTenant("tenant2").Store(user2);
 
    // And both operations get persisted in one transaction
    await session.SaveChangesAsync();

So that’s the gist of the V4 multi-tenancy improvements. We also finally support multi-tenancy within the asynchronous projection support, but I’ll blog about that some other time.

Now though, it’s time to consider…

Database per Tenant

To be clear, I’m looking for any possible feedback about the requirements for this feature in Marten. Blast away here in comments, or here’s a link to the GitHub issue, or go to Gitter.

While you can — and many folks have successfully achieved — multi-tenancy through database per tenant by just keeping an otherwise identically configured DocumentStore per named tenant in memory with the only difference being a connection string. That certainly can work, especially with a low number of tenants. There’s a few problems with that approach though:

  • You’re on your own to configure that in the DI container within your application
  • DocumentStore is a relatively expensive object to create, and it potentially generates a lot of runtime objects that get held in memory. You don’t really want a bunch of those hanging around
  • Going around AddMarten() negates the Marten CLI support, which is the easiest possible way to manage Marten database schema migrations. Now you’re completely on your own about how to do database migrations without using pure runtime database patching — which we do not recommend in production.

So let’s just call it a given that we do want to add some formal support for multi-tenancy through separate databases per tenant to Marten. Moreover, Database per Tenant been in our backlog forever, but pushed off every time we’ve struggled to make big Marten releases.

I think there’s some potential for this story to cause breaking API changes (I don’t have anything specific in mind, it’s just likely in my opinion), so that makes that story a very good candidate to get in place for Marten V5. From the backlog issue writeup I made back in 2017:

  • Have all tenants tracked in memory, such that a single DocumentStore can share all the expensive runtime built internal objects across tenants
  • A tenanting strategy that can lookup the database connection string per tenant, and create sessions per separate tenants. There’s actually an interface hook in Marten all ready to go that may serve out of the box when we do this (I meant to do this work years ago, but it just didn’t happen).
  • At development time (AutoCreate != AutoCreate.None), be able to spin up a new database on the fly for a tenant if it doesn’t already exist
  • “Know” what all the existing tenants are so that we could apply database migrations from the CLI or through the DocumentStore schema migration APIs
  • Extend the CLI support to support multiple tenant databases
  • Make the database registry mechanism be a little bit pluggable. Thinking that some folks would have a few tenants where you’d be good with just writing everything into a static configuration file. Other folks may have a *lot* of tenants (I’ve personally worked on a system that had >100 separate tenant databases in one deployed application), so they may want a “master” database

JasperFx OSS Plans for .Net 6 (Marten et al)

I’m going to have to admit that I got caught flat footed by the .Net 6 release a couple weeks ago. I hadn’t really been paying much attention to the forthcoming changes, maybe got cocky by how easy the transition from netcoreapp3.1 to .Net 5 was, and have been unpleasantly surprised by how much work it’s going to take to move some OSS projects up to .Net 6. All at the same time that the advance users of the world are clamoring for all their dependencies to target .Net 6 yesterday.

All that being said, here’s my running list of plans to get the projects in the JasperFx GitHub organization successfully targeting .Net 6. I’ll make edits to this page as things get published to Nuget.

Baseline

Baseline is a grab bag utility library full of extension methods that I’ve relied on for years. Nobody uses it directly per se, but it’s a dependency of just about every other project in the organization, so it went first with the 3.2.2 release adding a .Net 6 target. No code changes were necessary other than adding .Net 6 to the CI testing. Easy money.

Oakton

EDIT: Oakton v4.0 is up on Nuget. WebApplication is supported, but you can’t override configuration in commands with this model like you can w/ HostBuilder only. I’ll do a follow up at some point to fill in this gap.

Oakton is a tool to add extensible command line options to .Net applications based on the HostBuilder model. Oakton is my problem child right now because it’s a dependency in several other projects and its current model does not play nicely with the new WebApplicationBuilder approach for configuring .Net 6 applications. I’d also like to get the Oakton documentation website moved to the VitePress + MarkdownSnippets model we’re using now for Marten and some of the other JasperFx projects. I think I’ll take a shortcut here and publish the Nuget and let the documentation catch up later.

Alba

Alba is an automated testing helper for ASP.Net Core. Just like Oakton, Alba worked very well with the HostBuilder model, but was thrown for a loop with the new WebApplicationBuilder configuration model that’s the mechanism for using the new Minimal API (*cough* inevitable Sinatra copy *cough*) model. Fortunately though, Hawxy came through with a big pull request to make Alba finally work with the WebApplicationFactory model that can accommodate the new WebApplicationBuilder model, so we’re back in business soon. Alba 5.1 will be published soon with that work after some documentation updates and hopefully some testing with the Oakton + WebApplicationBuilder + Alba model.

EDIT: Alba 7.0 is up with the necessary changes, but the docs will come later this week

Lamar

Lamar is an IoC/DI container and the modern successor to StructureMap. The biggest issue with Lamar on v6 was Nuget dependencies on the IServiceCollection model, plus needing some extra implementation to light up the implied service model of Minimal APIs. All the current unit tests and even integration tests with ASP.Net Core are passing on .Net 6. To finish up a new Lamar 7.0 release is:

  • One .Net 6 related bug in the diagnostics
  • Better Minimal API support
  • Upgrade Oakton & Baseline dependencies in some of the Lamar projects
  • Documentation updates for the new IAsyncDisposable support and usage with WebApplicationBuilder with or without Minimal API usage

EDIT: Lamar 7.0 is up on Nuget with .Net 6 support

Marten/Weasel

We just made the gigantic V4 release a couple months ago knowing that we’d have to follow up quickly with a V5 release with a few breaking changes to accommodate .Net 6 and the latest version of Npgsql. We are having to make a full point release, so that opens the door for other breaking changes that didn’t make it into V4 (don’t worry, I think shifting from V4 to V5 will be easy for most people). The other Marten core team members have been doing most of the work for this so far, but I’m going to jump into the fray later this week to do some last minute changes:

  • Review some internal changes to Npgsql that might have performance impacts on Marten
  • Consider adding an event streaming model within the new V4 async daemon. For folks that wanna use that to publish events to some kind of transport (Kafka? Some kind of queue?) with strict ordering. This won’t be much yet, but it keeps coming up so we might as well consider it.
  • Multi-tenancy through multiple databases. It keeps coming up, and potentially causes breaking API changes, so we’re at least going to explore it

I’m trying not to slow down the Marten V5 release with .Net 6 support for too long, so this is all either happening really fast or not at all. I’ll blog more later this week about multi-tenancy & Marten.

Weasel is a spin off library from Marten for database change detection and ADO.Net helpers that are reused in other projects now. It will be published simultaneously with Marten.

Jasper

Oh man, I’d love, love, love to have Jasper 2.0 done by early January so that it’ll be available for usage at my company on some upcoming work. This work is on hold while I deal with the other projects, my actual day job, and family and stuff.

Integration Testing: Lessons from Storyteller and Other Thoughts

Continuing my series about automated testing. I’m wrestling quite a bit right now with the tooling for integration testing both at work and in some of my more complicated OSS projects like Marten or Jasper. I happily use xUnit.Net for unit testing on side projects, and we generally use NUnit at work. Both tools are well suited for unit testing, but lack something to be desired for integration testing in my opinion. At the same time, one of my long running OSS projects is Storyteller, which was originally meant as a Behavior Driven Development (BDD) tool (think Gherkin tools or the much older FitNesse tooling), but really turned out to mostly be advantageous for big, data intensive integration testing.

To paraphrase Miracle Max, Storyteller isn’t dead, it’s just mostly dead. Storyteller never caught on, the React UI is a bazillion versions out of date, buggy as hell, and probably isn’t worth updating. I’m busy with other things. Gherkin/Cucumber tools suck all the oxygen out of the room for alternative approaches to BDD or really even just integration testing. Plus our industry’s zombie like fixation on trying to automate all testing through Selenium is absolutely killing off all the enthusiasm for any other automated testing techniques — again, just my personal opinion.

However, I still want something better than what we have today, and there were some useful things that Storyteller provided for automated testing that I still want in automated testing tools. And of course there are also some things that I’d like to have where Storyteller fell short.

Things that I thought were valuable in Storyteller:

  • Rendering the test results as HTML with failures inline helps relate test failures to test inputs, expectations, or actions
  • BDD tests are generally expressed through a mix of “flow based” expressions (think about the English-like Given/When/Then nomenclature of Gherkin specifications) and “table based” testing that was very common in the older FIT/FitNesse style of BDD. A lot of data intensive problem domains can be best tested through declarative “table-driven” tests. For example, consider insurance quoting logic, or data transformations, or banking transactions where you’re pushing around a lot of data and also verifying matrices of expected results. I still very strongly believe that Storyteller’s focus on example tables and set verifications is the right approach for these kinds of problem domains. The example tables make the expression of the tests be both very declarative and readily reviewable by domain experts. The set verification approach in Storyteller is valuable because it will show you which values are missing or unexpected for easy test failure diagnosis. Table-driven testing is somewhat supported in the Gherkin specification, but I don’t think that’s well utilized in that space.
  • Being able to embed system diagnostics and logging directly into the test results such that the application logs are perfectly correlated to the integration test running has been very advantageous. As an example, in Jasper testing I pipe in all the system events around messages being sent, received, and executed so that you can use that data to understand how any given test behaved. I’ve found this feature to be incredibly helpful in solving the inevitable test failures.
  • Like many integration testing tools, Storyteller generally allows you to evaluate all the assertions of a long running test even after some condition or action has failed. This is the opposite of what we’re taught for fine-grained unit testing, but very advantageous in integration testing where a scenarios run very slowly and you want to maximize your feedback cycle. That approach can also easily lead to runaway test runs. Picking on Selenium testing yet again, imagine you have an integration test that uses Selenium to pull open a certain page in a web application and then verify the expected values of several screen elements. Following Selenium best practices, you have some built in “waiters” that try to slow down the test until the expected elements are visible on the screen. If the web application itself chokes completely on the web page, your Selenium heavy test might very easily loop through and timeout on the visibility of every single expected element. That’s slow. Even worse — and I’ve seen this happen in multiple shops — what if the whole web application is down, but each and every Selenium heavy test still runs and repeatedly times out all the various “waiters” in the test? That’s a hung build that can run all night (again, been there, done that). To that end, Storyteller has the concepts of “critical” or “catastrophic” errors in the test engine. For example, if you were using Storyteller to drive Selenium, you could teach Storyteller to treat the failure to load a web page as a “critical” exception that will stop the current test from continuing so you can “fail fast”. Likewise, if the very home page of the web application can’t be loaded (status code 500 y’all), you can have Storyteller throw a “catastrophic” exception that will pull the cord on the whole test suite run and stop all test executions regardless of any kind of test retry policy. Fail fast FTW!
  • Storyteller did a lot to make test data set up be as declarative as possible. I think this is hugely valuable for data-intensive problem domains like basically any enterprise application.
  • Targeted test retries in CI. You have to opt into retries on a test by test basic in Storyteller, and I think that was the right decision as opposed to letting every test failure get retried in CI.
  • Something I stole years ago from a former colleague at ThoughtWorks, Storyteller differentiates between tests that are “Acceptance” tests in the development lifecycle and tests that are labeled as “Regression” — meaning that those tests absolutely have to pass in CI while “Acceptance” tests are known to be a work in progress. I think that workflow is valuable for doing BDD style testing where the tests are specifications about how the system *should* behave.
  • Storyteller had a “step through” mode where you could manually step through the specification at execution time much like you would regular code in a debugger. That was hugely valuable.
  • Storyteller kept performance data about individual test steps as part of the test results, and even allowed you to specify performance thresholds that could fail a test if the performance did not satisfy that threshold. That’s not to say Storyteller specification were anything more than a good complement to real load or performance testing, but it was still very helpful to diagnose long running tests and very frequently identified real application performance bottlenecks.

Things that weren’t good about Storyteller, and you should do better:

  • The IDE integration was nonexistent. At a bare minimum you need easy ways to run individual or groups of specifications at will from your IDE for a quick feedback cycle. The best I ever did with Storyteller was an xUnit.Net bridge that was so so. Maybe the new “dotnet test” specification could help here
  • It’s a Gherkin world and the rest of us just live in it. If there was a serious attempt to reboot Storyteller, it absolutely has to be Gherkin compatible. Think there would be Storyteller customizations as well though, and I despise the silly Given/When/Then handcuffs. Any new Storyteller absolutely has to allow devs to efficiently build and run new specifications without having to jump into an external tool of some sort
  • It’s got to be more asynchronous code friendly because that’s the world we live in

I’ve got a lot more notes about specific use cases I’ll have to dig out later.

What am I going to do?

For Jasper & Marten where there’s no quick win to replace Storyteller, I think I might cut a quickie Storyteller v6 that updates everything to .Net 5/6 and makes the testing framework much more asynchronous code friendly as a short term thing.

Longer term, I don’t know. At work I think the best approach would be to convince myself to like SpecFlow and possibly add some Storyteller functionality as addons to SpecFlow for some of its missing functionality. I’ve got worlds of notes about a rebooted Storyteller or maybe a completely different approach I’ve nicknamed “Bobcat” that would be an xUnit-like tool that’s just very optimized for integration testing (shared setup data, teardown, smarter heuristics for parallelizing tests). I’m also somewhat curious to see if the rebooted Fixie could be molded into a better tool for integration testing.

Honestly, if a newly rebooted Storyteller happens, it’ll be an opportunity for me to refresh my UI/UX skills.

Rebooting Jasper

Jasper is a long running OSS passion project of mine. As it is now, Jasper is a command processing tool similar to Brighter or MassTransit that can be used as either an in memory mediator tool (like a superset of Mediatr) or as a service bus framework for asynchronous messaging. Jasper was originally conceived as a way to recreate the “good” parts of FubuMVC like low code ceremony, minimal intrusion of the framework into application code, and effective usage of the Russian Doll model for the execution pipeline. At the same time though, I wanted Jasper to improve upon the earlier FubuMVC architecture by maximizing performance, minimizing object allocations, easier configuration and bootstrapping, and making it much easier for developers to troubleshoot runtime issues.

I actually did cut a Jasper V1 release early in the COVID-19 pandemic, but otherwise dropped it to focus on Marten and stopped paying any attention to it. With Marten V4 is in the books, I’m going back to working on Jasper for a little bit. For right now, I’m thinking that the Jasper V2 work is something like this:

  1. Upgrading all the dependencies and targeting .Net 5/6 (basically done)
  2. Benchmarking and optimizing the core runtime internals. Sometimes the best way to improve a codebase is to step away from it for quite a bit and come back in with fresh perspectives. There’s also some significant lessons from Marten V4 that might apply to Jasper
  3. Build in Open Telemetry tracing through Jasper’s pipeline. Really all about getting me up to speed on distributed tracing.
  4. Support the AsyncAPI standard (Swagger for asynchronous messaging). I’m really interested in this, but haven’t taken much time to dive into it yet
  5. Wire compatibility with NServiceBus so a Jasper app can talk bi-directionally with an NServiceBus app
  6. Same with MassTransit. If I decide to pursue Jasper seriously, I’d have to do that to have any shot at using Jasper at work
  7. More transport options. Right now there’s a Kafka & Pulsar transport option stuck in PR purgatory from another contributor. Again, a learning opportunity.
  8. Optimize the heck out of the Rabbit MQ usage.
  9. Go over the usability of the configuration. To be honest here, I’ve less than thrilled with our MassTransit usage and the hoops you have to jump through to bend it to our will and I’d like to see if I could do better with Jasper
  10. Improve the documentation website (if I’m serious about Jasper)
  11. Play with some kind of Jasper/Azure Functions integration. No idea what that would look like, but the point is to go learn more about Azure Functions
  12. Maybe, but a low priority — I have a working version of FubuMVC style HTTP endpoints in Jasper already. With everybody all excited about the new Minimal API stuff in ASP.Net Core v6, I wouldn’t mind showing a slightly different approach

Integrating Marten and Jasper

Maybe the single biggest reason for me to play more with Jasper is to explore some deeper integration with Marten for some more complicated CQRS and event sourcing architectural problems. Jasper already has an outbox/inbox pattern implementation for Marten. Going farther, I’d like to have out of the box solutions for:

  • Event streaming from Marten to message queues using Jasper
  • An alternative to Marten’s async daemon using Kafka/Pulsar topics
  • Using Jasper to host Marten’s asynchronous projections in a way that distributes work across running nodes
  • Experimenting more with CQRS architectures using Jasper + Marten

Anyway, I’m jotting this down mostly for me, but I’m absolutely happy for any kind of feedback or maybe to see if anyone else would be interested in helping with Jasper development.