Skip to content

Using Roslyn for Runtime Code Generation in Marten

I’m using Roslyn to dynamically compile and load assemblies built at runtime from generated code in Marten and other than some concern over the warmup time, it’s been going very well so far.

Like so many other developers with more cleverness than sense, I’ve spent a lot of time trying to build Hollywood Principle style frameworks that try to dynamically call application code at runtime through Reflection or some kind of related mechanism. Reflection itself has traditionally been the easiest mechanism to use in .Net to create dynamic behavior at runtime, but it can be a performance problem, especially if you use it naively.

A Look Back at What Came Before…

Taking my own StructureMap IoC tool as an example, over the years I’ve accomplished dynamic runtime behavior in a couple different ways:

  1. Using IL directly using Reflection.Emit from the original versions through StructureMap 2.5. Working with IL is just barely a higher abstraction than assembly code and I don’t recommend using that if your goal is maintainability or making it easy for other developers to work in your code. I don’t miss generating IL by hand whatsoever. For those of you reading this and saying “pfft, IL isn’t so bad if you just understand how it works…”, my advice to you is to immediately go outside and get some fresh air and sunshine because you clearly aren’t thinking straight.
  2. From StructureMap 2.6 I crudely used the trick of building Expression trees representing what I needed to do, then compiling those Expression trees into objects of the right Func or Action signatures. This approach is easier – at least for me – because the Expression model is much closer semantically to the actual code you’re trying to mimic than the stack-based IL.
  3. From StructureMap 3.* on, there’s a much more complex dynamic Expression compilation model that’s robust enough to call constructor functions, setter properties, thread in interception, and surround all of that with try/catch logic for expressive exception messages and pseudo stack traces.

The current dynamic Expression approach in the StructureMap 3/4 internals is mostly working out well, but I barely remember how it works and it would take me a good day to just to get back into that code if I ever had to change something.

What if instead we could just work directly in plain old C# that we largely know and understand, but somehow get that compiled at runtime instead? Well, thanks to Roslyn and its “compiler as a service”, we now can.

I’ve said before that I want to eventually replace the Expression compilation with the Roslyn code compilation shown in this post, but I’m not sure I’m ambitious enough to mess with a working project.

How Marten uses Roslyn Runtime Generation 

As I explained in my last blog post, Marten generates some “glue code” to connect a document object to the proper ADO.Net command objects for loading, storing, or deleting. For each document class, Marten generates an IDocumentStorage class with this signature:

public interface IDocumentStorage
    NpgsqlCommand UpsertCommand(object document, string json);
    NpgsqlCommand LoaderCommand(object id);
    NpgsqlCommand DeleteCommandForId(object id);
    NpgsqlCommand DeleteCommandForEntity(object entity);
    NpgsqlCommand LoadByArrayCommand(TKey[] ids);
    Type DocumentType { get; }

In the test library, we have a class I creatively called “Target” that I’ve been using to test how Marten handles various .Net Types and queries. At runtime, Marten generates a class called TargetDocumentStorage that implements the interface above. Part of the generated code — modified by hand to clean up some extraneous line breaks and added comments — is shown below:

using Marten;
using Marten.Linq;
using Marten.Schema;
using Marten.Testing.Fixtures;
using Marten.Util;
using Npgsql;
using NpgsqlTypes;
using Remotion.Linq;
using System;
using System.Collections.Generic;

namespace Marten.GeneratedCode
    public class TargetStorage : IDocumentStorage, IBulkLoader, IdAssignment
        public TargetStorage()


        public Type DocumentType => typeof (Target);

        public NpgsqlCommand UpsertCommand(object document, string json)
            return UpsertCommand((Target)document, json);

        public NpgsqlCommand LoaderCommand(object id)
            return new NpgsqlCommand("select data from mt_doc_target where id = :id").WithParameter("id", id);

        public NpgsqlCommand DeleteCommandForId(object id)
            return new NpgsqlCommand("delete from mt_doc_target where id = :id").WithParameter("id", id);

        public NpgsqlCommand DeleteCommandForEntity(object entity)
            return DeleteCommandForId(((Target)entity).Id);

        public NpgsqlCommand LoadByArrayCommand(T[] ids)
            return new NpgsqlCommand("select data from mt_doc_target where id = ANY(:ids)").WithParameter("ids", ids);

        // I configured the "Date" field to be a duplicated/searchable field in code
        public NpgsqlCommand UpsertCommand(Target document, string json)
            return new NpgsqlCommand("mt_upsert_target")
                .WithParameter("id", document.Id)
                .WithJsonParameter("doc", json).WithParameter("arg_date", document.Date, NpgsqlDbType.Date);

        // This Assign() method would use a HiLo sequence generator for numeric Id fields
        public void Assign(Target document)
            if (document.Id == System.Guid.Empty) document.Id = System.Guid.NewGuid();

        public void Load(ISerializer serializer, NpgsqlConnection conn, IEnumerable documents)
            using (var writer = conn.BeginBinaryImport("COPY mt_doc_target(id, data, date) FROM STDIN BINARY"))
                foreach (var x in documents)
                    writer.Write(x.Id, NpgsqlDbType.Uuid);
                    writer.Write(serializer.ToJson(x), NpgsqlDbType.Jsonb);
                    writer.Write(x.Date, NpgsqlDbType.Date);

Now that you can see what code I’m generating at runtime, let’s move on to a utility for generating the code.


SourceWriter is a small utility class in Marten that helps you write neatly formatted, indented C# code. SourceWriter wraps a .Net StringWriter for efficient string manipulation and provides some helpers for adding namespace using statements and tracking indention levels for you. After experimenting with some different usages, I mostly settled on using the Write(text) method that allows you to provide a section of code as a multi-line string. The TargetDocumentStorage code I showed above is generated from within a class called DocumentStorageBuilder with a call to the SourceWriter.Write() method shown below:

BLOCK:public class {mapping.DocumentType.Name}Storage : IDocumentStorage, IBulkLoader<{mapping.DocumentType.Name}>, IdAssignment<{mapping.DocumentType.Name}>


BLOCK:public {mapping.DocumentType.Name}Storage({ctorArgs})

public Type DocumentType => typeof ({mapping.DocumentType.Name});

BLOCK:public NpgsqlCommand UpsertCommand(object document, string json)
return UpsertCommand(({mapping.DocumentType.Name})document, json);

BLOCK:public NpgsqlCommand LoaderCommand(object id)
return new NpgsqlCommand(`select data from {mapping.TableName} where id = :id`).WithParameter(`id`, id);

BLOCK:public NpgsqlCommand DeleteCommandForId(object id)
return new NpgsqlCommand(`delete from {mapping.TableName} where id = :id`).WithParameter(`id`, id);

BLOCK:public NpgsqlCommand DeleteCommandForEntity(object entity)
return DeleteCommandForId((({mapping.DocumentType.Name})entity).{mapping.IdMember.Name});

BLOCK:public NpgsqlCommand LoadByArrayCommand(T[] ids)
return new NpgsqlCommand(`select data from {mapping.TableName} where id = ANY(:ids)`).WithParameter(`ids`, ids);

BLOCK:public NpgsqlCommand UpsertCommand({mapping.DocumentType.Name} document, string json)
return new NpgsqlCommand(`{mapping.UpsertName}`)
    .WithParameter(`id`, document.{mapping.IdMember.Name})
    .WithJsonParameter(`doc`, json){extraUpsertArguments};

BLOCK:public void Assign({mapping.DocumentType.Name} document)

BLOCK:public void Load(ISerializer serializer, NpgsqlConnection conn, IEnumerable<{mapping.DocumentType.Name}> documents)
BLOCK:using (var writer = conn.BeginBinaryImport(`COPY {mapping.TableName}(id, data{duplicatedFieldsInBulkLoading}) FROM STDIN BINARY`))
BLOCK:foreach (var x in documents)
writer.Write(x.Id, NpgsqlDbType.{id_NpgsqlDbType});
writer.Write(serializer.ToJson(x), NpgsqlDbType.Jsonb);



There’s a couple things to note about the code generation above:

  • String interpolation makes this so much easier than I think it would be with just string.Format(). Thank you to the C# 6 team.
  • Each line of code is written to the underlying StringWriter with the level of indention added to the left by SourceWriter itself
  • The “BLOCK” prefix directs SourceWriter to add an opening brace “{” to the next line, then increment the indention level
  • The “END” text directs SourceWriter to decrement the current indention level, then write a closing brace “}” to the next line and a blank line after that.

Now that we’ve got ourselves some generated code, let’s get Roslyn involved to compile it and actually get at an object of the new Type we want.

Roslyn Compilation with AssemblyGenerator

Based on a blog post by Tugberk Ugurlu, I built the AssemblyGenerator class in Marten shown below that invokes Roslyn to compile C# code and load the new dynamically built Assembly into the application:

public class AssemblyGenerator
    private readonly IList _references = new List();

    public AssemblyGenerator()
        ReferenceAssembly(typeof (Enumerable).Assembly);

    public void ReferenceAssembly(Assembly assembly)

    public void ReferenceAssemblyContainingType<T>()
        ReferenceAssembly(typeof (T).Assembly);

    public Assembly Generate(string code)
        var assemblyName = Path.GetRandomFileName();
        var syntaxTree = CSharpSyntaxTree.ParseText(code);

        var references = _references.ToArray();
        var compilation = CSharpCompilation.Create(assemblyName, new[] {syntaxTree}, references,
            new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary));

        using (var stream = new MemoryStream())
            var result = compilation.Emit(stream);

            if (!result.Success)
                var failures = result.Diagnostics.Where(diagnostic =>
                    diagnostic.IsWarningAsError ||
                    diagnostic.Severity == DiagnosticSeverity.Error);

                var message = failures.Select(x => $"{x.Id}: {x.GetMessage()}").Join("\n");
                throw new InvalidOperationException("Compilation failures!\n\n" + message + "\n\nCode:\n\n" + code);

            stream.Seek(0, SeekOrigin.Begin);
            return Assembly.Load(stream.ToArray());

At runtime, you use the AssemblyGenerator class by telling it which other assemblies it should reference and giving it the source code to compile:

// Generate the actual source code
var code = GenerateDocumentStorageCode(mappings);

var generator = new AssemblyGenerator();

// Tell the generator which other assemblies that it should be referencing 
// for the compilation

mappings.Select(x => x.DocumentType.Assembly).Distinct().Each(assem => generator.ReferenceAssembly(assem));

// build the new assembly -- this will blow up if there are any
// compilation errors with the list of errors and the actual code
// as part of the exception message
var assembly = generator.Generate(code);

Finally, once you have the new Assembly, use Reflection just to find the new Type you want by either searching through Assembly.GetExportedTypes() or by name. Once you have the Type object, you can build that object through Activator.CreateInstance(Type) or any of the other normal Reflection mechanisms.

The Warmup Problem

So I’m very happy with using Roslyn in this way so far, but the initial “warmup” time on the very first usage of the compilation is noticeably slow. It’s a one time hit on startup, but this could get annoying when you’re trying to quickly iterate or debug a problem in code by frequently restarting the application. If the warmup problem really is serious in real applications, we may introduce a mode that just lets you export the generated code to file and have that code compiled with the rest of your project for much faster startup times.

Optimizing for Performance in Marten

For the last couple weeks I’ve been working on a new project called Marten that is meant to exploit Postgresql’s JSONB data as a full fledged document database for .Net development as a drop in replacement for RavenDb in our production environment. I think that I would say that our primary goal with Marten is improved stability and supportability, but maximizing performance and throughput is a very close second in the priority list.

This is my second update on Marten progress. From last week, also see Marten Development So Far.

So far, I’ve mostly been focusing on optimizing the SQL queries generated by the Linq support for faster fetching. I’ve been experimenting with a few different query modes for the SQL generation based on what fields or properties you’re trying to search on:

  1. By default in the absence of any explicit configuration, Marten tries to use the “jsonb_to_record” function with a LATERAL join approach to optimize queries against members on the root of the document.
  2. You can also force Marten to only use basic Postgresql JSON locators to generate the where clauses in the SQL statements
  3. Finally, if you know that your application will be frequently querying a document type against a certain member, Marten can use a “searchable” field such that it duplicates that data in a normal database field and searches directly against that database field. This mechanism will clearly slow down your inserts and take up somewhat more storage space, but the numbers I’m about to display don’t lie, this is very clearly the fastest way to optimize queries using Marten (so far).

I’ve also experimented with both the Newtonsoft.Json serializer and the faster, but less flexible Jil serializer. Again, the numbers are pretty clear that for bigger result sets, Jil is much faster (NetJSON was a complete bust for me when I tried it). So far I’ve been able to keep Marten serializer-agnostic and I can easily see times when you’d have to opt for Newtonsoft’s flexibility.

Default jsonb_to_record/LATERAL JOIN

Using this approach, the SQL generated is:

select from mt_doc_target as d, LATERAL jsonb_to_record( as l("Date" date) where l."Date" = :arg0

Json Locators Only

While you can configure this behavior on a field by field basis, the quickest way is to just set the default document behavior:

public class JsonLocatorOnly : MartenRegistry
    public JsonLocatorOnly()
        // This can also be done with attributes

With this setting, the generated SQL is:

select from mt_doc_target as d where CAST( ->> 'Date' as date) = :arg0

Searchable, Duplicated Field

Again, to configure this option, I used this code:

public class DateIsSearchable : MartenRegistry
    public DateIsSearchable()
        // This can also be done with attributes
        For<Target>().Searchable(x => x.Date);

When I do this, the table for the Target type has an additional field called “date” that will get the value of the Target.Date property every time a Target object is inserted or updated in the database.

The resulting SQL is:

select from mt_doc_target as d where = :arg0

The Performance Results

I created the table below by generating randomized data, then trying to search by a DateTime field using three different mechanisms:

var theDate = DateTime.Today.AddDays(3);
var queryable = session.Query<Target>().Where(x => x.Date == theDate);

In all cases, I used the same sample data for the document count and took an average of running the same query five times after throwing out an initial attempt where Postgresql seemed to be “warming up” the JSONB data.

Serializer: JsonNetSerializer

Query Type 1K 10K 100K 1M
JSON Locator Only 9.6 75.2 691.2 9648
jsonb_to_record + lateral join 10 93.6 922.6 12091.2
searching by duplicated field 2.4 15 169.6 2777.8

Serializer: JilSerializer

Query Type 1K 10K 100K 1M
JSON Locator Only 6.8 61 594.8 7265.6
jsonb_to_record + lateral join 8.4 86.6 784.2 9655.8
searching by duplicated field 1 8.8 115.4 2234.2

To be honest, I expected the JSONB_TO_RECORD + LATERAL JOIN mechanism to be faster than the JSON locator only approach, but I need to go back and try to add some indexes because that’s supposed to be the benefit of using JSONB_TO_RECORD to avoid the object casts that inevitably defeat indexes. I’d be happy to get some Postgresql gurus to weigh in here if there are any reading this.

If you’re curious to see my mechanism for recording this data, see the performance_tuning code file in GitHub.

Bulk Loading Documents

From time to time (testing or data migrations maybe) you’ll have some need to very rapidly load a large set of documents into your database. I added a feature this morning to Marten that exploits Postgresql’s COPY feature supported by Npgsql:

public void load_with_small_batch()
    // This is just creating some randomized
    // document data
    var data = Target.GenerateRandomData(100).ToArray();

    // Load all of these into a Marten-ized database

    // And just checking that the data is actually there;)

Behind the scenes, Marten is using code generation at runtime and compiled by Roslyn to do the bulk loading as efficiently as possible without any hit from using reflection:

public void Load(ISerializer serializer, NpgsqlConnection conn, IEnumerable documents)
    using (var writer = conn.BeginBinaryImport("COPY mt_doc_target(id, data) FROM STDIN BINARY"))
        foreach (var x in documents)
            writer.Write(x.Id, NpgsqlDbType.Uuid);
            writer.Write(serializer.ToJson(x), NpgsqlDbType.Jsonb);

Do note that the code generation mechanism is smart enough to also add any fields or properties of the document type that are marked as duplicated for searching.

Other Outstanding Optimization Tasks 

  • Optimize the mechanics for applying all the changes in a unit of work. I’m hoping that we can do something to reduce the number of network round trips between the application and the postgresql server. My fallback approach is going to be to use a custom PLV8 sproc, but not until we exhaust other possibilities with the Npgsql library.
  • I want some mechanism for queuing up queries and submitting them in one network round trip
  • The ability to make a named, reusable Linq query so you can reuse the underlying ADO.Net command generated from parsing the Linq expression without having to go through all the Expression parsing gymnastics on each usage
  • Really more for scalability than performance, but we’ll get around to asynchronous query methods. I’m just not judging that to be a critical path item right now.
  • It’s probably minor in the grand scheme of things, but the actual Linq expression to Sql query generation is grotesque in how it concatenates strings

Feel very free to make suggestions and other feedback on these items;-)

Testing HTTP Handlers with No Web Server in Sight

FubuMVC 2.0 and 3.0 introduced some tooling I called “Scenarios” that allow users to write mostly declarative integration tests against the entire HTTP pipeline in memory without having to host the application in a web server. I promised a coworker that I would write a blog post about using Scenarios for an internal team that wants to start using it much more in their work. A week of procrastination later and here you go:

NOTE: All samples are using FubuMVC 3.0

Why Integration Tests?

From the very beginning, we tried very hard to make unit testing FubuMVC action methods in isolation as easy as possible. I think we largely succeeded in that goal. However, within the context of a handling an HTTP request, FubuMVC like most web frameworks will potentially wrap those action methods with various middleware strategies for cross cutting technical things like authentication, authorization, logging, transaction management, and content negotiation. At some point, to truly exercise an HTTP endpoint you really do need to write an integration test that exercises the entire chain of HTTP handlers for an HTTP request exactly the way it will be configured inside the running application.

Toward that end, I built a class called EndpointDriver in early versions of FubuMVC that you could use to write integration tests against a FubuMVC application hosted with an embedded Katana server. This early tooling just wrapped WebClient with a FubuMVC specific fluent interface for resolving url’s, setting common options like the content-type and accepts headers, and verifying parts of the HTTP response. Below is a sample from our content negotiation support integration tests in FubuMVC 1.3 (“endpoints” is a reference to the EndpointDriver object for the running application):

public void force_to_json_with_querystring()
    endpoints.Get("conneg/override/Foo?Format=Json", acceptType: "text/html")

EndpointDriver was fine at first, but our test library started getting slower as we added more and more tests and the fluent interface just never kept up with everything we needed for HTTP testing (plus I think that WebClient is awkward to use).

Using OWIN for HTTP “Scenarios”

As part of my FubuMVC 2.0 effort last year, I knew that I wanted a much better mechanism than the older EndpointDriver for doing integration testing of HTTP endpoints. Specifically, I wanted:

  • To be able to run HTTP requests and verify the response without having to take the performance hit of a web server
  • To run a FubuMVC application as it would be configured in production
  • To completely configure any part of an HTTP request
  • To be able to declaratively express multiple assertions against the expected response
  • To utilize FubuMVC’s support for “reverse URL resolution” for more traceable tests
  • Access to the raw HTTP request and response for anything unusual you would need to do that didn’t have a specific helper

The end result was a mechanism I called “Scenario’s” that exploited FubuMVC’s OWIN support to run HTTP requests in memory using this signature off of the new FubuRuntime object I explained in an earlier blog post:

OwinHttpResponse Scenario(Action<Scenario> configuration)

The Scenario object models both the HTTP request provides a way to specify expectations about the HTTP response for commonly used things like HTTP status codes, header values, and checking for the presence of string values in the HTTP response body. If need be, you also have access to FubuMVC’s abstractions for the entire HTTP request and response (more on this later).

To make this concrete, let’s say that you’re working through a “Hello, World” exercise with FubuMVC with this class and action method that just returns the text “Hello, World” when you issue a GET to the root “/” url of an application:

public class HomeEndpoint
    public string Index()
        return "Hello, World";

A scenario test for the action above would look like this code below:

using (var runtime = FubuRuntime.Basic())
    // Execute the home route and verify
    // the response
    runtime.Scenario(_ =>

        _.ContentShouldBe("Hello, World");

In the scenario above, I’m issuing a GET request to the “/” url of the application and specifying that the resulting status code should be HTTP 200, “content-type” response header should be “text/plain”, and the exact contents of the response body should be “Hello, World.” When a Scenario is executed, it will run every single assertion instead of quitting on the first failure and report on every failed expectation in the specification output. This behavior is valuable when you have to author specifications with slower running scenario setup.

Specifying Url’s

FubuMVC has a model for reverse URL lookup from any endpoint method or the input model that we exploited in Scenario’s for traceable tests:

host.Scenario(_ =>
    // Specify a GET request to the Url that runs an endpoint method:
    _.Get.Action<InMemoryEndpoint>(e => e.get_memory_hello());

    // Or specify a POST to the Url that would handle an input message:

        // This call serializes the input object to Json using the 
        // application's configured JSON serializer and setting
        // the contents on the Request body
        .Json(new HeaderInput {Key = "Foo", Value1 = "Bar"});

    // Or specify a GET by an input object to get the route parameters
    _.Get.Input(new InMemoryInput { Color = "Red" });

I like the reverse url lookup instead of specifying Url’s directly in the scenarios because:

  1. It makes your scenario tests traceable to the actual handling code
  2. It insulates your scenarios from changes to the Url structures later

Checking the Response Body

For the 3.0 work I did a couple months ago, I fleshed out the Scenario support with more mechanisms to analyze the HTTP response body:

host.Scenario(_ =>
    // set up a request here

    // Read the response body as text
    var bodyText = _.Response.Body.ReadAsText();

    // Read the response body by deserializing Json
    // into a .net type with the application's
    // configured Json serializer
    var output = _.Response.Body.ReadAsJson<MyResponse>();

    // If you absolutely have to work with Xml...
    var xml = _.Response.Body.ReadAsXml();

Some Other Things…

I’ll happily explain the details of this list on request, but here are some other attributes of Scenario’s that FubuMVC supports right now:

  • You can specify expected values for HTTP response headers
  • You can assert on status codes and descriptions
  • There are helpers to send Json or Xml serialized data based on an input object message
  • There is a mechanism that allows you to disable all security middleware in the application for a single Scenario that has been frequently helpful in testing
  • You have access to the underlying IoC container for the running application from the Scenario if you need to resolve and use application services
  • FubuMVC is now StructureMap 4.0-only for its IoC usage, so we’re able to rely on StructureMap’s child container feature to resolve services during a Scenario execution from a unique child container per run. This allows you to replace services in your application with fakes, mocks, and stubs in a way that prevents your fake services from impacting more than one test.

Scenarios in Jasper

If you didn’t see my blog post earlier this year, FubuMVC is getting a complete reboot into a new project called Jasper late this year/early next year. I absolutely plan on bringing the Scenario support forward into Jasper very early, but this time around we’re completely dropping all of FubuMVC’s HTTP abstractions in favor of directly using the OWIN environment dictionary as the single model of HTTP requests and responses. My thought right now is that we’ll invest heavily in extension methods hanging off of IDictionary<string, object> for commonly used operations against that OWIN dictionary.

To some extent, we’re hoping as well that there will be a good ecosystem of OWIN helpers from other people and projects that will be usable from within Jasper.

Other Reading

Marten Development So Far (Postgresql as Doc Db)

Last week I mentioned that I had started a new OSS project called “Marten” that aims to allow .Net developers treat Postgresql 9.5 (we’re using the new “upsert” functionality ) as a document database using Postgresql’s JSONB data type. We’ve already had some interest and feedback on Github and the Gitter room — plus links to at least three other ongoing efforts to do something similar with Postgresql that I’m interpreting as obvious validation for the basic idea.

Please feel very free to chime in on the approach or requirements here or Github or Gitter. We’re going to proceed with this project regardless at work, but I’d love to see it also be a viable community project with input from outside our little development organization.

What’s Already Done

I’d sum up the Marten work as “so far, so good”. If you look closely into the Marten code, do know that I have been purposely standing the functionality with simple mechanics and naive implementations. My philosophy here is to get the functionality up with good test coverage before starting any heavy optimization work.

As of now:

  • Our thought is that the main service facade to Marten is the IDocumentSession interface that very closely mimics the same interface in RavenDb. This work is for my day job at Extend Health, and our immediate goal is to move systems off of RavenDb early next year, so I think that design decision is pretty understandable. That doesn’t mean that that’ll be the only way to interact with Marten in the long run.
  • In the “development mode”, Marten is able to create database tables and an “upsert” stored procedure for any new document type it encounters in calls to the IDocumentSession.
  • The real DocumentSession facade can store documents, load documents by either a single or array of id’s, and delete documents by the same.
  • DocumentSession implements a “unit of work” with similar usage to RavenDb’s.
  • You can completely bypass the Linq provider I’m describing in the next section and just use raw SQL to fetch documents
  • A DocumentCleaner service that you can use to tear down document data or even the schema objects that Marten builds inside of automated testing harnesses

Linq Support

I don’t think I need to make the argument that Marten is going to be more usable and definitely more popular if it has decent Linq support. While I was afraid that building a Linq provider on top of the Postgresql JSON operators was going to be tedious and hard, the easy to use Relinq library has made it just “tedious.”

As early as next week I’m going to start working over the Linq support and the SQL it generates to try to optimize searching.

The Linq support hangs off of the IDocumentSession.Query<T>() method like so:

        public void query()
            theSession.Store(new Target{Number = 1, DateOffset = DateTimeOffset.Now.AddMinutes(5)});
            theSession.Store(new Target{Number = 2, DateOffset = DateTimeOffset.Now.AddDays(1)});
            theSession.Store(new Target{Number = 3, DateOffset = DateTimeOffset.Now.AddHours(1)});
            theSession.Store(new Target{Number = 4, DateOffset = DateTimeOffset.Now.AddHours(-2)});
            theSession.Store(new Target{Number = 5, DateOffset = DateTimeOffset.Now.AddHours(-3)});


                .Where(x => x.DateOffset > DateTimeOffset.Now).ToArray()
                .Select(x => x.Number)
                .ShouldHaveTheSameElementsAs(1, 2, 3);

For right now, the Linq IQueryable support includes:

  • IQueryable.Where() support with strings, int’s, long’s, decimal’s, DateTime’s, enumeration values, and boolean types.
  • Multiple or chained Where().Where().Where() clauses like you might use when you’re calculating optional where clauses or letting multiple pieces of code add additional filters
  • “&&” and “||” operators in the Where() clauses
  • Deep nested properties in the Where() clauses like x.Address.City == “Austin”
  • First(), FirstOrDefault(), Single(), and SingleOrDefault() support for the IQueryable
  • Count() and Any() support
  • Contains(), StartsWith(), and EndsWith() support for string values — but it’s case sensitive right now. Case-insensitive searches are probably going to be an “up-for-grabs” task;)
  • Take() and Skip() support for paging
  • OrderBy() / ThenBy() / OrderByDescending() support

Right now, I’m using my audit of our largest system at work that uses RavenDb to guide and prioritize the Linq support. The only thing missing for us is searching within child collections of a document.

What we’re missing right now is:

  • Projections via IQueryable.Select(). Right now you have to do IQueryable.ToArray() to force the documents into memory before trying to use Select() projections.
  • Last() and LastOrDefault()
  • A lot of things I probably hadn’t thought about at all;-)

Using Roslyn for Runtime Code Compilation

We’ll see if this turns out to be a good idea or not, but as of today Marten is using Roslyn to generate strategy classes that “know” how to build database commands for updating, deleting, and loading document data for each document type instead of using Reflection or IL emitting or compiling Expression’s on the fly. Other than the “warm up” performance hit on doing the very first compilation, this is working smoothly so far. We’ll be watching it for performance. I’ll blog about that separately sometime soon-ish.

Next Week: Get Some Data and Optimize!

My focus for Marten development next week is on getting a non-trivial database together and working on pure optimization. My thought is to grab data from Github using Ocktokit.Net to build a semi-realistic document database of users, repositories, and commits from all my other OSS projects. After that, I’m going to try out:

  • Using GIN indexes against the jsonb data to see how that works
  • Trying to selectively duplicate data into normal database fields for lightweight sql searches and indexes
  • Trying to use Postgresql’s jsonb_to_record functionality inside of the Linq support to see if that makes searches faster
  • I’m using Newtonsoft.Json as the JSON serializer right now thinking that I’d want the extra flexibility later, but I want to try out Jil too for the comparison
  • After the SQL generation settles down, try to clean up the naive string concatenation going on inside of the Linq support
  • Optimize the batch updates through DocumentSession.SaveChanges(). Today it’s just making individual sql commands in one transaction. For some optimization, I’d like to at least try to make the updates happen in fewer remote calls to the database. My fallback plan is to use a *gasp* stored procedure using postgresql’s PLV8 javascript support to take any number of document updates or deletions as a single json payload.

That list above is enough to keep me busy next week, but there’s more in the open Github issue list and we’re all ears about whatever we’ve missed, so feel free to add more feature requests or comment on existing issues.

Why “Marten?”

One of my colleagues was sneering at the name I was using, so I googled for “natural predators of ravens” and the marten was one of the few options, so we ran with it.

My .Net Unboxed 2015 Wrapup

I had a blast this week at the .Net Unboxed conference in Dallas. The content and speaker lineup was good, the vibe was great, the venue and location was great, and it was remarkably well organized. My hat is completely off to the organizers and I sincerely hope they’re up for doing this again next year.

For my part, I thought my Storyteller 3 talk went well and I was thrilled with the interest and questions I got about it later. I’ll definitely be posting a link to the recording when that’s posted.

Some thoughts and highlights in no particular order:

  • Strong naming in .Net continues to be a major source of angst and frustration for those of us heavily involved in OSS or simply wanting to consume OSS projects. Now that Nuget makes it somewhat easier to push out incremental releases and bug fix releases, strong naming is causing more and more headaches. I enjoyed my conversations with Daniel Plaisted of Microsoft who for the very first time has convinced me that anybody in Redmond understands how much trouble strong naming is causing. I’m still iffy on having to take on the overhead of ilrepack/ilmerge in publishing or doing the “Newtonsoft Lie to Your Users” version strategy. I think that CoreCLR’s much looser usage of strong naming might very well be enough reason for us as a community to hurry up and get our code up on the new runtime. In the meantime, I’ll be closely following the new Strongnamer project as a possible way to eliminate some of the pain for non-CoreCLR packages.
  • I’m not buying that DNX is going to be usable until later next year. I think, and conversations this week reinforced this idea, that I very much like the ASP.Net team’s general vision for vNext, but they’ve just bitten off more than they can handle.
  • Nik Molnar gave me a compliment about my UI work on Storyteller that made my day since I’m infamously bad at UI design and layout. I’m pretty sure he followed that up with “but, [something negative]” but I didn’t pay any attention to that part;)
  • I started to get a little irritated during one talk and wanted to start arguing with the speaker, so I quietly snuck out and went to the other ongoing talk *just* in time to hear Jimmy Bogard telling folks how he made a mistake by copying the old static ObjectFactory idea from StructureMap. I’ve apologized in public dozens of times on that one and I’m sure I’ll have to do it plenty more times. Sigh.
  • I did enjoy Jimmy’s talk on his experiences running OSS projects and appreciated his candor about the earlier decisions and approaches that didn’t necessarily work out. For my money, many of the best conference talks are about lessons learned from mistakes and fixing problems.
  • I definitely appreciated the lack of “let me tell you how wonderful I am and can I have an MVP award now?” talks that so frequently pop up in many .Net-centric “eyes-forward” conferences. I love how interactive the talks were and how engaged the audiences were in asking questions. I especially enjoy it when talks seem to be just a way of jumpstarting conversations.
  • I’ve thought for over a year that the forthcoming “K”/DNX work from Redmond was probably going to suck all the oxygen out of the room for alternative frameworks and I think you’ve definitely seen that happen. On a much more positive note, I think that we might see a resurgence of those things next year as we get to start taking advantage of the improvements to the .Net framework. More and more, I’m hearing about folks treating DNX as almost a reset for .Net OSS, and that might not be a terrible thing.
  • I enjoyed the talk on Falcor.Net and I’m very interested in Falcor in general as a possibly easier – or at least less weird – approach than GraphQL for React.js client to server communication.




Postgresql as a Document Db for .Net Development

I’m one of those guys who normally doesn’t like to talk much about new OSS projects until there’s a lot to show, but just for fun this time, I’m gonna talk about something that I’ve just barely started in the hopes of getting some feedback and because there’s already been some interest from outside my company. Besides, it’s not like the ways I’ve ran OSS projects in the past have been all that successful anyway.

We use RavenDb at work in several projects, and while I still think there are some great features and attributes in RavenDb for easy development, it hasn’t held up very well in production usage and we want to replace it next year. I’ve gotten to spend some time over the past couple weeks laying out the skeleton of a new project on GitHub we’re calling “Marten” that will in theory allow us to treat Postgresql as a document database for .Net development.

We want to keep what we see as the advantages of RavenDb:

  • Schema-less development based on our objects without any kind of ORM mapping or limitations on object structure
  • The ability to quickly get a clean database per automated test for reliable testing
  • Linq support — I’ve already gotten some Linq support for basic operators using Re-linq and I’ve been pleasantly surprised at how well that went.
  • Batched updates and the built in unit of work — my working theory is to use DbDataAdapter’s to rig up batched updates
  • Defered and/or batched queries — at least one of our apps is getting killed by network chattiness, so this is going to be a pretty high priority

In the end, what we’d really like to have is all the development advantages of RavenDb and document databases, but have full ACID support, all the DevOps tooling that already exists around Postgresql, and sit on top of a proven database engine.

Roadmap and Contributing

I’ve done enough spiking and proof of concept type work to feel like this is viable — pending performance testing down the road of course. I spent this morning trying to write up my thoughts on where we should go with thing into the GitHub issue list mostly as a way to start a detailed conversation about what this thing should be and where it’s going to go. If you’ve got any opinions, we’d love to hear them either on individual issues or in the Gitter room.

Roughly speaking, the features we’re thinking about are:

  • Support basic document saving and retrieval through a new IDocumentSession service facade purposely modeled after RavenDb’s
  • Basic Linq support against documents
  • The ability to bypass Linq and provide the raw SQL yourself when necessary (already working)
  • Schema creation and migration support for deployments
  • Read side/view projections in the database?
  • Some way to define and use indexes in queries
  • For lack of a better term, “Stored Procedures” that let you generate SQL queries from a convoluted Linq expression once and reuse across requests
  • Maybe make this thing a plugin or separate provider for EF7. I’m not sure there’s a technical reason to do that yet, you you know it’d make a lot more people interested in this thing

Maybe just as a vanity project for my satisfaction, but also build an EventStore capability including user-defined projects into Marten using Postgresql’s ability to embed Javascript.

If you have any interest in contributing or following this thing, hit us up in the Gitter room or start weighing in on GitHub issues.

Marten in Action

To see the itty bit that’s done so far in action, say that you have a .Net type representing a document like this one from my test project:

    // The IDocument interface is just a temporary crutch
    // for now. It won't be necessary in the end
    public class User : IDocument
        public User()
            Id = Guid.NewGuid();

        public Guid Id { get; set; }

        public string FirstName { get; set; }
        public string LastName { get; set; }

        public string FullName
            get { return "{0} {1}".ToFormat(FirstName, LastName); }

Starting from a blank Postgresql 9.5 schema (because we’re already depending on the new “upsert” capabilities) with Marten’s version of IDocumentSession, I’ll create a new User object, save it, then load a new copy of it from the database by its id:

        public void persist_and_reload_a_document()
            var user = new User { FirstName = "James", LastName = "Worthy" };

            // theSession is Marten's IDocumentSession service

            // Marten is NOT coupled to StructureMap, but
            // I found it convenient to use StructureMap for object assembly
            // in the tests
            using (var session2 = theContainer.GetInstance<IDocumentSession>())

                var user2 = session2.Load<User>(user.Id);


Behind the scenes, Marten sees that it doesn’t have a preexisting table to store User documents, so it quietly makes us one like this:

CREATE TABLE public.mt_doc_user
  id uuid NOT NULL,
  data jsonb NOT NULL,
  CONSTRAINT pk_mt_doc_user PRIMARY KEY (id)

Right now, we’re only adding an Id field as the primary key and a second JSONB field to hold the actual document representation. Later on we’ll probably add timestamps, version numbers, or duplicate selected fields in the document structure for more efficient querying and indexing.

Why didn’t you use…

Because a flood of “why not Y” questions inevitably follow any statement of “we chose X”:

  • I’ve seen too many stories about MongoDb losing data and Postgresql v. MongoDb performance comparisons.
  • SimpleDb does look cool, but I’m not a huge fan of their query language and for some crazy reason, our organization (including me) is suddenly being very conservative about trying newer databases.
  • I need to do more research on Kafka before I can answer that one
  • I really don’t want to have to fall back all the way to developing applications primarily on an RDBMS. I’ve had enough of heavy ORM’s, the only somewhat more palatable micro-ORM’s, and writing procedural code using raw tabular data.

Other Reading

The name “Marten” has already stuck in conversations at work, so we’re keeping it for now. Besides, look how cute martens are:


Storyteller 3: Executable Specifications and Living Documentation for .Net

tl;dr: The open source Storyteller 3 is an all new version of an old tool that my shop (and others) use for customer facing acceptance tests, large scale test automation, and “living documentation” generation for code-centric systems.

A week from today I’m giving a talk at .Net Unboxed on Storyteller 3, an open source tool largely built by myself and my colleagues for creating, running, and managing Executable Specifications against .Net projects based on what we feel are the best practices for automated testing based on over a decade of working with automated integration testing. As I’ll try to argue in my talk and subsequent blog posts, the most complete approach in the .Net ecosystem for reliably and economically writing large scale automated integration tests.

My company and a couple other early adopters have been using Storyteller for daily work since June and the feedback has been pleasantly positive so far. Now is as good of time as any to make a public beta release for the express purpose of getting more feedback on the tool so we can continue to improve the tool prior to an official 3.0 release in January.

If you’re interested in kicking the tires on Storyteller, the latest beta as of now is available on For help getting started, see our tutorial and getting started pages.

Some highlights:

It’s improved a lot since then, but I gave a talk at work in March previewing Storyteller 3 that at least discusses the goals and philosophy behind the tool and Storyteller’s approach to acceptance tests and integration tests.

A Brief History

I had a great time at Codemash this year catching up with old friends. I was pleasantly surprised when I was there to be asked several times about the state of Storyteller, an OSS project others had originally built in 2008 as a replacement for FitNesse as our primary means of expressing and executing automated customer facing acceptance tests. Frankly, I always thought that Storyteller 1 and the incrementally better Storyteller 2 were failures in terms of usability and I was so burnt out on working with it that I had largely given up on it and ignored it for years.

Unfortunately, my shop has a large investment in Storyteller tests and our largest and most active project was suffering with heinously slow and unreliable Storyteller regression test suites that probably caused more harm than good with their support costs. After a big town hall meeting to decide whether to scrap and replace Storyteller with something else, we instead decided to try to improve Storyteller to avoid having to rewrite all of our tests. The result has been an effective rewrite of Storyteller with an all new client. While trying very hard to mostly preserve backward compatibility with the previous version in its public API’s, the .Net engine is also a near rewrite in order to squeeze out as much performance and responsiveness as we could.


The official 3.0 release is going to happen in early January to give us a chance to possibly get more early user feedback and maybe to get some more improvements in place. You can see the currently open issue list on GitHub. The biggest things outstanding on our roadmap are:

  • Modernize the client technology to React.js v14 and introduce Redux and possibly RxJS as a precursor to doing any big improvements to the user interface and trying to improve the performance of the user interface with big specification suites
  • A “step through” mode in the interactive specification running so users can step through a specification like you would in a debugger
  • The big one, allow users to author the actual specification language in the user interface editor with some mechanics to attach that language to actual test support code later

Get every new post delivered to your Inbox.

Join 50 other followers