Plough Horses and Context Shifting

Mostly I just miss my grandfather and this is just an excuse to share a story I’ve always liked for some reason. On another occasion he tried to do that old timer thing of telling me how hard he had it as a youngster because he had to ride a horse so many miles into town to get to his job at an absurdly early hour, then got all misty-eyed and ruined the effect with “…and sometimes I’d give that stallion his head and we’d go flying…” 

My grandfather told me a story one time about how he used to plough with a horse not too far from town. Then and now, there’s a factory right at the edge of town. In those days that factory would blow a whistle at quitting time. My grandfather’s plough horse would finish the row they were on when it heard the whistle, but it would absolutely refuse to do anything else after that.

How is that relevant you ask? As I get a little older, I know that I’m less able to make large context shifts late in the afternoon as quitting time gets closer and closer. If I finish something pretty complicated after lunch and the next thing up is also going to be complicated or worse, involve a pretty large context shift into a different problem space, I know better than to even try to start that next thing. Instead, I’ll do some paper and pencil work to task out the next day’s work or switch to some easier work or correspondance.

Since I’m largely able to set my own hours and I’m mostly unencumbered by meetings (don’t hate me for that), I can actually think about how to optimize my work schedule to the work I’m doing. One thing I’ve learned is that you do best week over week if you pace yourself to the old XP idea of a sustainable pace. For me, that’s also knowing when to push hard and when it’s best to either quit and rest or just start lining up the next day’s push.

 

Marten is Ready for Early Adopters

I’ve been using RavenDb for development over the past several years and I’m firmly convinced that there’s a pretty significant productivity advantage to using document databases over relational databases for many systems. For as much as I love many of the concepts and usability of RavenDb, it isn’t running very successfully at work and it’s time to move our applications to something more robust. Fortunately, we’ve been able to dedicate some time toward using Postgresql as a document database. We’ve been able to do this work as a new OSS project called Marten. Our hope with Marten has been to retain the development time benefits of document databases (along with an easy migration path away from RavenDb) with a robust technological foundation — and even I’ll admit that it will occasionally be useful to fall back to using Postgresql as a relational database where that is still advantageous.

I feel like Marten is at a point where it’s usable and what we really need most is some early adopters who will kick the tires on it, give some feedback about how well it works, what’s missing that would make it easier to use, and how it’s performing in their systems. Fortunately, as of today, Marten now has (drum role please):

And of course, the Marten Gitter room is always open for business.

An Example Quickstart

To get started with Marten, you need two things:

  1. A Postgresql database schema (either v9.4 or v9.5)
  2. The Marten nuget installed into your application

After that, the quickest way to get up and running is shown below with some sample usage:

var store = DocumentStore.For("your connection string");

Now you need a document type that will be persisted by Marten:

    public class User
    {
        public Guid Id { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public bool Internal { get; set; }
        public string UserName { get; set; }
    }

As long as a type can be serialized and deserialized by the JSON serializer of your choice and has a public field or property called “Id” or “id”, Marten can persist and load it back later.

To persist and load documents, you use the IDocumentSession interface:

    using (var session = store.LightweightSession())
    {
        var user = new User {FirstName = "Han", LastName = "Solo"};
        session.Store(user);

        session.SaveChanges();
    }

 

 

 

 

My Thoughts on Choosing and Using Persistence Tools

By no means is this post a comprehensive examination of every possible type of persistence tool, software system, or even team role. It’s just my opinions and experiences — and even though I’ve been a software developer far longer than many folks, there are lots of types of systems and architectures I’ve never gotten a chance to work on. I’m also primarily a developer, so I’m definitely not representing the DBA viewpoint.

I had an interesting exchange on twitter last week when an ex-Austinite I hadn’t seen in years asked me if I still used NHibernate. At the same time, there’s some consternation at my work about our usage of RavenDb as a document database and some question about how we might replace that later. Because I happen to be working on potentially using Postgresql as a complete replacement for RavenDb and a possible replacement for some of our older Sql Server-based event sourcing tooling, I thought it would be helpful to go over what I think about “persistence” tools these days.

To sum up my feelings on the subject:

  • I think NoSQL document databases can make development much more productive over old-fashioned relational database usage — especially when your data is hierarchical
  • I’m completely done with heavyweight ORM’s like NHibernate or Entity Framework and I would recommend against the very concept at this point. I think the effort to directly persist a rich domain model to a relational database (or really just any database at all) was a failure.
  • I’m mostly indifferent to the so-called “micro-” ORM’s. I think they’re definitely an easy way to query and manipulate relational databases from middle tier code, but the usage I’ve seen at work makes me think they’re just a quick way to very tightly couple your application code to the details of the database schema.
  • I think that Event Sourcing inside a CQRS architecture can be very effective where that style fits, but it’s a mess when it’s used where not really appropriate.
  • I have mostly given up on the old Extreme Programming idea that you could happily build most of the application without having to worry about any kind of database until near the end of the project. If database performance is any kind of project risk, you’ve got to deal with that early. If your database choice is going to have an impact on your application code, then that too has to be dealt with earlier. If you can model your application as consuming JSON feeds, maybe you can get away with delaying the database.
  • EDIT: A couple folks have either asked about or recommended just using raw ADO.net code. My feeling on that subject hasn’t changed in years, if you’re writing raw ADO.Net code, you’re probably stealing from your employer.

 

 

Root Causes

When I judge whether or not a persistence tool is a good fit for how I prefer to work, I’m thinking about these low level first causes:

  • Is the tool ACID compliant, or will it at least manage not to lose important data because that tends to make our business folks angry. I’m just too old and conservative to screw around with any of the database tools that don’t support ACID.
  • Will the tool have a negative impact on our ability to evolve the structure of the application? Is it cheap for me to make incremental changes to the system state? I strongly believe that tools and technologies that don’t allow for easy system evolution make software development efforts fragile by forcing you to be right in your upfront designs.
  • What’s the impact going to be on automated test efforts? For all of its problems for us, RavenDb has to be the undisputed champion of testability because of how absurdly easy it is to establish and tear down system state between tests for reliable automated tests.
  • How much or little mismatch is there between the shape of my data that my business logic, user interface, or API handlers going to need versus how the database technology needs to store it because that old impedance mismatch issue can suck down a lot of developer time if you choose poorly.

 

Why I prefer Document Db’s over Relational Db’s for Most Development

Even though RavenDb isn’t necessarily working out for us, I still believe that document databases make sense and can lead to better productivity than relational databases. Doing a side by side comparison on some of my “first causes” above:

  • Evolutionary software design. Changing your usage of a relational database can easily involve changes to the DDL, database migrations, and possibly ORM mappings (the old, dreaded “wormhole anti-pattern” problem). I think this an area where “schemaless” document databases are a vast improvement for developer productivity because I only have to change my document type in code and go. It’s vastly less work when there’s only one model to change.
  • Testability. I think it’s more mechanical work to set up and tear down system state for automated tests with relational databases versus a document database. Relational integrity is a great thing when you need it, but it adds some extra work to test setup just to make the database shut up and let me insert my data. My clear experience from having automated testing against both types of database engines are that it’s much simpler working with document databases.
  • The Impedance Mismatch Challenge. Again, this is an area where I much prefer document databases when I generally want to store and retrieve hierarchical data. I also prefer document databases where data collections may have a great deal of polymorphism.

 

 

When would I still opt for a Relational Database?

Besides the overwhelming inertia of relational databases (everybody knows RDBMS tools and there are seemingly an infinite number of management and reporting tools to support RDBMS’s), there are still some places where I would still opt for a relational database:

  • Reporting applications. Not that it’s impossible in other kinds of databases, but there’s so many decent existing solutions for reporting against RDBMS’s.
  • If I were still a consultant, an RDBMS is a perfectly acceptable choice for conservative clients
  • Applications that will require a lot of adhoc queries. Much of my early career was trying to make sense of large engineering and construction databases that frequently went off the rails.
  • Batch jobs, not that I really wanna ever build systems like that again
  • Systems with a lot of flat, two dimensional data

 

The Pond Scum Shared Database Anti-Pattern

If you’ve worked around me long enough, you would surely hear me use the phrase “sharing a database is like drug abusers sharing needles.” I’ve frequently bumped into what I call the “pond scum anti-pattern” where an enterprise has one giant shared database with lots of little applications floating around it that modify and read pretty much the same set of database tables. It’s common, but so awfully harmful.

The indirect coupling between applications is especially pernicious because it’s probably not very obvious how any giving change to the database will impact all the little applications that float around it. My strong preference is for application databases rather than the giant shared database. That might very well lead to some duplication or worse, some inconsistency in data across applications, but we can’t solve everything in one already too long blog post;)

And to prove that this topic of the “shared database” problem is a long, never-ending problem, here’s a blog post from me on the same subject from 2005.

 

What about…?

  • MongoDb? I know some people like it and I’ve had some feedback on Marten that we should be patterning its usage on MongoDb rather than mostly on RavenDb. I’ve just seen too many stories about MongoDb losing data or having inadequate transactional integrity support.
  • Graph databases like Neo4J? I think they sound very interesting and there’s a project or two I’ve done that I thought might have benefited from using a graph database, but I’ve never used one. Someday.
  • Rail’s ActiveRecord? Even though I never made the jump to Ruby like so many other of my ALT.Net friends from a decade ago did, there was a time when I thought Ruby on Rails was the coolest thing ever. That day has clearly passed. I’m really not wild about any persistence that forces you to lock your application code to the shape of the database.
  • CSLA is apparently still around. To say the least, I’m not a fan. Too much harmful coupling between business logic and infrastructure, poor for evolutionary design in my opinion.

StructureMap 4.0 is Out!

tl;dr: StructureMap 4.0 went live to Nuget today with CoreCLR support, better performance, far better conventional registration via type scanning, and many improvements specifically targeted toward StructureMap integration into ASP.Net MVC 5 & 6 applications.

StructureMap 4.0 officially went live on Nuget today. The release notes page is updated for 4.0 and you can also peruse the raw GitHub issue list for the 4.0 milestone to see what’s new and what’s been fixed since 3.1.6.

Even though StructureMap has been around forever in .Net OSS terms (since June of 2004!), there are still new things to do, obsolete things to remove, and continuing needs to adapt to what users are actually trying to accomplish with the tool. As such, StructureMap 4.0 represents the lessons we’ve learned in the past couple years since the big 3.0 release. 4.0 is a much smaller set of changes than 3.0 and mostly contains performance and diagnostic improvements.

For the very first time since the long forgotten 2.5 release way back in 2008, I’m claiming that the StructureMap documentation site is completely up to date and effectively comprehensive. Failing that of course, the StructureMap Gitter room is open for questions.

This time around, I’d like to personally thank Kristian Hellang for being patient while we worked through issues with lifecycle and object disposal patterns for compliance with the new ASP.Net vNext dependency injection usage and the new StructureMap.DNX nuget for integrating StructureMap into vNext applications. I’d also like to thank Dmytro Dziuma for some pretty significant improvements to StructureMap runtime performance and his forthcoming packages for AOP with StructureMap and the rebuilt StructureMap.AutoFactory library. I’d also like to thank Oren Novotny for his help in moving StructureMap to the CoreCLR.

Some Highlights:

  • The StructureMap nuget now targets .Net 4.0, the CoreCLR via the “dotnet” profile, and various Windows Phone and Android targets via the PCL. While the early feedback on CoreCLR usage has been positive, I think you still have to assume that that support is unproven and early.
  • The internals of the type scanning and conventional registration has been completely overhauled to optimize container bootstrapping time and there are some new diagnostics to allow users to unwind frequent problems with type scanning registrations. The mechanism for custom conventions is a breaking change for 4.0, see the documentation for the details.
  • The lifecycle management had to be significantly changed and enhanced for ASP.Net vNext compliance. More on this in a later blog post.
  • Likewise, there are some new rules and behavior for how and when StructureMap will track and dispose IDisposable’s.
  • Performance improvements in general and some optimizations targeted specifically at integration with ASP.Net MVC (I don’t approve of how the ASP.Net team has done their integration, but .Net is their game and it was important to harden StructureMap for their goofy usage)
  • More robustness in heavy multi-threaded access
  • The constructor selection is a little smarter
  • ObjectFactory is gone, baby, gone. Jimmy Bogard will have to find some new reason to mock my code;)
  • If you absolutely have to use them, there is better support for customizing registration and behavior with attributes
  • The Registry class moved to the root “StructureMap” namespace, do watch that. That’s bugged me for years, so I went a head and fixed that this time since we were going for a new full point release.
  • 4.0 introduces a powerful new mechanism for establishing policies and conventions on how objects are built at runtime. My hope is that this will solve some of the user questions and problems that I’ve gotten in the past couple years. There will definitely be follow up post on that.

 

And of course, you can probably expect a 4.0.1 release soon for any issues that pop up once folks use this in real project work. At a minimum, there’ll be updates for the CoreCLR support once the dust settles on all that churn.

 

Optimizing Marten Part 2

This is an update to an earlier blog post on optimizing for performance in Marten. Marten is a new OSS project I’m working on that allows .Net applications to treat the Postgresql database as a document database. Our hope at work is that Marten will be a more performant and easier to support replacement in our ecosystem for RavenDb (and possibly a replacement event store mechanism inside of the applications that use event sourcing, but that’s going to come later).

Before we should think about using Marten for real, we’re undergoing some efforts to optimize the performance both in reading and writing data from Postgresql.

Optimizing Queries with Indexes

In my previous post, my former colleague Joshua Flanagan suggested using the Postgresql containment operator and gin indexes as part of my performance comparisons. After adding some ability to define database indexes for  Marten document types like this:

public class ContainmentOperator : MartenRegistry
{
    public ContainmentOperator()
    {
        // For persisting a document type called 'Target'
        For<Target>()

            // Use a gin index against the json data field
            .GinIndexJsonData()

            // directs Marten to try to use the containment
            // operator for querying against this document type
            // in the Linq support
            .PropertySearching(PropertySearching.ContainmentOperator);
    }
}

and like this for indexing what we’re calling “searchable” fields where Marten duplicates some element of a document into a separate database column for optimized searching:

public class DateIsSearchable : MartenRegistry
{
    public DateIsSearchable()
    {
        // This can also be done with attributes
        // This automatically adds a "BTree" index
        For<Target>().Searchable(x => x.Date);
    }
}

As of now, when you choose to make a field or property of a document “searchable”, Marten is automatically adding a database index to that column on the document storage table. By default, the index is the standard Postgresql btree index, but you do have the ability to override how the index is created.

Now that we have support for querying using the containment operator and support for defining indexes, I reran the query performance tests and updated the results with some new data:

Serializer: JsonNetSerializer

Query Type 1K 10K 100K
JSON Locator Only 7 77.2 842.4
jsonb_to_record + lateral join 9.4 88.6 1170.4
searching by duplicated field 1 16.4 135.4
searching by containment operator 4.6 14.8 132.4

Serializer: JilSerializer

Query Type 1K 10K 100K
JSON Locator Only 6 54.8 827.8
jsonb_to_record + lateral join 8.6 76.2 1064.2
searching by duplicated field 1 6.8 64
searching by containment operator 4 7.8 66.8

Again, searching by a field that is duplicated as a simple database column with a btree index is clearly the fastest approach. The containment operator plus gin index comes in second, and may be the best choice when you will have to issue many different kinds of queries against the same document type. Based on this data, I think that we’re going to make the containment operator be the preferred way of querying json documents, but fallback to using the json locator approach for all other query operators besides equality tests.

I still think that we have to ship with Newtonsoft.Json as our default json serializer because of F# and polymorphism concerns among other things, but if you can get away with it for your document types, Jil is clearly much faster.

There is some conversation in the Marten Gitter room about possibly adding gin indexes to every document type by default, but I think we first need to pay attention to the data in the next section:

 

Insert Timings

The querying is definitely important, but we certainly want the write side of Marten to be fast too. We’ve had what we call “BulkInsert” support using Npgsql & Postgresql’s facility for bulk copying. Recently, I’ve changed Marten’s internal unit of work class to issue all of its delete and “upsert” commands in one single ADO.Net DbCommand to try to execute multiple sql statements in a single network round trip.

My best friend the Oracle database guru (I’ll know if he reads this because he’ll be groaning about the Oracle part;)) suggested that this approach might not matter against issuing multiple ADO.Net commands against the same stateful transaction and connection, but we were both surprised by how much difference batching the SQL commands turned out to be.

To better understand the impact on insert timing using our bulk insert facility, the new batched update mechanism, and the original “ADO.Net command per document update” approach, I ran a series of tests that tried to insert 500 documents using each technique.

Because we also need to understand the implications on insertion and update timing of using the searchable, duplicated fields and gin indexes (there is some literature in the Postgresql docs stating that gin indexes could be expensive on the write side), I ran each permutation of update strategy against three different indexing strategies on the document storage table:

  1. No indexes whatsoever
  2. A duplicated field with a btree index
  3. Using a gin index against the JSON data column

And again, just for fun, I used both the Newtonsoft.Json and Jil serializers to also understand the impact that they have on performance.

You can find the code I used to make these tables in GitHub in the insert_timing class.

Using Newtonsoft.Json as the Serializer

Index Bulk Insert Batch Update Command per Document
No Index 62 149 244
Duplicated Field w/ Index 53 152 254
Gin Index on Json 96 186 300

 

Using Jil as the Serializer

Index Bulk Insert Batch Update Command per Document
No Index 47 134 224
Duplicated Field w/ Index 57 151 245
Gin Index on Json 79 180 270

As you can clearly see, the new batch update mechanism looks to be a pretty big win for performance over our original, naive “command per document” approach. The only downside is that this technique has a certain ceiling insofar as how many or how large the documents can be before the single command exceeds technical limits. For right now, I think I’d like to simply beat that problem with documentation pushing users to using the bulk insert mechanism for large data sets. In the longer term, we’ll throttle the batch update by paging updates into some to be determined number of document updates at a time.

The key takeaway for me just reinforces the very first lesson I had drilled into me about software performance: network round trips are evil. We are certainly reducing the number of network round trips between our application and the database server by utilizing the command batching.

You can also see that using a gin index slows down the document updates considerably. I think the only good answer to users is that they’ll have to do performance testing as always.

 

Other Optimization Things

  • We’ve been able to cutdown on Reflection hits and dynamic runtime behavior by using Roslyn as a crude metaprogramming mechanism to just codegen the document storage code.
  • Again in the theme of reducing network round trips, we’re going to investigate being able to batch up deferred queries into a single request to the Postgresql database.
  • We’re not sure about the details yet, but we’ll be investigating approaches for using asynchronous projections inside of Postgresql (maybe using Javascript running inside of the database, maybe .Net code in an external system, maybe both approaches).
  • I’m leaving the issues out in http://up-for-grabs.net, but we’ll definitely add the ability to just retrieve the raw JSON so that HTTP endpoints could stream data to clients without having to take the unnecessary hit of deserializing to a .Net type just to immediately serialize right back to JSON for the HTTP response. We’ll also support a completely asynchronous querying and update API for maximum scalability.

 

Using Roslyn for Runtime Code Generation in Marten

I’m using Roslyn to dynamically compile and load assemblies built at runtime from generated code in Marten and other than some concern over the warmup time, it’s been going very well so far.

Like so many other developers with more cleverness than sense, I’ve spent a lot of time trying to build Hollywood Principle style frameworks that try to dynamically call application code at runtime through Reflection or some kind of related mechanism. Reflection itself has traditionally been the easiest mechanism to use in .Net to create dynamic behavior at runtime, but it can be a performance problem, especially if you use it naively.

A Look Back at What Came Before…

Taking my own StructureMap IoC tool as an example, over the years I’ve accomplished dynamic runtime behavior in a couple different ways:

  1. Using IL directly using Reflection.Emit from the original versions through StructureMap 2.5. Working with IL is just barely a higher abstraction than assembly code and I don’t recommend using that if your goal is maintainability or making it easy for other developers to work in your code. I don’t miss generating IL by hand whatsoever. For those of you reading this and saying “pfft, IL isn’t so bad if you just understand how it works…”, my advice to you is to immediately go outside and get some fresh air and sunshine because you clearly aren’t thinking straight.
  2. From StructureMap 2.6 I crudely used the trick of building Expression trees representing what I needed to do, then compiling those Expression trees into objects of the right Func or Action signatures. This approach is easier – at least for me – because the Expression model is much closer semantically to the actual code you’re trying to mimic than the stack-based IL.
  3. From StructureMap 3.* on, there’s a much more complex dynamic Expression compilation model that’s robust enough to call constructor functions, setter properties, thread in interception, and surround all of that with try/catch logic for expressive exception messages and pseudo stack traces.

The current dynamic Expression approach in the StructureMap 3/4 internals is mostly working out well, but I barely remember how it works and it would take me a good day to just to get back into that code if I ever had to change something.

What if instead we could just work directly in plain old C# that we largely know and understand, but somehow get that compiled at runtime instead? Well, thanks to Roslyn and its “compiler as a service”, we now can.

I’ve said before that I want to eventually replace the Expression compilation with the Roslyn code compilation shown in this post, but I’m not sure I’m ambitious enough to mess with a working project.

How Marten uses Roslyn Runtime Generation 

As I explained in my last blog post, Marten generates some “glue code” to connect a document object to the proper ADO.Net command objects for loading, storing, or deleting. For each document class, Marten generates an IDocumentStorage class with this signature:

public interface IDocumentStorage
{
    NpgsqlCommand UpsertCommand(object document, string json);
    NpgsqlCommand LoaderCommand(object id);
    NpgsqlCommand DeleteCommandForId(object id);
    NpgsqlCommand DeleteCommandForEntity(object entity);
    NpgsqlCommand LoadByArrayCommand(TKey[] ids);
    Type DocumentType { get; }
}

In the test library, we have a class I creatively called “Target” that I’ve been using to test how Marten handles various .Net Types and queries. At runtime, Marten generates a class called TargetDocumentStorage that implements the interface above. Part of the generated code — modified by hand to clean up some extraneous line breaks and added comments — is shown below:

using Marten;
using Marten.Linq;
using Marten.Schema;
using Marten.Testing.Fixtures;
using Marten.Util;
using Npgsql;
using NpgsqlTypes;
using Remotion.Linq;
using System;
using System.Collections.Generic;

namespace Marten.GeneratedCode
{
    public class TargetStorage : IDocumentStorage, IBulkLoader, IdAssignment
    {
        public TargetStorage()
        {

        }

        public Type DocumentType => typeof (Target);

        public NpgsqlCommand UpsertCommand(object document, string json)
        {
            return UpsertCommand((Target)document, json);
        }

        public NpgsqlCommand LoaderCommand(object id)
        {
            return new NpgsqlCommand("select data from mt_doc_target where id = :id").WithParameter("id", id);
        }

        public NpgsqlCommand DeleteCommandForId(object id)
        {
            return new NpgsqlCommand("delete from mt_doc_target where id = :id").WithParameter("id", id);
        }

        public NpgsqlCommand DeleteCommandForEntity(object entity)
        {
            return DeleteCommandForId(((Target)entity).Id);
        }

        public NpgsqlCommand LoadByArrayCommand(T[] ids)
        {
            return new NpgsqlCommand("select data from mt_doc_target where id = ANY(:ids)").WithParameter("ids", ids);
        }

        // I configured the "Date" field to be a duplicated/searchable field in code
        public NpgsqlCommand UpsertCommand(Target document, string json)
        {
            return new NpgsqlCommand("mt_upsert_target")
                .AsSproc()
                .WithParameter("id", document.Id)
                .WithJsonParameter("doc", json).WithParameter("arg_date", document.Date, NpgsqlDbType.Date);
        }

        // This Assign() method would use a HiLo sequence generator for numeric Id fields
        public void Assign(Target document)
        {
            if (document.Id == System.Guid.Empty) document.Id = System.Guid.NewGuid();
        }

        public void Load(ISerializer serializer, NpgsqlConnection conn, IEnumerable documents)
        {
            using (var writer = conn.BeginBinaryImport("COPY mt_doc_target(id, data, date) FROM STDIN BINARY"))
            {
                foreach (var x in documents)
                {
                    writer.StartRow();
                    writer.Write(x.Id, NpgsqlDbType.Uuid);
                    writer.Write(serializer.ToJson(x), NpgsqlDbType.Jsonb);
                    writer.Write(x.Date, NpgsqlDbType.Date);
                }
            }
        }
    }
}

Now that you can see what code I’m generating at runtime, let’s move on to a utility for generating the code.

SourceWriter

SourceWriter is a small utility class in Marten that helps you write neatly formatted, indented C# code. SourceWriter wraps a .Net StringWriter for efficient string manipulation and provides some helpers for adding namespace using statements and tracking indention levels for you. After experimenting with some different usages, I mostly settled on using the Write(text) method that allows you to provide a section of code as a multi-line string. The TargetDocumentStorage code I showed above is generated from within a class called DocumentStorageBuilder with a call to the SourceWriter.Write() method shown below:

            writer.Write(
                $@"
BLOCK:public class {mapping.DocumentType.Name}Storage : IDocumentStorage, IBulkLoader<{mapping.DocumentType.Name}>, IdAssignment<{mapping.DocumentType.Name}>

{fields}

BLOCK:public {mapping.DocumentType.Name}Storage({ctorArgs})
{ctorLines}
END

public Type DocumentType => typeof ({mapping.DocumentType.Name});

BLOCK:public NpgsqlCommand UpsertCommand(object document, string json)
return UpsertCommand(({mapping.DocumentType.Name})document, json);
END

BLOCK:public NpgsqlCommand LoaderCommand(object id)
return new NpgsqlCommand(`select data from {mapping.TableName} where id = :id`).WithParameter(`id`, id);
END

BLOCK:public NpgsqlCommand DeleteCommandForId(object id)
return new NpgsqlCommand(`delete from {mapping.TableName} where id = :id`).WithParameter(`id`, id);
END

BLOCK:public NpgsqlCommand DeleteCommandForEntity(object entity)
return DeleteCommandForId((({mapping.DocumentType.Name})entity).{mapping.IdMember.Name});
END

BLOCK:public NpgsqlCommand LoadByArrayCommand(T[] ids)
return new NpgsqlCommand(`select data from {mapping.TableName} where id = ANY(:ids)`).WithParameter(`ids`, ids);
END


BLOCK:public NpgsqlCommand UpsertCommand({mapping.DocumentType.Name} document, string json)
return new NpgsqlCommand(`{mapping.UpsertName}`)
    .AsSproc()
    .WithParameter(`id`, document.{mapping.IdMember.Name})
    .WithJsonParameter(`doc`, json){extraUpsertArguments};
END

BLOCK:public void Assign({mapping.DocumentType.Name} document)
{mapping.IdStrategy.AssignmentBodyCode(mapping.IdMember)}
END

BLOCK:public void Load(ISerializer serializer, NpgsqlConnection conn, IEnumerable<{mapping.DocumentType.Name}> documents)
BLOCK:using (var writer = conn.BeginBinaryImport(`COPY {mapping.TableName}(id, data{duplicatedFieldsInBulkLoading}) FROM STDIN BINARY`))
BLOCK:foreach (var x in documents)
writer.StartRow();
writer.Write(x.Id, NpgsqlDbType.{id_NpgsqlDbType});
writer.Write(serializer.ToJson(x), NpgsqlDbType.Jsonb);
{duplicatedFieldsInBulkLoadingWriter}
END
END
END

END

");
        }

There’s a couple things to note about the code generation above:

  • String interpolation makes this so much easier than I think it would be with just string.Format(). Thank you to the C# 6 team.
  • Each line of code is written to the underlying StringWriter with the level of indention added to the left by SourceWriter itself
  • The “BLOCK” prefix directs SourceWriter to add an opening brace “{” to the next line, then increment the indention level
  • The “END” text directs SourceWriter to decrement the current indention level, then write a closing brace “}” to the next line and a blank line after that.

Now that we’ve got ourselves some generated code, let’s get Roslyn involved to compile it and actually get at an object of the new Type we want.

Roslyn Compilation with AssemblyGenerator

Based on a blog post by Tugberk Ugurlu, I built the AssemblyGenerator class in Marten shown below that invokes Roslyn to compile C# code and load the new dynamically built Assembly into the application:

public class AssemblyGenerator
{
    private readonly IList _references = new List();

    public AssemblyGenerator()
    {
        ReferenceAssemblyContainingType<object>();
        ReferenceAssembly(typeof (Enumerable).Assembly);
    }

    public void ReferenceAssembly(Assembly assembly)
    {
        _references.Add(MetadataReference.CreateFromFile(assembly.Location));
    }

    public void ReferenceAssemblyContainingType<T>()
    {
        ReferenceAssembly(typeof (T).Assembly);
    }

    public Assembly Generate(string code)
    {
        var assemblyName = Path.GetRandomFileName();
        var syntaxTree = CSharpSyntaxTree.ParseText(code);

        var references = _references.ToArray();
        var compilation = CSharpCompilation.Create(assemblyName, new[] {syntaxTree}, references,
            new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary));


        using (var stream = new MemoryStream())
        {
            var result = compilation.Emit(stream);

            if (!result.Success)
            {
                var failures = result.Diagnostics.Where(diagnostic =>
                    diagnostic.IsWarningAsError ||
                    diagnostic.Severity == DiagnosticSeverity.Error);


                var message = failures.Select(x => $"{x.Id}: {x.GetMessage()}").Join("\n");
                throw new InvalidOperationException("Compilation failures!\n\n" + message + "\n\nCode:\n\n" + code);
            }

            stream.Seek(0, SeekOrigin.Begin);
            return Assembly.Load(stream.ToArray());
        }
    }
}

At runtime, you use the AssemblyGenerator class by telling it which other assemblies it should reference and giving it the source code to compile:

// Generate the actual source code
var code = GenerateDocumentStorageCode(mappings);

var generator = new AssemblyGenerator();

// Tell the generator which other assemblies that it should be referencing 
// for the compilation
generator.ReferenceAssembly(Assembly.GetExecutingAssembly());
generator.ReferenceAssemblyContainingType<NpgsqlConnection>();
generator.ReferenceAssemblyContainingType<QueryModel>();
generator.ReferenceAssemblyContainingType<DbCommand>();
generator.ReferenceAssemblyContainingType<Component>();

mappings.Select(x => x.DocumentType.Assembly).Distinct().Each(assem => generator.ReferenceAssembly(assem));

// build the new assembly -- this will blow up if there are any
// compilation errors with the list of errors and the actual code
// as part of the exception message
var assembly = generator.Generate(code);

Finally, once you have the new Assembly, use Reflection just to find the new Type you want by either searching through Assembly.GetExportedTypes() or by name. Once you have the Type object, you can build that object through Activator.CreateInstance(Type) or any of the other normal Reflection mechanisms.

The Warmup Problem

So I’m very happy with using Roslyn in this way so far, but the initial “warmup” time on the very first usage of the compilation is noticeably slow. It’s a one time hit on startup, but this could get annoying when you’re trying to quickly iterate or debug a problem in code by frequently restarting the application. If the warmup problem really is serious in real applications, we may introduce a mode that just lets you export the generated code to file and have that code compiled with the rest of your project for much faster startup times.

Optimizing for Performance in Marten

For the last couple weeks I’ve been working on a new project called Marten that is meant to exploit Postgresql’s JSONB data as a full fledged document database for .Net development as a drop in replacement for RavenDb in our production environment. I think that I would say that our primary goal with Marten is improved stability and supportability, but maximizing performance and throughput is a very close second in the priority list.

This is my second update on Marten progress. From last week, also see Marten Development So Far.

So far, I’ve mostly been focusing on optimizing the SQL queries generated by the Linq support for faster fetching. I’ve been experimenting with a few different query modes for the SQL generation based on what fields or properties you’re trying to search on:

  1. By default in the absence of any explicit configuration, Marten tries to use the “jsonb_to_record” function with a LATERAL join approach to optimize queries against members on the root of the document.
  2. You can also force Marten to only use basic Postgresql JSON locators to generate the where clauses in the SQL statements
  3. Finally, if you know that your application will be frequently querying a document type against a certain member, Marten can use a “searchable” field such that it duplicates that data in a normal database field and searches directly against that database field. This mechanism will clearly slow down your inserts and take up somewhat more storage space, but the numbers I’m about to display don’t lie, this is very clearly the fastest way to optimize queries using Marten (so far).

I’ve also experimented with both the Newtonsoft.Json serializer and the faster, but less flexible Jil serializer. Again, the numbers are pretty clear that for bigger result sets, Jil is much faster (NetJSON was a complete bust for me when I tried it). So far I’ve been able to keep Marten serializer-agnostic and I can easily see times when you’d have to opt for Newtonsoft’s flexibility.

Default jsonb_to_record/LATERAL JOIN

Using this approach, the SQL generated is:

select d.data from mt_doc_target as d, LATERAL jsonb_to_record(d.data) as l("Date" date) where l."Date" = :arg0

Json Locators Only

While you can configure this behavior on a field by field basis, the quickest way is to just set the default document behavior:

public class JsonLocatorOnly : MartenRegistry
{
    public JsonLocatorOnly()
    {
        // This can also be done with attributes
        For<Target>().PropertySearching(PropertySearching.JSON_Locator_Only);
    }
}

With this setting, the generated SQL is:

select d.data from mt_doc_target as d where CAST(d.data ->> 'Date' as date) = :arg0

Searchable, Duplicated Field

Again, to configure this option, I used this code:

public class DateIsSearchable : MartenRegistry
{
    public DateIsSearchable()
    {
        // This can also be done with attributes
        For<Target>().Searchable(x => x.Date);
    }
}

When I do this, the table for the Target type has an additional field called “date” that will get the value of the Target.Date property every time a Target object is inserted or updated in the database.

The resulting SQL is:

select d.data from mt_doc_target as d where d.date = :arg0

The Performance Results

I created the table below by generating randomized data, then trying to search by a DateTime field using three different mechanisms:

var theDate = DateTime.Today.AddDays(3);
var queryable = session.Query<Target>().Where(x => x.Date == theDate);

In all cases, I used the same sample data for the document count and took an average of running the same query five times after throwing out an initial attempt where Postgresql seemed to be “warming up” the JSONB data.

Serializer: JsonNetSerializer

Query Type 1K 10K 100K 1M
JSON Locator Only 9.6 75.2 691.2 9648
jsonb_to_record + lateral join 10 93.6 922.6 12091.2
searching by duplicated field 2.4 15 169.6 2777.8

Serializer: JilSerializer

Query Type 1K 10K 100K 1M
JSON Locator Only 6.8 61 594.8 7265.6
jsonb_to_record + lateral join 8.4 86.6 784.2 9655.8
searching by duplicated field 1 8.8 115.4 2234.2

To be honest, I expected the JSONB_TO_RECORD + LATERAL JOIN mechanism to be faster than the JSON locator only approach, but I need to go back and try to add some indexes because that’s supposed to be the benefit of using JSONB_TO_RECORD to avoid the object casts that inevitably defeat indexes. I’d be happy to get some Postgresql gurus to weigh in here if there are any reading this.

If you’re curious to see my mechanism for recording this data, see the performance_tuning code file in GitHub.

Bulk Loading Documents

From time to time (testing or data migrations maybe) you’ll have some need to very rapidly load a large set of documents into your database. I added a feature this morning to Marten that exploits Postgresql’s COPY feature supported by Npgsql:

public void load_with_small_batch()
{
    // This is just creating some randomized
    // document data
    var data = Target.GenerateRandomData(100).ToArray();

    // Load all of these into a Marten-ized database
    theSession.BulkLoad(data);

    // And just checking that the data is actually there;)
    theSession.Query<Target>().Count().ShouldBe(data.Length);
    theSession.Load<Target>(data[0].Id).ShouldNotBeNull();
}

Behind the scenes, Marten is using code generation at runtime and compiled by Roslyn to do the bulk loading as efficiently as possible without any hit from using reflection:

public void Load(ISerializer serializer, NpgsqlConnection conn, IEnumerable documents)
{
    using (var writer = conn.BeginBinaryImport("COPY mt_doc_target(id, data) FROM STDIN BINARY"))
    {
        foreach (var x in documents)
        {
            writer.StartRow();
            writer.Write(x.Id, NpgsqlDbType.Uuid);
            writer.Write(serializer.ToJson(x), NpgsqlDbType.Jsonb);
        }
    }
}

Do note that the code generation mechanism is smart enough to also add any fields or properties of the document type that are marked as duplicated for searching.

Other Outstanding Optimization Tasks 

  • Optimize the mechanics for applying all the changes in a unit of work. I’m hoping that we can do something to reduce the number of network round trips between the application and the postgresql server. My fallback approach is going to be to use a custom PLV8 sproc, but not until we exhaust other possibilities with the Npgsql library.
  • I want some mechanism for queuing up queries and submitting them in one network round trip
  • The ability to make a named, reusable Linq query so you can reuse the underlying ADO.Net command generated from parsing the Linq expression without having to go through all the Expression parsing gymnastics on each usage
  • Really more for scalability than performance, but we’ll get around to asynchronous query methods. I’m just not judging that to be a critical path item right now.
  • It’s probably minor in the grand scheme of things, but the actual Linq expression to Sql query generation is grotesque in how it concatenates strings

Feel very free to make suggestions and other feedback on these items;-)

Testing HTTP Handlers with No Web Server in Sight

FubuMVC 2.0 and 3.0 introduced some tooling I called “Scenarios” that allow users to write mostly declarative integration tests against the entire HTTP pipeline in memory without having to host the application in a web server. I promised a coworker that I would write a blog post about using Scenarios for an internal team that wants to start using it much more in their work. A week of procrastination later and here you go:

NOTE: All samples are using FubuMVC 3.0

Why Integration Tests?

From the very beginning, we tried very hard to make unit testing FubuMVC action methods in isolation as easy as possible. I think we largely succeeded in that goal. However, within the context of a handling an HTTP request, FubuMVC like most web frameworks will potentially wrap those action methods with various middleware strategies for cross cutting technical things like authentication, authorization, logging, transaction management, and content negotiation. At some point, to truly exercise an HTTP endpoint you really do need to write an integration test that exercises the entire chain of HTTP handlers for an HTTP request exactly the way it will be configured inside the running application.

Toward that end, I built a class called EndpointDriver in early versions of FubuMVC that you could use to write integration tests against a FubuMVC application hosted with an embedded Katana server. This early tooling just wrapped WebClient with a FubuMVC specific fluent interface for resolving url’s, setting common options like the content-type and accepts headers, and verifying parts of the HTTP response. Below is a sample from our content negotiation support integration tests in FubuMVC 1.3 (“endpoints” is a reference to the EndpointDriver object for the running application):

[Test]
public void force_to_json_with_querystring()
{
    endpoints.Get("conneg/override/Foo?Format=Json", acceptType: "text/html")
        .ContentTypeShouldBe(MimeType.Json)
        .ReadAsJson<OverriddenResponse>()
        .Name.ShouldEqual("Foo");
}

EndpointDriver was fine at first, but our test library started getting slower as we added more and more tests and the fluent interface just never kept up with everything we needed for HTTP testing (plus I think that WebClient is awkward to use).

Using OWIN for HTTP “Scenarios”

As part of my FubuMVC 2.0 effort last year, I knew that I wanted a much better mechanism than the older EndpointDriver for doing integration testing of HTTP endpoints. Specifically, I wanted:

  • To be able to run HTTP requests and verify the response without having to take the performance hit of a web server
  • To run a FubuMVC application as it would be configured in production
  • To completely configure any part of an HTTP request
  • To be able to declaratively express multiple assertions against the expected response
  • To utilize FubuMVC’s support for “reverse URL resolution” for more traceable tests
  • Access to the raw HTTP request and response for anything unusual you would need to do that didn’t have a specific helper

The end result was a mechanism I called “Scenario’s” that exploited FubuMVC’s OWIN support to run HTTP requests in memory using this signature off of the new FubuRuntime object I explained in an earlier blog post:

OwinHttpResponse Scenario(Action<Scenario> configuration)

The Scenario object models both the HTTP request provides a way to specify expectations about the HTTP response for commonly used things like HTTP status codes, header values, and checking for the presence of string values in the HTTP response body. If need be, you also have access to FubuMVC’s abstractions for the entire HTTP request and response (more on this later).

To make this concrete, let’s say that you’re working through a “Hello, World” exercise with FubuMVC with this class and action method that just returns the text “Hello, World” when you issue a GET to the root “/” url of an application:

public class HomeEndpoint
{
    public string Index()
    {
        return "Hello, World";
    }
}

A scenario test for the action above would look like this code below:

using (var runtime = FubuRuntime.Basic())
{
    // Execute the home route and verify
    // the response
    runtime.Scenario(_ =>
    {
        _.Get.Url("/");

        _.StatusCodeShouldBeOk();
        _.ContentShouldBe("Hello, World");
        _.ContentTypeShouldBe("text/plain");
    });
}

In the scenario above, I’m issuing a GET request to the “/” url of the application and specifying that the resulting status code should be HTTP 200, “content-type” response header should be “text/plain”, and the exact contents of the response body should be “Hello, World.” When a Scenario is executed, it will run every single assertion instead of quitting on the first failure and report on every failed expectation in the specification output. This behavior is valuable when you have to author specifications with slower running scenario setup.

Specifying Url’s

FubuMVC has a model for reverse URL lookup from any endpoint method or the input model that we exploited in Scenario’s for traceable tests:

host.Scenario(_ =>
{
    // Specify a GET request to the Url that runs an endpoint method:
    _.Get.Action<InMemoryEndpoint>(e => e.get_memory_hello());

    // Or specify a POST to the Url that would handle an input message:
    _.Post

        // This call serializes the input object to Json using the 
        // application's configured JSON serializer and setting
        // the contents on the Request body
        .Json(new HeaderInput {Key = "Foo", Value1 = "Bar"});

    // Or specify a GET by an input object to get the route parameters
    _.Get.Input(new InMemoryInput { Color = "Red" });
});

I like the reverse url lookup instead of specifying Url’s directly in the scenarios because:

  1. It makes your scenario tests traceable to the actual handling code
  2. It insulates your scenarios from changes to the Url structures later

Checking the Response Body

For the 3.0 work I did a couple months ago, I fleshed out the Scenario support with more mechanisms to analyze the HTTP response body:

host.Scenario(_ =>
{
    // set up a request here

    // Read the response body as text
    var bodyText = _.Response.Body.ReadAsText();

    // Read the response body by deserializing Json
    // into a .net type with the application's
    // configured Json serializer
    var output = _.Response.Body.ReadAsJson<MyResponse>();

    // If you absolutely have to work with Xml...
    var xml = _.Response.Body.ReadAsXml();
});

Some Other Things…

I’ll happily explain the details of this list on request, but here are some other attributes of Scenario’s that FubuMVC supports right now:

  • You can specify expected values for HTTP response headers
  • You can assert on status codes and descriptions
  • There are helpers to send Json or Xml serialized data based on an input object message
  • There is a mechanism that allows you to disable all security middleware in the application for a single Scenario that has been frequently helpful in testing
  • You have access to the underlying IoC container for the running application from the Scenario if you need to resolve and use application services
  • FubuMVC is now StructureMap 4.0-only for its IoC usage, so we’re able to rely on StructureMap’s child container feature to resolve services during a Scenario execution from a unique child container per run. This allows you to replace services in your application with fakes, mocks, and stubs in a way that prevents your fake services from impacting more than one test.

Scenarios in Jasper

If you didn’t see my blog post earlier this year, FubuMVC is getting a complete reboot into a new project called Jasper late this year/early next year. I absolutely plan on bringing the Scenario support forward into Jasper very early, but this time around we’re completely dropping all of FubuMVC’s HTTP abstractions in favor of directly using the OWIN environment dictionary as the single model of HTTP requests and responses. My thought right now is that we’ll invest heavily in extension methods hanging off of IDictionary<string, object> for commonly used operations against that OWIN dictionary.

To some extent, we’re hoping as well that there will be a good ecosystem of OWIN helpers from other people and projects that will be usable from within Jasper.

Other Reading

Marten Development So Far (Postgresql as Doc Db)

Last week I mentioned that I had started a new OSS project called “Marten” that aims to allow .Net developers treat Postgresql 9.5 (we’re using the new “upsert” functionality ) as a document database using Postgresql’s JSONB data type. We’ve already had some interest and feedback on Github and the Gitter room — plus links to at least three other ongoing efforts to do something similar with Postgresql that I’m interpreting as obvious validation for the basic idea.

Please feel very free to chime in on the approach or requirements here or Github or Gitter. We’re going to proceed with this project regardless at work, but I’d love to see it also be a viable community project with input from outside our little development organization.

What’s Already Done

I’d sum up the Marten work as “so far, so good”. If you look closely into the Marten code, do know that I have been purposely standing the functionality with simple mechanics and naive implementations. My philosophy here is to get the functionality up with good test coverage before starting any heavy optimization work.

As of now:

  • Our thought is that the main service facade to Marten is the IDocumentSession interface that very closely mimics the same interface in RavenDb. This work is for my day job at Extend Health, and our immediate goal is to move systems off of RavenDb early next year, so I think that design decision is pretty understandable. That doesn’t mean that that’ll be the only way to interact with Marten in the long run.
  • In the “development mode”, Marten is able to create database tables and an “upsert” stored procedure for any new document type it encounters in calls to the IDocumentSession.
  • The real DocumentSession facade can store documents, load documents by either a single or array of id’s, and delete documents by the same.
  • DocumentSession implements a “unit of work” with similar usage to RavenDb’s.
  • You can completely bypass the Linq provider I’m describing in the next section and just use raw SQL to fetch documents
  • A DocumentCleaner service that you can use to tear down document data or even the schema objects that Marten builds inside of automated testing harnesses

Linq Support

I don’t think I need to make the argument that Marten is going to be more usable and definitely more popular if it has decent Linq support. While I was afraid that building a Linq provider on top of the Postgresql JSON operators was going to be tedious and hard, the easy to use Relinq library has made it just “tedious.”

As early as next week I’m going to start working over the Linq support and the SQL it generates to try to optimize searching.

The Linq support hangs off of the IDocumentSession.Query<T>() method like so:

        public void query()
        {
            theSession.Store(new Target{Number = 1, DateOffset = DateTimeOffset.Now.AddMinutes(5)});
            theSession.Store(new Target{Number = 2, DateOffset = DateTimeOffset.Now.AddDays(1)});
            theSession.Store(new Target{Number = 3, DateOffset = DateTimeOffset.Now.AddHours(1)});
            theSession.Store(new Target{Number = 4, DateOffset = DateTimeOffset.Now.AddHours(-2)});
            theSession.Store(new Target{Number = 5, DateOffset = DateTimeOffset.Now.AddHours(-3)});

            theSession.SaveChanges();

            theSession.Query<Target>()
                .Where(x => x.DateOffset > DateTimeOffset.Now).ToArray()
                .Select(x => x.Number)
                .ShouldHaveTheSameElementsAs(1, 2, 3);
        }

For right now, the Linq IQueryable support includes:

  • IQueryable.Where() support with strings, int’s, long’s, decimal’s, DateTime’s, enumeration values, and boolean types.
  • Multiple or chained Where().Where().Where() clauses like you might use when you’re calculating optional where clauses or letting multiple pieces of code add additional filters
  • “&&” and “||” operators in the Where() clauses
  • Deep nested properties in the Where() clauses like x.Address.City == “Austin”
  • First(), FirstOrDefault(), Single(), and SingleOrDefault() support for the IQueryable
  • Count() and Any() support
  • Contains(), StartsWith(), and EndsWith() support for string values — but it’s case sensitive right now. Case-insensitive searches are probably going to be an “up-for-grabs” task;)
  • Take() and Skip() support for paging
  • OrderBy() / ThenBy() / OrderByDescending() support

Right now, I’m using my audit of our largest system at work that uses RavenDb to guide and prioritize the Linq support. The only thing missing for us is searching within child collections of a document.

What we’re missing right now is:

  • Projections via IQueryable.Select(). Right now you have to do IQueryable.ToArray() to force the documents into memory before trying to use Select() projections.
  • Last() and LastOrDefault()
  • A lot of things I probably hadn’t thought about at all;-)

Using Roslyn for Runtime Code Compilation

We’ll see if this turns out to be a good idea or not, but as of today Marten is using Roslyn to generate strategy classes that “know” how to build database commands for updating, deleting, and loading document data for each document type instead of using Reflection or IL emitting or compiling Expression’s on the fly. Other than the “warm up” performance hit on doing the very first compilation, this is working smoothly so far. We’ll be watching it for performance. I’ll blog about that separately sometime soon-ish.

Next Week: Get Some Data and Optimize!

My focus for Marten development next week is on getting a non-trivial database together and working on pure optimization. My thought is to grab data from Github using Ocktokit.Net to build a semi-realistic document database of users, repositories, and commits from all my other OSS projects. After that, I’m going to try out:

  • Using GIN indexes against the jsonb data to see how that works
  • Trying to selectively duplicate data into normal database fields for lightweight sql searches and indexes
  • Trying to use Postgresql’s jsonb_to_record functionality inside of the Linq support to see if that makes searches faster
  • I’m using Newtonsoft.Json as the JSON serializer right now thinking that I’d want the extra flexibility later, but I want to try out Jil too for the comparison
  • After the SQL generation settles down, try to clean up the naive string concatenation going on inside of the Linq support
  • Optimize the batch updates through DocumentSession.SaveChanges(). Today it’s just making individual sql commands in one transaction. For some optimization, I’d like to at least try to make the updates happen in fewer remote calls to the database. My fallback plan is to use a *gasp* stored procedure using postgresql’s PLV8 javascript support to take any number of document updates or deletions as a single json payload.

That list above is enough to keep me busy next week, but there’s more in the open Github issue list and we’re all ears about whatever we’ve missed, so feel free to add more feature requests or comment on existing issues.

Why “Marten?”

One of my colleagues was sneering at the name I was using, so I googled for “natural predators of ravens” and the marten was one of the few options, so we ran with it.

My .Net Unboxed 2015 Wrapup

I had a blast this week at the .Net Unboxed conference in Dallas. The content and speaker lineup was good, the vibe was great, the venue and location was great, and it was remarkably well organized. My hat is completely off to the organizers and I sincerely hope they’re up for doing this again next year.

For my part, I thought my Storyteller 3 talk went well and I was thrilled with the interest and questions I got about it later. I’ll definitely be posting a link to the recording when that’s posted.

Some thoughts and highlights in no particular order:

  • Strong naming in .Net continues to be a major source of angst and frustration for those of us heavily involved in OSS or simply wanting to consume OSS projects. Now that Nuget makes it somewhat easier to push out incremental releases and bug fix releases, strong naming is causing more and more headaches. I enjoyed my conversations with Daniel Plaisted of Microsoft who for the very first time has convinced me that anybody in Redmond understands how much trouble strong naming is causing. I’m still iffy on having to take on the overhead of ilrepack/ilmerge in publishing or doing the “Newtonsoft Lie to Your Users” version strategy. I think that CoreCLR’s much looser usage of strong naming might very well be enough reason for us as a community to hurry up and get our code up on the new runtime. In the meantime, I’ll be closely following the new Strongnamer project as a possible way to eliminate some of the pain for non-CoreCLR packages.
  • I’m not buying that DNX is going to be usable until later next year. I think, and conversations this week reinforced this idea, that I very much like the ASP.Net team’s general vision for vNext, but they’ve just bitten off more than they can handle.
  • Nik Molnar gave me a compliment about my UI work on Storyteller that made my day since I’m infamously bad at UI design and layout. I’m pretty sure he followed that up with “but, [something negative]” but I didn’t pay any attention to that part;)
  • I started to get a little irritated during one talk and wanted to start arguing with the speaker, so I quietly snuck out and went to the other ongoing talk *just* in time to hear Jimmy Bogard telling folks how he made a mistake by copying the old static ObjectFactory idea from StructureMap. I’ve apologized in public dozens of times on that one and I’m sure I’ll have to do it plenty more times. Sigh.
  • I did enjoy Jimmy’s talk on his experiences running OSS projects and appreciated his candor about the earlier decisions and approaches that didn’t necessarily work out. For my money, many of the best conference talks are about lessons learned from mistakes and fixing problems.
  • I definitely appreciated the lack of “let me tell you how wonderful I am and can I have an MVP award now?” talks that so frequently pop up in many .Net-centric “eyes-forward” conferences. I love how interactive the talks were and how engaged the audiences were in asking questions. I especially enjoy it when talks seem to be just a way of jumpstarting conversations.
  • I’ve thought for over a year that the forthcoming “K”/DNX work from Redmond was probably going to suck all the oxygen out of the room for alternative frameworks and I think you’ve definitely seen that happen. On a much more positive note, I think that we might see a resurgence of those things next year as we get to start taking advantage of the improvements to the .Net framework. More and more, I’m hearing about folks treating DNX as almost a reset for .Net OSS, and that might not be a terrible thing.
  • I enjoyed the talk on Falcor.Net and I’m very interested in Falcor in general as a possibly easier – or at least less weird – approach than GraphQL for React.js client to server communication.