Proposal for StructureMap 5

EDIT 9/12: Meh, I had a couple twitter exchanges and Gitter questions today that reminded me why I’ve been wanting to walk away from StructureMap. Having to support not just SM but really how SM is used internally inside of tools like ASP.Net MVC Core, MediatR, and half a hundred other frameworks is just wearing me down too much. If anyone else is interested in taking on any of this, I’ll happily help out, but otherwise I think I’m just going to leave this alone. Besides, there’s the new built in IoC in ASP.Net and 30 or so other OSS competitors.

 

So I’ve been more or less burned out on StructureMap development and support for quite some time (it’s been better lately though). That being said, there’s a ton of people using it (it averages just shy of a 1,000 downloads a day).  It has also been put through the ringer from a lot of users, which remarkably enough, exposes and leads to fixing a lot of bugs and usability problems — and if you don’t believe me, check out this folder of all the tests for bug regressions and fixes.

StructureMap 4.* has a couple ongoing issues that should get addressed some day if the project is going to keep going on:

  1. StructureMap has fallen behind many or most of the other IoC containers in the public performance benchmarks. I think those benchmarks are mostly over simplified BS, but still, there’s the pride factor
  2. ASP.Net Core DI compliance has been a huge pain in the ass to the point where I’ve openly wondered if the ASP.Net team has purposely tried to sabotage and wipe out all the existing ecosystem of IoC containers in .Net
  3. The child container behavior (which I don’t personally use) has been problematic as StructureMap’s more creative users have found several permutations of this and that where the child container model has broken down a bit

So, here’s my thoughts about a possible direction for a future StructureMap 5.0:

  • Keep API backwards almost completely backward compatible with StructureMap 4.* except for a few places impacted by the next bullet point
  • Completely redesign the internal data structures as a performance optimization. The current structure isn’t terribly different from the very earliest StructureMap versions and there’s absolutely room to cut out some performance fat there
  • Take a dependency on Microsoft.Extensions.DependencyInjection.Abstractions and merge in the functionality that today is in StructureMap.Microsoft.DependencyInjection so that it’s easier to get the compliance against ASP.Net Core right by having everything in one place. My thought here too is that we would somehow use their configuration abstractions, but supplemented with the existing StructureMap configuration options somehow as a kind of buddy class extension. Not sure how that one’s gonna work out yet.
  • Look for opportunities to make the dynamic Expressions that are built up to actually create objects be more efficient by inlining some operations and generally reducing the number of times it bounces through dictionary structures. I know a lot more about building dynamic Expressions that I did several years ago when I moved StructureMap off of IL generation, so surely there’s some opportunity there

Alright, so my personal conundrum is simply wondering do I care enough to do this as an exercise in crafting performant data structures and micro-optimization, or call StructureMap 4.5 the end of the road and just continue to try to address bugs and user questions for the time being.

I’d kind of like to hear from some StructureMap users or contributors to see how much they’d want the performance and ASP.Net Core compatibility, and then see if anyone would want to help out if we go with this.

 

Advertisements

How we did (and did not) improve performance and efficiency in Marten 2.0

Marten 2.0 was released yesterday, and one of the improvements is somewhat significantly runtime performance and far better memory utilization in applications that use Marten. For today’s blog post, here’s what we did and tried to get there:

  • Avoiding Json strings whenever possible. Some time last year Ayende wrote a “review” of Marten on his blog before almost immediately retracting it. While I didn’t agree with most of his criticisms, he did call out Marten for being inefficient in its Json serialization by reading and writing the full Json strings instead of opting for more efficient mechanisms of reading or writing via byte arrays or Stream’s. The “write” side of this problem was largely solved in Marten 2.0, but after some related changes in the underlying Npgsql library, the “read” side of Marten uses TextReader’s as the input to Json serialization, therefore bypassing the need to create then immediately tear down string objects. These changes reduced the memory allocations in Marten almost by half, with maybe a 15-20% improvement in performance.
  • StringBuilder for all SQL command build up. I know what you’re thinking, “duh, StringBuilder is way more efficient than string concatenation,” but Marten got off the ground by mostly using string interpolation and concatenation. For 2.0, I went back over all that code and switched to StringBuilder’s, which has the nice impact of reducing memory utilization quite a bit (it didn’t make that much difference in performance). I absolutely don’t regret starting with simpler, cruder mechanisms to get things working before pulling in this optimization.
  • FastExpressionCompiler – Marten heavily uses dynamically generated Expression’s that are then compiled to Func or Action’s for document persistence and loading. The excellent FastExpressionCompiler library from Maksim Volkau replaces the built in Expression compilation with a new model that results both in delegates that are faster in runtime, and also reduces the compilation time of these expressions. Using FastExpressionCompiler makes Marten bootstrap faster, which made a huge improvement in Marten’s test suite execution. I measured about a 10% throughput performance in Marten’s benchmarks just by using this library
  • Newtonsoft.Json 9 to 10 and back to 9 – Newtonsoft.Json 10 was measurably slower in the Marten benchmarks, so we reverted back to 9.0.1. Bummer. You can always opt for Jil or other alternatives for considerably faster json serialization, but we found too many cases where Jil errored out on document types that Newtonsoft.Json handled just fine, so we stuck with Newtonsoft as the default based on the idea that the code should at least work;)

 

What’s left to do for performance?

  • I’m sure we could get better with our mechanics for byte[] or char[] pooling and probably some buffering in the ADO.Net manipulation during async methods
  • We know there are some places where the Linq provider generates Sql that isn’t as efficient as it could be. We might try to tackle this tactically in use case by use case, but I’m hoping for the version of Postgresql after 10 to get their improved Json querying functionality based on JsonPath before we do anything big to the Linq support.

Marten 2.0 is Out!

banner

 

I was just able to push the official Marten 2.0 nuget — and update the documentation after the Github outage today settled down;) The “2.0” moniker reflects the fact that there are some breaking API changes, but it’s doubtful that a typical user would even see them. A few operations moved off of IDocumentStore.Advanced and the Linq extensibility interface changed somewhat.

I’m going to be lazy and leave blog posts with actual content for later this week, but the highlights are:

  • Better performance and less memory usage — I’ll blog about what we did tomorrow
  • Much more flexibility in the event store and hopefully improved usability
  • Explicit insert and update document operations as opposed to the default “upsert” functionality
  • Multi-tenancy support within a single database
  • Persist and query documents serialized with camel casing (or snake casing) — a big request from several users who wanted to be able to stream the raw document json in Http services
  • The ability to run Marten with PLV8 disabled in environments where that extension is not (yet) available *cough* Azure *cough*

It’s not the slightest bit interesting to end users, but there was a massive change to the Marten internals for checking, updating, and creating schema objects in the underlying Postgresql database. That change has made it much easier to introduce changes of all kinds into Marten, and should allow for an easy extensibility model later.

The entire list of changes and contributions is here on the Github milestone page.

Thank you to…

I’m going to miss someone here, but the long list of folks who deserve some thanks for this release:

  • A special thanks to Joona-Pekka for tackling documentation updates and some uglier fixes in this release
  • James Hopper
  • Szymon Kulec for his help in the performance updates
  • Jarrod Alexander
  • Babu Annamalai for getting us running on AppVeyor, TravisCI, and up on the VS2017 project system
  • Eric Green, Daniel Wertheim, Wastaz, Marc Piechura, and Jeff Doolittle for their input to the event store functionality in this release
  • Bibodha Neupane (my colleague who’s been dogfooding the multi-tenancy support on one of our projects)
  • James Farrer
  • Michał Gajek
  • Eric J. Smith
  • Drew Peterson

and other folks that I surely missed.

Marten has probably been the best OSS project I’ve ever been a part of in terms of community input and involvement and I’m looking forward to seeing where it goes next.

 

What’s next?

Marten 2.1 will actually drop pretty soon with some in flight functionality that wasn’t quite ready today. And since nothing in this world attracts user bugs like a major version release, assume that a bug fix release is shortly forthcoming;)

 

 

 

 

Message Handlers in the new Jasper Service Bus

A couple months ago I blogged a little bit about a yet another OSS service bus project my shop is building out for messaging in .Net Core systems called Jasper that services as a wire compatible successor to the tooling we use in older .Net applications. While it’s already in production systems at work and doing fine, I have no clue if it’ll have any success as an OSS project. At the very least I’m going to squeeze some blog posts out of the process of building it and here we are.

Service bus frameworks are definitely an example of the Hollywood Principle where a framework handles much of the event handling and workflow while delegating to your application specific code through some kind of interface or idiom. In most of the cases I’ve seen over the years in .Net, you’ll see some kind of interface like the one below that allows you to plug your message handlers into your service bus infrastructure:

public interface IHandler
{
    Task Handle(T message);
}

I’ve certainly used this approach in a handful of cases, and there’s even some direct support for auto-registering this kind of service strategy inside of StructureMap if you want to roll your own framework. It’s easy to understand, adds some level of discoverability, and might help guide users. It’s also somewhat limiting in flexibility and the copious usage of generics can easily lead users into some bad places — and I say that partially based on a decade of helping folks with generics on the StructureMap user lists.

Jasper takes a different approach that relies much more on naming conventions and method signatures. To make that concrete, here’s the very simplest form of message handlers you can use in Jasper and if you don’t mind, let me leave how could this possibly work efficiently for a followup post (spoiler alert: Roslyn is awesome):

public class ExampleHandler
{
    public void Handle(Message1 message)
    {
        // Do work synchronously
    }

    public Task Handle(Message2 message)
    {
        // Do work asynchronously
        return Task.CompletedTask;
    }
}

Out of the box, Jasper finds and uses message handling methods by searching for concrete classes whose names are suffixed with either “Handler” or “Consumer” (there’s some historical reasons for having both) and then discovers message handling actions by analyzing the public methods on those classes for message handling candidates.

Right off the bat, you can see that Jasper allows you to write either synchronous or asynchronous methods to handle messages, so no more phantom “return Task.CompletedTask;” lines cluttering up your code.

Moreover, you can wring out a little more performance in your system by using static methods instead:

public static class StaticHandler
{
    public static Task Handle(Message3 message)
    {
        return Task.CompletedTask;
    }
}

It might be advantageous to use this approach to reduce memory allocations at run time and should give you slightly more efficient IL. Of course, any handler method, static or otherwise, isn’t terribly helpful unless you can get at the services within your application that you’ll need to invoke to process the message.

To that end Jasper gives you a couple possibilities. First, you can do the idiomatic, constructor injection approach like this:

public class ServiceUsingHandler
{
    private readonly IService _service;

    public ServiceUsingHandler(IService service)
    {
        _service = service;
    }

    public void Handle(Message1 message)
    {
        // do something with _service to handle this thing
    }
}

At the moment, Jasper would revert to spinning up a StructureMap nested container and uses that to build out the ServiceUsingHandler objects something like this:

// _root is a reference to the application's root
// container
using (var nested = _root.GetNestedContainer())
{
    var serviceUsingHandler = nested.GetInstance();
    var message1 = (Message1)context.Envelope.Message;
    serviceUsingHandler.Handle(message1, widget);
}

Using that approach enables Jasper to build objects of your handler classes with whatever dependencies you would need. Alternatively, you can also use “method injection” in your handlers like this:

public void Handle(
    Message2 message, 
    IService service, 
    Envelope envelope
)
{
    // handle the message
}

In the example above, Jasper “knows” how to resolve both the IService and Envelope dependencies before calling into the Handle() method. The Envelope object is Jasper’s version of an envelope wrapper that gives you more metadata about the current message. Instead of pushing everything through the constructor function, you can opt for potentially simpler and cleaner code by opting for method injection instead. In the case of the Envelope, that is not even available through the IoC container.

More about this in a later post, but ironically as the author of literally the oldest IoC container in .the .Net ecosystem, I’m trying hard to reduce Jasper’s usage of IoC containers at runtime.

The last thing I wanted to show here was Jasper’s concept of cascading messages that we used with some success in the earlier FubuMVC service bus. It’s very common for the handling of the original message to trigger additional “cascading” messages. In most service bus frameworks, that’d probably be something like this:

public class MessageHandler
{
    // Successfully handling Message1 will generate
    // a Message2 going out
    public Message2 Handle(Message1 message, IServiceBus bus)
    {
        bus.Send(new Message2());
    }
}

You can absolutely do that in Jasper as well, but we also support a policy where object(s) returned from a handler method are considered to be outgoing messages that are sent out as part of considering the message request complete. In its simplest usage, that may look like this:

public class MessageHandler
{
    // Successfully handling Message1 will generate
    // a Message2 going out
    public Message2 Handle(Message1 message)
    {
        return new Message2();
    }

    // Same thing, but async
    public Task Handle(Message1 message)
    {
        
    }
}

To get a little more complex, let’s say that your neck deep in CQRS jargon and when your service receives a “Command1” you raise one or more domain events that are handled separately. With cascading messages, that can look like this:

public IEnumerable<object> Handle(Command1 command)
{
    yield return new Event1();
    yield return new Event2();
}

In this case, each object returned is an outgoing message. I like the cascading message approach because it makes your message handlers easier to test with pure state-based testing.

There’s plenty more going on with this feature, but my wife really needs me to get out the office to go help with the little ones, so there’s going to have to more later;)

How Should Microservice’s Communicate?

We do quite a bit of distributed development and inter-service messaging at work. Some of this is done through exposing HTTP services. For asynchronous messaging between systems, my shop uses FubuMVC and its .Net Core replacement “Jasper” as a service bus (translate “Jasper” to “MassTransit” or “NServiceBus” when you read this). This blog post is a draft of our architectural team’s advice to our teams on choosing which option to use for their projects as part of our nascent microservice architecture approach. If any of my colleagues see this and disagree with me, don’t worry because one way or another this is going to be a living document and you’ll get to have input to this.

Microservices will generally need to send or process messages from other microservices or clients. To that end, it’s worth considering your options for inter-service communication.

We commonly use either HTTP services or the Jasper/FubuMVC service bus to communicate between services. Before you choose what tooling to use for service to service communication, first think about what your messaging requirements are. Service to service communication is roughly going to fall into these categories:

  1. Publish/Subscribe – asynchronously broadcast a message to all interested subscribers without expecting an immediate response. For the purpose of differentiation with “fire and forget,” let’s say that this also implies guaranteed delivery, meaning that messages are persisted durably until they are able to be published. The Jasper/FubuMVC service bus tools accomplish guaranteed delivery through the durable “store and forward” mechanism in LightningQueues and eventually RabbitMQ as we transition to Docker’ized hosting.
  2. Request/Reply – invoke another service while expecting a matching response. Querying data from a web service is an example. Sending a message through the service bus with the expectation of a response is also an example. The query handlers are an example of request/reply
  3. Fire and Forget – sending a request and not caring about any kind of response or whether or not the response is really received. This pattern is mostly appropriate for messages where you’re more concerned about performance and it’s not vital for the messages to be processed. The intra-node communication that Jasper/FubuMVC uses to coordinate subscriptions and health checks is done through LightningQueues in its “fire and forget” mode.

 

Use HTTP services if:

  • Your service is going to be exposed to external users of your API
  • Your service will need to be consumed by a web browser client
  • You are exposing data query endpoints to other services, as in the other services need to request information and use that data immediately
  • You do not need guaranteed delivery
  • You do not exactly know upfront what other mechanisms that future clients of your microservice will support. The idea here is that HTTP is essentially ubiquitous across platforms
  • You want to expose your service to non-.Net clients. It might be perfectly possible to use our existing service bus from other platforms, but in this case, HTTP endpoints are probably much less friction

 

Use a Service Bus if:

  • You need durable, publish/subscribe semantics. If your service does not need to wait for a reply or acknowledgement from the downstream system, you probably want publish/subscribe.
  • If you need to send the same messages to multiple subscribers
  • If you need to support “dynamic subscriptions” that allow other services to register with your service to receive event messages from your service
  • If you want fire and forget messaging, use the service bus with the non-persistent mode in LightningQueues. (think “ZeroMQ”)
  • You may need to take advantage of the “delayed messages” feature in Jasper/FubuMVC
  • You need to implement some kind of long-lived, saga workflow
  • While it is possible to throttle HTTP requests, it is probably easier and more effective to accommodate surge loads through the message queues behind the service bus
  • If the ordering of message processing is important, you probably need to be queueing within a service bus

 

Gray Areas

It’s not a perfectly black and white choice between using HTTP versus messaging with a service bus. The service bus also supports the request/reply pattern and you could happily use HTTP for fire and forget messaging. Both approaches can be scaled horizontally with our current technology stack. To muddle the picture even more, Jasper will eventually include an HTTP transport as well for more efficient request/reply support. If you feel like it’s unclear which direction to go, it is more than acceptable to choose the technology that the project team is most comfortable with. In all likelihood, that is going to mean using the more common ASP.Net Core stack for HTTP services rather than the somewhat custom service bus technology we use today.

 

Avoid These Integration Approaches

There will inevitably be reasons why we have to use options in this list because of external clients, but all the same, it is highly recommended that you do not use these integration approaches:

  • Publishing file drops to the file system and monitoring folders
  • Publishing files to FTP servers
  • Integration through shared databases. Relational databases aren’t efficient queueing mechanisms anyway, and we really don’t want the hard coupling between services that comes from sharing an underlying database

 

 

Disagree? Have something to add? Feel very free to help me make this list better by dropping a comment;-)

 

 

An Early Look at Multi-Tenancy in Marten 2.0

The code shown in this post is in flight and I’m just writing this post to try to get more feedback and suggestions on the approach we’re going so far before doing anything silly like making an official release.

The Marten community has been working toward a 2.0 release some time in the next couple months (hopefully in June for my own peace of mind). Since it is a full point release, we can entertain breaking API changes and major restructuring of the code. The big ticket items have been improving performance, reducing memory usage inside of Marten, a yet-to-be-completely-defined overhaul of the event store. The biggest change by far in terms of development time is the introduction of multi-tenancy support within Marten.

From Wikipedia:

The term “software multitenancy” refers to a software architecture in which a single instance of software runs on a server and serves multiple tenants. A tenant is a group of users who share a common access with specific privileges to the software instance.

The gist of multi-tenancy is that you are able to store and retrieve data tied to a tenant (client/customer/etc.), preferably in a way that prevents one tenant’s users from seeing or editing data from other tenants — and yes, I have indeed seen systems that screwed up on this in harmful ways.

To make this a little more concrete, here’s a sample:

[Fact]
public void use_multiple_tenants()
{
    // Set up a basic DocumentStore with multi-tenancy
    // via a tenant_id column
    var store = DocumentStore.For(_ =>
    {
        // This sets up the DocumentStore to be multi-tenanted
        // by a tenantid column
        _.Connection(ConnectionSource.ConnectionString)
            .MultiTenanted();
    });

    // Write some User documents to tenant "tenant1"
    using (var session = store.OpenSession("tenant1"))
    {
        session.Store(new User{UserName = "Bill"});
        session.Store(new User{UserName = "Lindsey"});
        session.SaveChanges();
    }

    // Write some User documents to tenant "tenant2"
    using (var session = store.OpenSession("tenant2"))
    {
        session.Store(new User { UserName = "Jill" });
        session.Store(new User { UserName = "Frank" });
        session.SaveChanges();
    }

    // When you query for data from the "tenant1" tenant,
    // you only get data for that tenant
    using (var query = store.QuerySession("tenant1"))
    {
        query.Query<User>()
            .Select(x => x.UserName)
            .ToList()
            .ShouldHaveTheSameElementsAs("Bill", "Lindsey");
    }

    using (var query = store.QuerySession("tenant2"))
    {
        query.Query<User>()
                .Select(x => x.UserName)
                .ToList()
                .ShouldHaveTheSameElementsAs("Jill", "Frank");
    }
}

There are three basic possibilities for multi-tenancy that we are considering or building:

  1. Separate database per tenant — For maximum separation of different client’s data, you can opt to store the information in separate databases with the same schema structure, with the obvious downside being more complicated deployments and quite possibly requiring more hosting infrastructure. At runtime, when you tell Marten what the tenant is, and behind the scenes it will look up the database connection information for that tenant and possibly create a missing tenant database on the fly in development modes. We don’t quite have this scenario supported yet, but we’ve done a lot of preparatory work in Marten’s internals to enable this mechanism to work without having to blow up application memory by duplicating objects underneath the DocumentStore objects for each tenant.
  2. Separate schema per tenant — Using a separate schema in the same database for each tenant might be a great compromise between data separation and server utilization. Unfortunately, some Marten internals are making this one harder than it should be. Today, you can opt to stick different document types into different schemas. My theory is that if we could eliminate that feature, we could drastically simplify this scenario.
  3. Multi-tenancy in a single table with a tenant id — The third possibility is to store all tenant data in the same tables, but use a new “tenant_id” column to distinguish between tenants. Marten needs to be smart enough to quietly filter all queries based on the current tenant and to always write documents to the current tenant id. Likewise, Marten has been changed so that you cannot modify data from any other tenant than the current tenant for a session. Most of the work to support this option is already done and I expect this to be the most commonly used approach.

Right now, we’re very close to fully supporting #3, and not too far away from #1 either. I have a theory that we could support a kind of hybrid of #1 and either #2 or #3 that could be the basis for sharding Marten databases.

We *could* also do multi-tenancy by having separate tables per tenant in the same schema, but that’s way more work inside of Marten internals and I just flat out don’t want to do that.

So, um, what do you think? What would you use or change?

Storyteller 4.2: ASP.Net Core, Databases, Json

I was just able to push the official Nugets for Storyteller 4.2 with some cool new features we built for my shop’s internal automated testing, including:

  • Storyteller 4.2
  • dotnet-storyteller 1.1.2
  • Storyteller.AspNetCore 1.0
  • Storyteller RDBMS 1.0
  • StorytellerRunner 1.1.2 (used by dotnet storyteller)
  • StorytellerRunnerCsproj 4.2 (the classic csproj/appdomain runner for .Net 4.6 apps)

The entire list of Github issues in the 4.2 release is here.

The Highlights

  1. Built in support to make declarative checks against the expected structure of a Json string via the JsonComparisonFixture class
  2. Support for using Storyteller to write specifications against ASP.Net Core applications via the new Storyteller.AspNetCore nuget. See also

    Using Storyteller with ASP.Net Core Systems.

  3. Support for addressing and verifying databases with the new Storyteller.RDBMS nuget. See also A Concept for Integrated Database Testing within Storyteller.
  4. New Fixture base classes for checking model state (CheckModelFixture), setting up model state (ModelFixture), and executing API’s that can be treated as “one model in, one model out” using the new ApiFixture
  5. A new extension model for the Storyteller engine

What’s Coming Next for Storyteller?

  • The big thing coming next is a dotnet test adapter for VS2017 so that you can easily kick off or debug Storyteller specifications from within Visual Studio.Net or JetBrains Rider
  • Fleshing out the Selenium add-on
  • It’s an oddball thing, but we have a proof of concept for an approach to test React/Redux frontend’s subcutaneously with Storyteller. If that works out, we’ll be publishing that add on as well