Tag Archives: Marten

Marten 2.4.0 — now plays nicer with others

I was able to push a Marten 2.4.0 release and updated documentation yesterday with several bug fixes and some new small, but important features. The key additions are:

  1. Opening up a Marten session with a existing native Npgsql connection and/or transaction. When you do this, you can also direct Marten whether or not it “owns” the transaction lifecycle and whether or not Marten is responsible for committing or rolling back the transaction on IDocumentSession.SaveChanges(). Since this was a request from one of the Particular Software folks, I’m assuming you’ll shortly see Marten-backed saga persistence in NServiceBus soon;-)
  2. Opening a Marten session that enlists in the current TransactionScope. Do note that this feature is only available when either targeting the full .Net framework (> .Net 4.6) or Netstandard 2.0.
  3. Ejecting a document from a document session. Other folks have asked for that over time, but strangely enough, it got done quickly when I wanted it for something I was building. Weird how it works that way sometimes.
  4. Taking advantage of Marten’s schema management capabilities to register “feature schema objects” for additional database schema objects.

I don’t know the timing, but there were some new features that got left out because I got impatient to push this release, and we’ve had some recent feature requests that aren’t too crazy. Marten will return next in “2.5.0.”

 

 

 

 

 

 

 

Advertisements

Retrospective on Marten at 2 Years Old

I made the very first commit to Marten two years ago this week. Looking at the statistics, it’s gotten just shy of 2,000 commits since then from almost 60 contributors. It’s not setting any kind of world records for usage, but it’s averaging a healthy (for a .Net OSS project) 100+ downloads a day.

Marten was de facto sponsored by my shop because we intended all along to use it as a way to replace RavenDb in our ecosystem with Postgresql. Doing Marten out in the open as an open source project hosted in GitHub has turned out to be hugely advantageous because we’ve had input, contributions, and outright user testing from so many external folks before we even managed to put Marten into our biggest projects. Arguably — and this frustrates me more than a little bit — Marten has been far more successful in other shops that in my own.

I’ve been very pleasantly surprised by how the Marten community came together and how much positive contribution we’ve gotten on new features, documentation, and answering user questions in our Gitter room. At this point, I don’t feel like Marten is just my project anymore and that we’ve genuinely got a healthy group of contributors and folks answering user questions (which is contributing greatly to my mental health).

Early adopters are usually the best users to deal with because they’re more understanding and patient than the folks that come much later when and if your tool succeeds. There’s been a trend that I absolutely love in Marten where we’ve been able to collect a lot of bug reports as a pull request with failing tests that show you exactly what’s wrong. For a project that’s so vulnerable to permutation problems, that’s been a life send. Moreover, we’ve had enough users using it in lots of different things that’s led to the discovery and resolution of a lot of functionality and usability problems.

I’m a little bit disappointed by the uptake in Marten usage, because I think it’s hugely advantageous for developer productivity over ORM’s like Entity Framework and definitely more productive in many problem domains than using a relational database straight up. I don’t know if that’s mostly because the .Net community just isn’t very accepting of tools like this that are outside of the mainstream, we haven’t been able to break through in terms of promoting it, or if it just isn’t that compelling to the average .Net developer. I strongly suspect that Marten would be far more successful if it had been built on top of Sql Server, and we might test that theory if Sql Server ever catches up to Postgresql in terms of JSON and Javascript support (it’s not even close yet).

For some specific things:

  • Postgresql is great for developers just out of the sheer ease of installing it in developer or testing environments
  • I thought going into Marten that the Linq support would be the most problematic thing. After working on the Linq support for quite awhile, I now think that the Linq support is the most problematic and time consuming thing to work on and it’s likely that folks will never stop coming up with new usage scenarios
  • The Linq support would be so much easier and probably more performant when Postgresql gets their proposed JsonPath querying feature. Again, I don’t think that Sql Server’s JSON support is adequate to support Marten’s feature set, but they at least went for JsonPath in their Json querying.
  • A lot of other people contributed here too, but Marten has been a great learning experience on asynchronous code that’s helping me out quite a bit in other projects
  • The event sourcing feature has been a mixed bag for me. My shop hasn’t ended up adopting it, so I’m not dogfooding that work at all — but guess what seems to be the most popular part of Marten to the outside world? The event sourcing support wouldn’t be viable if we didn’t have so much constructive feedback and help from other people.
  • I think it was advantageous to have the documentation done very early and constantly updated as we went
  • After my FubuMVC flop, I swore that if I tried to do another big OSS project that I’d try much harder to build community, document it early, and promote it more effectively. To that end, you can see or hear more about Marten on DotNetRocks, the NoSQL podcast, the Cross Cutting Concerns podcast, a video on Channel 9Herding Code, a recent conversation on Hanselminutes, and a slew of blog posts as we went.

Let my close by thanking the Marten community. I might fight burnout occasionally or get grumpy about the internal politics around Marten at work, but y’all have been fantastic to interact with and I really enjoy the Marten community.

How we did (and did not) improve performance and efficiency in Marten 2.0

Marten 2.0 was released yesterday, and one of the improvements is somewhat significantly runtime performance and far better memory utilization in applications that use Marten. For today’s blog post, here’s what we did and tried to get there:

  • Avoiding Json strings whenever possible. Some time last year Ayende wrote a “review” of Marten on his blog before almost immediately retracting it. While I didn’t agree with most of his criticisms, he did call out Marten for being inefficient in its Json serialization by reading and writing the full Json strings instead of opting for more efficient mechanisms of reading or writing via byte arrays or Stream’s. The “write” side of this problem was largely solved in Marten 2.0, but after some related changes in the underlying Npgsql library, the “read” side of Marten uses TextReader’s as the input to Json serialization, therefore bypassing the need to create then immediately tear down string objects. These changes reduced the memory allocations in Marten almost by half, with maybe a 15-20% improvement in performance.
  • StringBuilder for all SQL command build up. I know what you’re thinking, “duh, StringBuilder is way more efficient than string concatenation,” but Marten got off the ground by mostly using string interpolation and concatenation. For 2.0, I went back over all that code and switched to StringBuilder’s, which has the nice impact of reducing memory utilization quite a bit (it didn’t make that much difference in performance). I absolutely don’t regret starting with simpler, cruder mechanisms to get things working before pulling in this optimization.
  • FastExpressionCompiler – Marten heavily uses dynamically generated Expression’s that are then compiled to Func or Action’s for document persistence and loading. The excellent FastExpressionCompiler library from Maksim Volkau replaces the built in Expression compilation with a new model that results both in delegates that are faster in runtime, and also reduces the compilation time of these expressions. Using FastExpressionCompiler makes Marten bootstrap faster, which made a huge improvement in Marten’s test suite execution. I measured about a 10% throughput performance in Marten’s benchmarks just by using this library
  • Newtonsoft.Json 9 to 10 and back to 9 – Newtonsoft.Json 10 was measurably slower in the Marten benchmarks, so we reverted back to 9.0.1. Bummer. You can always opt for Jil or other alternatives for considerably faster json serialization, but we found too many cases where Jil errored out on document types that Newtonsoft.Json handled just fine, so we stuck with Newtonsoft as the default based on the idea that the code should at least work;)

 

What’s left to do for performance?

  • I’m sure we could get better with our mechanics for byte[] or char[] pooling and probably some buffering in the ADO.Net manipulation during async methods
  • We know there are some places where the Linq provider generates Sql that isn’t as efficient as it could be. We might try to tackle this tactically in use case by use case, but I’m hoping for the version of Postgresql after 10 to get their improved Json querying functionality based on JsonPath before we do anything big to the Linq support.

Marten 2.0 is Out!

banner

 

I was just able to push the official Marten 2.0 nuget — and update the documentation after the Github outage today settled down;) The “2.0” moniker reflects the fact that there are some breaking API changes, but it’s doubtful that a typical user would even see them. A few operations moved off of IDocumentStore.Advanced and the Linq extensibility interface changed somewhat.

I’m going to be lazy and leave blog posts with actual content for later this week, but the highlights are:

  • Better performance and less memory usage — I’ll blog about what we did tomorrow
  • Much more flexibility in the event store and hopefully improved usability
  • Explicit insert and update document operations as opposed to the default “upsert” functionality
  • Multi-tenancy support within a single database
  • Persist and query documents serialized with camel casing (or snake casing) — a big request from several users who wanted to be able to stream the raw document json in Http services
  • The ability to run Marten with PLV8 disabled in environments where that extension is not (yet) available *cough* Azure *cough*

It’s not the slightest bit interesting to end users, but there was a massive change to the Marten internals for checking, updating, and creating schema objects in the underlying Postgresql database. That change has made it much easier to introduce changes of all kinds into Marten, and should allow for an easy extensibility model later.

The entire list of changes and contributions is here on the Github milestone page.

Thank you to…

I’m going to miss someone here, but the long list of folks who deserve some thanks for this release:

  • A special thanks to Joona-Pekka for tackling documentation updates and some uglier fixes in this release
  • James Hopper
  • Szymon Kulec for his help in the performance updates
  • Jarrod Alexander
  • Babu Annamalai for getting us running on AppVeyor, TravisCI, and up on the VS2017 project system
  • Eric Green, Daniel Wertheim, Wastaz, Marc Piechura, and Jeff Doolittle for their input to the event store functionality in this release
  • Bibodha Neupane (my colleague who’s been dogfooding the multi-tenancy support on one of our projects)
  • James Farrer
  • Michał Gajek
  • Eric J. Smith
  • Drew Peterson

and other folks that I surely missed.

Marten has probably been the best OSS project I’ve ever been a part of in terms of community input and involvement and I’m looking forward to seeing where it goes next.

 

What’s next?

Marten 2.1 will actually drop pretty soon with some in flight functionality that wasn’t quite ready today. And since nothing in this world attracts user bugs like a major version release, assume that a bug fix release is shortly forthcoming;)

 

 

 

 

An Early Look at Multi-Tenancy in Marten 2.0

The code shown in this post is in flight and I’m just writing this post to try to get more feedback and suggestions on the approach we’re going so far before doing anything silly like making an official release.

The Marten community has been working toward a 2.0 release some time in the next couple months (hopefully in June for my own peace of mind). Since it is a full point release, we can entertain breaking API changes and major restructuring of the code. The big ticket items have been improving performance, reducing memory usage inside of Marten, a yet-to-be-completely-defined overhaul of the event store. The biggest change by far in terms of development time is the introduction of multi-tenancy support within Marten.

From Wikipedia:

The term “software multitenancy” refers to a software architecture in which a single instance of software runs on a server and serves multiple tenants. A tenant is a group of users who share a common access with specific privileges to the software instance.

The gist of multi-tenancy is that you are able to store and retrieve data tied to a tenant (client/customer/etc.), preferably in a way that prevents one tenant’s users from seeing or editing data from other tenants — and yes, I have indeed seen systems that screwed up on this in harmful ways.

To make this a little more concrete, here’s a sample:

[Fact]
public void use_multiple_tenants()
{
    // Set up a basic DocumentStore with multi-tenancy
    // via a tenant_id column
    var store = DocumentStore.For(_ =>
    {
        // This sets up the DocumentStore to be multi-tenanted
        // by a tenantid column
        _.Connection(ConnectionSource.ConnectionString)
            .MultiTenanted();
    });

    // Write some User documents to tenant "tenant1"
    using (var session = store.OpenSession("tenant1"))
    {
        session.Store(new User{UserName = "Bill"});
        session.Store(new User{UserName = "Lindsey"});
        session.SaveChanges();
    }

    // Write some User documents to tenant "tenant2"
    using (var session = store.OpenSession("tenant2"))
    {
        session.Store(new User { UserName = "Jill" });
        session.Store(new User { UserName = "Frank" });
        session.SaveChanges();
    }

    // When you query for data from the "tenant1" tenant,
    // you only get data for that tenant
    using (var query = store.QuerySession("tenant1"))
    {
        query.Query<User>()
            .Select(x => x.UserName)
            .ToList()
            .ShouldHaveTheSameElementsAs("Bill", "Lindsey");
    }

    using (var query = store.QuerySession("tenant2"))
    {
        query.Query<User>()
                .Select(x => x.UserName)
                .ToList()
                .ShouldHaveTheSameElementsAs("Jill", "Frank");
    }
}

There are three basic possibilities for multi-tenancy that we are considering or building:

  1. Separate database per tenant — For maximum separation of different client’s data, you can opt to store the information in separate databases with the same schema structure, with the obvious downside being more complicated deployments and quite possibly requiring more hosting infrastructure. At runtime, when you tell Marten what the tenant is, and behind the scenes it will look up the database connection information for that tenant and possibly create a missing tenant database on the fly in development modes. We don’t quite have this scenario supported yet, but we’ve done a lot of preparatory work in Marten’s internals to enable this mechanism to work without having to blow up application memory by duplicating objects underneath the DocumentStore objects for each tenant.
  2. Separate schema per tenant — Using a separate schema in the same database for each tenant might be a great compromise between data separation and server utilization. Unfortunately, some Marten internals are making this one harder than it should be. Today, you can opt to stick different document types into different schemas. My theory is that if we could eliminate that feature, we could drastically simplify this scenario.
  3. Multi-tenancy in a single table with a tenant id — The third possibility is to store all tenant data in the same tables, but use a new “tenant_id” column to distinguish between tenants. Marten needs to be smart enough to quietly filter all queries based on the current tenant and to always write documents to the current tenant id. Likewise, Marten has been changed so that you cannot modify data from any other tenant than the current tenant for a session. Most of the work to support this option is already done and I expect this to be the most commonly used approach.

Right now, we’re very close to fully supporting #3, and not too far away from #1 either. I have a theory that we could support a kind of hybrid of #1 and either #2 or #3 that could be the basis for sharding Marten databases.

We *could* also do multi-tenancy by having separate tables per tenant in the same schema, but that’s way more work inside of Marten internals and I just flat out don’t want to do that.

So, um, what do you think? What would you use or change?

Marten 1.3 is Out: Bugfixes, Usability Improvements, and a lot less Memory Usage

I just uploaded Marten 1.3.0 to Nuget (but note that Nuget has had issues today with the index updating being delayed). This release is mostly bugfixes, but there’s some new functionality, and significant improvements to performance on document updates and bulk inserts. You can see the entire list of changes here with some highlights below.

I’d like to thank Marten contributors Eric Green, James Hopper, Michał Gajek, Barry Hagan, and Babu Annamalai for their contributions in this release. A special thanks goes out to Szymon Kulec for all his efforts in both Marten and Npgsgl to reduce Marten’s memory allocations.

Thanks to Phillip Haydon There’s a slew of new documentation on our website about Postgresql for Sql Server folks.

What’s New?

It wasn’t a huge release for new features, but these were added:

  1. New “AsPagedList()” helper for fetching documents by page
  2. Query for deleted, not deleted, or all documents marked as “soft deleted
  3. Indexes on Marten’s metadata columns
  4. Querying by the document metadata

What’s Next?

The next release is going to be Marten 2.0 because we need to make a handful of breaking API changes (don’t worry, it’s very unlikely that most users would hit this). The big ticket item is a lot more work to reduce memory allocations throughout Marten. The other, not-in-the-slightest-bit-sexy change is to standardize and streamline Marten’s facilities for database change tracking with the hope that this work will make it far easier to start adding new features again.