Marten 2.0 is Out!

I was just able to push the official Marten 2.0 nuget — and update the documentation after the Github outage today settled down;) The “2.0” moniker reflects the fact that there are some breaking API changes, but it’s doubtful that a typical user would even see them. A few operations moved off of IDocumentStore.Advanced and the Linq extensibility interface changed somewhat.

I’m going to be lazy and leave blog posts with actual content for later this week, but the highlights are:

Better performance and less memory usage — I’ll blog about what we did tomorrow
Much more flexibility in the event store and hopefully improved usability
Explicit insert and update document operations as opposed to the default “upsert” functionality
Multi-tenancy support within a single database
Persist and query documents serialized with camel casing (or snake casing) — a big request from several users who wanted to be able to stream the raw document json in Http services
The ability to run Marten with PLV8 disabled in environments where that extension is not (yet) available *cough* Azure *cough*

It’s not the slightest bit interesting to end users, but there was a massive change to the Marten internals for checking, updating, and creating schema objects in the underlying Postgresql database. That change has made it much easier to introduce changes of all kinds into Marten, and should allow for an easy extensibility model later.

The entire list of changes and contributions is here on the Github milestone page.

Thank you to…

I’m going to miss someone here, but the long list of folks who deserve some thanks for this release:

A special thanks to Joona-Pekka for tackling documentation updates and some uglier fixes in this release
James Hopper
Szymon Kulec for his help in the performance updates
Jarrod Alexander
Babu Annamalai for getting us running on AppVeyor, TravisCI, and up on the VS2017 project system
Eric Green, Daniel Wertheim, Wastaz, Marc Piechura, and Jeff Doolittle for their input to the event store functionality in this release
Bibodha Neupane (my colleague who’s been dogfooding the multi-tenancy support on one of our projects)
James Farrer
Michał Gajek
Eric J. Smith
Drew Peterson

and other folks that I surely missed.

Marten has probably been the best OSS project I’ve ever been a part of in terms of community input and involvement and I’m looking forward to seeing where it goes next.

What’s next?

Marten 2.1 will actually drop pretty soon with some in flight functionality that wasn’t quite ready today. And since nothing in this world attracts user bugs like a major version release, assume that a bug fix release is shortly forthcoming;)

An Early Look at Multi-Tenancy in Marten 2.0

The code shown in this post is in flight and I’m just writing this post to try to get more feedback and suggestions on the approach we’re going so far before doing anything silly like making an official release.

The Marten community has been working toward a 2.0 release some time in the next couple months (hopefully in June for my own peace of mind). Since it is a full point release, we can entertain breaking API changes and major restructuring of the code. The big ticket items have been improving performance, reducing memory usage inside of Marten, a yet-to-be-completely-defined overhaul of the event store. The biggest change by far in terms of development time is the introduction of multi-tenancy support within Marten.

From Wikipedia:

The term “software multitenancy” refers to a software architecture in which a single instance of software runs on a server and serves multiple tenants. A tenant is a group of users who share a common access with specific privileges to the software instance.

The gist of multi-tenancy is that you are able to store and retrieve data tied to a tenant (client/customer/etc.), preferably in a way that prevents one tenant’s users from seeing or editing data from other tenants — and yes, I have indeed seen systems that screwed up on this in harmful ways.

To make this a little more concrete, here’s a sample:

[Fact]
public void use_multiple_tenants()
{
    // Set up a basic DocumentStore with multi-tenancy
    // via a tenant_id column
    var store = DocumentStore.For(_ =>
    {
        // This sets up the DocumentStore to be multi-tenanted
        // by a tenantid column
        _.Connection(ConnectionSource.ConnectionString)
            .MultiTenanted();
    });

    // Write some User documents to tenant "tenant1"
    using (var session = store.OpenSession("tenant1"))
    {
        session.Store(new User{UserName = "Bill"});
        session.Store(new User{UserName = "Lindsey"});
        session.SaveChanges();
    }

    // Write some User documents to tenant "tenant2"
    using (var session = store.OpenSession("tenant2"))
    {
        session.Store(new User { UserName = "Jill" });
        session.Store(new User { UserName = "Frank" });
        session.SaveChanges();
    }

    // When you query for data from the "tenant1" tenant,
    // you only get data for that tenant
    using (var query = store.QuerySession("tenant1"))
    {
        query.Query<User>()
            .Select(x => x.UserName)
            .ToList()
            .ShouldHaveTheSameElementsAs("Bill", "Lindsey");
    }

    using (var query = store.QuerySession("tenant2"))
    {
        query.Query<User>()
                .Select(x => x.UserName)
                .ToList()
                .ShouldHaveTheSameElementsAs("Jill", "Frank");
    }
}

There are three basic possibilities for multi-tenancy that we are considering or building:

Separate database per tenant — For maximum separation of different client’s data, you can opt to store the information in separate databases with the same schema structure, with the obvious downside being more complicated deployments and quite possibly requiring more hosting infrastructure. At runtime, when you tell Marten what the tenant is, and behind the scenes it will look up the database connection information for that tenant and possibly create a missing tenant database on the fly in development modes. We don’t quite have this scenario supported yet, but we’ve done a lot of preparatory work in Marten’s internals to enable this mechanism to work without having to blow up application memory by duplicating objects underneath the DocumentStore objects for each tenant.
Separate schema per tenant — Using a separate schema in the same database for each tenant might be a great compromise between data separation and server utilization. Unfortunately, some Marten internals are making this one harder than it should be. Today, you can opt to stick different document types into different schemas. My theory is that if we could eliminate that feature, we could drastically simplify this scenario.
Multi-tenancy in a single table with a tenant id — The third possibility is to store all tenant data in the same tables, but use a new “tenant_id” column to distinguish between tenants. Marten needs to be smart enough to quietly filter all queries based on the current tenant and to always write documents to the current tenant id. Likewise, Marten has been changed so that you cannot modify data from any other tenant than the current tenant for a session. Most of the work to support this option is already done and I expect this to be the most commonly used approach.

Right now, we’re very close to fully supporting #3, and not too far away from #1 either. I have a theory that we could support a kind of hybrid of #1 and either #2 or #3 that could be the basis for sharding Marten databases.

We *could* also do multi-tenancy by having separate tables per tenant in the same schema, but that’s way more work inside of Marten internals and I just flat out don’t want to do that.

So, um, what do you think? What would you use or change?

Talking Marten on the Cross Cutting Concerns Podcast

When I was at Codemash this year, Matthew Groves was kind enough to let me do a podcast with him on Marten for the Cross Cutting Concerns podcast. Check it out.

Marten 1.3 is Out: Bugfixes, Usability Improvements, and a lot less Memory Usage

I just uploaded Marten 1.3.0 to Nuget (but note that Nuget has had issues today with the index updating being delayed). This release is mostly bugfixes, but there’s some new functionality, and significant improvements to performance on document updates and bulk inserts. You can see the entire list of changes here with some highlights below.

I’d like to thank Marten contributors Eric Green, James Hopper, Michał Gajek, Barry Hagan, and Babu Annamalai for their contributions in this release. A special thanks goes out to Szymon Kulec for all his efforts in both Marten and Npgsgl to reduce Marten’s memory allocations.

Thanks to Phillip Haydon There’s a slew of new documentation on our website about Postgresql for Sql Server folks.

What’s New?

It wasn’t a huge release for new features, but these were added:

New “AsPagedList()” helper for fetching documents by page
Query for deleted, not deleted, or all documents marked as “soft deleted“
Indexes on Marten’s metadata columns
Querying by the document metadata

What’s Next?

The next release is going to be Marten 2.0 because we need to make a handful of breaking API changes (don’t worry, it’s very unlikely that most users would hit this). The big ticket item is a lot more work to reduce memory allocations throughout Marten. The other, not-in-the-slightest-bit-sexy change is to standardize and streamline Marten’s facilities for database change tracking with the hope that this work will make it far easier to start adding new features again.

Marten 1.2 — Improved Linq support and way more polish

Marten is a library for .Net that turns Postgresql into a document database and event store.

I just published the Marten 1.2 release to Nuget. While I hoped to fit a lot more new functionality into this release, 1.2 really just adds a lot more polish to Marten by fixing several bugs, makes some performance improvements based on my company’s trial by fire usage of Marten during our peak “season”, and by largely reworking the internals of the Linq support.

Marten continues to have a vibrant community of interested folks and contributors that are helping push the project on. Probably missing some names, but I’d like to call out James Hopper, jokokko, Barry Hagan, Alexander Langer, and Robin van der Knaap for their contributions to this release. I’d also like to thank all of you who have opened and commented on Github issues to help improve Marten. If this all keeps up long enough, I may finally stop being so cynical about OSS on the .Net platform;)

Here’s the entire list of changes from the GitHub milestone. The highlights of the 1.2 release are:

Support for the SelectMany() operator in Linq queries (this story spurred an absurd amount of rework in our Linq support that I think will make it easier to add more features in subsequent releases)
Distinct() Linq query support
Named parameter usage in user supplied queries
Better logging and exception messages
Marten’s sequential Guid algorithm was corrected to order consistently with Postgresql. This should result in better write performance in Marten usage with Guid id’s.
Marten tries harder to warn you when you use unsupported Linq operators
Several improvements to querying against child collections
The ability to use event metadata in the built in aggregation projections
Cleaned up some of the database connection mechanics to stop mixing blocking and async calls and makes Marten much more aggressive about closing database connections

What’s Next?

I’m not 100% sure I want to commit to another new release before the holiday season, but 1.3 is looking like it’s going to be a lot of improvements for querying against multiple documents, new types of Select() transformations, and working over the internals to optimize performance.

The tentative list of 1.3 enhancements can be seen here.

Marten 1.1 Release Notes

Marten 1.1 was released just now (as in, hold your horses until Nuget gets done indexing it) with an assortment of bug fixes, performance & reliability improvements, and a couple of new convenience methods. As our teams have used Marten more at work, we’ve also had to make some adjustments for running Marten under reduced Postgresql security privileges and with the “AutoCreateSchemaObjects == None” mode. Finally, we had to add a couple new public members to existing API’s, so SemVer rules mean this had to be a minor point bump.

So what’s new or different? You can find the entire 1.1 issue and pull request list in GitHub. The highlights are described below:

Distinct() Support in Linq

From a pull request by John Campion, Marten now supports the Linq Distinct() keyword:

public void use_distinct(IQuerySession session)
{
    var surnames = session
        .Query<User>()
        .Select(x => x.LastName)
        .Distinct();
}

Better Connection and Transaction Hygiene

I’m a little embarrassed by this one, but at least we got it before it did too much harm. Marten had been too aggressive in starting transactions in sessions which has had the effect of making Npgsql send extraneous ROLLBACK; messages to Postgresql to close out the empty transactions. In some failure cases, our team at work was seeing this cause a connection to hang. We made two fixes for this behavior:

First off, if you IDocumentSession.SaveChanges(Async) is called when there are no outstanding changes queued up, Marten does absolutely nothing. No connection opened, no transaction started, just nothing.

Secondly, Marten now starts transactions lazily within an IDocumentSession. So instead of starting a transaction on the first time a session opens a connection to Postgresql, it defers that until SaveChanges() or SaveChangesAsync() is called.

public void lazy_tx(IDocumentSession session)
{
    // Executing this query will *not* start
    // a new transaction
    var users = session
        .Query<User>()
        .Where(x => x.Internal)
        .ToList();

    session.Store(new User {UserName = "lebron"});

    // This starts a transaction against the open
    // connection before doing any writes
    session.SaveChanges();
}

Data Migration Improvements

From our work on moving document storage from RavenDb to Marten (and other users too), we’ve bumped into a little bit of friction in Marten. The bulk inserts in either of the non-default modes left out the last modified data. That impacts either of these options:

public void bulk_inserts(IDocumentStore store, Target[] documents)
{
    store.BulkInsert(documents, BulkInsertMode.IgnoreDuplicates);

    // or

    store.BulkInsert(documents, BulkInsertMode.OverwriteExisting);
}

To make it easier to migrate data in documents that uses a Hilo sequence for identity assignment, we added a convenience method to establish a new “floor” in the sequence to avoid conflicting with the existing data being brought over from a new system.

public void reset_hilo(IDocumentStore store)
{
    // This resets the Hilo state in the database
    // for the IntDoc document type so that
    // all id's assigned will be greater than the floor
    // value.
    store.Advanced.ResetHiloSequenceFloor<IntDoc>(3000);
}

Do note that it’s possible and even likely that there will be gaps in the id sequence in the database when you do this.

Building Marten’s Async Daemon

A couple weeks ago I wrote a blog post on the new “Async Daemon” feature in Marten. This post is a bit that I cut out of that post just describing the challenges I faced and what I did to slide around the problems. For all Marten users that have been asking me about writing their own subsystem to read and process events offline, you really want to read this post to understand why that’s much harder than you’d think and why you do probably want to just help make the async daemon solid.

The first challenge for the async daemon was “knowing” when there are new events that need to be processed by async projections. When a projection runs, it needs to process the events in the same order that they were captured in. Since the async daemon was inevitably going to use some sort of polling (NOTIFY/LISTEN in Postgresql was not adequate by itself) to read events out of the event table, we needed a very efficient way to be able to page the event fetching without missing events.

We started Marten with the thought that we would try to accomplish that by having the event store enqueue the events in a rolling buffer table that some kind of offline process would poll and read, but we were talked out of that approach in discussions with a Postgresql consultant who was helping us at work. Moreover, as I worked through other use cases to rebuild projections from scratch or add new projections later, we realized that the rolling buffer table would never have worked for the async daemon.

We also experimented with using sequential Guid’s as the global identifier for events in the event store with the idea that we would be able to use that to key off of for the projections by always querying for “Id > [last event id encountered].” In my testing I was unable to get the sequential Guid algorithm to accurately order the event id’s, especially under a heavy parallel load.

In the end, we opted to make the event store table in Marten use a sequential long integer as its primary key, and backed that with a database SEQUENCE. That gave us a more reliable way to “know” what events were new for each individual projection. In testing I figured out pretty quickly that the async daemon was missing events when there’s a lot of concurrent events streaming in because of event sequence id’s being reserved from in flight transactions. To counteract that problem, I ended up taking a two step process:

Limit the async daemon to only querying against events that were captured before some time of threshold (3 seconds is the default) to avoid missing events that are still in flight
When the async daemon fetches a new page of events, it actually tries to check that there are no gaps in the event sequence, and if there is, it pauses a little bit, and tries again until there are no gaps in the sequence or if the subsequent fetch turns up the exact same data (leading the async daemon to believe that the missing events were rejected).

Those two steps — as far as I can tell — have eliminated the problems I was seeing before about missing events in flight. It did completely ruin a family dinner at our favorite Thai restaurant when I couldn’t make myself stop thinking about how to slide around the problems in event ordering;)

The other killer problem was in trying to make the async daemon resilient in the face of potential connectivity problems and occasional projection failures without losing any results. I’ll try to blog about that in a later post.

Proposed Roadmap for Marten 1.0 and Beyond

I’m just thinking out loud here and hoping for some usable feedback.

I feel like Marten is getting very close to an official 1.0 release, and the latest Nuget is marked as 1.0-alpha. The Marten community voted on our minimum feature set for 1.0 earlier this year and we’ve finished everything on that list as of late July (right before I went on a long family vacation;)).

Some thoughts on the big 1.0:

I’m a big believer in semantic versioning, so an OSS tool reaching 1.0 is a big deal because that starts the draconian versioning rules about backward compatibility. You want to get pretty close to a livable API before you throw that switch to 1.0.
It’s a chicken and egg kind of conundrum. What we need right now is more users spawning yet more feedback about Marten. I’d love to have more usage before flipping Marten to 1.0, but we’ll get a lot more users after we release it as 1.0.
In this day and age of package managers like Nuget, it’s a lot less friction to make more frequent releases and update your dependencies, so going 1.0 now knowing that 1.0.* bug fix releases and 1.* feature releases will be coming soon just isn’t that worrisome.
I feel pretty good about the document database side of Marten, but the event store functionality is still churning and it’s less mature.
We’re basically out of low hanging fruit kind of features on the document storage and Linq support
My shop is doing the work right now to transition a very large web application from RavenDb to Marten. Right now I’m thinking that the first version of Marten that goes into production across all of that application will be declared to be 1.0.

All that being said, my best guess for an official Marten 1.0 release is around October 1st. Right now my biggest issues on my plate are really all around schema management and our database team’s requirements for the DDL generation. And more documentation, but that battle never ends. Plenty of pull requests are still flowing in, but I think I’m personally done with any kind of major feature work for awhile unless there’s noticeable demand from the community for specific features.

Marten 1.1 and Beyond

Based on our current issue list and requests from the Marten Gitter room, I think this list is where Marten goes next after the 1.0 release:

Better support for child collections on documents
- Support the Linq SelectMany() operator
- Be able to do Include() against child collections
- Maybe be able to use child collections in Select() transformations
More types of event store projections — if you’re looking to get into doing some OSS work, I think these are our most approachable stories in the backlog
- Project to a flat table for better reporting?
- Projections that use the output of other projections
- Arbitrary categorization of projected views (by customer, by region, etc.). Some of our users have already done this themselves, but it’s not in Marten itself yet
Multi-tenancy support. My thinking right now is that we don’t directly put this into Marten, but make sure that there are adequate hooks to do this easily yourself. There’s a lot more information in the GitHub issue linked to above.
Possibly try to support the Linq GroupBy() operator. That might also lead into some kind of map/reduce capability within Marten. We’ve had the feedback that “Marten isn’t a real document db because it doesn’t have map/reduce.” I think that’s nonsense, but we might very well need to have a better story for creating aggregated views into the document state — which may or may not be best done as some kind of formal map/reduce strategy.
More support on document structural changes. Marten can already handle transformations of a single document type, but we’ll need to be able to later address document type names being changed, multiple document types getting combined (this is potentially a big deal for one of our systems), and whatever else we bump into next spring when we start optimizing a big system at work;)
Being able to do document transformation with more than one document at time. This would mean being able to use related documents in the same Select() transformation. Also, we’ll probably need to be able to use Javascript transformations across multiple document types.

There’s some other things in the GitHub issue list, but the above is what I’m thinking about right now for 1.1 and beyond.

Thoughts? Concerns? Requests? Let us know either here, the GitHub issue list, or the Marten Gitter room.

Proposed Marten Tooling for Database Management

This is an update to an earlier post on schema management using Marten.

At this point, I think the biggest challenges facing us at work for using Marten are strictly in the realm of database change management. To that end, we’re adding what will be a new package for command line tooling around Marten schema management and investigating possible usage of Sqitch to handle database migrations in our ecosystem. The command line usage shown in this post is in Marten master, but not pushed up to Nuget in any way yet. The Sqitch usage here is purely hypothetical.

When you’re using Marten, all the data definition language (DDL) for the underlying Postgresql database is generated to match by code within Marten. In development, you’d just run with the setting to auto-create database objects on the fly to match the code for faster iteration. For production deployment, however, you probably don’t get to do that and you’ll need some kind of database migration strategy to get the changes that your Marten application needs to the real database. That’s the gap that this post is trying to fill.

Command Line Tooling

My concept for supporting command line tooling suitable for build automation at this point is to publish a new library package called Marten.CommandLine that you can use to expose your own application and database through the command line. To use this tooling, follow these steps:

Create a new console application in your solution
Add the forthcoming Marten.CommandLine nuget
Add a reference to the projects in your system that would express the configuration for your Marten-enabled application
In the “Main()” entry point of your new console application, add code like this below to build up your Marten configuration via the StoreOptions class and then delegate to Marten to parse the command line arguments and execute the proper command:

    public class Program
    {
        public static int Main(string[] args)
        {
            var options = buildStoreOptions();

            return MartenCommands.Execute(options, args);
        }

        private static StoreOptions buildStoreOptions()
        {
            // build your own StoreOptions that 
            // establishes the configuration of your
            // Marten application
        }
    }

You can see an example of building the console application from the SampleConsoleApp project I used in the Marten codebase to test this functionality.

Once you have the code above, you’re actually ready to go. If you’re using the new dotnet CLI, running “dotnet run” in the root of your console application project yields this output listing the valid commands:

------------------------------------------------------------------------------------------------------------------------------------

  Available commands:
------------------------------------------------------------------------------------------------------------------------------------

   apply -> Applies all outstanding changes to the database based on the current configuration
  assert -> Assert that the existing database matches the current Marten configuration
    dump -> Dumps the entire DDL for the configured Marten database
   patch -> Evaluates the current configuration against the database and writes a patch and drop file if there are any differences
------------------------------------------------------------------------------------------------------------------------------------

If you’re not using the dotnet CLI yet, you’d just need to compile your new console application like you’ve always done and call the exe directly. If you’re familiar with the *nix style of command line interfaces ala Git, you should feel right at home with the command line usage in Marten.

For the sake of usability, let’s say that you stick a file named “marten.cmd” (or the *nix shell file equivalent) at the root of your codebase like so:

dotnet run --project src/MyConsoleApp %*

All the example above does is delegate any arguments to your console application. Once you have that file, some sample usages are shown below:

# Assert that the database matches the current database. This
# command will fail if there are differences
marten assert --log log.txt

# This command tries to update the database
# to reflect the application configuration
marten apply --log log.txt

# This dumps a single file named "database.sql" with 
# all the DDL necessary to build the database to
# match the application configuration
marten dump database.sql

# This dumps the DDL to separate files per document
# type to a folder named "scripts"
marten dump scripts --by-type

# Create a patch file called "patch1.sql" and
# the corresponding rollback file "patch.drop.sql" if any
# differences are found between the application configuration
# and the database
marten patch patch1.sql --drop patch1.drop.sql

In all cases, the commands expose usage help through “marten help [command].” Each of the commands also exposes a “–conn” (or “-c” if you prefer) flag to override the database connection string and a “–log” flag to record all the command output to a file.

My Current Thinking about Marten + Sqitch

Our team doing the RavenDb to Marten transition work has turned us on to using Sqitch for database migrations. From my point of view, I like this choice because Sqitch just uses script files in whatever the underlying database’s SQL dialect is. That means that Marten can use our existing “WritePatch()” schema management to tie into Sqitch’s migration scheme.

The way that I think this could work for us is first to have a Sqitch project established in our codebase with its folders for updates, rollbacks, and verify’s. In our build script that runs in our master continuous integration (CI) build, we would:

Call sqitch to update the CI database (or whatever database we declare to be the source of truth) with the latest known migrations
Call the “marten assert” command shown above to detect if there are outstanding differences between the application configuration and the database by examining the exit code from that command
If there are any differences detected, figure out what the next migration name would be based on our naming convention and use sqitch to start a new migration with that name
Run the “marten patch” command to write the update and rollback scripts to the file locations previously determined in steps 2 & 3
Commit the new migration file back to the underlying git repository

I’m insisting on doing this on our CI server instead of making developers do it locally because I think it’ll lead to less duplicated work and fewer problems from these migrations being created against work in progress feature branches.

For production (and staging/QA) deployments, we’d just use sqitch out of the box to bring the databases up to date.

I like this approach because it keeps the monotony of repetitive database change tracking out of our developer’s hair, while also allowing them to integrate database changes from outside of Marten objects into the database versioning.

Moving from RavenDb to Marten

EDIT 8/19: Couple other things came up about indexing yesterday that I added here.

For the purpose of this post, I’m only talking about the document database features in Marten. Our immediate need is to replace RavenDb before our busy season starts. Using the event store half of Marten probably won’t happen for us until next year.

The planets have finally aligned for us at work to begin converting our largest and most active application from RavenDb to Marten for persistence. I’m meeting with a couple of our teams this morning to talk over the transition, and this blog post is just an attempt to get my talking points prepared for them.

Moving to Marten

First off, Marten is really just a fancy data access library against the outstanding Postgresql database engine. Marten utilizes Postgresql’s JSONB type to efficiently store and query against our document data. We have deliberately based some of the most basic API usage on RavenDb where that made sense in order to make the transition to Marten easier for our teams, but Marten has deviated quite a bit in more advanced usage.

Here’s what I want our teams to know when we switch things over:

Marten is ACID all the way down. No more WaitForNonStaleResults() nonsense, no more subtle bugs or unstable automated tests from stale data. Some folks have poked back at this in Marten by claiming that eventual consistency is necessary for performance or scalability. So far, all our experimentation suggests that Marten’s Postgresql-backed writes – with ACID – are measurably faster than RavenDb.
Marten does not force you to declare which indexes you want to use for any given query. Postgresql itself can figure out the most efficient execution plan for itself. This is going to be advantageous for us in a couple ways. First by letting us rip a lot of RavenDb index code out. Secondly by making it much easier to optimize database performance without having to have so much impact on the code like it is today with RavenDb.
We need more documentation and blog posts on this topic, but it is perfectly possible to use the relational database features of Postgresql where that’s still valuable.
If it’s useful, it is possible to use Dapper in conjunction with Marten and even in the same unit of work/transaction.
Just like RavenDb, Marten’s IDocumentSession is effectively the unit of work and should be scoped to a logical transaction. In most cases in our systems that effectively translates to an IDocumentSession per HTTP request or service bus message.
There is no hard request throttling in Marten. You should be aware of how many network round trips you’re making during a single operation and there are diagnostics to track that, but Marten will not blow up in production because an operation happened to make too many requests.
There’s no equivalent to RavenDb’s embedded data store option. That was the killer feature in RavenDb we’re going to miss the most. Fortunately, it’s pretty easy to spin up Postgresql on your own box. For automated testing scenarios where today we just use a brand new RavenDb data store, we’ll just take advantage of Marten’s “database cleaner” to wipe out state in between tests. In a way, this will simplify some of our testing against distributed systems. If this becomes a problem for test performance, we have a couple fallback plans to either host Postgresql in disposable Docker images or to enhance our testing harnesses to leapfrog clean schemas between tests.
Most importantly, if there’s something in Marten you don’t like, you can either do a pull request or at least raise an issue in GitHub where I’ll see it and we can get it fixed. OSS FTW!
We don’t use this in our internal systems (but we should), but the “Include()” feature in Marten for fetching related documents in one round trip is quite different than Raven’s.
Batch querying in Marten is more explicit and different mechanically than RavenDb’s “Futures.” We should be using this feature to reduce network chattiness between applications and the database.
I am highly recommending the usage of the Compiled Query feature in Marten that has no equivalent in RavenDb for better runtime performance and even as a declarative query model. This feature can be used in combination with “Include()” and batch querying to maximize the performance of your Marten backed persistence.
You can use any tooling you want that’s compatible with Postgresql to poke and prod a Marten-ized database. I just use pgAdmin, but Datagrip or even just Visual Studio is useful.
Marten has quite a few more useful diagnostic abilities you can use to analyze the SQL being generated or track database activity by session. In a later blog post, I’ll talk about the reusable recipe we’ve built for Marten integration into FubuMVC applications.

Why we’re getting off of RavenDb

I’ve been asked several times since we started working on Marten in public what it would take for us to change our minds and continue with RavenDb. I think it’s quite possible that Voron will make a positive difference, but as I’ll explain a little below, we just don’t trust RavenDb’s quality and software engineering practices.

So why are we wanting to move away from RavenDb?

We’ve had multiple day+ outages due to RavenDb indexes getting corrupted and being unable to rebuild. That in a nutshell is more than enough reason to move on.
We’ve been concerned for years with RavenDb’s internal quality. We’ve experienced a number of regression bugs when changing versions of RavenDb to the point where we’re unwilling to even try upgrading it.
Their release and versioning strategies are not consistent with Semantic Versioning, so you never know if you’re going to get breaking changes in minor or revision level version changes
Unresponsive support when we’ve had production issues with RavenDb
We’ve not had a lot of success with the DevOps type tooling around RavenDb (replication, etc.) and we’re hopeful that adopting Postgresql helps out on that front.
Resource utilization. RavenDb requires a lot of handholding to keep the memory utilization reasonable. Naive usage of RavenDb almost invariably leads to problems.
The stale data issue as a result of RavenDb’s eventual consistency strategy has been a major source of friction for us