Moving from RavenDb to Marten

EDIT 8/19: Couple other things came up about indexing yesterday that I added here.

For the purpose of this post, I’m only talking about the document database features in Marten. Our immediate need is to replace RavenDb before our busy season starts. Using the event store half of Marten probably won’t happen for us until next year.

The planets have finally aligned for us at work to begin converting our largest and most active application from RavenDb to Marten for persistence. I’m meeting with a couple of our teams this morning to talk over the transition, and this blog post is just an attempt to get my talking points prepared for them.

Moving to Marten

First off, Marten is really just a fancy data access library against the outstanding Postgresql database engine. Marten utilizes Postgresql’s JSONB type to efficiently store and query against our document data. We have deliberately based some of the most basic API usage on RavenDb where that made sense in order to make the transition to Marten easier for our teams, but Marten has deviated quite a bit in more advanced usage.

Here’s what I want our teams to know when we switch things over:

  • Marten is ACID all the way down. No more WaitForNonStaleResults() nonsense, no more subtle bugs or unstable automated tests from stale data. Some folks have poked back at this in Marten by claiming that eventual consistency is necessary for performance or scalability. So far, all our experimentation suggests that Marten’s Postgresql-backed writes – with ACID – are measurably faster than RavenDb.
  • Marten does not force you to declare which indexes you want to use for any given query. Postgresql itself can figure out the most efficient execution plan for itself. This is going to be advantageous for us in a couple ways. First by letting us rip a lot of RavenDb index code out. Secondly by making it much easier to optimize database performance without having to have so much impact on the code like it is today with RavenDb.
  • We need more documentation and blog posts on this topic, but it is perfectly possible to use the relational database features of Postgresql where that’s still valuable.
  • If it’s useful, it is possible to use Dapper in conjunction with Marten and even in the same unit of work/transaction.
  • Just like RavenDb, Marten’s IDocumentSession is effectively the unit of work and should be scoped to a logical transaction. In most cases in our systems that effectively translates to an IDocumentSession per HTTP request or service bus message.
  • There is no hard request throttling in Marten. You should be aware of how many network round trips you’re making during a single operation and there are diagnostics to track that, but Marten will not blow up in production because an operation happened to make too many requests.
  • There’s no equivalent to RavenDb’s embedded data store option. That was the killer feature in RavenDb we’re going to miss the most. Fortunately, it’s pretty easy to spin up Postgresql on your own box. For automated testing scenarios where today we just use a brand new RavenDb data store, we’ll just take advantage of Marten’s “database cleaner” to wipe out state in between tests. In a way, this will simplify some of our testing against distributed systems. If this becomes a problem for test performance, we have a couple fallback plans to either host Postgresql in disposable Docker images or to enhance our testing harnesses to leapfrog clean schemas between tests.
  • Most importantly, if there’s something in Marten you don’t like, you can either do a pull request or at least raise an issue in GitHub where I’ll see it and we can get it fixed. OSS FTW!
  • We don’t use this in our internal systems (but we should), but the “Include()” feature in Marten for fetching related documents in one round trip is quite different than Raven’s.
  • Batch querying in Marten is more explicit and different mechanically than RavenDb’s “Futures.” We should be using this feature to reduce network chattiness between applications and the database.
  • I am highly recommending the usage of the Compiled Query feature in Marten that has no equivalent in RavenDb for better runtime performance and even as a declarative query model. This feature can be used in combination with “Include()” and batch querying to maximize the performance of your Marten backed persistence.
  • You can use any tooling you want that’s compatible with Postgresql to poke and prod a Marten-ized database. I just use pgAdmin, but Datagrip or even just Visual Studio is useful.
  • Marten has quite a few more useful diagnostic abilities you can use to analyze the SQL being generated or track database activity by session. In a later blog post, I’ll talk about the reusable recipe we’ve built for Marten integration into FubuMVC applications.

 

Why we’re getting off of RavenDb

I’ve been asked several times since we started working on Marten in public what it would take for us to change our minds and continue with RavenDb. I think it’s quite possible that Voron will make a positive difference, but as I’ll explain a little below, we just don’t trust RavenDb’s quality and software engineering practices.

So why are we wanting to move away from RavenDb?

  • We’ve had multiple day+ outages due to RavenDb indexes getting corrupted and being unable to rebuild. That in a nutshell is more than enough reason to move on.
  • We’ve been concerned for years with RavenDb’s internal quality. We’ve experienced a number of regression bugs when changing versions of RavenDb to the point where we’re unwilling to even try upgrading it.
  • Their release and versioning strategies are not consistent with Semantic Versioning, so you never know if you’re going to get breaking changes in minor or revision level version changes
  • Unresponsive support when we’ve had production issues with RavenDb
  • We’ve not had a lot of success with the DevOps type tooling around RavenDb (replication, etc.) and we’re hopeful that adopting Postgresql helps out on that front.
  • Resource utilization. RavenDb requires a lot of handholding to keep the memory utilization reasonable. Naive usage of RavenDb almost invariably leads to problems.
  • The stale data issue as a result of RavenDb’s eventual consistency strategy has been a major source of friction for us

 

Advertisements

19 thoughts on “Moving from RavenDb to Marten

  1. Josh Schwartzberg

    Marten is a great project. I’ve been following it for several months and it aligns with exactly what I wanted; the best of geteventstore + ravendb, but stored in postgres w/ non-eventual consistency as a first class citizen. I will be investigating a migration path for some of our production event-sourced systems and plan to contribute back to the project wherever possible. Thank you Jeremy.

    Reply
      1. dotjosh

        To clarify, I want synchronous projections within the same transaction that I’m appending to a stream. Can I do that with the projection feature in event store or only if I’m doing my own projections?

    1. gregfyoung

      “synchronous projections in the same transaction”

      Is a valid pattern however its only viable in the most trivial of scenarios and falls apart the moment you leave them. As an example how do you replay a projection while the system is running (this quickly leads into problems as example). Another one might be “how do I get neo4j in a transaction with sqlserver”, “how can I get 2 instances of my read model for availability purposes?”, “what happens if the read model is overloaded, should we stop accepting writes?”. As I said it doesn’t hang around long. Most I have seen try it ended up with massive amounts of accidental complexity due to it all while thinking they were removing complexity.

      Reply
  2. Double Attack Khalid (@buhakmeh)

    The parallels are really similar to our experience as a team.Trying to explain “eventual consistency” to our users, like Jeremy said, is nonsense as far as the user is concerned. Some of the pain there can be attributed to the development of the systems, so I don’t want to completely put the blame on RavenDB.

    @Jeremy You said something interesting about the testing scenario. To mimic the in-memory features of raven, where you thinking that every time a test ran it would create a new schema, run migrations, seed the data, run the test, then teardown everything? That would be pretty cool.

    We are in the process of migrating from RavenDB to SQL Azure, and sadly the business has been soured by the idea of NoSQL as they feel their data is less accessible if it goes into a NoSQL store. Sad, because personally, I love working with NoSQL databases and there is clearly advantages to NoSQL as a strategy for building applications.

    Anyways, my 2 cents 🙂

    Reply
    1. christianduhard

      I am always surprised when someone says they can’t explain eventual consistency to users. I find the people who have the most trouble understanding eventual consistency are developers. People naturally work in an eventually consistent manner.

      Reply
      1. gregfyoung

        @jeremymiller

        People expect that if they just changed data what they see reflects that. What they want is read-your-own-writes consistency in most cases where most developers talk about consistency. There are also a ton (likely most) where people just don’t care …

        Unless you use pessimistic locks everywhere you have already broken your statement that they should be seeing the latest data (the data could have changed while it was on the wire to the user or while they were looking at it).

        When is the last time you saw two users, one adding a new customer and the other searching for the customer. The first says “I am going to click save now, time your search so it goes at the same time”. If it takes 500ms this is normally acceptable and often required at scale.

        Read-your-own-writes consistency is trivial to implement in any event sourced system though there are many strategies you can use to make it so you don’t have to focus on this.

      2. Khalid Abuhakmeh (@buhakmeh)

        @christianduhard It’s not that devs can’t explain eventual consistency to users. It is that users don’t care. It would be like your car mechanic spending 30 minutes telling you how your brakes work before giving you the keys to go home. It is a detail most drivers don’t care to understand, they just want to know their car will stop when they push the brake pedal down.

  3. Pingback: Dew Drop - August 19, 2016 (#2312) - Morning Dew

  4. Bret Ferrier (@runxc1)

    @buhakmeh If you are moving to Azure than why not the Azure Document DB? It is a document store and supports the MongoDB drivers but also has a SQL like query language so could give you the best of both worlds. I have been following marten and I know that Jeremy is transitioning from RavenDB but I feel like Marten should be trying to compete with MongoDB as that is the more used Document Database.

    Reply
  5. dotnetchris

    > Marten will not blow up in production because an operation happened to make too many requests.

    Can you atleast make that an opt-in feature? I personally think all systems should blow up when a user makes an unreasonable and 99% likely errant amount of requests in a single session. Make a user fully demand that much consumption.

    Reply
  6. Pingback: Proposed Roadmap for Marten 1.0 and Beyond | The Shade Tree Developer

  7. Pingback: Why you should give Marten a look before adopting an ORM | The Shade Tree Developer

  8. Pingback: Marten 1.1 Release Notes | The Shade Tree Developer

  9. Pingback: Marten: my Open Source experience | Szymon Kulec `Scooletz`

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s