Marten Linq Provider Improvements

A couple years ago I was in a small custom software development shop in Austin as “the .NET guy” for the company. The “Java guy” in the company asked me one day to try to name one think about .NET that he could look at that wasn’t just a copy of something older in the JVM. I almost immediately said for him to look at LINQ (Language INtegrated Query for non .NET folks who might stumble into this), as there isn’t really a one for one equivalent and I’d argue that LINQ is a real advantage within the .NET space.

As the author and primary support person for Marten’s LINQ provider support though, I have a decidedly mixed view of LINQ. It’s undoubtedly a powerful tool for .NET developers, but it’s maybe my least favorite thing to support in my entire OSS purview as a LINQ provider is a permutation hell kind of problem. To put it in perspective, I start making oodles of references to Through the Looking Glass anytime I have to spend some significant amount of time dealing with our LINQ support.

Nevertheless, Marten has an uncomfortably large backlog of LINQ related issues and we had a generous GitHub sponsorship to specifically improve the efficiency of the SQL generated for child collection queries in Marten, so I’ve been working on and off for a couple months to do a complete overhaul of our LINQ support that will land in Marten 7.0 sometime in the next couple months. Just in the last week I finally had a couple breakthroughs I’m ready to share. First though, let’s all get in the right headspace with some psychedelic music:

RIP Tom Petty!

and

And I’m going w/ Grace Potter’s cover version!

Alright, so back to the real problem. When Marten today encounters a LINQ query like this one:

        var results = theSession.Query<Top>().Where(x =>
            x.Middles.Any(m => m.Color == Colors.Green && m.Bottoms.Any(b => b.Name == "Bill")));

Marten generates a really fugly SQL query using PostgreSQL Common Table Expressions to explode out the child collections into flat rows that can then be filtered to matching child rows, then finally uses a sub query filter on the original table to find the right rows. To translate, all that mumbo jumbo I said translates to “a big ass, slow query that doesn’t allow PostgreSQL to utilize its fancy GIN index support for faster JSONB querying.”

The Marten v7 support will be smart enough to “know” when it can generate more efficient SQL for certain child collection filtering. In the case above, Marten v7 can use the PostgreSQL containment operator to utilize the GIN indexing support and just be simpler in general with SQL like this:

select d.id, d.data from public.mt_doc_top as d where CAST(d.data ->> 'Middles' as jsonb) @> :p0 LIMIT :p1
  p0: [{"Color":2,"Bottoms":[{"Name":"Bill"}]}]
  p1: 2

You might have to take my word for it right now that the SQL above is significantly more efficient than the previous LINQ support.

One more sample that I’m especially proud of. Let’s say you use this LINQ query:

        var result = await theSession
            .Query<Root>()
            .Where(r => r.ChildsLevel1.Count(c1 => c1.Name == "child-1.1") == 1)
            .ToListAsync();

This one’s a little more complicated because you need to do a test of the *number* of matching child elements within a child collection. Again, Marten vCurrent will use a nasty and not terribly efficient common table expression approach to give you the right data. For Marten v7, we specifically asked the Marten user base if we could abandon support for any PostgreSQL versions lower than PostgreSQL 12. *That* is letting us use PostgreSQL’s JSONPath query support within our LINQ provider and gets us to this SQL for the LINQ query from up above:

select d.id, d.data from public.mt_doc_root as d where jsonb_array_length(jsonb_path_query_array(d.data, '$.ChildsLevel1[*] ? (@.Name == $val1)', :p0)) = :p1
  p0: {"val1":"child-1.1"}
  p1: 1

It’s still quite a bit away, but the point of this post is that there is some significant improvements coming to Marten’s LINQ provider soon. More importantly to me, finishing this work up and knocking out the slew of open LINQ related GitHub issues will allow the Marten core team to focus on much more exciting new functionality in the event sourcing side of things.

2 thoughts on “Marten Linq Provider Improvements

  1. Would love to see a little more in-depth on the approach you’re taking with the new linq provider for V7. I have seen you’re moving away from re-linq, which – for me – was a default go-to choice when dealing with linq. Can you elaborate a little more on its limitations and the reasons you’re dropping it?

    1. Relinq doesn’t do very much other than help you out with pre-processing literals into simple values for you. That’s really been the only place where I’ve missed Relinq — and even there, I’ve been using FastExpressionCompiler to do that bit of Expression compilation in a faster way.

      The project looks dead and not very well maintained, the extension model is awkward, and it takes a bit of overhead to map Relinq’s intermediate model into something that’s more appropriate for Marten’s particular “query within JSON” needs.

Leave a comment