Optimizing Marten Performance by using “Include’s”

Continuing my series of short blog posts about concepts or new features in Marten, this time I’m going to show how to use the brand new IQueryable<T>().Include() feature to improve performance by reducing the number of network round trips to the underlying Postgresql database. Check out the Marten tag for related blog posts.

In one of the formative experiences of my early software career, our instructor repeated the phrase “network round trips are evil” as a kind of mantra. The point, as I quickly learned then and plenty of times later, is that a “chatty” system making lots of network round trips between boxes can be very slow if you’re not careful.

To that end, many of our early adopters and would be adopters said that they would switch to Marten if only we had the Include() feature from RavenDb. The point of this feature is to reduce network round trips to the database server from your application by being able to fetch related documents at the same time.

Jumping right to a concrete example, let’s say that your domain has two document types, one called “Issue” and another called “User.” In this case the Issue maybe have logical links to the assigned user responsible for addressing the issue, partially shown below:

    public class Issue
    {
        // The AssigneeId would be the Id of the
        // related User document
        public Guid? AssigneeId { get; set; }

    }

If I want to load an Issue by its Id, but also get the assigned User at the same time, I can use Marten’s new “Include()” feature inspired by RavenDb and NHibernate’s QueryOver mechanism:

[Fact]
public void simple_include_for_a_single_document()
{
    var user = new User();
    var issue = new Issue {AssigneeId = user.Id, Title = "Garage Door is busted"};

    theSession.Store<object>(user, issue);
    theSession.SaveChanges();

    using (var query = theStore.QuerySession())
    {
        User included = null;
        var issue2 = query.Query<Issue>()
            // Using the call below, Marten will execute
            // the supplied callback to pass back the related
            // User document assigned to the Issue
            .Include(x => x.AssigneeId, x => included = x)
            .Where(x => x.Title == issue.Title)
            .Single();

        included.ShouldNotBeNull();
        included.Id.ShouldBe(user.Id);

        issue2.ShouldNotBeNull();

        // All of this was done with exactly one call to Postgresql
        query.RequestCount.ShouldBe(1);
    }
}

The actual SQL statement sent to Postgresql in the code above would be:

select d.data, d.id, assignee_id.data, assignee_id.id from public.mt_doc_issue as d INNER JOIN public.mt_doc_user as assignee_id ON CAST(d.data ->> 'AssigneeId' as uuid) = assignee_id.id where d.data ->> 'Title' = :arg0 LIMIT 1

In the code above, I’m using the fairly new “Include()” statement to direct Marten to fetch the related User document at the same time it’s retrieving the Issue. We deviated somewhat from RavenDb in this feature. Instead of just adding the included documents to the internal identity map and expecting the user to just “know” that they are cached, we opted to make the included documents accessible to the caller through either:

  1. Passing a callback function into the Include() method
  2. Passing an IList<T>, where T is the included document type, into Include(). In this case, Marten will fill the list with all the included documents found.
  3. Passing an IDictionary<TKey, T> into the Include() method that will be filled by the Id of the included documents found

 

Since the pace of development on Marten is temporarily outpacing my efforts at keeping the documentation website completely up to date, the best resource for seeing what’s possible with the Include() functionality is our acceptance tests.

Let me end with a couple salient points about the new Include() functionality:

  • The included documents are resolved through the internal identity map of the current session, so there will not be any duplicates from repeated documents. Think about the case of fetching 100 Issue’s that are all assigned to one of 5 different User’s. In this case, only the 5 reoccurring User documents would be returned.
  • You can do multiple Include()’s on one query
  • The Include() functionality is available in the batched query feature
  • This will be a topic for another post, but Marten already supports the creation of foreign key relationships between documents
  • By default, Marten uses an outer join to fetch the included documents. I didn’t show it above, but there is also an optional argument in the Include() method you can use to force Marten to use a more efficient inner join — but just remember that means that nothing will be returned in the case of a NULL Issue.AssigneeId in the example above.

 

For my next blog post, I’ll talk about our brand new as-of-this-morning “Compiled Query” feature.

2 thoughts on “Optimizing Marten Performance by using “Include’s”

Leave a comment