Network Round Trips are Evil

As Houston gets drenched by Hurricane Beryl as I write this, I’m reminded of a formative set of continuing education courses I took when I was living in Houston in the late 90’s and plotting my formal move into software development. Whatever we learned about VB6 in those MSDN classes is long, long since obsolete, but one pithy saying from one of our instructors (who went on to become a Marten user and contributor!) stuck with me all these years later:

Network round trips are evil

John Cavnar-Johnson

His point then, and my point now quite frequently working with JasperFx Software clients, is that round trips between browsers to backend web servers or between application servers and the database need to be treated as expensive operations and some level of request, query, or command batching is often a very valuable optimization in systems design.

Consider my family’s current kitchen predicament as diagrammed above. The very expensive, original refrigerator from our 20 year old house finally gave up the ghost, and we’ve had it completely removed while we wait on a different one to be delivered. Fortunately, we have a second refrigerator in the garage. When cooking now though, it’s suddenly a lot more time consuming to go to the refrigerator for an ingredient since I can’t just turn around and grab something when the kitchen refrigerator was just a step away. Now that we have to walk across the house from the kitchen to the garage to get anything from the other refrigerator, it’s becoming very helpful to try to grab as many things as you can at one time so you’re not constantly running back and forth.

While this issue certainly arises from user interfaces or browser applications making a series of little requests to a backing server, I’m going to focus on database access for the rest of this post. Using a simple example from Marten usage, consider this code where I’m just creating five little documents and persisting them to a database:


    public static async Task storing_many(IDocumentSession session)
    {
        var user1 = new User { FirstName = "Magic", LastName = "Johnson" };
        var user2 = new User { FirstName = "James", LastName = "Worthy" };
        var user3 = new User { FirstName = "Michael", LastName = "Cooper" };
        var user4 = new User { FirstName = "Mychal", LastName = "Thompson" };
        var user5 = new User { FirstName = "Kurt", LastName = "Rambis" };

        session.Store(user1);
        session.Store(user2);
        session.Store(user3);
        session.Store(user4);
        session.Store(user5);

        // Marten will *only* make a single database request here that
        // bundles up "upsert" statements for all five users added above
        await session.SaveChangesAsync();
    }

In the code above, Marten is only issuing a single batched command to the backing database that performs all five “upsert” operations in one network round trip. We were very performance conscious in the very early days of Marten development and did quite a bit of experimentation with different options for JSON serialization or how exactly to write SQL that queried inside of JSONB or even table structure. Consistently and unsurprisingly though, the biggest jump in performance was when we introduced command batching to reduce the number of network round trips between code using Marten and the backing PostgreSQL database. That early performance testing also led us to early investments in Marten’s batch querying support and the Include() query functionality that allows Marten users to fetch related data with fewer network hops to the database.

Just based on my own experience, here are two trends I see about interacting with databases in real world systems:

  1. There’s a huge performance gain to be made by finding ways to batch database queries
  2. It’s very common for systems in the real world to suffer from performance problems that can at least partially be traced to unnecessary chattiness between an application and its backing database(s)

At a guess, I think the underlying reasons for the chattiness problem are something like:

  • Developers who just aren’t aware of the expense of network round trips or aren’t aware of how to utilize any kind of database query batching to reduce the problems
  • Wrapper abstractions around the raw database persistence tooling that hides more powerful APIs that might alleviate the chattiness problem
  • Wrapper abstractions that encourage a pattern of only loading data by keys one row/object/document at a time
  • Wrapper abstractions around the raw database persistence that discourage developers from learning more about the underlying persistence tooling they’re using. Don’t underestimate how common that problem is. And I’ve absolutely been guilty of causing that issue as a younger “architect” in the past who created those abstractions.
  • Complicated architectural layering that can make it quite difficult to easily reason about the cause and effect between system inputs and the database queries that those inputs spawn. Big call stacks of a controller calling a mediator tool that calls one service that calls other services that call different repository abstractions that all make database queries is a common source of chattiness because it’s hard to even see where all the chattiness is coming from by reading the code.

As you might know if you’ve stumbled across any of my writings or conference talks from the last couple years, I’m not a big fan of typical Clean/Onion Architecture approaches. I think these approaches introduce a lot of ceremony code into the mix that I think causes more harm overall than whatever benefits they bring.

Here’s an example that’s somewhat contrived, but also quite typical in terms of the performance issues I do see in real life systems. Let’s say you’ve got a command handler for a ShipOrder command that will need to access data for both a related Invoice and Order entity that could look something like this:

public class ShipOrderHandler
{
    private readonly IInvoiceRepository _invoiceRepository;
    private readonly IOrderRepository _orderRepository;
    private readonly IUnitOfWork _unitOfWork;

    public ShipOrderHandler(
        IInvoiceRepository invoiceRepository,
        IOrderRepository orderRepository,
        IUnitOfWork unitOfWork)
    {
        _invoiceRepository = invoiceRepository;
        _orderRepository = orderRepository;
        _unitOfWork = unitOfWork;
    }

    public async Task Handle(ShipOrder command)
    {
        // Making one round trip to get an Invoice
        var invoice = await _invoiceRepository.LoadAsync(command.InvoiceId);

        // Then a second round trip using the results of the first pass
        // to get follow up data
        var order = await _orderRepository.LoadAsync(invoice.OrderId);

        // do some logic that changes the state of one or both of these entities

        // Commit the transaction that spans the two entities
        await _unitOfWork.SaveChangesAsync();
    }
}

The code is pretty simple in this case, but we’re still making more database round trips than we absolutely have to — and real enterprise systems can get much, much bigger than my little contrived example and incur a lot more overhead because of the chattiness problem that the repository abstractions naturally let in.

Let’s try this functionality again, but this time just depending on the raw persistence tooling (Marten’s IDocumentSession and use a Wolverine-style command handler to boot to further reduce the code noise:

public static class ShipOrderHandler
{
    // We're still keeping some separation of concerns to separate the infrastructure from the business
    // logic, but Wolverine lets us do that just through separate functions instead of having to use
    // all the limiting repository abstractions
    public static async Task<(Order, Invoice)> LoadAsync(IDocumentSession session, ShipOrder command)
    {
        // This is important (I think:)), the admittedly complicated
        // Marten usage below fetches both the invoice and its related order in a 
        // single network round trip to the database and can lead to substantially
        // better system performance
        Order order = null;
        var invoice = await session
            .Query<Invoice>()
            .Include<Order>(i => i.OrderId, o => order = o)
            .Where(x => x.Id == command.InvoiceId)
            .FirstOrDefaultAsync();

        return (order, invoice);
    }
    
    public static void Handle(ShipOrder command, Order order, Invoice invoice)
    {
        // do some logic that changes the state of one or both of these entities
        // I'm assuming that Wolverine is handling the transaction boundaries through
        // middleware here
    }
}

In the second code sample, we’ve been able to go right at the Marten tooling to take advantage of its more advanced functionality to batch up data fetching for better performance that wasn’t easily possible when we were putting repository abstractions between our command handler and the underlying persistence tooling. Moreover, we can even reason about the resulting database operations that are happening as a result of our command that can be somewhat obfuscated by more layers and more code separation as is common in Onion/Clean/Ports and Adapters style approaches.

It’s not just repository abstractions that cause problems, sometimes it’s just happily useful little extension methods that can be the source of chattiness. Here’s a pair of helper extension methods around Marten’s event store functionality that help you start a new event stream in a single line of code or append a single event to an existing event stream in a single line of code:

public static class DocumentSessionExtensions
{
    public static Task Add<T>(this IDocumentSession documentSession, Guid id, object @event, CancellationToken ct)
        where T : class
    {
        documentSession.Events.StartStream<T>(id, @event);
        return documentSession.SaveChangesAsync(token: ct);
    }

    public static Task GetAndUpdate<T>(
        this IDocumentSession documentSession,
        Guid id,
        int version,
        
        // If we're being finicky about performance here, these kinds of inline
        // lambdas are NOT cheap at runtime and I'm recommending against
        // continuation passing style APIs in application hot paths for
        // my clients
        Func<T, object> handle,
        CancellationToken ct
    ) where T : class =>
        documentSession.Events.WriteToAggregate<T>(id, version, stream =>
            stream.AppendOne(handle(stream.Aggregate)), ct);
}

Fine, right? These potentially make your code cleaner and simpler but of course, they’re also potentially harmful. Here’s an example of these two extension methods that were similar to some code I saw in the wild last week:

public static class Handler
{
    public static async Task Handle(Command command, IDocumentSession session, CancellationToken token)
    {
        var id = CombGuidIdGeneration.NewGuid();
        
        // One round trip
        await session.Add<Aggregate>(id, new FirstEvent(), token);

        if (command.SomeCondition)
        {
            // This actually makes a pair of round trips, one to fetch the current state
            // of the Aggregate compiled from the first event appended above, then
            // a second to append the SecondEvent
            await session.GetAndUpdate<Aggregate>(id, 1, _ => new SecondEvent(), token);
        }
    }
}

I got involved with this code in reaction to some load testing that was resulting in disappointing results. When I was pulled in, I saw the extra round trips that snuck in because of the usage of the convenience extension methods they had been using, and suggested a change to something like this (but with Wolverine’s aggregate handler workflow that simplified the code more than this):

public static class Handler
{
    public static async Task Handle(Command command, IDocumentSession session, CancellationToken token)
    {
        var events = determineEvents(command).ToArray();
        
        var id = CombGuidIdGeneration.NewGuid();
        session.Events.StartStream<Aggregate>(id, events);

        await session.SaveChangesAsync(token);
    }

    // This was isolated so you can easily unit test the business
    // logic that "decides" what events to append
    public static IEnumerable<object> determineEvents(Command command)
    {
        yield return new FirstEvent();
        if (command.SomeCondition)
        {
            yield return new SecondEvent();
        }
    }
}

The code above cut down the number of network round trips to the database and greatly improved the results of the load testing.

Summary

If system performance is a concern in your system (it’s not always), you probably need to be cognizant of how chatty your application is in regards to its communication and interaction with the backing database. Or any other remote system or infrastructure that your system interacts with at runtime.

Personally, I think that higher ceremony code structures make it much more likely to incur issues with database chattiness especially by first obfuscating your code so you don’t even easily recognize where there’s chattiness, then second by wrapping simplifying abstractions around your database persistence tooling that eliminate the usage of more advanced functionality for query batching.

And of course, both Wolverine and Marten put a heavy emphasis on reducing code ceremony and generally on code noise in general because I personally think that’s very valuable to help teams succeed over time with software systems in the wild. My theory of the case is that even at the cost of a little bit of “magic”, simply reducing the amount of code you have to wade through in existing systems will make those systems easier to maintain and troubleshoot over time.

And on that note, I’m basically on vacation for the next week, and you can address your complaints about my harsh criticism of Clean/Onion Architectures to the ether:-)

One thought on “Network Round Trips are Evil

Leave a comment