Using Postgresql Advisory Locks for Leader Election

If you’re running an application with a substantial workload, or just want some kind of high availability, you’re probably running that application across multiple servers (heretofore called “nodes” because who knows where they’re physically running these days). That’s great and all, but it’s not too uncommon that you’ll need to make some kind of process run on only one of those nodes at any one time.

As an example, the Marten event store functionality as a feature to support asynchronous projection builders called the “async daemon” (because I thought that sounded cool at the time). The async daemon is very stateful, and can only function while running on one node at a time — but it doesn’t have any existing infrastructure to help you manage that. What we know we need to do for the upcoming Marten v4.0 release is to provide “leader election” to make sure the async daemon is actively building projections on only one node and can be activated or fail over to another node as needed to guarantee that exactly one node is active at all times.

From Wikipedia, Leader Election “is the process of designating a single process as the organizer of some task distributed among several computers.” There’s plenty of existing art to do this, but it’s not for the feint of heart. In the past, I tried to do this with FubuMVC using a custom implementation of the Bully Algorithm. Microsoft’s microservices pattern guidance has some .Net centric approaches to leader election. Microsoft’s new Dapr tool is supposed to support leader election some day.

From my previous experience, building out and especially testing custom election infrastructure was very difficult. As a far easier approach, I’ve used Advisory Locks in Postgresql in Jasper (I’m also using the Sql Server equivalents as well) as what I think of as a “poor man’s leader election.”

An advisory lock in Postgresql is an arbitrary, application-managed lock on a named resource. Postgresql simply tracks these locks as a distributed lock such that only one active client can hold the lock at any one time. These locks can be held either at:

The connection level, such that the lock, once obtained, is held as long as the database connection is open.
The transaction level, such that a lock obtained within the course of one Postgresql transaction is held until the transaction is committed, rolled back, or the connection is lost.

As an example, Jasper‘s “Durability Agent” is a constantly running process in Jasper applications that tries to read and process any persisted messages persisted in a Postgresql or Sql Server database. Since you certainly don’t want a unique message to be processed by more than one node, the durability uses advisory locks to try to temporarily take sole ownership of replaying persisted messages with a workflow similar to this sequence diagram:

Transaction Scoped Advisory Lock Usage

That’s working well so far for Jasper, but in Marten v4.0, we want to use the connection scoped advisory lock for leader election of a long running process for the async daemon.

Sample Usage for Leader Election

Before you look at any of these code samples, just know that this is over-simplified to show the concept, isn’t in production, and would require a copious amount of error handling and logging to be production worthy.

For Marten v4.0, we’ll use the per-connection usage to ensure that the new version of the async daemon will only be running on one node (or at least the actual “leader” process that distributes and assigns work across other nodes if we do it well). The async daemon process itself is probably going to be a .Net Core IHostedService that runs in the background.

As just a demonstrator, I’ve pushed up a little project called AdvisoryLockSpike to GitHub just to show the conceptual usage. First let’s say that the actual worker bee process of the async daemon implements this interface:

public enum ProcessState
{
    Active,
    Inactive,
    Broken
}

public interface IActiveProcess : IDisposable
{
    Task<ProcessState> State();
    
    
    // The way I've done this before, the
    // running code does all its work using
    // the currently open connection or at
    // least checks the connection to "know"
    // that it still has the leadership role
    Task Start(NpgsqlConnection conn);
}

Next, we need something around that to actually deal with the mechanics of trying to obtain the global lock and starting or stopping the active process. Since that’s a background process within an application, I’m going to use the built in BackgroundService in .Net Core with this little class:

public class LeaderHostedService<T> : BackgroundService
    where T : IActiveProcess
{
    private readonly LeaderSettings<T> _settings;
    private readonly T _process;
    private NpgsqlConnection _connection;

    public LeaderHostedService(LeaderSettings<T> settings, T process)
    {
        _settings = settings;
        _process = process;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        // Don't try to start right off the bat
        await Task.Delay(_settings.FirstPollingTime, stoppingToken);
            
        _connection = new NpgsqlConnection(_settings.ConnectionString);
        await _connection.OpenAsync(stoppingToken);
        
        while (!stoppingToken.IsCancellationRequested)
        {
            var state = await _process.State();
            if (state != ProcessState.Active)
            {
                // If you can take the global lock, start
                // the process
                if (await _connection.TryGetGlobalLock(_settings.LockId, cancellation: stoppingToken))
                {
                    await _process.Start(_connection);
                }
            }

            // Start polling again
            await Task.Delay(_settings.OwnershipPollingTime, stoppingToken);
        }

        if (_connection.State != ConnectionState.Closed)
        {
            await _connection.DisposeAsync();
        }

    }
}

To fill in the blanks, the TryGetGlobalLock() method is an extension method helper to call the underlying pg_try_advisory_lock function in Postgresql to try to obtain a global advisory lock for the configured lock id. That extension method is shown below:

// Try to get a global lock with connection scoping
public static async Task<bool> TryGetGlobalLock(this DbConnection conn, int lockId, CancellationToken cancellation = default(CancellationToken))
{
    var c = await conn.CreateCommand("SELECT pg_try_advisory_lock(:id);")
        .With("id", lockId)
        .ExecuteScalarAsync(cancellation);

    return (bool) c;
}

Raw ADO.Net is so verbose and unusable out of the box that I’ve built up a set of extension methods to streamline its usage that you might observe above if you notice that that isn’t quite out of the box ADO.Net.

I’m generally a fan of strong typed configuration, and .Net Core makes that easy now, so I’ll use this class to represent the configuration:

public class LeaderSettings<T> where T : IActiveProcess
{
    public TimeSpan OwnershipPollingTime { get; set; } = 5.Seconds();
    
    // It's a random number here so that if you spin
    // up multiple nodes at the same time, they won't
    // all collide trying to grab ownership at the exact
    // same time
    public TimeSpan FirstPollingTime { get; set; } 
        = new Random().Next(100, 3000).Milliseconds();
    
    // This would be something meaningful
    public int LockId { get; set; }
    
    public string ConnectionString { get; set; }
}

In this approach, the background services will be constantly polling to try to take over as the async daemon if the async daemon is not active somewhere else. If the current async daemon node fails, the connection will drop and the global advisory lock is released and ready for another node to take over. We’ll see how this goes, but the early feedback from my own usage on Jasper and other Marten contributors other projects is positive. With this approach, we hope to enable teams to use the async daemon on multi-node deployments of their application with just Marten out of the box and without having to have any kind of sophisticated infrastructure for leader election.

8 thoughts on “Using Postgresql Advisory Locks for Leader Election”

chester89 says:

May 6, 2020 at 11:11 am

Hello.
I suppose you chose xact lock for atomicity (e.g. either transaction is committed and the lock is released, or it aborts and lock is kept)?
Lets say you have two nodes A and B, A acquires a lock, starts running your logic and suddenly process crashes before you can commit. If the lock is still held, how would you release it from node B?
While Postgres docs claim that pg_advisory_unlock_all is called when the connection is closed cleanly or dropped, I’m not sure how that will play with things like PgBouncer. Your app may well be disconnected from the proxy, but the physical connection will return to the pool.
I think manual lock management (pg_advisory_lock) will be more flexible, but it also enables “I committed, but couldn’t release the lock” scenarios (in case of network failures, for example). You can communicate that the background job has finished fine (write some record inside the transaction), you just wasn’t successful in releasing the lock.

chester89 says:

May 6, 2020 at 11:26 am

And then there’s issues like this one. Not sure if many organizations have DBAs and/or end-to-end tests in place thorough enough to check for these settings when they upgrade their infrastructure

chester89 says:

May 6, 2020 at 11:32 am

Not to mention statement pooling disables multi-statement transactions completely.
I’m guessing in this case most of the apps/libraries won’t work at all, so that’s a minor issue

chester89 says:

May 7, 2020 at 6:00 am

Why don’t you use Dapper to hide most of repetitive ADO.NET code?
Don’t want to take another dependency or does it not work in some cases (like postgres arrays)?

jeremydmiller says:

May 7, 2020 at 10:06 am

I don’t think Dapper would really help much, and TBH, I’m not really much of a fan of micro-ORMs

nyx says:

May 7, 2020 at 7:34 pm

Hi, Jeremy!

What can you say about Andy’s criticism and issues he describes?

https://ayende.com/blog/190753-C/avoid-rolling-your-own-leader-election-algorithm?Key=3b8a257d-3d2f-407e-81c1-1098a89302c0

1. jeremydmiller says:
  
  May 7, 2020 at 7:41 pm
  
  Oren & I had a twitter debate about that the other day you can find.
  
  I think at some point Oren said that my solution could only possibly ever work if you were passing the connection with the advisory lock around and actually using that connection to do the work represented by the lock. Both of which I intend to be do in the real solution of which this was a lofi conceptual demonstration.
  
  There is some prior art here, you can find other folks who’ve used advisory locks as an effective distributed mutex, I’ve used them in a similar way in another project, and I wouldn’t say this is rolling some kind of unique leader election. I also wouldn’t and didn’t call it a generic solution like the Bully Algorithm either, but it will fit exactly what we’re trying to do with the async daemon.
  
  1. nyx says:
    
    May 8, 2020 at 4:41 pm
    
    I thought it through and agree that it should work if you modify database through connection that acquired lock.
    
    Thank you for the idea!