Distributed locking with Postgres advisory locks

Recently, I worked on a simple distributed locking mechanism. The main goal was to enable efficient, concurrent processing of some resources, and to make sure that multiple workers weren’t processing a given resource at the same time. It turned out that Postgres advisory locks are perfect for this task.

Advisory locks in Postgres provide a mechanism to use database locks for arbitrary purposes. Since they are completely meaningless to the database, it’s up to the application to decide what they mean. They can be acquired for a key, which has to be a 64-bit integer.

These locks will not interfere with locks acquired when inserting/updating/deleting rows, they are totally independent.

Advisory locks can be used to control access to shared resources in a concurrent environment.

For instance, in a distributed system, they can be used to make sure that only one worker is processing a particular resource (thus acting as a mutex). This is useful not only to avoid the potential issues with deadlocks (as even the MVCC-based Postgres database has locks), but also to distribute a workload between multiple replicas automatically.

# Types of advisory locks

There are two main flavors of advisory locks:

  • Session-level, and
  • Transaction-level

Session-level advisory locks, once acquired, are held until they are released explicitly, or until the session ends.

Transaction-level advisory locks cannot be released explicitly, and they are released automatically when the transaction ends (commit/rollback).

Which one is better for you depends on your situation. One factor that you might have to take into account is whether you are using PgBouncer, and if so, in which mode. If you use PgBouncer in session-pooling mode, both advisory lock types are available to you, and you can choose between them. However, if you’re using PgBouncer in transaction-pooling mode, you can’t use session-level features, so transaction-level advisory locks are your only option.

In my opinion, transaction-level locks are easier to deal with. True, they can’t be released on-demand. But this means you don’t have to worry about releasing them - it’s automatic whenever the transaction is committed or rolled back.

# Shared vs Exclusive

Both flavors have exclusive and shared modes. Multiple processes can hold a shared lock on the same key, but exclusive lock requests are denied. Exclusive locks cannot be shared.

# To wait or not to wait

When the lock is acquired for a key, your SQL statment for acquiring the lock returns true. What happens when the lock can’t be acquired? It depends on which function you use to request the lock. You can:

  • wait until it’s released: this is done by using pg_advisory_lock.
  • immediately return false: this is done by using pg_try_advisory_lock.

# The failure path

A distributed locking system without a lock expiry seems unusable - what happens if clients die, does the lock just hang around forever, blocking others working on that resource?

No, and that’s the charm of it. Postgres takes care of cleaning up after clients disconnect. Transaction-level locks in particular are cleaned up after the connection is deemed dead based on the TCP settings of the database.

Limitations: this locking mode assumes one central database, on which it relies for keeping a tab on locks. That central database is a single point of failure.

# In action

Let’s see how these locks look like in Postgres. Using transaction-level locks for this example means I have to begin a transaction to use the lock:

postgres=# BEGIN; SELECT pg_try_advisory_xact_lock(1);
(1 row)

The statement returned true so the lock is acquired. It will be held until the transaction is committed or rolled back. We can see it in the pg_locks system view:

postgres=# SELECT locktype, objid, mode from pg_locks;
  locktype  | objid  |      mode       | 
 relation   |        | AccessShareLock | 
 virtualxid |        | ExclusiveLock   | 
 advisory   |     1  | ExclusiveLock   | 
(3 rows)

If I try to acquire the lock on the same resource (on key 1) from another process, it will return false:

postgres=# BEGIN; SELECT pg_try_advisory_xact_lock(1);
(1 row)

When the transaction is committed, the lock is automatically released:

postgres=# COMMIT;
postgres=# SELECT locktype, objid, mode from pg_locks;
  locktype  | objid |       mode       |
 relation   |       |  AccessShareLock |
 virtualxid |       |  ExclusiveLock   |
(2 rows)


# Implementation example

In the following example we’ll assume that we have to solve the following challenge:

  • there is a deployment which consists of some workers (nr. of workers >= 1)
  • the workers operate on some shared resources
  • the deployment has to be scalable by adding more workers at any time
  • we want to avoid concurrent work on the same resource
  • the workers have to distribute the workload among themselves automatically

In addition, we assume that there is a central database to which each worker has access.

We will use transaction-level advisory locks. The function pg_try_advisory_xact_lock has the following characteristics:

  • it will not wait for the lock to become available. It will either obtain the lock immediately and return true, or return false otherwise.
  • if acquired, the lock is automatically released at the end of the current transaction and cannot be released explicitly

At the highest level, the solution is just to iterate over the resources and try to do work on them. The underlying implementation will take care of balancing the workload.

resources = [db1, db2, db3, db4]  # Or files, queues, etc.

for resource in resources:

We have to implement the work_on_resource function so that it respects the advisory locks:

def work_on_resource(resource_id):
    with try_to_lock_shared_resource(resource_id) as locked:
        if not locked:
            # Some other replica is working on it
            logger.info(f"Could not acquire lock for resource {resource_id}")
        logger.info(f"Acquired lock for resource {resource_id}")
        # Do the work on the resource.
        # The lock will be released automatically when we exit the contextmanager.

The try_to_lock_shared_resource function tries to acquire a lock on the resource. This is where we construct the key that we use for the lock. The key has to be a 64-bit integer, so if the resource ID is not an int, we have to cast it to one. We can use a hash function for this.

The built-in hash() function returns different hashes when invoked with the same string in different Python interpreters, because they are salted with an unpredictable random value.

In addition to the identifier of the resource, we use a namespace to make sure that other applications can still lock the same resource. It’s just the workers of the same application that we want to restrict to one.

def try_to_lock_shared_resource(resource_id):
    with connection.begin() as transaction:
        namespace = os.getenv("APP_NAME")
        key = f"{namespace}_{resource_id}"

        # Use hashlib, as the built-in hash function
        # is not consistent between interpreter invocations in Python 3.
        key_hash = hashlib.sha512(key.encode()).hexdigest()
        # Cast the hash to an integer
        advisory_lock_key = int(key_hash, 16)

        # The key has to be a 64-bit integer, so we use the modulo operator here.
        # Hash collisions can occurr but are not very likely.
        key_for_resource = int(key_hash, 16) % 2**63

        yield _lock_resource(transaction, key_for_resource)

There is a slight caveat here: when generating the lock key for the resource, hash collisions can occurr. Since we are using 64-bit integers, the chance of a hash collision is minimal. Whether it’s acceptable or not depends on your use case. See a great post about calculating the chance of hash collisions here.

Finally, the function that executes the lock statement:

def _lock_resource(transaction, resource_id):
    Returns True if it could acquire the lock for the resource,
    Returns False otherwise
    locked = transaction.execute(f"SELECT pg_try_advisory_xact_lock({resource_id})")
    lock_acquired_for_resource = locked.first()[0]
    return lock_acquired_for_resource

🎉 Voilà! Using this pattern, the deployment can be scaled up to multiple workers, and they will distribute the work automatically. If a worker tries to acquire a resource, and can’t do so, then it just moves on to the next resource to work on. I think this is a simple but powerful pattern, and I’m glad I found out about it - thanks to all the authors who have written about on the internet.

# Disclaimer

1) This solution is good for the use-case described above. Note that it’s nothing like Redlock, because the purpose was mostly efficiency. Correctness was secondary, therefore we didn’t need the complexity of having multiple servers to keep tab on locks, and checking for a majority to acquire a lock. Always seek the advice of your senior engineers or other qualified professionals before committing to this.

2) This solution is a kind of a mutex and might not be considered a true distributed locking mechanism in the strictest sense. Martin Kleppman notes:

It’s important to remember that a lock in a distributed system is not like a mutex in a multi-threaded application. It’s a more complicated beast, due to the problem that different nodes and the network can all fail independently in various ways.

Hoever, in the case of transaction-level advisory locks, the node and network failures are taken care of by the Postgres server - transactions are cleaned up when the client disconnects and the connection is considered dead according to the TCP settings on the server (tcp_keepalives_idle, tcp_keepalives_interval and tcp_keepalives_count).

# Resources

# Sidenote on Python’s built-in hash function

Consider these calls to hash, returning different results at each invocation:

$ python3 -c "print(hash('some things here'))"

$ python3 -c "print(hash('some things here'))"

$  python3 -c "print(hash('some things here'))"

Compare them to the following, setting the seed to a constant makes the hash consistent:

$ PYTHONHASHSEED=1 python3 -c "print(hash('some things here'))"

$ PYTHONHASHSEED=1 python3 -c "print(hash('some things here'))"

$ PYTHONHASHSEED=1 python3 -c "print(hash('some things here'))"

All this is due to a vulnerability explained here.

Written on November 23, 2019

If you notice anything wrong with this post (factual error, rude tone, bad grammar, typo, etc.), and you feel like giving feedback, please do so by contacting me at samubalogh@gmail.com. Thank you!