Distributed locking with Postgres advisory locks
Recently, I worked on a simple distributed locking mechanism. The main goal was to enable efficient, concurrent processing of some resources, and to make sure that multiple workers weren’t processing a given resource at the same time. It turned out that Postgres advisory locks are perfect for this task.
Advisory locks in Postgres provide a mechanism to use database locks for arbitrary purposes. Since they are completely meaningless to the database, it’s up to the application to decide what they mean. They can be acquired for a key, which has to be a 64-bit integer.
These locks will not interfere with locks acquired when inserting/updating/deleting rows, they are totally independent.
Advisory locks can be used to control access to shared resources in a concurrent environment.
For instance, in a distributed system, they can be used to make sure that only one worker is processing a particular resource (thus acting as a mutex). This is useful not only to avoid the potential issues with deadlocks (as even the MVCC-based Postgres database has locks), but also to distribute a workload between multiple replicas automatically.
# Types of advisory locks
There are two main flavors of advisory locks:
- Session-level, and
- Transaction-level
Session-level advisory locks, once acquired, are held until they are released explicitly, or until the session ends.
Transaction-level advisory locks cannot be released explicitly, and they are released automatically when the transaction ends (commit/rollback).
Which one is better for you depends on your situation. One factor that you might have to take into account is whether you are using PgBouncer, and if so, in which mode. If you use PgBouncer in session-pooling mode, both advisory lock types are available to you, and you can choose between them. However, if you’re using PgBouncer in transaction-pooling mode, you can’t use session-level features, so transaction-level advisory locks are your only option.
In my opinion, transaction-level locks are easier to deal with. True, they can’t be released on-demand. But this means you don’t have to worry about releasing them - it’s automatic whenever the transaction is committed or rolled back.
# Shared vs Exclusive
Both flavors have exclusive and shared modes. Multiple processes can hold a shared lock on the same key, but exclusive lock requests are denied. Exclusive locks cannot be shared.
# To wait or not to wait
When the lock is acquired for a key, your SQL statment for acquiring the lock returns true. What happens when the lock can’t be acquired? It depends on which function you use to request the lock. You can:
- wait until it’s released: this is done by using
pg_advisory_lock
. - immediately return
false
: this is done by usingpg_try_advisory_lock
.
# The failure path
A distributed locking system without a lock expiry seems unusable - what happens if clients die, does the lock just hang around forever, blocking others working on that resource?
No, and that’s the charm of it. Postgres takes care of cleaning up after clients disconnect. Transaction-level locks in particular are cleaned up after the connection is deemed dead based on the TCP settings of the database.
Limitations: this locking mode assumes one central database, on which it relies for keeping a tab on locks. That central database is a single point of failure.
# In action
Let’s see how these locks look like in Postgres. Using transaction-level locks for this example means I have to begin a transaction to use the lock:
The statement returned true
so the lock is acquired. It will be held until the transaction is committed or rolled back. We can see it in the pg_locks
system view:
If I try to acquire the lock on the same resource (on key 1) from another process, it will return false
:
When the transaction is committed, the lock is automatically released:
# Implementation example
In the following example we’ll assume that we have to solve the following challenge:
- there is a deployment which consists of some workers (nr. of workers >= 1)
- the workers operate on some shared resources
- the deployment has to be scalable by adding more workers at any time
- we want to avoid concurrent work on the same resource
- the workers have to distribute the workload among themselves automatically
In addition, we assume that there is a central database to which each worker has access.
We will use transaction-level advisory locks. The function pg_try_advisory_xact_lock
has the following characteristics:
- it will not wait for the lock to become available. It will either obtain the lock immediately and return
true
, or returnfalse
otherwise. - if acquired, the lock is automatically released at the end of the current transaction and cannot be released explicitly
At the highest level, the solution is just to iterate over the resources and try to do work on them. The underlying implementation will take care of balancing the workload.
We have to implement the work_on_resource
function so that it respects the advisory locks:
The try_to_lock_shared_resource
function tries to acquire a lock on the resource. This is where we construct the key that we use for the lock.
The key has to be a 64-bit integer, so if the resource ID is not an int, we have to cast it to one. We can use a hash function for this.
The built-in
hash()
function returns different hashes when invoked with the same string in different Python interpreters, because they are salted with an unpredictable random value.
In addition to the identifier of the resource, we use a namespace to make sure that other applications can still lock the same resource. It’s just the workers of the same application that we want to restrict to one.
There is a slight caveat here: when generating the lock key for the resource, hash collisions can occurr. Since we are using 64-bit integers, the chance of a hash collision is minimal. Whether it’s acceptable or not depends on your use case. See a great post about calculating the chance of hash collisions here.
Finally, the function that executes the lock statement:
🎉 Voilà! Using this pattern, the deployment can be scaled up to multiple workers, and they will distribute the work automatically. If a worker tries to acquire a resource, and can’t do so, then it just moves on to the next resource to work on. I think this is a simple but powerful pattern, and I’m glad I found out about it - thanks to all the authors who have written about on the internet.
# Disclaimer
1) This solution is good for the use-case described above. Note that it’s nothing like Redlock, because the purpose was mostly efficiency. Correctness was secondary, therefore we didn’t need the complexity of having multiple servers to keep tab on locks, and checking for a majority to acquire a lock. Always seek the advice of your senior engineers or other qualified professionals before committing to this.
2) This solution is a kind of a mutex and might not be considered a true distributed locking mechanism in the strictest sense. Martin Kleppman notes:
“It’s important to remember that a lock in a distributed system is not like a mutex in a multi-threaded application. It’s a more complicated beast, due to the problem that different nodes and the network can all fail independently in various ways.”
Hoever, in the case of transaction-level advisory locks, the node and network failures are taken care of by the Postgres server - transactions are cleaned up when the client disconnects and the connection is considered dead according to the TCP settings on the server (
tcp_keepalives_idle
,tcp_keepalives_interval
andtcp_keepalives_count
).
# Resources
- The different advisory locking functions avaible in Postgres: Advisory Lock Functions in the official PG docs
- Introduction to advisory locks in the official docs
- A Python library for advisory locks: github/level12/pals
- Two great posts about the topic: by Igor Šarčević and by Hashrocket
- Martin Kleppman’s great post on distributed locking
- Some background on how Postgres cleans up connections, on Stackexchange
# Sidenote on Python’s built-in hash
function
Consider these calls to hash
, returning different results at each invocation:
Compare them to the following, setting the seed to a constant makes the hash consistent:
All this is due to a vulnerability explained here.