POSTGRESQL LOCKING
Postgresql provides 3 different types of lock such as table-level locks, row-level locks, and advisory locks.
Some locks are acquired implicitly by PostgreSQL when, for example, an SQL statement is run. Locks can also be explicitly acquired by users.
Some locks conflict other locks. For example, when an ACCESS EXCLUSIVE lock is acquired by a transaction, a second transaction cannot acquire any locks, since ACCESS EXCLUSIVE conflicts all the other type of locks. So, the second transaction should wait in a queue for the first one to release the lock.
A transaction never conflicts itself. That is a transaction can hold ACCESS EXCLUSIVE and ACCESS SHARE locks at the same time.
Non-conflicting locks can be held by multiple transactions concurrently.
Locks are released when the transaction is committed or rollbacked.
TABLE-LEVEL LOCKS
This types of locks are automatically acquired by PostgreSQL.
ACCESS SHARE
A SELECT command acquires this type of lock. In general, any query that only reads but not modifies data will have hold this lock.
ROW SHARE
SELECT FOR UPDATE and SELECT FOR SHARE commands acquire this lock.
ROW EXCLUSIVE
INSERT, UPDATE, DELETE commands acquire this lock. In general, any query that modifies data will hold this lock.
SHARE UPDATE EXCLUSIVE
Acquired by VACUUM, ANALYZE, CREATE INDEX CONCURRENTLY and some forms of ALTER TABLE commands.
SHARE
CREATE INDEX(without concurrently) acquires this lock.
SHARE ROW EXCLUSIVE
PostgreSQL does not automatically acquire this type of lock.
EXCLUSIVE
PostgreSQL does not automatically acquire this type of lock.
ACCESS EXCLUSIVE
Acquired by the ALTER TABLE, DROP TABLE, TRUNCATE, REINDEX, CLUSTER, VACUUM FULL, and REFRESH MATERIALIZED VIEW (without CONCURRENTLY) commands. This is also the default lock mode for LOCK TABLE statements that do not specify a mode explicitly.
ROW-LEVEL LOCKS
There are two types of row-level locks such as shared and exclusive. Exclusive locks are automatically acquired when rows are updated or deleted. Row-level locks do not block data querying, only block other transactions which modifies the same rows.
FOR UPDATE
This SQL command acquires an exclusive row-level lock without actually modifying.
select * from table where id > 10 and id < 15 FOR UPDATE;
So, another transaction cannot hold this type of lock concurrently on the rows whose id are between 10 and 15. But it can read.
Also, check out FOR UPDATE SKIP LOCK
FOR SHARE
This SQL command acquires a shared row-level lock.
select * from table where id > 10 and id < 15 FOR SHARE;
So, a second transaction can acquire the same type of lock, but cannot modify or delete the rows selected, or acquire any exclusive locks on them.
DEADLOCKS
Explicit locks usage can increase the probability of deadlocks. For example, when two transactions hold locks that the other wants, then the transaction will wait for each other, and this is a deadlock. PostgreSQL detects this and aborts one of the transactions.
Example:
The first transaction updates like:
UPDATE users set name='John' where id = 2;
And the second transaction concurrently updates like:
UPDATE users set name='John' where id = 8;
UPDATE users set name='John' where id = 2;
And the first transaction wants to update like:
UPDATE users set name='John' where id = 8;
So, here the first transaction holds a row-level lock on the row id=2, while the second transaction holds a row-level lock on the row(id=8) and waits for the lock on row(id=2) to be released. However, the first transaction also waits for the lock on row(id=8) to be released. Thus, the two transactions wait for each other.
Other Notes
In PostgreSQL, each table (or relation) stores rows of data as an array of 8-KB pages, or blocks.
PostgreSQL uses a query planner/optimizer to determine the most efficient way to execute each query. In order to do so, it accounts for a number of factors, including whether or not the data in question has been indexed, as well as internal statistics about the database, like the number of rows in each table.
When a query involves updating or deleting data, PostgreSQL uses multi-version concurrency control (MVCC) to ensure that data remains accessible and consistent in high-concurrency environments. Each transaction operates on its own snapshot of the database at that point in time, so that read queries won’t block write queries, and vice versa.
In order to speed up queries, PostgreSQL uses a certain portion of the database server’s memory as a shared buffer cache (128MB by default), to store recently accessed blocks in memory. When data is updated or deleted, PostgreSQL will note the change in the write-ahead log (WAL), update the page in memory, and mark it as “dirty.” PostgreSQL periodically runs checkpoint processes to flush these dirty pages from memory to disk, to ensure that data is up to date, not only in memory but also on disk.
Questions:
- What is MVCC? Why do we need it and how is it implemented in Postgresql?
- Why is isolation necessary?
- What is a transaction?
- What are the lock levels? What are the differences?
- Can a transaction conflict itself, hold locks of conflicting modes?
- Why do we need application-controlled locking where MVCC is already presented?
- Does a row-level lock block query operation?
- When does an acquired lock get released?
- What is a deadlock, can Postgresql prevent them?
- Is it possible for an update operation to cause a deadlock?
- Do two select quires run in parallel block each other? Can select quires to acquire locks?
- What is implicit locking, how are they acquired? What do you acquire explicit locks?