TL;DR — The One-Minute Version
SHOW transaction_isolation; are like privacy settings for your database transactions. READ UNCOMMITTED = everyone can see your drafts. READ COMMITTED = people see your work only after you publish. REPEATABLE READ = you get a frozen snapshot — changes by others are invisible to you. SERIALIZABLE = transactions take turns, as if they ran one at a time.
Every time two users touch the same data at the same time, the database has to make a choice: how much should one user see of the other's unfinished work? That choice is the isolation levelNamed after the "I" in ACID. Isolation determines whether Transaction A can see data that Transaction B changed but hasn't committed yet. The SQL standard defines four levels: READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, and SERIALIZABLE. Each level prevents progressively more anomalies but costs progressively more performance.. Set it too low and you'll read garbage data that was never committed. Set it too high and your throughput drops to a crawl because every transaction waits in line.
The trick is that most of your queries need the default (READ COMMITTED), and only a few critical paths — payments, inventory decrements, seat bookings — need the heavy stuff. Stripe runs SERIALIZABLE on every payment transaction. They'd rather retry a failed transaction than charge someone twice. Booking.com uses optimistic lockingA strategy where you DON'T lock the row when you read it. Instead, you include a version number in your WHERE clause when you UPDATE. If someone else changed the row between your read and your write, the version won't match, your UPDATE affects 0 rows, and you know to retry. It's "optimistic" because you assume conflicts are rare — and for most workloads, they are. Booking.com uses this for hotel availability: each room record has a version column. with version columns for hotel availability — no heavy locks, just a version check at write time.
You can see your database's current isolation level right now. Open a terminal and run this:
-- PostgreSQL: check your default isolation level
SHOW transaction_isolation;
-- Output: read committed (that's the default)
-- MySQL: check your default isolation level
SELECT @@transaction_isolation;
-- Output: REPEATABLE-READ (MySQL defaults to a HIGHER level than Postgres!)
-- See your current transaction's unique ID
SELECT txid_current(); -- PostgreSQL assigns every transaction a sequential number
docker run --name pg-demo -e POSTGRES_PASSWORD=demo -p 5432:5432 -d postgres:16 gives you a running instance in 10 seconds. Then open two terminals: psql -h localhost -U postgres in each.
The Scenario — 10,000 People, 100 Tickets
It's Friday at 10:00 AM. A popular band just announced a concert. 100 tickets go live. 10,000 fans hit "Buy" within the same second. Your job is to build the ticket system. Let's think about what happens if you get this wrong.
available_tickets column: 100. Both decrement it to 99. Both commit. Two tickets sold, but only one was subtracted from inventory. Repeat this 10,000 times and you've sold 500 tickets that don't exist. 400 people get charged with no ticket. Customer support explodes. Lawyers call.
Let's set up the exact table so you can reproduce this yourself. Run this in either terminal:
-- Create a simple tickets table for our flash sale
CREATE TABLE tickets (
id SERIAL PRIMARY KEY,
event_name TEXT NOT NULL,
available INT NOT NULL CHECK (available >= 0),
price NUMERIC(10,2) NOT NULL
);
-- Insert our concert: 100 tickets at $75 each
INSERT INTO tickets (event_name, available, price)
VALUES ('Summer Rooftop Concert', 100, 75.00);
-- Verify it's there
SELECT * FROM tickets;
-- id | event_name | available | price
-- ----+------------------------+-----------+-------
-- 1 | Summer Rooftop Concert | 100 | 75.00
This is called a lost updateWhen two transactions read the same row, both compute a new value based on what they read, and then both write. The second write overwrites the first, and the first transaction's change is silently lost. It's one of the most common concurrency bugs in web applications. The fix: either use SELECT FOR UPDATE to lock the row before reading, or use UPDATE tickets SET available = available - 1 so the decrement is atomic (the database does the math, not your application).. Two people bought tickets, but the count only went down by one. The second write silently erased Alice's purchase. In a flash sale with 10,000 concurrent users, this can happen hundreds of times. You end up with 500 "successful" purchases for 100 tickets.
The numbers are real. Taylor Swift's Eras Tour had 14 million people in the Ticketmaster queue for about 2 million tickets. At that scale, even a 0.01% lost update rate means 1,400 double-sold seats. The resulting congressional hearing wasn't about slow pages — it was about people who were charged for seats that had already been sold to someone else.
UPDATE tickets SET available = available - 1 WHERE id = 1 instead of reading first, then writing? Would that fix the problem? (Spoiler: partly. The decrement itself becomes atomic, but what if you need to check availability before decrementing? That's where isolation levels come in.)
First Attempt — "Just Use Transactions"
The first thing every developer tries: wrap it in a transaction. BEGIN, do the work, COMMIT. Surely that fixes everything, right?
Let's test it. Open both terminals and follow along. We'll use PostgreSQL's default isolation level (READ COMMITTED) and see if plain transactions protect us.
UPDATE tickets SET available = 100 WHERE id = 1;
-- Step 1: Start a transaction
BEGIN;
-- Step 2: Read the current ticket count
SELECT available FROM tickets WHERE id = 1;
-- You'll see: 100
-- PAUSE HERE. Go to Terminal B and run its Step 1 and Step 2.
-- Come back when Terminal B also shows 100.
-- Step 3: Decrement based on what we read
UPDATE tickets SET available = 99 WHERE id = 1;
-- Step 4: Commit
COMMIT;
-- Step 5: Check the result
SELECT available FROM tickets WHERE id = 1;
-- You'll see: 99 <-- Should be 98! We lost Bob's update.
-- Step 1: Start a transaction
BEGIN;
-- Step 2: Read the current ticket count
SELECT available FROM tickets WHERE id = 1;
-- You'll see: 100 (same as Alice — that's the problem!)
-- PAUSE HERE. Go back to Terminal A and run Steps 3 & 4 (UPDATE + COMMIT).
-- Then come back here.
-- Step 3: Decrement based on what we read (we think it's still 100)
UPDATE tickets SET available = 99 WHERE id = 1;
-- NOTE: This UPDATE blocks briefly until Alice's COMMIT finishes.
-- But then it goes through — setting available to 99 AGAIN.
-- Step 4: Commit
COMMIT;
-- Step 5: Check the result
SELECT available FROM tickets WHERE id = 1;
-- 99 — We both bought a ticket, but only one was counted!
- Both Alice and Bob started transactions and read
available = 100. - Alice updated the row to 99 and committed. The database now has
available = 99. - Bob's UPDATE was briefly blocked while Alice's transaction held the row lock. Once Alice committed, the lock was released and Bob's UPDATE went through.
- But Bob's UPDATE says
SET available = 99— he computed this from his stale read of 100. He doesn't know Alice already decremented it. - The result: two tickets sold, but
availableonly dropped from 100 to 99. One ticket vanished from inventory.
The transaction (BEGIN/COMMIT) protected each individual operation from being half-done. But it did not prevent the two transactions from stepping on each other's toes. That's because READ COMMITTEDPostgreSQL's default isolation level. Each statement within a transaction sees only data that was committed BEFORE that statement started executing. But if you run the same SELECT twice within a transaction, you might get different results if another transaction committed changes in between. It prevents dirty reads (seeing uncommitted data) but allows non-repeatable reads and phantom reads. — the default level — only guarantees you won't read uncommitted data. It says nothing about two transactions reading the same row and then both writing to it.
So "just use transactions" is not wrong — it's incomplete. Transactions give you atomicityThe "A" in ACID. Atomicity means "all or nothing." If your transaction has three SQL statements and the server crashes after the second one, PostgreSQL rolls back ALL of them on restart. You never get a half-done state. But atomicity says nothing about whether OTHER transactions can interfere while yours is running — that's isolation's job. (all-or-nothing), but the level of isolation between concurrent transactions depends on the isolation level you're running at. And the default, READ COMMITTED, is designed for speed, not safety. It assumes conflicts are rare and your application code handles the edge cases. For a flash sale with 10,000 concurrent writes to the same row? Those edge cases become the common case.
Where It Breaks — The Four Anomalies You Can Reproduce
The SQL standard defines four bad things that can happen when transactions overlap. They range from "obviously wrong" to "subtly catastrophic." Each higher isolation level prevents more of them. Let's reproduce every single one so you see them happen in your terminals, not just read about them in a textbook.
Let's go through each one. For every anomaly, you'll get the exact SQL to run in your two terminals so you can see the problem live.
Anomaly 1: Dirty Read
A dirty readReading data from a transaction that hasn't committed yet. If that transaction rolls back, you just based a decision on data that NEVER EXISTED. It's called "dirty" because the data hasn't been cleaned up (committed) yet. The SQL standard says READ UNCOMMITTED allows this, but PostgreSQL is opinionated: even at READ UNCOMMITTED, PostgreSQL silently upgrades you to READ COMMITTED. MySQL's InnoDB, however, will actually let you read uncommitted data. happens when Transaction B reads data that Transaction A wrote but hasn't committed yet. If A later rolls back, B just acted on data that never existed. Imagine approving a $10,000 loan because you read the applicant's account balance as $50,000 — but that balance was from an uncommitted transaction that gets rolled back. The applicant actually has $500.
READ UNCOMMITTED, PostgreSQL silently treats it as READ COMMITTED. PostgreSQL's developers decided dirty reads are so dangerous that they refuse to implement them, period. MySQL's InnoDB does allow them. The demo below shows what would happen in MySQL, and what PostgreSQL prevents.
-- Reset first
UPDATE tickets SET available = 100 WHERE id = 1;
-- === TERMINAL A ===
BEGIN;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; -- PG ignores this, treats as READ COMMITTED
UPDATE tickets SET available = 50 WHERE id = 1; -- Changed but NOT committed
-- Don't commit yet! Switch to Terminal B.
-- === TERMINAL B ===
BEGIN;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT available FROM tickets WHERE id = 1;
-- PostgreSQL returns: 100 (the committed value — it REFUSES to show you the dirty 50)
COMMIT;
-- === TERMINAL A ===
ROLLBACK; -- We changed our mind. The 50 was never real.
-- If B had read 50, it would have acted on fantasy data.
-- PostgreSQL protected you.
-- In MySQL/InnoDB, dirty reads actually happen at READ UNCOMMITTED
-- === TERMINAL A ===
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
START TRANSACTION;
UPDATE tickets SET available = 50 WHERE id = 1;
-- Don't commit!
-- === TERMINAL B ===
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
START TRANSACTION;
SELECT available FROM tickets WHERE id = 1;
-- MySQL returns: 50 ← DIRTY READ! This data was never committed.
COMMIT;
-- === TERMINAL A ===
ROLLBACK; -- The 50 never happened.
-- But Terminal B already read 50 and maybe made decisions based on it.
-- That's why almost nobody uses READ UNCOMMITTED in production.
Anomaly 2: Non-Repeatable Read
This is the one you'll actually hit in production. You read a row, do some logic, then read the same row again — and the value changed because someone else committed an update in between. It's called "non-repeatable" because the same SELECT gives you a different answer each time you run it within the same transaction.
This is exactly the flash sale bug from Section 2. Let's reproduce it explicitly.
-- Reset
UPDATE tickets SET available = 100 WHERE id = 1;
-- === TERMINAL A ===
BEGIN;
SELECT available FROM tickets WHERE id = 1;
-- Returns: 100
-- PAUSE. Go to Terminal B.
-- === TERMINAL B === (run all of this)
BEGIN;
UPDATE tickets SET available = 80 WHERE id = 1;
COMMIT; -- Bob's update is now committed.
-- === TERMINAL A === (come back here)
SELECT available FROM tickets WHERE id = 1;
-- Returns: 80 ← DIFFERENT from your first read!
-- Same transaction, same SELECT, different result.
-- That's a non-repeatable read.
COMMIT;
At READ COMMITTEDThe default in PostgreSQL. Each SQL statement sees a snapshot of all data committed at the instant that statement begins executing. So two SELECT statements in the same transaction can see different data if someone committed changes between them. This is fine for most queries, but dangerous for read-then-write patterns like our flash sale., each SELECT sees a fresh snapshot of committed data. If someone committed a change between your two SELECTs, you'll see it. For casual reads (like showing a user's profile) this is fine. For the flash sale? It's a disaster, because your application reads 100, computes 99, and writes 99 — not knowing someone changed it to 80 in between.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; right after BEGIN in Terminal A — both SELECTs will return 100.
Anomaly 3: Phantom Read
A phantom readA non-repeatable read is about a single row changing. A phantom read is about the SET OF ROWS changing. You run a WHERE query, get 5 rows. Someone inserts a new row that matches your WHERE. You run the same query again — now you get 6 rows. The new row is the "phantom." This matters for aggregations like COUNT, SUM, and for business logic that depends on "how many rows match this condition." is sneakier than a non-repeatable read. Instead of an existing row changing, a new row appears (or an existing one disappears) between two queries. You count the rows matching a condition — get 5. Someone inserts a matching row. You count again — get 6. The new row is the "phantom."
This matters for queries like "how many tickets are available across all events" or "how many orders did we process today." Let's set up a multi-event table to demonstrate:
-- Setup: add more events
INSERT INTO tickets (event_name, available, price) VALUES
('Jazz Night', 50, 45.00),
('Comedy Show', 75, 30.00);
-- === TERMINAL A ===
BEGIN;
-- How many events have tickets priced over $40?
SELECT COUNT(*) FROM tickets WHERE price > 40.00;
-- Returns: 2 (the concert at $75 and jazz at $45)
-- PAUSE. Go to Terminal B.
-- === TERMINAL B ===
INSERT INTO tickets (event_name, available, price) VALUES ('VIP Gala', 20, 200.00);
-- No transaction needed — it auto-commits.
-- === TERMINAL A === (come back)
SELECT COUNT(*) FROM tickets WHERE price > 40.00;
-- Returns: 3 ← PHANTOM! A new row appeared that matches our WHERE.
-- If you're generating an invoice or report inside this transaction,
-- the totals at the top won't match the line items at the bottom.
COMMIT;
Anomaly 4: Serialization Anomaly (Write Skew)
This is the subtlest and most dangerous anomaly. Two transactions both read overlapping data, then each writes based on what it read. Neither transaction's individual read or write is wrong. But the combination of their writes creates a state that couldn't have happened if they ran one at a time. This is called write skewA concurrency anomaly where two transactions read the same data, make different decisions based on it, and write non-conflicting rows. Neither transaction violates any constraint individually, but together they create an invalid state. Classic example: two doctors are on-call. Hospital policy requires at least one. Both check "are there 2+ doctors on-call?" (yes), both request leave (each removes themselves). Result: zero doctors on-call. Neither UPDATE conflicts with the other because they modify different rows — but the invariant (at least 1 doctor) is violated..
Classic example: a hospital requires at least one doctor on-call at all times. Two doctors are on-call. Both check: "Is there more than one doctor on-call?" Both see: yes (2). Both submit "take me off-call." Both transactions succeed — because they're updating different rows (each doctor's own record). Result: zero doctors on-call. Nobody violated any constraint individually, but the combined result is illegal.
-- Setup: two doctors on-call
CREATE TABLE doctors (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
on_call BOOLEAN NOT NULL DEFAULT true
);
INSERT INTO doctors (name, on_call) VALUES ('Dr. Alice', true), ('Dr. Bob', true);
-- === TERMINAL A === (Dr. Alice wants to go home)
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- "Are there at least 2 doctors on-call? If yes, I can leave."
SELECT COUNT(*) FROM doctors WHERE on_call = true;
-- Returns: 2 — safe to leave!
-- PAUSE. Go to Terminal B.
-- === TERMINAL B === (Dr. Bob ALSO wants to go home)
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM doctors WHERE on_call = true;
-- Returns: 2 — also looks safe!
UPDATE doctors SET on_call = false WHERE name = 'Dr. Bob';
COMMIT; -- Bob is off-call now.
-- === TERMINAL A === (come back)
UPDATE doctors SET on_call = false WHERE name = 'Dr. Alice';
COMMIT; -- Alice is off-call too.
-- Check the damage:
SELECT * FROM doctors WHERE on_call = true;
-- Returns: 0 rows. Nobody is on-call. Hospital policy violated.
-- REPEATABLE READ didn't catch this because each txn modified a DIFFERENT row.
The key insight: write skew can't be prevented by row-level locks because the two transactions modify different rows. The database doesn't detect a conflict because there's no physical overlap. Only SERIALIZABLEThe highest isolation level. The database guarantees that the result of running concurrent transactions is identical to SOME serial (one-at-a-time) order. PostgreSQL implements this using Serializable Snapshot Isolation (SSI), which tracks read and write dependencies between transactions. If it detects a cycle that could produce an anomaly, it aborts one of the transactions with a "serialization failure" error. Your application must be ready to retry. Stripe wraps every payment in SERIALIZABLE and retries up to 3 times on conflict. isolation catches this. At SERIALIZABLE, PostgreSQL detects that the two transactions read overlapping data and wrote based on conflicting assumptions, and it aborts one of them with a serialization failure.
-- Reset
UPDATE doctors SET on_call = true;
-- === TERMINAL A ===
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT COUNT(*) FROM doctors WHERE on_call = true;
-- Returns: 2
-- PAUSE. Go to Terminal B.
-- === TERMINAL B ===
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT COUNT(*) FROM doctors WHERE on_call = true;
-- Returns: 2
UPDATE doctors SET on_call = false WHERE name = 'Dr. Bob';
COMMIT; -- Succeeds.
-- === TERMINAL A === (come back)
UPDATE doctors SET on_call = false WHERE name = 'Dr. Alice';
COMMIT;
-- ERROR: could not serialize access due to read/write dependencies
-- DETAIL: Reason code: Canceled on identification as a pivot...
-- PostgreSQL aborted Alice's transaction. She must retry.
-- When she retries, she'll see only 1 doctor on-call and knows she can't leave.
-- The hospital is safe.
| Isolation Level | Dirty Read | Non-Repeatable Read | Phantom Read | Serialization Anomaly |
|---|---|---|---|---|
| READ UNCOMMITTED | Possible* | Possible | Possible | Possible |
| READ COMMITTED | Prevented | Possible | Possible | Possible |
| REPEATABLE READ | Prevented | Prevented | Standard: Possible** | Possible |
| SERIALIZABLE | Prevented | Prevented | Prevented | Prevented |
*PostgreSQL silently upgrades READ UNCOMMITTED to READ COMMITTED. **PostgreSQL also prevents phantoms at REPEATABLE READ (goes beyond the SQL standard).
The Breakthrough — Pick the Right Level for the Job
Here's the insight that separates juniors from seniors: not everything needs SERIALIZABLE. And not everything is safe at READ COMMITTED. The real skill is knowing which isolation level to use where.
Think of it like door locks in a building. The lobby has a simple latch — anyone can walk in. Your office has a key lock. The server room has a badge reader plus a deadbolt. And the vault has biometric scanning plus a time-locked combination. You don't put vault-level security on the lobby — nobody would get in. And you don't put a simple latch on the vault — everything would get stolen. Same with isolation levels: the cost of safety goes up at each level, and most of your data doesn't need the highest tier.
Let's check your database's default right now, and try switching levels:
-- Check your current default
SHOW transaction_isolation;
-- PostgreSQL: "read committed"
-- This is what every new transaction uses unless you say otherwise.
-- Change it for ONE transaction only:
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- ... your critical SQL here ...
COMMIT;
-- The NEXT transaction goes back to the default. No permanent change.
-- Change the default for your entire session:
SET default_transaction_isolation = 'serializable';
-- Now EVERY transaction in this connection uses SERIALIZABLE.
-- Useful for payment microservices that need it for every query.
-- Change the server-wide default (postgresql.conf):
-- default_transaction_isolation = 'read committed'
-- You almost NEVER change this. 99% of connections should use the default.
-- See your current transaction ID (useful for debugging):
SELECT txid_current();
-- Returns something like: 12847562
-- Every transaction gets a unique, sequential number.
The PostgreSQL vs MySQL Difference
This trips up a lot of developers who switch between databases. The two most popular open-source databases made different default choices:
- Default: READ COMMITTED — the fastest safe option.
- READ UNCOMMITTED: Silently treated as READ COMMITTED. PostgreSQL's team decided dirty reads are too dangerous to allow, even if you ask for them. "We know better."
- REPEATABLE READ: Uses Snapshot Isolation (SI)Each transaction sees a consistent snapshot of the database as it existed when the transaction started. PostgreSQL implements this using MVCC (Multi-Version Concurrency Control). Instead of locking rows, it keeps old versions around until no transaction needs them. This means readers never block writers and writers never block readers — a huge performance win. The downside: more storage overhead for old row versions, and periodic VACUUM is needed to clean them up. which goes BEYOND the SQL standard — it also prevents phantom reads, not just non-repeatable reads.
- SERIALIZABLE: Uses SSI (Serializable Snapshot Isolation)A technique invented by Michael Cahill in his 2008 PhD thesis and implemented in PostgreSQL 9.1. Instead of locking everything, SSI extends snapshot isolation by tracking read/write dependencies between transactions. If it detects a potential cycle (a situation that could produce a non-serializable result), it proactively aborts one transaction with ERROR 40001. The aborted transaction can simply retry. This is much more efficient than traditional two-phase locking because most transactions complete without conflict. — a sophisticated algorithm that detects conflicts without heavy locking. Most transactions succeed without waiting.
- Performance cost of SERIALIZABLE: ~10-30% throughput reduction under contention. Under low contention, almost no difference.
- Default: REPEATABLE READ — one step higher than PostgreSQL's default.
- READ UNCOMMITTED: Actually allows dirty reads (unlike PostgreSQL). You can legitimately read uncommitted data.
- REPEATABLE READ: Uses MVCCMulti-Version Concurrency Control. Both PostgreSQL and MySQL/InnoDB use MVCC, but they implement it differently. MySQL's InnoDB keeps old versions in the "undo log" (also called rollback segment) rather than in the main table. This means InnoDB's MVCC doesn't bloat the table like PostgreSQL's can, but the undo log needs its own management. At REPEATABLE READ, InnoDB still allows phantom reads for locking reads (SELECT FOR UPDATE) but prevents them for plain SELECT. for consistent reads but does NOT prevent phantom reads for locking reads (
SELECT FOR UPDATE). - SERIALIZABLE: Uses gap locksInnoDB locks not just existing rows but the "gaps" between index entries. If your query reads rows with id 5, 10, and 15, InnoDB at SERIALIZABLE locks the gaps (5-10) and (10-15) too, preventing any INSERT into those ranges. This prevents phantom reads but can cause more lock contention and deadlocks than PostgreSQL's SSI approach. The trade-off: guaranteed safety but potentially lower throughput under high write concurrency. — a heavier mechanism that locks entire ranges of the index, preventing inserts into "gaps." More prone to deadlocks than PostgreSQL's SSI approach.
- Why the higher default? MySQL's replication historically depended on consistent statement-based replay. REPEATABLE READ made replication safer by default.
This means the exact same application code behaves differently on PostgreSQL vs MySQL. If you develop on MySQL (REPEATABLE READ default) and deploy on PostgreSQL (READ COMMITTED default), you might introduce non-repeatable read bugs that never appeared in testing. Always check SHOW transaction_isolation; when you set up a new database.
SELECT FOR UPDATE at READ COMMITTED instead? (Hint: SELECT FOR UPDATE prevents lost updates on a specific row, but it can't prevent write skew — the doctor on-call problem. Payments involve multi-row constraints like "don't charge more than the account balance" and "don't issue duplicate refunds." SERIALIZABLE is the only level that catches ALL of these automatically, without the developer having to think about every possible race condition.)
| Isolation Level | Throughput (relative) | Retry Rate | Use When |
|---|---|---|---|
| READ COMMITTED | 100% (baseline) | ~0% | Most reads, non-critical writes |
| REPEATABLE READ | ~95% | ~1-2% under contention | Reports, exports, batch jobs |
| SERIALIZABLE | ~70-90% | ~5-15% under high contention | Payments, bookings, safety-critical |
The retry rate is the key number. At SERIALIZABLE, PostgreSQL will occasionally abort a transaction with error code 40001 (serialization failure). Your application must catch this and retry the entire transaction. This is not a bug — it's the database saying "I detected a potential anomaly and prevented it by killing your transaction. Try again." Stripe's codebase has retry wrappers around every database call for exactly this reason.
# The standard retry pattern for SERIALIZABLE transactions
# (Python + psycopg2, but the pattern is the same in any language)
import psycopg2
import time
def run_serializable(conn, sql_func, max_retries=3):
"""Execute a function inside a SERIALIZABLE transaction with retry."""
for attempt in range(max_retries):
try:
with conn: # auto-commit/rollback
with conn.cursor() as cur:
cur.execute("SET TRANSACTION ISOLATION LEVEL SERIALIZABLE")
return sql_func(cur) # Your business logic here
except psycopg2.errors.SerializationFailure:
if attempt == max_retries - 1:
raise # Give up after max retries
time.sleep(0.01 * (2 ** attempt)) # Exponential backoff: 10ms, 20ms, 40ms
conn.rollback()
# If we get here, all retries failed
# Usage:
def book_ticket(cur):
cur.execute("SELECT available FROM tickets WHERE id = 1")
available = cur.fetchone()[0]
if available <= 0:
raise Exception("Sold out!")
cur.execute("UPDATE tickets SET available = available - 1 WHERE id = 1")
return "Ticket booked!"
# This will retry up to 3 times on serialization conflict
result = run_serializable(conn, book_ticket)
Now you know what the four levels are, what anomalies they prevent, and when to use each one. But we've only covered half the story. Isolation levels are the automatic protection the database gives you. In Sections 6+, we'll dive into explicit locking — SELECT FOR UPDATE, SELECT FOR UPDATE SKIP LOCKED, advisory locks, deadlock detection — the tools you use when you want to take manual control of concurrency instead of relying on the database to figure it out.
How It Works — Locking Strategies
Isolation levels are the automatic safety net. But sometimes you need to take the wheel yourself. You know exactly which row is going to be contested, and you want to lock it before anyone else touches it. That's explicit lockingWhen you manually tell the database to lock specific rows (or tables) rather than relying on the isolation level to handle concurrency automatically. Explicit locks give you fine-grained control: you decide which rows to lock, when to lock them, and how long to hold them. The trade-off is that you must think about lock ordering and deadlocks yourself — the database won't do it for you. — and there are four main strategies, each with different trade-offs.
SELECT * FROM jobs WHERE status = 'pending' LIMIT 1, then update the row to status = 'processing'. Two workers grab the same job. How would you change the query so each worker gets a unique job without blocking the others? (Hint: there's a locking clause designed exactly for this — look at SKIP LOCKED below.)
Pessimistic Locking — SELECT FOR UPDATE
Think of pessimistic locking as the "nuclear option." Before you even read a row, you tell the database: "Lock this row. Nobody else touches it until I'm done." It's called pessimisticPessimistic because you ASSUME there will be a conflict. You lock preventively, just in case. The opposite is optimistic locking, where you assume conflicts are rare and only check at the end. Pessimistic is safer but slower; optimistic is faster but requires retry logic. because you assume the worst — that someone else IS going to try to change this row right now — so you lock it preemptively.
Here's the magic keyword: FOR UPDATE. You add it to the end of your SELECT statement, and the database locks every row returned by that query until your transaction commits or rolls back.
-- Reset first
UPDATE tickets SET available = 100 WHERE id = 1;
-- Step 1: Begin and lock the row
BEGIN;
SELECT * FROM tickets WHERE id = 1 FOR UPDATE;
-- id | event_name | available | price
-- ----+------------------------+-----------+-------
-- 1 | Summer Rooftop Concert | 100 | 75.00
-- The row is now LOCKED. Go to Terminal B.
-- (After Terminal B tries and blocks...)
-- Step 2: Decrement and commit
UPDATE tickets SET available = available - 1 WHERE id = 1;
COMMIT;
-- Terminal B unblocks now and sees available = 99
-- Step 1: Try to lock the SAME row
BEGIN;
SELECT * FROM tickets WHERE id = 1 FOR UPDATE;
-- ⏳ BLOCKS HERE. The cursor just hangs.
-- Terminal A holds the lock. We wait.
-- (After Terminal A commits, this query returns:)
-- id | event_name | available | price
-- ----+------------------------+-----------+-------
-- 1 | Summer Rooftop Concert | 99 | 75.00
-- We see 99 (Alice's update), NOT 100. No lost update!
-- Step 2: Decrement from the CORRECT value
UPDATE tickets SET available = available - 1 WHERE id = 1;
COMMIT;
-- Final: available = 98. Both tickets properly counted!
-- While Terminal A holds the lock, open a THIRD terminal and run:
SELECT locktype, relation::regclass, mode, granted, pid
FROM pg_locks
WHERE relation = 'tickets'::regclass;
-- You'll see something like:
-- locktype | relation | mode | granted | pid
-- ----------+----------+------------------+---------+------
-- relation | tickets | RowShareLock | t | 1234 ← Terminal A (holds it)
-- relation | tickets | RowShareLock | t | 1235 ← Terminal B (waiting)
-- tuple | tickets | ExclusiveLock | f | 1235 ← The actual wait
-- "granted = f" means Terminal B is WAITING for that lock.
-- As soon as Terminal A commits, granted flips to "t".
FOR UPDATE, every single transaction queues up behind the one before it. If each transaction takes 5ms to complete, that's 50 seconds for all 10K users. Most of them will time out. Your flash sale becomes a flash failure. We need something smarter.
SKIP LOCKED — The Flash Sale Solution
Here's where it gets clever. What if, instead of waiting for a locked row, you just skip it and grab the next available one? That's exactly what SKIP LOCKEDA PostgreSQL clause (also supported by MySQL 8.0+ and Oracle) that tells the database: "If the row I want is already locked by another transaction, don't wait for it — just pretend it doesn't exist and move on to the next matching row." This is incredibly powerful for job queues and flash sales because workers never block each other. Added in PostgreSQL 9.5. does. Instead of forming a queue, each user instantly grabs whichever row isn't already taken.
The idea comes from Arpit BhayaniA well-known system design educator and engineer. His deep dives on database internals, particularly on pessimistic locking and SKIP LOCKED for flash sale systems, are widely referenced in the engineering community. He benchmarked the difference and found SKIP LOCKED to be roughly 10× faster than plain FOR UPDATE under high concurrency.'s deep dive on flash sale systems. His benchmarks tell the whole story: plain FOR UPDATE took 1.7 seconds for 100 concurrent bookings. Add SKIP LOCKED and it drops to 147ms. That's a 10× speedup with a two-word change.
But to use SKIP LOCKED properly, you need individual rows to lock — not a single counter. Let's restructure our tickets table so each ticket is its own row:
-- Individual tickets (one row per seat) instead of a counter
CREATE TABLE seats (
id SERIAL PRIMARY KEY,
event_id INT NOT NULL,
seat_num TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'available'
CHECK (status IN ('available', 'booked')),
booked_by TEXT
);
-- Create 100 seats for our concert
INSERT INTO seats (event_id, seat_num)
SELECT 1, 'SEAT-' || generate_series(1, 100);
-- Verify
SELECT count(*) FROM seats WHERE status = 'available';
-- 100 seats ready to go
-- Step 1: Grab one available seat (lock it)
BEGIN;
SELECT id, seat_num FROM seats
WHERE status = 'available'
ORDER BY id
LIMIT 1
FOR UPDATE SKIP LOCKED;
-- id | seat_num
-- ----+----------
-- 1 | SEAT-1
-- SEAT-1 is locked. Go to Terminal B immediately.
-- (After checking Terminal B got SEAT-2, not SEAT-1...)
-- Step 2: Book it
UPDATE seats SET status = 'booked', booked_by = 'Alice' WHERE id = 1;
COMMIT;
-- Step 1: Grab one available seat (skip locked rows)
BEGIN;
SELECT id, seat_num FROM seats
WHERE status = 'available'
ORDER BY id
LIMIT 1
FOR UPDATE SKIP LOCKED;
-- id | seat_num
-- ----+----------
-- 2 | SEAT-2
-- NOT SEAT-1! That one is locked by Alice, so we SKIPPED it.
-- No waiting. No blocking. Instant response.
-- Step 2: Book it
UPDATE seats SET status = 'booked', booked_by = 'Bob' WHERE id = 2;
COMMIT;
-- Both Alice and Bob got different seats, zero waiting.
# Benchmark: 100 concurrent connections, each booking one seat
# Save this as book_seat.sql:
# BEGIN;
# SELECT id FROM seats WHERE status='available'
# ORDER BY id LIMIT 1 FOR UPDATE SKIP LOCKED;
# UPDATE seats SET status='booked' WHERE id = (
# SELECT id FROM seats WHERE status='available'
# ORDER BY id LIMIT 1 FOR UPDATE SKIP LOCKED);
# COMMIT;
# Run with pgbench (built into PostgreSQL):
pgbench -c 100 -j 4 -t 1 -f book_seat.sql mydb
# Compare: WITHOUT SKIP LOCKED (just FOR UPDATE):
# ~1,700 ms for 100 bookings (everyone waits in line)
# WITH SKIP LOCKED:
# ~147 ms for 100 bookings (everyone grabs a different row)
# That's a 10× speedup.
SELECT * FROM jobs WHERE status = 'pending' LIMIT 1 FOR UPDATE SKIP LOCKED; lets multiple workers pull jobs simultaneously without stepping on each other. PostgreSQL's own pg_cron extension uses this pattern internally.
Optimistic Locking — Version Columns
What if you don't want to lock anything at all? The optimisticCalled "optimistic" because you assume conflicts are rare. You read the data without any lock, do your processing, and only check for conflicts when you write. If someone else modified the row in the meantime, your write fails and you retry. This works beautifully when most transactions don't actually conflict — which is the case for many real-world workloads. approach says: "Most of the time, nobody else is touching this row. So don't bother locking. Just read the data, do your work, and when you write, check that nothing changed since you read it."
The trick is a version column (sometimes called a lock_versionThe name Rails uses for its built-in optimistic locking column. When you add a lock_version integer column to any ActiveRecord model, Rails automatically includes AND lock_version = ... in every UPDATE and increments it on success. If the version doesn't match (someone else updated first), Rails raises ActiveRecord::StaleObjectError and you can rescue it and retry. Django has a similar concept via third-party packages like django-concurrency. in Rails). Every time you update a row, you increment the version. When writing, your WHERE clause includes the version you read. If someone changed it, your UPDATE matches zero rows — and you know to retry.
-- Add a version column to products
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
stock INT NOT NULL CHECK (stock >= 0),
version INT NOT NULL DEFAULT 1
);
INSERT INTO products (name, stock) VALUES ('Limited Edition Sneakers', 50);
-- Application reads the product (no locks!)
SELECT id, stock, version FROM products WHERE id = 1;
-- id | stock | version
-- ----+-------+---------
-- 1 | 50 | 1
-- Application tries to decrement stock
-- Notice: WHERE includes version = 1
UPDATE products
SET stock = stock - 1, version = version + 1
WHERE id = 1 AND version = 1;
-- UPDATE 1 ← success! One row updated.
-- If another transaction already changed version to 2:
UPDATE products
SET stock = stock - 1, version = version + 1
WHERE id = 1 AND version = 1;
-- UPDATE 0 ← ZERO rows updated. Version mismatch!
-- The application knows: "someone else updated first, I need to retry."
When to use it: Optimistic locking shines when contention is low — lots of people reading, but rarely two people updating the same row at the same time. Booking.com uses this for hotel room availability. Millions of people search and view rooms (reads), but only a tiny fraction actually click "Book" at the same instant for the same room. The retry rate is under 0.1%.
When NOT to use it: High contention scenarios like flash sales. If 10,000 people all try to update the same product's stock at the same millisecond, optimistic locking means 9,999 of them fail and retry, then 9,998 fail and retry again — an avalanche of wasted work. For that, use pessimistic locking or SKIP LOCKED.
MVCC — Multi-Version Concurrency Control
Here's the one you don't choose — it's always running. MVCCMulti-Version Concurrency Control. The mechanism that lets readers and writers operate simultaneously without blocking each other. Instead of locking a row when someone reads it, the database keeps multiple versions of the same row. Each transaction sees the version that was current when it started (or when its statement started, depending on isolation level). Old versions are cleaned up later by VACUUM (PostgreSQL) or purge threads (MySQL/InnoDB). is how PostgreSQL and MySQL handle concurrent access without making every read wait for every write. The core idea: every time someone updates a row, the database doesn't overwrite the old data. It creates a new version of the row and keeps the old one around for any transactions that still need it.
Think of it like Google Docs version history. When you edit a document, the old versions aren't deleted — they're preserved. Anyone who opened the document before your edit still sees the old version until they refresh. That's MVCC in a nutshell.
-- See the hidden MVCC columns in PostgreSQL
-- Every row has these, you just don't normally see them:
SELECT xmin, xmax, ctid, * FROM tickets WHERE id = 1;
-- xmin | xmax | ctid | id | event_name | available | price
-- -------+------+-------+----+------------------------+-----------+-------
-- 12847 | 0 | (0,1) | 1 | Summer Rooftop Concert | 100 | 75.00
-- xmin = the transaction ID that CREATED this row version
-- xmax = the transaction ID that DELETED/REPLACED this version (0 = still alive)
-- ctid = physical location on disk (page 0, tuple 1)
-- Now update the row:
UPDATE tickets SET available = 99 WHERE id = 1;
-- Check again:
SELECT xmin, xmax, ctid, * FROM tickets WHERE id = 1;
-- xmin | xmax | ctid | id | event_name | available | price
-- -------+------+-------+----+------------------------+-----------+-------
-- 12850 | 0 | (0,2) | 1 | Summer Rooftop Concert | 99 | 75.00
-- Notice: xmin changed (new transaction created this version)
-- Notice: ctid changed from (0,1) to (0,2) — NEW physical row!
-- The old version at (0,1) is now a "dead tuple" — it has xmax set.
-- See how many dead tuples exist:
SELECT relname, n_live_tup, n_dead_tup
FROM pg_stat_user_tables
WHERE relname = 'tickets';
-- relname | n_live_tup | n_dead_tup
-- ---------+------------+------------
-- tickets | 1 | 1 ← one dead version lingering
-- VACUUM will clean this up.
SELECT max(age(xmin)) FROM your_table; and kill long-running transactions.)
Going Deeper — Advanced Locking Concepts
The four strategies from Section 6 cover 90% of what you'll need. But production databases throw curveballs. Deadlocks happen at 3 AM. MySQL locks invisible gaps between index entries. Sometimes you need locks that aren't tied to any row at all. This section covers the advanced concepts that separate someone who can use databases from someone who can debug them under pressure.
A deadlockWhen two or more transactions form a circular wait: A holds a lock that B needs, and B holds a lock that A needs. Neither can proceed. Without intervention, they'd wait forever. All modern databases detect deadlocks automatically and kill one transaction (the "victim") so the other can proceed. PostgreSQL checks for deadlocks every deadlock_timeout interval (default: 1 second). is the database equivalent of two people meeting in a narrow hallway — each waiting for the other to step aside, neither willing to move first. Transaction A locks row 1 and needs row 2. Transaction B locks row 2 and needs row 1. Neither can proceed. They're stuck.
-- Terminal A:
BEGIN;
UPDATE tickets SET price = 80.00 WHERE id = 1; -- locks row 1
-- Now go to Terminal B...
-- Terminal B:
BEGIN;
UPDATE tickets SET price = 90.00 WHERE id = 2; -- locks row 2
UPDATE tickets SET price = 85.00 WHERE id = 1; -- BLOCKS (A has row 1)
-- Now go back to Terminal A...
-- Terminal A:
UPDATE tickets SET price = 95.00 WHERE id = 2; -- DEADLOCK!
-- ERROR: deadlock detected
-- DETAIL: Process 1234 waits for ShareLock on transaction 12850;
-- blocked by process 1235.
-- Process 1235 waits for ShareLock on transaction 12849;
-- blocked by process 1234.
-- PostgreSQL killed Terminal A's transaction. Terminal B proceeds.
-- Find deadlocked queries:
SELECT pid, state, wait_event_type, query
FROM pg_stat_activity
WHERE wait_event_type = 'Lock';
-- Set detection timeout (how long before PG checks for deadlocks):
SET deadlock_timeout = '1s'; -- default is 1 second
The fix is simple in theory: always lock rows in the same order. If every transaction locks row 1 before row 2, no circular wait can form. In practice, this means sorting your operations by primary key before executing them.
MySQL has a locking concept that doesn't exist in PostgreSQL: gap locksAn InnoDB-specific lock that locks the "gap" between two index entries, preventing any INSERT into that range. If your index has entries at 100 and 200, a gap lock locks the space between them, so nobody can insert a row with a value of 150 while your transaction is running. Gap locks are used at REPEATABLE READ and SERIALIZABLE to prevent phantom reads. PostgreSQL uses SSI (Serializable Snapshot Isolation) instead, which achieves the same goal without locking gaps.. Instead of just locking existing rows, MySQL also locks the spaces between index entries. This prevents other transactions from inserting new rows into those gaps — which is how InnoDB prevents phantom readsWhen a transaction re-runs a range query (like SELECT WHERE amount BETWEEN 100 AND 200) and gets different rows the second time because another transaction inserted a new row into that range. Gap locks prevent this by locking the range itself, not just the existing rows in it..
-- MySQL only (gap locks don't exist in PostgreSQL)
-- This query locks ALL rows with amount between 100-200
-- AND the gaps between them:
BEGIN;
SELECT * FROM orders
WHERE amount BETWEEN 100 AND 200
FOR UPDATE;
-- In another session, try inserting into the gap:
INSERT INTO orders (amount) VALUES (150);
-- ⏳ BLOCKS! The gap between 100 and 200 is locked.
-- This can cause surprising deadlocks in MySQL
-- when two transactions lock overlapping ranges.
-- See InnoDB lock info:
SELECT * FROM performance_schema.data_locks
WHERE LOCK_TYPE = 'RECORD';
Gap locks are one of the biggest differences between MySQL and PostgreSQL internals. If you're used to PostgreSQL and switch to MySQL, unexpected blocking from gap locks is one of the first things that trips people up.
Sometimes you need to lock a concept, not a row. "Only one server should run database migrations at a time." "Only one worker should warm the cache." "Only one cron job should execute this batch." There's no specific row to lock. Enter advisory locksLocks that the database manages for you, but that have no connection to any particular row or table. You pick an arbitrary number as the lock ID, acquire it, and release it. The database tracks who holds which advisory lock. Other sessions that try to acquire the same lock ID will block (or fail, if you use the try_ variant). They're "advisory" because the database doesn't enforce them automatically — your application code decides what they mean. — locks that are stored in the database but don't correspond to any table or row.
-- Acquire an advisory lock with ID 12345
-- (the number is arbitrary — your app decides what it means)
SELECT pg_advisory_lock(12345);
-- Returns immediately. You now hold lock #12345.
-- In another terminal:
SELECT pg_advisory_lock(12345);
-- ⏳ BLOCKS until the first session releases it.
-- Non-blocking version (returns true/false instead of waiting):
SELECT pg_try_advisory_lock(12345);
-- Returns FALSE if someone else holds it. Your app can decide what to do.
-- Release it:
SELECT pg_advisory_unlock(12345);
-- Session-level advisory lock (auto-released when connection closes):
SELECT pg_advisory_lock(42);
-- Even if your transaction rolls back, this lock stays until disconnect.
-- Transaction-level advisory lock (released on COMMIT/ROLLBACK):
SELECT pg_advisory_xact_lock(42);
-- See all advisory locks held right now:
SELECT * FROM pg_locks WHERE locktype = 'advisory';
Real uses: Running database migrations (Rails uses advisory locks so two servers don't run migrations simultaneously). Distributed cron jobs (only one instance executes the job). Cache warming (prevent thundering herd). The lock lives in PostgreSQL's shared memory, so it's fast — microseconds to acquire, no disk I/O.
All the locking strategies above are practical tools. Two-Phase LockingA theoretical protocol that guarantees serializability (the gold standard of correctness). It has two strict rules: (1) In the "growing phase," a transaction can acquire locks but must not release any. (2) In the "shrinking phase," a transaction can release locks but must not acquire any. Once you release your first lock, you can never acquire another. This simple rule mathematically prevents all anomalies. Most databases use a variant called Strict 2PL (S2PL) where all locks are held until COMMIT, which is even stricter but simpler to implement. (2PL) is the theoretical foundation that explains why they work. It's the answer to the interview question: "How do databases guarantee serializability?"
The rule is beautifully simple: once you release your first lock, you can never acquire another one. This prevents a scenario where Transaction A releases a lock on Row 1, Transaction B grabs it and modifies it, and then Transaction A tries to read Row 1 again (seeing B's changes mid-transaction). In practice, most databases use Strict 2PL — they hold ALL locks until COMMIT, which is even simpler: you never release anything until you're completely done.
You'll never call a "2PL API" in your code. But understanding it explains why transactions hold locks until COMMIT (not until the last statement), why long transactions cause contention (they hold locks longer), and why the database can guarantee that SERIALIZABLE really means serializable.
Variations — PostgreSQL vs MySQL vs SQL Server
The SQL standard defines four isolation levels. But every database implements them differently. The "same" isolation level can behave very differently depending on which database you're using. This section maps out the key differences so you don't get blindsided when switching databases or deploying to a different engine.
PostgreSQL — Practical and Permissive
PostgreSQL's approach is "be practical, not pedantic." Its headline decisions:
- Default: READ COMMITTED — the fastest safe option. PostgreSQL trusts you to upgrade when you need to.
- READ UNCOMMITTED doesn't exist. If you ask for it, PostgreSQL silently gives you READ COMMITTED instead. The PostgreSQL team decided dirty reads are too dangerous to allow, even if the SQL standard says they should be optional.
- REPEATABLE READ goes beyond the standard. PostgreSQL's RR level actually prevents phantom reads too (the standard only requires preventing non-repeatable reads). This is because PostgreSQL uses Snapshot IsolationA concurrency control mechanism where each transaction gets a consistent "snapshot" of the entire database at the moment it begins. All reads within that transaction see data from that snapshot, regardless of what other transactions do. This is stronger than the SQL standard's REPEATABLE READ requirement, which only guarantees that re-reading the same row gives the same result (but doesn't prevent new rows from appearing). PostgreSQL's snapshot isolation prevents both. under the hood, which is inherently stronger.
- SERIALIZABLE uses SSI (Serializable Snapshot Isolation) — a clever algorithm that detects potential conflicts without heavy locking. Most transactions succeed without waiting. When a conflict IS detected, PostgreSQL aborts one transaction with error
40001and you retry.
-- PostgreSQL: READ UNCOMMITTED is a lie
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SHOW transaction_isolation;
-- Output: "read committed" ← PostgreSQL upgraded it silently!
-- Check if your transaction is SERIALIZABLE:
SELECT current_setting('transaction_isolation');
-- Monitor serialization failures in production:
SELECT count(*) FROM pg_stat_database
WHERE datname = current_database();
-- (Check your app logs for SQLSTATE 40001)
MySQL / InnoDB — Safe by Default
MySQL made the opposite default choice from PostgreSQL, and the reasons are historical:
- Default: REPEATABLE READ — one step higher than PostgreSQL. This was originally for safe statement-based replicationAn older MySQL replication method where the master sends SQL statements (like UPDATE users SET name='Alice' WHERE id=1) to replicas, which re-execute them. This only works correctly if both master and replica see the same data during execution, which REPEATABLE READ helps guarantee. Modern MySQL prefers row-based replication (sending the actual row changes) which doesn't have this constraint, but the default stuck. — replicas need to see the same snapshot as the master to replay statements correctly.
- READ UNCOMMITTED is real. Unlike PostgreSQL, MySQL actually allows dirty reads. You'll see uncommitted data from other transactions. Some shops use this deliberately for monitoring dashboards that need speed over accuracy.
- REPEATABLE READ still allows phantoms for locking reads (
SELECT ... FOR UPDATE). Plain SELECTs are phantom-free via MVCC, but locking reads can see newly inserted rows. This catches people off guard. - SERIALIZABLE uses gap locks — physically locking ranges of the index. This is heavier than PostgreSQL's SSI and can cause more deadlocks, but it's simpler to reason about.
-- MySQL: check your default
SELECT @@transaction_isolation;
-- Output: REPEATABLE-READ (note the hyphens, not spaces)
-- MySQL: actually allows dirty reads
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
-- Now you can see other transactions' uncommitted work.
-- PostgreSQL would refuse to do this.
-- See InnoDB lock waits:
SELECT * FROM information_schema.INNODB_LOCK_WAITS;
-- See gap locks specifically:
SELECT * FROM performance_schema.data_locks
WHERE LOCK_TYPE = 'RECORD' AND LOCK_MODE LIKE '%GAP%';
SQL Server — Lock-Based with Opt-In MVCC
SQL Server is the odd one out. While PostgreSQL and MySQL both use MVCC by defaultPostgreSQL and MySQL/InnoDB automatically use MVCC (multi-version concurrency control) so that readers don't block writers and writers don't block readers. SQL Server, by contrast, uses traditional lock-based concurrency by default: a reader requesting READ COMMITTED will place a shared lock on the row, blocking any writer until the read is done. This is simpler but causes more blocking under concurrent load., SQL Server uses traditional lock-based concurrency. Readers block writers and writers block readers — unless you explicitly opt in to snapshot isolation.
- Default: READ COMMITTED with locking. Shared locks are placed on rows during reads and released as soon as the statement finishes. This means writers wait for readers to finish.
WITH (NOLOCK)is SQL Server's dirty read shortcut. It's extremely common in production for read-heavy reporting queries:SELECT * FROM orders WITH (NOLOCK). This is equivalent to READ UNCOMMITTED for that specific table reference.- SNAPSHOT ISOLATION must be enabled at the database level:
ALTER DATABASE mydb SET ALLOW_SNAPSHOT_ISOLATION ON;. Only then can transactions useSET TRANSACTION ISOLATION LEVEL SNAPSHOT;. Without enabling it, you don't get MVCC at all. - READ_COMMITTED_SNAPSHOT is a separate setting that gives MVCC behavior to READ COMMITTED without changing the isolation level name. Many teams enable this first:
ALTER DATABASE mydb SET READ_COMMITTED_SNAPSHOT ON;
-- SQL Server: the famous NOLOCK hint (dirty reads)
SELECT * FROM orders WITH (NOLOCK) WHERE status = 'pending';
-- Fast, but you might read rows that are mid-update and will be rolled back.
-- Equivalent to READ UNCOMMITTED for this table only.
-- Enable snapshot isolation (database-level, one-time):
ALTER DATABASE mydb SET ALLOW_SNAPSHOT_ISOLATION ON;
-- Use it in a transaction:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
BEGIN TRANSACTION;
SELECT * FROM orders; -- sees a consistent snapshot, no locks
COMMIT;
-- Better option for most apps: MVCC for READ COMMITTED:
ALTER DATABASE mydb SET READ_COMMITTED_SNAPSHOT ON;
-- Now all READ COMMITTED queries use MVCC (row versioning).
-- No code changes needed. Readers stop blocking writers.
At Scale — How Real Companies Handle This
Theory is nice. But what actually happens when you have millions of transactions per second, real money on the line, and an on-call engineer at 3 AM? Here's how four companies with very different workloads chose very different locking strategies — and why each choice was right for their situation.
Stripe — SERIALIZABLE on Every Payment
Stripe processes hundreds of millions of dollars in payments daily. Their rule is absolute: every payment transaction runs at SERIALIZABLE. No exceptions. The reasoning is straightforward — if a race condition causes a double charge, or if a refund goes through twice, real people lose real money and Stripe loses trust.
The cost? About a 10% retry rate under normal load. That means roughly 1 in 10 payment transactions gets aborted by PostgreSQL's SSISerializable Snapshot Isolation. PostgreSQL's SERIALIZABLE implementation. When SSI detects that two concurrent transactions accessed data in a pattern that could produce a non-serializable result, it aborts one of them with error code 40001. The application retries the entire transaction. SSI is "optimistic" in the sense that most transactions complete successfully — it only intervenes when it detects an actual conflict pattern. conflict detector and must be retried. Stripe has custom retry logic with exponential backoffA retry strategy where each successive retry waits longer: first retry after 10ms, second after 20ms, third after 40ms, etc. This prevents a "thundering herd" where all failed transactions retry at the same instant and conflict again. Stripe uses jitter (adding a small random delay) on top of exponential backoff so retries from different servers don't synchronize. baked into every database access layer. They consider this an acceptable trade-off: a 10% retry rate is nothing compared to the cost of a single double-charge incident.
Booking.com — Optimistic Locking for Availability
Booking.com's workload is the opposite of Stripe's. Millions of people search for hotels every minute, but only a tiny fraction actually click "Book Now" at the same instant for the same room. The read-to-write ratio is enormous. Pessimistic locking would be absurd here — locking a room row every time someone views it would grind the site to a halt.
Instead, Booking.com uses optimistic locking with version columns on room inventory records. Each room has a version column. When a user books, the system does UPDATE rooms SET available = available - 1, version = version + 1 WHERE id = :room AND version = :version. If someone else booked the same room between the user's search and their click, the version won't match, and the booking fails gracefully with "Sorry, this room just became unavailable."
Their retry rate? Under 0.1%. Because contention is so low (millions searching, few booking the exact same room at the exact same millisecond), conflicts almost never happen. Optimistic locking gives them maximum throughput with negligible retries.
Airline Check-In — The SKIP LOCKED Evolution
This is the story Arpit BhayaniA system design educator known for deep technical breakdowns of real-world systems. His analysis of airline seat assignment using various locking strategies (from basic FOR UPDATE through SKIP LOCKED) is one of the most referenced examples in the database concurrency space. He benchmarked each iteration and showed the progression from 1.7 seconds to 147 milliseconds. documented — six iterations of the same seat-assignment problem, each faster than the last:
| Iteration | Strategy | 100 Concurrent | Problem |
|---|---|---|---|
| 1 | No locking (just UPDATE) | Fast | Double-booking! Multiple people get same seat |
| 2 | Application-level mutex | ~2.5s | Single point of failure, doesn't work across servers |
| 3 | SELECT FOR UPDATE |
~1.7s | Everyone queues behind one row |
| 4 | FOR UPDATE NOWAIT |
Fast failures | Too many errors — poor user experience |
| 5 | FOR UPDATE SKIP LOCKED |
~147ms | None! Each user grabs a different seat |
| 6 | SKIP LOCKED + partitioning | <100ms | Even faster with table partitioning by section |
The jump from iteration 3 to iteration 5 — from 1.7 seconds to 147 milliseconds — came from adding two words: SKIP LOCKED. That's a 10× improvement for a two-word code change. It's one of the best performance-per-character ratios you'll ever see.
Discord — READ COMMITTED Is Fine for Chat
Not every system needs strong isolation. Discord runs PostgreSQL with READ COMMITTED for their message storage. A chat message appearing 100 milliseconds late? Nobody notices. A message briefly showing an old edit before the new one? Doesn't matter. The correctness bar for a chat app is fundamentally different from a payment system.
Discord's real scaling challenges are about throughput and storage, not about isolation. They eventually moved their hot data to ScyllaDB (a distributed NoSQL database) for performance reasons, but their PostgreSQL isolation choice was never the bottleneck. This is an important lesson: don't over-engineer isolation for workloads that don't need it. READ COMMITTED is the default for a reason — it's fast, it's safe enough for most things, and it doesn't require retry logic.
Anti-Lessons — What NOT to Do
Every production incident teaches something. Sometimes the lesson is "do this." More often, the lesson is "never do that again." These three anti-patterns show up repeatedly in codebases, Stack Overflow answers, and incident post-mortems. Learn them here so you don't learn them at 3 AM.
It sounds logical: "SERIALIZABLE is the safest level. Why not use it everywhere and never worry about concurrency bugs?" Because safety has a price, and that price is steep.
At SERIALIZABLE, PostgreSQL's SSI engine tracks read/write dependencies between all active transactions. When it detects a potential conflict pattern (even if no actual anomaly occurred), it aborts one transaction. Under moderate contention, you'll see a 10% retry rate. Under heavy contention, it can hit 30-50%. Every retry means wasted CPU, wasted I/O, wasted network round-trips, and higher latency for the user.
# ❌ BAD: Setting SERIALIZABLE globally for everything
# In Django settings.py:
DATABASES = {
'default': {
'OPTIONS': {
'options': '-c default_transaction_isolation=serializable'
}
}
}
# Now your blog post listing page, your search page, your static content
# page — ALL run at SERIALIZABLE. Every read-only query participates
# in SSI tracking. Retries everywhere. Throughput drops 30-50%.
# ✅ GOOD: Use SERIALIZABLE only where it matters
# Keep the default at READ COMMITTED (PostgreSQL's default).
# Upgrade to SERIALIZABLE only for critical operations:
def process_payment(conn, user_id, amount):
"""Only payment transactions need SERIALIZABLE."""
for attempt in range(3):
try:
with conn:
with conn.cursor() as cur:
cur.execute(
"SET TRANSACTION ISOLATION LEVEL SERIALIZABLE"
)
# ... payment logic ...
return "Payment successful"
except psycopg2.errors.SerializationFailure:
conn.rollback()
time.sleep(0.01 * (2 ** attempt))
raise Exception("Payment failed after retries")
# Regular reads stay at READ COMMITTED — fast, no retries needed.
def list_products(conn):
"""No special isolation needed for browsing."""
with conn.cursor() as cur:
cur.execute("SELECT * FROM products WHERE active = true")
return cur.fetchall()
The rule: Use SERIALIZABLE only for transactions where getting it wrong costs money, safety, or legal liability. Everything else stays at READ COMMITTED.
After learning about SELECT FOR UPDATE, some developers add it to every query "just to be safe." This is like putting a padlock on every door in your house, including the bathroom door, the kitchen door, and the closet door. Now nobody can move through the house without waiting for keys.
FOR UPDATE is designed for one specific pattern: you intend to UPDATE the row you just read. Read-then-write. If you're just reading data to display on a page, FOR UPDATE is pure overhead — it holds a lock that blocks other writers for no reason.
-- ❌ BAD: Locking rows you're only reading
BEGIN;
SELECT * FROM products WHERE category = 'electronics' FOR UPDATE;
-- Locked 5,000 rows! Any other transaction trying to UPDATE
-- any of these rows must wait until we COMMIT.
-- Meanwhile, we're just building a product listing page...
-- We never UPDATE these rows.
COMMIT; -- Finally releases 5,000 row locks.
-- Everyone waiting on those rows can now proceed.
-- ✅ GOOD: Only lock what you intend to update
-- For read-only queries, no lock needed:
SELECT * FROM products WHERE category = 'electronics';
-- MVCC handles this. No locks. Readers never block writers.
-- For read-then-update (like buying a ticket), lock JUST that row:
BEGIN;
SELECT * FROM tickets WHERE id = 1 FOR UPDATE;
-- Lock ONE row that we WILL update.
UPDATE tickets SET available = available - 1 WHERE id = 1;
COMMIT;
-- Even better for high concurrency: SKIP LOCKED
SELECT * FROM seats WHERE available = true
LIMIT 1 FOR UPDATE SKIP LOCKED;
The rule: Only use FOR UPDATE when you're going to UPDATE or DELETE the row in the same transaction. For read-only queries, MVCC already provides a consistent view without any locking.
When a developer sees their first ERROR: deadlock detected, the instinct is "something is fundamentally wrong with my code." But deadlocks are a normal part of concurrent systems. They happen. The database detects them automatically and resolves them by killing one transaction. Your job isn't to prevent all deadlocks — it's to handle them gracefully.
Even perfectly correct code can deadlock if two transactions happen to lock rows in different orders. You can minimize deadlocks by always locking rows in a consistent order (e.g., by ascending primary key), but you can't eliminate them entirely in a system with unpredictable user behavior.
# ❌ BAD: Treating deadlocks as fatal errors
try:
cursor.execute("UPDATE accounts SET balance = ...")
except psycopg2.errors.DeadlockDetected:
logger.error("CRITICAL: Deadlock! Something is very wrong!")
send_pagerduty_alert("DATABASE DEADLOCK - WAKE UP")
raise # Crash the request, show error to user
# This turns a routine database event into a P0 incident.
# ✅ GOOD: Deadlocks are expected — just retry
def with_retry(conn, func, max_retries=3):
for attempt in range(max_retries):
try:
with conn:
with conn.cursor() as cur:
return func(cur)
except psycopg2.errors.DeadlockDetected:
conn.rollback()
logger.info(f"Deadlock on attempt {attempt + 1}, retrying...")
time.sleep(0.01 * (2 ** attempt))
logger.warning("Transaction failed after max retries")
raise Exception("Transaction failed — please try again")
# Monitor deadlock FREQUENCY, not individual deadlocks.
# A few per hour = normal. Hundreds per minute = investigate.
# Minimize deadlocks (but don't expect zero):
# 1. Lock rows in consistent order (ascending PK)
# 2. Keep transactions short (less time holding locks)
# 3. SET deadlock_timeout = '1s'; (detect faster)
The rule: Monitor deadlock frequency, not individual deadlocks. A few per hour is normal in any high-concurrency system. If the rate suddenly spikes, investigate. But the fix is always detection + retry, not prevention.
Common Mistakes — The Traps Everyone Falls Into
These aren't hypothetical. Every single one of these mistakes has caused real production incidents at real companies. The scary part: most of them are silent. Your app works fine in development, passes all tests, then starts producing wrong numbers under production traffic. Here are six mistakes that catch even experienced engineers off guard.
What happens: You write a transaction in PostgreSQL (default: READ COMMITTED) and it works perfectly. Then someone deploys the same code on MySQL (default: REPEATABLE READ) and suddenly reads inside the transaction are stale — they're seeing a frozen snapshot, not the latest data. Or vice versa: you relied on MySQL's snapshot isolation, then migrated to Postgres and suddenly your transaction sees mid-flight changes from other transactions.
Why it's dangerous: Neither database throws an error. The queries succeed. The results are just wrong in subtle ways — stale inventory counts, double-charges, phantom records appearing in reports. You won't catch it until a customer complains or an accountant finds a discrepancy.
-- ALWAYS check your database's default before writing transactions
-- PostgreSQL:
SHOW transaction_isolation; -- read committed
-- MySQL:
SELECT @@transaction_isolation; -- REPEATABLE-READ
-- Rule: never assume. Check. Then decide if the default
-- is safe enough for YOUR specific query.
The fix: Set the isolation level explicitly on critical transactions. Don't rely on whatever the database happens to default to.
What happens: You set SERIALIZABLE or REPEATABLE READ on a critical transaction. PostgreSQL detects a conflict and throws ERROR: could not serialize access (error code 40001). Your code doesn't catch this specific error, so the entire request crashes with a 500. Users see "Internal Server Error" and retry manually, hammering the database even harder.
Why it's dangerous: SERIALIZABLE is designed to abort conflicting transactions. That's the whole point — it prevents anomalies by saying "sorry, try again." But if your application doesn't retry automatically, you've turned a safety feature into a reliability nightmare.
import psycopg2
def run_with_retry(conn, query, params, max_retries=3):
for attempt in range(max_retries):
try:
with conn:
with conn.cursor() as cur:
conn.set_session(
isolation_level='SERIALIZABLE'
)
cur.execute(query, params)
return cur.fetchall()
except psycopg2.errors.SerializationFailure:
conn.rollback()
if attempt == max_retries - 1:
raise
# Exponential backoff before retry
time.sleep(0.01 * (2 ** attempt))
# Key: catch SerializationFailure (40001), NOT generic Exception
The fix: Every SERIALIZABLE or REPEATABLE READ transaction needs a retry loop with exponential backoff. This isn't optional — it's part of the contract when you use higher isolation levels.
What happens: You open a transaction, lock some rows with SELECT FOR UPDATE, then call an external API (payment gateway, email service, webhook). That API takes 5 seconds to respond. For those 5 seconds, every other transaction that needs those rows is blocked — sitting idle, holding a database connection, waiting for your lock to release.
Why it's dangerous: Under load, this snowballs. 100 concurrent users trying to buy tickets. The first one locks the row and calls Stripe. The other 99 queue up, each holding a database connection. Your connection pool (usually 20-50 connections) is exhausted in seconds. New requests can't even get a connection. The entire application grinds to a halt — not because the database is slow, but because one transaction is holding locks while waiting for a network call.
The fix: Reserve the row (mark it as "processing"), commit the transaction, then call the external API, then open a new transaction to finalize. Keep the lock-holding window as short as physically possible.
What happens: Your application slows down under load. CPU is fine, memory is fine, disk I/O is fine. But requests are timing out. The problem is invisible unless you check: transactions are waiting on locks held by other transactions, forming a queue that grows exponentially.
Why it's dangerous: Lock contention doesn't show up in your typical application metrics (CPU, memory, error rate). You need to look inside the database.
-- Find who's blocking whom RIGHT NOW
SELECT
blocked.pid AS blocked_pid,
blocked.query AS blocked_query,
blocking.pid AS blocking_pid,
blocking.query AS blocking_query,
NOW() - blocked.query_start AS waiting_time
FROM pg_stat_activity blocked
JOIN pg_locks bl ON bl.pid = blocked.pid AND NOT bl.granted
JOIN pg_locks gl ON gl.pid != blocked.pid
AND gl.locktype = bl.locktype
AND gl.database IS NOT DISTINCT FROM bl.database
AND gl.relation IS NOT DISTINCT FROM bl.relation
JOIN pg_stat_activity blocking ON blocking.pid = gl.pid
WHERE NOT bl.granted
ORDER BY waiting_time DESC;
The fix: Run this query on a dashboard (Grafana, Datadog) and set alerts when any transaction waits longer than 5 seconds. Make it part of your standard monitoring — right alongside CPU and memory.
What happens: PostgreSQL's default deadlock_timeout is 1 second. That means the deadlock detector doesn't even start checking until a transaction has been waiting for 1 full second. In a high-traffic system (thousands of transactions per second), a 1-second wait is an eternity. Locks pile up, connections are consumed, and by the time PostgreSQL detects the deadlock, your application is already partially frozen.
Why it's dangerous: Deadlock detection isn't free — it takes CPU. PostgreSQL uses the timeout to avoid checking too aggressively. But the default of 1s is tuned for general-purpose workloads, not high-concurrency ones.
-- Check current setting
SHOW deadlock_timeout; -- default: 1s
-- For high-concurrency OLTP, consider lowering it
ALTER SYSTEM SET deadlock_timeout = '200ms';
SELECT pg_reload_conf();
-- Trade-off: lower timeout = faster deadlock detection
-- but more CPU spent on deadlock checking.
-- 200ms is a good starting point for busy OLTP systems.
The fix: Tune deadlock_timeout based on your workload. For high-concurrency OLTP (ticketing, e-commerce), 200ms is common. For analytics workloads with long-running queries, the default 1s is fine.
What happens: SQL Server developers learn that WITH (NOLOCK) makes queries faster. So they sprinkle it everywhere — reports, dashboards, even transactional queries. NOLOCK is equivalent to READ UNCOMMITTED: you can read rows that another transaction is actively modifying and hasn't committed yet. If that transaction rolls back, you just read data that never existed.
Why it's dangerous: NOLOCK doesn't just show uncommitted data. It can also cause skipped rows and duplicate rows during page splits. Your "fast" report might show a customer balance of $0 because it read mid-update, or count 1,002 rows in a 1,000-row table because of a page split during the scan.
-- ❌ BAD: NOLOCK on financial data
SELECT SUM(balance) FROM accounts WITH (NOLOCK);
-- Can read mid-transfer balances: money appears to vanish or duplicate
-- ✅ BETTER: Use READ COMMITTED SNAPSHOT (SQL Server's MVCC equivalent)
ALTER DATABASE MyApp SET READ_COMMITTED_SNAPSHOT ON;
-- Now reads get a consistent snapshot without blocking writers
-- No NOLOCK needed, no dirty reads, no page-split weirdness
-- NOLOCK is ONLY acceptable for:
-- 1. Rough estimates (dashboard "approximately 1M users")
-- 2. Data that never changes (lookup tables, enum references)
-- NEVER use it for financial, inventory, or booking data.
The fix: Enable READ_COMMITTED_SNAPSHOT on your SQL Server database. This gives you PostgreSQL-style MVCC — readers don't block writers, writers don't block readers — without any of the dirty-read risks of NOLOCK.
Interview Playbook — Design the Locking Strategy
This is a classic system design interview question because it tests whether you understand the trade-offs between correctness and performance. There's no single right answer — the interviewer wants to see you reason through options, identify failure modes, and pick the right tool for each sub-problem.
Key idea: Use SELECT FOR UPDATE to lock the seat row before booking it.
BEGIN;
-- Lock the specific seat row
SELECT * FROM seats WHERE id = 42 AND status = 'available' FOR UPDATE;
-- If we got the row, mark it as booked
UPDATE seats SET status = 'booked', user_id = 7 WHERE id = 42;
COMMIT;
Score: 5/10. This prevents double-booking for a single seat, but under 10,000 concurrent users, the FOR UPDATE creates a queue on every popular seat. Users experience long waits. No mention of retries, timeouts, or how to handle the 9,900 users who won't get a seat.
Key idea: Separate browsing (no locks) from booking (optimistic locking with version column).
-- Browsing: no locks at all, just READ COMMITTED
SELECT id, status FROM seats WHERE event_id = 1;
-- Booking: optimistic locking
UPDATE seats
SET status = 'booked', user_id = 7, version = version + 1
WHERE id = 42 AND status = 'available' AND version = 3;
-- If 0 rows updated → seat was taken, show "sorry" immediately
Score: 7/10. Great separation of concerns: browsing is fast (no locks), booking is safe (version check prevents double-booking). The losers find out immediately instead of waiting in a lock queue. Missing: what happens for the payment step? If payment fails after marking the seat as "booked," you have a ghost reservation.
Key idea: Three-tier locking: optimistic for browsing, SKIP LOCKED for seat assignment, SERIALIZABLE for payment.
-- TIER 1: Browsing — no locks, cached reads, eventual consistency OK
SELECT id, status FROM seats WHERE event_id = 1;
-- Served from a read replica or Redis cache. Slight staleness is fine.
-- TIER 2: Seat assignment — SKIP LOCKED (no waiting, no queueing)
BEGIN;
SELECT id FROM seats
WHERE event_id = 1 AND status = 'available'
ORDER BY id
LIMIT 1
FOR UPDATE SKIP LOCKED;
-- Got a seat? Reserve it with a 5-minute TTL
UPDATE seats SET status = 'reserved', user_id = 7,
reserved_until = NOW() + INTERVAL '5 minutes'
WHERE id = <selected_id>;
COMMIT;
-- TIER 3: Payment — SERIALIZABLE (correctness over speed)
BEGIN ISOLATION LEVEL SERIALIZABLE;
UPDATE seats SET status = 'booked'
WHERE id = <selected_id> AND user_id = 7
AND status = 'reserved'
AND reserved_until > NOW();
-- Process payment via Stripe (idempotency key!)
COMMIT;
-- On serialization failure: retry with backoff
-- On payment failure: seat auto-expires via reserved_until TTL
Score: 9/10. This answer demonstrates production-grade thinking:
- SKIP LOCKED means 10,000 concurrent users don't queue up — each one instantly grabs the next available seat or gets "sold out." Zero wait time.
- Reserved + TTL handles payment failures gracefully — if someone abandons checkout, the seat auto-releases after 5 minutes.
- SERIALIZABLE on payment ensures you never charge someone for a seat that expired or was double-assigned.
- Idempotency keys prevent double-charges if the retry loop re-sends a payment request.
The final 1 point comes from discussing monitoring: pg_locks dashboard, alerts on lock wait time > 1 second, and pgbench load tests before launch.
Practice Exercises — Break Your Database (Safely)
Reading about isolation levels is one thing. Seeing them break is what actually builds intuition. Every exercise below uses two terminal windows so you can watch the conflict happen in real time. You'll need a local PostgreSQL or MySQL instance — don't run these on production.
MySQL supports READ UNCOMMITTED (unlike PostgreSQL, which silently upgrades it). Open two terminals connected to the same MySQL database and prove that Transaction B can see data that Transaction A hasn't committed yet.
- Set both sessions to
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; - In Terminal A:
BEGIN; UPDATE accounts SET balance = 0 WHERE id = 1;(don't commit yet) - In Terminal B:
SELECT balance FROM accounts WHERE id = 1;— you'll see 0 even though A hasn't committed - In Terminal A:
ROLLBACK;— the balance reverts, but B already acted on phantom data
-- Setup
CREATE TABLE accounts (id INT PRIMARY KEY, balance DECIMAL(10,2));
INSERT INTO accounts VALUES (1, 1000.00);
-- Terminal A:
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
BEGIN;
UPDATE accounts SET balance = 0 WHERE id = 1;
-- DON'T commit yet!
-- Terminal B (while A is still open):
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT balance FROM accounts WHERE id = 1;
-- Result: 0.00 ← dirty read! A hasn't committed!
-- Terminal A:
ROLLBACK; -- balance goes back to 1000.00
-- But B already "saw" 0.00 and may have acted on it
Use PostgreSQL at READ COMMITTED level. Show that running the same SELECT COUNT(*) twice inside one transaction can return different numbers if another transaction inserts a row in between.
- Terminal A:
BEGIN; SELECT COUNT(*) FROM orders WHERE status = 'pending'; - Terminal B:
INSERT INTO orders (status) VALUES ('pending'); COMMIT; - Terminal A:
SELECT COUNT(*) FROM orders WHERE status = 'pending';— the count increases! - Then repeat the experiment at REPEATABLE READ. The count stays the same (PostgreSQL takes a snapshot).
-- Setup
CREATE TABLE orders (id SERIAL PRIMARY KEY, status TEXT);
INSERT INTO orders (status) VALUES ('pending'), ('pending'), ('pending');
-- Terminal A (READ COMMITTED — the default):
BEGIN;
SELECT COUNT(*) FROM orders WHERE status = 'pending';
-- Result: 3
-- Terminal B:
INSERT INTO orders (status) VALUES ('pending');
COMMIT;
-- Terminal A (same transaction):
SELECT COUNT(*) FROM orders WHERE status = 'pending';
-- Result: 4 ← PHANTOM! A new row appeared mid-transaction!
COMMIT;
-- Now try REPEATABLE READ:
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM orders WHERE status = 'pending';
-- (Terminal B inserts another row and commits)
SELECT COUNT(*) FROM orders WHERE status = 'pending';
-- Same count! PostgreSQL's snapshot isolates you from phantoms.
Create a tickets table with 100 rows. Write two pgbench scripts — one that uses FOR UPDATE (queue-style) and one that uses FOR UPDATE SKIP LOCKED (skip-style). Run both at 50 concurrent connections and compare throughput (transactions per second) and latency (average ms per transaction).
- Setup:
CREATE TABLE tickets (id SERIAL PRIMARY KEY, status TEXT DEFAULT 'available');then insert 100 rows - Script A (
for-update.sql):BEGIN; SELECT id FROM tickets WHERE status='available' LIMIT 1 FOR UPDATE; UPDATE tickets SET status='sold' WHERE id=:id; COMMIT; - Script B (
skip-locked.sql): Same but addSKIP LOCKEDafterFOR UPDATE - Run:
pgbench -c 50 -j 4 -t 100 -f for-update.sql mydb - Expect: SKIP LOCKED has 3-10x higher TPS because blocked transactions skip instead of waiting
#!/bin/bash
# Setup
psql mydb -c "DROP TABLE IF EXISTS tickets;"
psql mydb -c "CREATE TABLE tickets (id SERIAL PRIMARY KEY, status TEXT DEFAULT 'available');"
psql mydb -c "INSERT INTO tickets (status) SELECT 'available' FROM generate_series(1, 100);"
# Test 1: FOR UPDATE (queue-style)
cat > /tmp/for-update.sql <<'EOF'
BEGIN;
SELECT id FROM tickets WHERE status = 'available' ORDER BY id LIMIT 1 FOR UPDATE;
UPDATE tickets SET status = 'sold' WHERE id = :id;
COMMIT;
EOF
psql mydb -c "UPDATE tickets SET status = 'available';"
pgbench -c 50 -j 4 -t 100 -f /tmp/for-update.sql mydb
# Test 2: SKIP LOCKED (skip-style)
cat > /tmp/skip-locked.sql <<'EOF'
BEGIN;
SELECT id FROM tickets WHERE status = 'available' ORDER BY id LIMIT 1 FOR UPDATE SKIP LOCKED;
UPDATE tickets SET status = 'sold' WHERE id = :id;
COMMIT;
EOF
psql mydb -c "UPDATE tickets SET status = 'available';"
pgbench -c 50 -j 4 -t 100 -f /tmp/skip-locked.sql mydb
# Compare the "tps" (transactions per second) output
# SKIP LOCKED will be significantly faster under contention
Add a version column to a products table. Write a Python (or any language) function that reads a product, simulates processing time, then updates with a version check. Open two terminals and run the function simultaneously to trigger a version conflict.
- Schema:
id, name, stock INT, version INT DEFAULT 1 - Read:
SELECT stock, version FROM products WHERE id = 1; - Sleep 2 seconds (simulates processing)
- Write:
UPDATE products SET stock = stock - 1, version = version + 1 WHERE id = 1 AND version = <read_version>; - Check rows affected: if 0, print "Conflict! Retrying..." and re-read
import psycopg2, time
def buy_product(conn, product_id):
max_retries = 3
for attempt in range(max_retries):
with conn.cursor() as cur:
# Step 1: Read current state (no locks)
cur.execute(
"SELECT stock, version FROM products WHERE id = %s",
(product_id,)
)
stock, version = cur.fetchone()
print(f" Read: stock={stock}, version={version}")
if stock <= 0:
print(" Out of stock!")
return False
# Step 2: Simulate processing time
time.sleep(2)
# Step 3: Write with version check
cur.execute(
"""UPDATE products
SET stock = stock - 1, version = version + 1
WHERE id = %s AND version = %s""",
(product_id, version)
)
conn.commit()
if cur.rowcount == 1:
print(f" ✓ Purchased! New version={version + 1}")
return True
else:
print(f" ✗ Version conflict (attempt {attempt + 1})")
# Re-read and retry
print(" Failed after max retries")
return False
# Run this in two separate terminals simultaneously:
conn = psycopg2.connect("dbname=mydb")
conn.autocommit = False
buy_product(conn, 1)
Create two rows (A and B). In Terminal 1, lock row A then try to lock row B. In Terminal 2, lock row B then try to lock row A. Watch PostgreSQL detect the deadlock and abort one of the transactions. Then fix it by locking rows in consistent order.
- Terminal 1:
BEGIN; UPDATE accounts SET balance = balance - 10 WHERE id = 1;(holds lock on row 1) - Terminal 2:
BEGIN; UPDATE accounts SET balance = balance - 10 WHERE id = 2;(holds lock on row 2) - Terminal 1:
UPDATE accounts SET balance = balance + 10 WHERE id = 2;(blocked — waiting for row 2) - Terminal 2:
UPDATE accounts SET balance = balance + 10 WHERE id = 1;(DEADLOCK detected!) - Fix: always lock the lower ID first. Both terminals should lock id=1 before id=2.
-- Setup
CREATE TABLE accounts (id INT PRIMARY KEY, balance DECIMAL(10,2));
INSERT INTO accounts VALUES (1, 500.00), (2, 500.00);
-- ❌ CAUSES DEADLOCK (opposite lock order):
-- Terminal 1: Terminal 2:
-- BEGIN; BEGIN;
-- UPDATE accounts SET UPDATE accounts SET
-- balance = balance - 10 balance = balance - 10
-- WHERE id = 1; ← locks 1 WHERE id = 2; ← locks 2
-- UPDATE accounts SET UPDATE accounts SET
-- balance = balance + 10 balance = balance + 10
-- WHERE id = 2; ← WAITS WHERE id = 1; ← DEADLOCK!
-- ✅ FIX: Always lock in ascending ID order
-- Terminal 1: Terminal 2:
-- BEGIN; BEGIN;
-- UPDATE ... WHERE id = 1; UPDATE ... WHERE id = 1; ← waits
-- UPDATE ... WHERE id = 2; (resumes after T1 commits)
-- COMMIT; UPDATE ... WHERE id = 2;
-- COMMIT;
Cheat Sheet — Quick Reference Cards
Level Dirty Non-Rep Phantom ───────────────────────────────────────── READ UNCOMMITTED ✗ ✗ ✗ READ COMMITTED ✓ ✗ ✗ REPEATABLE READ ✓ ✓ PG:✓ My:✗ SERIALIZABLE ✓ ✓ ✓ ✓ = prevented, ✗ = possible PG = PostgreSQL, My = MySQL
Pessimistic (FOR UPDATE) → Lock row on read → Best for: high contention Optimistic (version column) → Check at write time → Best for: low contention SKIP LOCKED → Skip locked rows → Best for: job queues, tickets
-- Who's blocking whom? SELECT * FROM pg_stat_activity WHERE wait_event_type = 'Lock'; -- Count waiting transactions SELECT COUNT(*) FROM pg_locks WHERE NOT granted; -- Kill a blocking query SELECT pg_terminate_backend(pid);
Cause: A locks 1→2, B locks 2→1 Fix: Always lock ascending order SHOW deadlock_timeout; → 1s High OLTP? Set to 200ms Retry pattern: catch DeadlockDetected rollback + backoff + retry
PostgreSQL default: READ COMMITTED (per-statement) MySQL/InnoDB default: REPEATABLE READ (snapshot) SQL Server default: READ COMMITTED (locking, no MVCC) → Enable RCSI for MVCC behavior
"READ COMMITTED for 95%, SERIALIZABLE for payments." "SKIP LOCKED for queues, FOR UPDATE for single-row." "Optimistic when conflicts rare, pessimistic when they're not." "Deadlocks = retry. Monitor rate."