Mental Model — The Blueprint of Your Data's House
\d+ users is the blueprint of your data's house. It defines which rooms exist (tables), what goes in each room (columns and types), and which doorways connect them (foreign keys). Without a blueprint, you get a pile of bricks.
Before you write a single line of application code, someone has to answer a deceptively simple question: what shape is our data? That answer is the schema. It's the contract between your code and your database — your code says "give me all orders for user 42," and the schema guarantees that "orders" and "users" are real things with real columns and a real relationship between them.
Every company you've heard of started with this step. When the Shopify founders built their online store in 2006, one of the first things they did was sketch out tables for shops, products, orders, and customers. When Jack Dorsey built the first version of Twitter in 2006, he needed a tweets table. When GitHub launched in 2008, they needed repositories, users, and commits. The schema came before the UI, before the API, before everything.
Get it right, and your database hums along for years. Get it wrong, and you'll spend months on painful migrationsThe process of changing your database schema after it's already in production with real data. Adding a column, renaming a table, changing a data type — all migrations. They range from trivial (adding a nullable column takes milliseconds) to terrifying (changing a primary key type on a table with 500 million rows can take hours and requires downtime). Tools like Rails migrations, Alembic (Python), and Flyway (Java) help you version-control these changes. — rewriting table structures while millions of rows of live data flow through them. GitHub had to migrate their repositories table primary key from 32-bit to 64-bit integers in 2019 when they approached 2.1 billion repositories. That migration took months of engineering. A better upfront schema choice would have avoided it entirely.
Here's the comforting part: schema design is not magic. It follows a small set of rules that you can learn in an afternoon and apply for the rest of your career. By the end of this page, you'll be able to look at any business requirement and sketch a clean, normalizedOrganizing your tables so that each piece of information is stored in exactly one place. If a customer's address changes, you update it in ONE row, not in 500 order rows that all copied the address. The formal rules are called "normal forms" (1NF, 2NF, 3NF) — we'll cover them in Section 6. For now, just think of it as "don't repeat yourself" for databases. schema that won't crumble under production traffic.
The Scenario — Building an E-Commerce Database from Scratch
Let's make this concrete. You're building an online store — think a simplified version of Shopify or Amazon. Your store needs to handle these things:
- Users — people who sign up, have an email and a shipping address
- Products — items for sale, each with a name, description, price, and category
- Orders — when a user buys something, you record what they bought, how much they paid, and when
- Reviews — users leave star ratings and comments on products
- Inventory — how many units of each product are in stock
This is the same problem every e-commerce company has solved. Shopify's real orders table has columns like id, shop_id, customer_id, total_price DECIMAL(10,2), financial_status, fulfillment_status, and created_at. We'll build something similar, but we'll start with the wrong approach first — because understanding why it fails teaches you more than jumping straight to the right answer.
The natural instinct — especially if you've used spreadsheets before — is to put everything into one big table. One row per order, with all the user info, product info, review info, and inventory count crammed into the same row. Let's try that approach and watch it fall apart.
First Attempt — One Giant Table
Here's the "just throw it all in one table" approach. It's how most people think about data when they first start, because it feels natural — like a spreadsheet. One row per order, with everything you need right there:
-- The "spreadsheet" approach: everything in one table
CREATE TABLE store_data (
order_id INT PRIMARY KEY,
order_date TIMESTAMP,
order_total DECIMAL(10,2),
-- Customer info (repeated on EVERY order)
customer_name VARCHAR(100),
customer_email VARCHAR(255),
customer_address TEXT,
-- Product info (repeated on EVERY order)
product_name VARCHAR(200),
product_price DECIMAL(10,2),
product_category VARCHAR(50),
-- Review info
review_stars INT,
review_comment TEXT,
-- Inventory
stock_quantity INT
);
-- Let's insert some data
INSERT INTO store_data VALUES
(1, '2024-01-15', 29.99, 'Alice Johnson', 'alice@email.com',
'123 Main St, NYC', 'Wireless Mouse', 29.99, 'Electronics', 5,
'Great mouse!', 150),
(2, '2024-01-16', 29.99, 'Bob Smith', 'bob@email.com',
'456 Oak Ave, LA', 'Wireless Mouse', 29.99, 'Electronics', 4,
'Good but battery dies fast', 149),
(3, '2024-01-16', 79.99, 'Alice Johnson', 'alice@email.com',
'123 Main St, NYC', 'Mechanical Keyboard', 79.99, 'Electronics',
NULL, NULL, 75);
Look at that data carefully. Alice has two orders. Her name, email, and address appear in both rows. The "Wireless Mouse" product name, price, and category also appear in both rows. Already, we're storing the same information multiple times. With 3 rows, this is mildly annoying. With 3 million rows, this is a catastrophe waiting to happen.
It "works" if you squint. You can query it, you can insert data, you can join... nothing. There's only one table, so there's nothing to join. But the moment you need to update Alice's address, or correct the price of the Wireless Mouse, or figure out the actual stock count — this design crumbles. Let's see exactly how.
Where It Breaks — The Three Anomalies
The one-giant-table approach has three fatal flaws. Database textbooks call them anomaliesA data integrity problem caused by poor schema design. There are three types: update anomaly (changing data in one place but not everywhere it's duplicated), insertion anomaly (can't add new data without unrelated data), and deletion anomaly (deleting one thing accidentally destroys other data). All three vanish when you normalize your schema — separate tables, linked by keys. — situations where the structure of your table actively causes data corruption. These aren't theoretical risks; they're things that happen in production at real companies every day when schemas aren't designed properly.
Anomaly #1: Update Anomaly
Alice moves from "123 Main St, NYC" to "789 Pine Rd, Chicago." You run an update:
-- You update Alice's address on her latest order...
UPDATE store_data
SET customer_address = '789 Pine Rd, Chicago'
WHERE order_id = 3;
-- But order 1 STILL says "123 Main St, NYC"
-- Now Alice has TWO addresses in your database.
-- Which one is correct? Your database doesn't know.
-- The "fix" is to update ALL rows for Alice:
UPDATE store_data
SET customer_address = '789 Pine Rd, Chicago'
WHERE customer_email = 'alice@email.com';
-- But if you have 500 orders for Alice across 3 tables,
-- and a developer forgets ONE... you have corrupted data.
This is exactly what happened at a real e-commerce startup in 2019 — they stored customer addresses directly in order rows instead of a separate customers table. When customers updated their addresses, old orders kept the old address (which was intentional for shipping records), but the customer profile data was also duplicated and went out of sync. Support tickets flooded in: "Why does my account show two different addresses?"
Anomaly #2: Insertion Anomaly
You just got a new product in — a "USB-C Hub" for $49.99. You want to add it to your catalog. But in the giant table, every row is an order. There's no way to add a product without also creating an order. You'd have to insert a fake row with NULL for the customer, order date, and order total just to record that the product exists:
-- You want to add a new product to your catalog.
-- But every row needs an order_id (it's the PRIMARY KEY).
-- You're forced to create a fake order:
INSERT INTO store_data VALUES
(NULL, NULL, NULL, -- no order
NULL, NULL, NULL, -- no customer
'USB-C Hub', 49.99, 'Electronics', -- the actual product
NULL, NULL, -- no review
200); -- stock count
-- This fails! order_id is the PRIMARY KEY and can't be NULL.
-- You literally CANNOT add a product without an order.
-- That's an insertion anomaly.
The table structure is blocking you from recording perfectly valid information. A product exists in the real world, but your database can't represent it until someone buys it. That's like saying a house doesn't exist until someone moves in.
Anomaly #3: Deletion Anomaly
Bob's order (#2) was a mistake — he wants a refund and you delete the order. But Bob is the only customer who ordered the Wireless Mouse with stock count 149. When you delete his order row, you also lose the fact that Bob exists as a customer. His email, his address — gone. Not because you wanted to delete Bob, but because his customer data was trapped inside an order row.
-- Bob wants a refund. Delete his order.
DELETE FROM store_data WHERE order_id = 2;
-- Bob's order is gone. But so is:
-- ❌ Bob's customer record (name, email, address)
-- ❌ The fact that Bob exists in your system at all
-- ❌ Bob's review of the Wireless Mouse (4 stars)
-- If Bob comes back next week to buy something,
-- your system has no idea who he is.
-- You deleted a customer by trying to delete an order.
These three anomalies are the reason normalization exists. Every rule of normalization — 1NF, 2NF, 3NF — is specifically designed to eliminate one or more of these anomalies. The fix is beautifully simple: stop cramming everything into one table.
The Breakthrough — Separate Tables, Linked by Keys
orders table has user_id = 42, that's a foreign key pointing to users.id = 42. The database enforces this link — you can't create an order for user 999 if user 999 doesn't exist. Try it: INSERT INTO orders (user_id) VALUES (999); — the database will reject it with a foreign key violation. to link them together. Each piece of information is stored in exactly one place. Update it once, and everywhere that references it sees the change automatically.
The fix is to split that monster table into separate tables, each responsible for one type of thing. This is called normalizationThe process of organizing a database schema to reduce redundancy and prevent anomalies. Invented by Edgar Codd at IBM in 1970. The basic idea: each fact should be stored in exactly one place. There are formal "normal forms" (1NF through 5NF), but in practice you usually aim for Third Normal Form (3NF). Going beyond 3NF is rarely worth the complexity for most applications., and it was invented by Edgar Codd at IBM in 1970. Over fifty years later, it's still the standard approach used by every production database at every major tech company.
-- Each "thing" gets its own table. No duplication.
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
address TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(200) NOT NULL,
price DECIMAL(10,2) NOT NULL, -- DECIMAL, not FLOAT! (0.1 + 0.2 = 0.3, not 0.30000000000000004)
category VARCHAR(50),
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INT NOT NULL REFERENCES users(id), -- foreign key: links to users
total DECIMAL(10,2) NOT NULL,
status VARCHAR(20) DEFAULT 'pending',
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE order_items (
id SERIAL PRIMARY KEY,
order_id INT NOT NULL REFERENCES orders(id), -- links to orders
product_id INT NOT NULL REFERENCES products(id), -- links to products
quantity INT NOT NULL DEFAULT 1,
unit_price DECIMAL(10,2) NOT NULL -- price AT TIME OF PURCHASE (important!)
);
CREATE TABLE reviews (
id SERIAL PRIMARY KEY,
user_id INT NOT NULL REFERENCES users(id),
product_id INT NOT NULL REFERENCES products(id),
stars INT CHECK (stars BETWEEN 1 AND 5), -- constraint: only 1-5 stars
comment TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE inventory (
product_id INT PRIMARY KEY REFERENCES products(id),
quantity INT NOT NULL DEFAULT 0,
updated_at TIMESTAMP DEFAULT NOW()
);
Look at what changed. Alice's name and address now live in one row in the users table. Her orders reference her user ID. If she moves, you update one row, and every order, review, and shipment that references her ID automatically points to her new address. No duplication, no anomalies.
Notice a subtle but critical detail in order_items: we store unit_price separately from the product's current price. Why? Because prices change. If the Wireless Mouse goes from $29.99 to $34.99 next month, Alice's old order should still show $29.99 — that's what she actually paid. Shopify does exactly this: their line_items table captures the price at the moment of purchase, independent of the product's current price.
Now let's verify that the anomalies are gone:
-- ✅ Update Anomaly: GONE
-- Alice moves? Update ONE row.
UPDATE users SET address = '789 Pine Rd, Chicago' WHERE id = 1;
-- Every order, review, shipment that references user_id = 1
-- automatically sees the new address. One update, done.
-- ✅ Insertion Anomaly: GONE
-- New product? Just insert it. No fake order needed.
INSERT INTO products (name, price, category)
VALUES ('USB-C Hub', 49.99, 'Electronics');
-- The product exists in your catalog immediately.
-- ✅ Deletion Anomaly: GONE
-- Delete Bob's order? Bob still exists in the users table.
DELETE FROM orders WHERE id = 2;
-- Bob's customer record is safe. His reviews are safe.
-- Only the order is deleted.
This is the fundamental insight of schema design: store each fact in exactly one place, and link facts together with keys. It's the same principle as "Don't Repeat Yourself" in code, but applied to data. And just like DRY in code, it eliminates an entire category of bugs.
How It Works — The Building Blocks of Schema Design
Now that you understand why we separate tables, let's go deeper into the how. Schema design has six core building blocks. Master these, and you can design a clean database for any application.
1. Entity-Relationship Diagrams — Drawing Your Data Before Building It
Before you write CREATE TABLE, you draw. An ER diagramEntity-Relationship diagram — a visual blueprint of your database. Rectangles are entities (tables). Diamonds are relationships. Lines show how they connect. Crow's foot notation (the little "fork" at the end of a line) shows cardinality: one-to-many, many-to-many, etc. Every database textbook uses ER diagrams, and tools like dbdiagram.io, Lucidchart, and pgModeler let you draw them and export SQL. is a picture of your data before it becomes SQL. Every rectangle is a table. Every line is a relationship. The little symbols at the end of lines (called crow's foot notationA way to show relationship cardinality in ER diagrams. A single line means "one." A fork (looks like a crow's foot) means "many." A circle means "zero (optional)." So a line from users to orders with a fork on the orders end means "one user has many orders." A circle + fork means "one user has zero or many orders." You'll see this in every database design tool.) tell you the cardinality — one-to-many, many-to-many, etc.
Real companies use ER diagrams before every major feature. When Shopify adds a new entity like "subscriptions," the database team draws the ER diagram first, reviews it with the application team, and only then writes the migration. You can draw yours for free at dbdiagram.io — type your schema in plain text and it generates the diagram automatically.
2. Normal Forms (1NF → 2NF → 3NF) — The Rules That Prevent Anomalies
Normal forms are the formal rules behind normalization. Don't let the academic names scare you — each one is just a specific rule about how to organize your columns. In practice, you need to know three:
First Normal Form (1NF) — every column holds a single value, not a list. Sounds obvious, but it's the #1 beginner mistake:
-- ❌ BAD: Multiple values crammed into one column
CREATE TABLE orders_bad (
id INT PRIMARY KEY,
customer VARCHAR(100),
products VARCHAR(500) -- "Mouse, Keyboard, Monitor" ← a LIST in one cell
);
INSERT INTO orders_bad VALUES (1, 'Alice', 'Mouse, Keyboard, Monitor');
-- Want all orders that include "Keyboard"?
SELECT * FROM orders_bad WHERE products LIKE '%Keyboard%';
-- This works... until someone orders "Keyboard Cleaner"
-- and now you get false matches. Unreliable.
-- ✅ GOOD: Each value in its own row
CREATE TABLE order_items (
order_id INT REFERENCES orders(id),
product_id INT REFERENCES products(id),
quantity INT DEFAULT 1,
PRIMARY KEY (order_id, product_id)
);
-- One row per product per order
INSERT INTO order_items VALUES (1, 101, 1); -- Mouse
INSERT INTO order_items VALUES (1, 102, 1); -- Keyboard
INSERT INTO order_items VALUES (1, 103, 1); -- Monitor
-- Want all orders with a Keyboard? Easy, exact, reliable.
SELECT o.* FROM orders o
JOIN order_items oi ON o.id = oi.order_id
WHERE oi.product_id = 102;
Second Normal Form (2NF) — every non-key column depends on the entire primary key, not just part of it. This matters when you have a composite keyA primary key made of two or more columns. For example, PRIMARY KEY (order_id, product_id) in the order_items table. The combination of order_id + product_id uniquely identifies each row. 2NF says that every other column must depend on BOTH parts of that key, not just one. (a primary key with multiple columns). If a column depends on only part of the key, it belongs in a different table.
Third Normal Form (3NF) — no column depends on another non-key column. Classic example: storing city and zip_code in the same table. The city depends on the zip code (zip 10001 is always New York), not on the primary key. Technically, city should be in a zip_codes lookup table. In practice, most teams keep city and zip together for simplicity and accept the tiny duplication — pragmatism over purity.
3. Foreign Keys & Constraints — Your Database's Safety Net
A schema without constraintsRules that the database enforces automatically. PRIMARY KEY (uniqueness + not null), FOREIGN KEY (must reference an existing row), UNIQUE (no duplicates), NOT NULL (can't be empty), CHECK (custom validation like CHECK (price > 0)), DEFAULT (auto-fill a value). These run inside the database engine, so they can't be bypassed by buggy application code. That's why you add them — even if your app validates too. is like a building without a fire code — it works until something goes wrong, and then everything burns down. Constraints are the rules that your database enforces no matter what. Even if your application code has a bug, the database catches the invalid data.
-- FOREIGN KEY: Can't create an order for a non-existent user
INSERT INTO orders (user_id, total) VALUES (9999, 50.00);
-- ERROR: insert or update on table "orders" violates foreign key constraint
-- Key (user_id)=(9999) is not present in table "users".
-- UNIQUE: Can't register two users with the same email
INSERT INTO users (name, email) VALUES ('Fake Alice', 'alice@email.com');
-- ERROR: duplicate key value violates unique constraint "users_email_key"
-- CHECK: Can't give a 6-star review
INSERT INTO reviews (user_id, product_id, stars) VALUES (1, 1, 6);
-- ERROR: new row for relation "reviews" violates check constraint
-- Check constraint: stars BETWEEN 1 AND 5
-- NOT NULL: Can't create a product without a name
INSERT INTO products (name, price) VALUES (NULL, 29.99);
-- ERROR: null value in column "name" violates not-null constraint
-- DEFAULT: Status auto-fills if you don't specify it
INSERT INTO orders (user_id, total) VALUES (1, 29.99);
-- status automatically set to 'pending' — you don't need to pass it
Twitter uses VARCHAR(280) on their tweet text column, not TEXT. Why? Because VARCHAR(280) is a constraint — the database physically cannot store a tweet longer than 280 characters. If a bug in the Twitter app tried to save a 500-character tweet, the database would reject it. Using TEXT (unlimited length) would mean relying on application code alone to enforce the character limit. Constraints are your last line of defense.
4. Data Types & Why They Matter — DECIMAL vs FLOAT, VARCHAR vs TEXT
Choosing the right data typeThe kind of value a column can hold. Common types: INT (whole numbers), BIGINT (big numbers, up to 9.2 quintillion), VARCHAR(n) (text up to n characters), TEXT (unlimited text), DECIMAL(p,s) (exact numbers with p total digits and s decimal places), BOOLEAN (true/false), TIMESTAMP (date + time), UUID (128-bit unique identifier). The wrong choice causes bugs. The right choice prevents them. is not pedantic — it prevents real bugs. The most dangerous mistake in schema design is using FLOAT for money.
-- ❌ FLOAT: Stores money as a binary approximation
SELECT 0.1::FLOAT + 0.2::FLOAT;
-- Result: 0.30000000000000004 ← NOT 0.3!
-- This is not a PostgreSQL bug. It's how IEEE 754 floating point works.
-- Stripe, Shopify, every bank, every payment processor uses DECIMAL for money.
-- ✅ DECIMAL: Stores money as exact base-10 digits
SELECT 0.1::DECIMAL + 0.2::DECIMAL;
-- Result: 0.3 ← Exactly right.
-- Shopify's orders table:
-- total_price DECIMAL(10,2) ← 10 digits total, 2 after the decimal
-- This means prices from 0.00 to 99,999,999.99 — more than enough.
-- GitHub's repositories table uses BIGINT for id, not INT:
-- INT max: 2,147,483,647 (2.1 billion) — GitHub approached this in 2019
-- BIGINT max: 9,223,372,036,854,775,807 (9.2 quintillion) — safe for centuries
Here's a quick reference for the most common choices:
- Money: Always
DECIMAL(10,2). NeverFLOAT. NeverDOUBLE. - IDs:
BIGINTfor new projects (future-proof).INTonly if you're absolutely sure you'll never exceed 2.1 billion rows. - Short text:
VARCHAR(n)with a sensible limit. Emails:VARCHAR(255). Usernames:VARCHAR(50). - Long text:
TEXTfor descriptions, comments, blog posts — no length limit. - Yes/No:
BOOLEAN. NotINTwith 0/1. NotVARCHAR(3)with "yes"/"no". - Dates/times:
TIMESTAMPTZ(with timezone) in PostgreSQL. Never store dates as strings.
5. Timestamps & Timezone Handling — The Silent Source of Bugs
Timestamps cause more subtle bugs than almost any other data type. The problem: your server is in US-East, your user is in Tokyo, and your database stores 2024-01-15 14:30:00. Is that 2:30 PM in New York or 2:30 PM in Tokyo? Without a timezone, nobody knows.
-- ❌ BAD: TIMESTAMP without timezone (ambiguous)
CREATE TABLE events_bad (
id SERIAL PRIMARY KEY,
event_time TIMESTAMP -- 2024-01-15 14:30:00 — which timezone?
);
-- ✅ GOOD: TIMESTAMPTZ (stores as UTC, converts on display)
CREATE TABLE events (
id SERIAL PRIMARY KEY,
event_time TIMESTAMPTZ -- stores internally as UTC
);
-- Insert with any timezone — PostgreSQL converts to UTC
INSERT INTO events (event_time) VALUES ('2024-01-15 14:30:00+09:00'); -- Tokyo
INSERT INTO events (event_time) VALUES ('2024-01-15 00:30:00-05:00'); -- NYC (same instant!)
-- Both rows store the same UTC moment: 2024-01-15 05:30:00+00
-- Display in any timezone:
SET timezone = 'America/New_York';
SELECT event_time FROM events; -- shows 2024-01-15 00:30:00-05
SET timezone = 'Asia/Tokyo';
SELECT event_time FROM events; -- shows 2024-01-15 14:30:00+09
-- RULE: Always use TIMESTAMPTZ. Always store in UTC. Convert on display.
Every major company has learned this the hard way. GitHub stores all timestamps in UTC. Stripe's API returns all timestamps as Unix timestampsThe number of seconds since January 1, 1970, 00:00:00 UTC (called the "Unix epoch"). For example, 1705315800 = 2024-01-15 14:30:00 UTC. No timezone ambiguity — it's always UTC. Stripe's API returns timestamps this way: "created": 1705315800. Your application converts to local time for display. Many APIs use this format because it's unambiguous and easy to do math on (want "5 minutes ago"? subtract 300). (seconds since 1970, always UTC). Shopify stores created_at and updated_at on every table, both in UTC. If you take away one rule: always use TIMESTAMPTZ, always store in UTC.
6. Soft Deletes & Audit Trails — Never Actually Delete Anything
When a user "deletes" their account on Stripe, their data doesn't vanish from the database. It gets a timestamp in a column called deleted_at. This is called a soft deleteInstead of DELETE FROM users WHERE id = 42;, you do UPDATE users SET deleted_at = NOW() WHERE id = 42;. The row is still in the database, but your application filters it out: SELECT * FROM users WHERE deleted_at IS NULL;. This means you can "undo" deletes, comply with audit requirements, and debug issues by looking at "deleted" records. Stripe, Shopify, GitHub, and most SaaS companies use soft deletes on critical tables. — the data looks deleted to the user, but it's still there for auditing, compliance, and "oops, undo that."
-- Soft delete pattern (used by Stripe, Shopify, GitHub)
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW(),
deleted_at TIMESTAMPTZ -- NULL = active, non-NULL = "deleted"
);
-- "Delete" a user (they still exist in the database)
UPDATE users SET deleted_at = NOW() WHERE id = 42;
-- Your application's default query ALWAYS filters deleted rows:
SELECT * FROM users WHERE deleted_at IS NULL;
-- Admin/support can see "deleted" users for debugging:
SELECT * FROM users WHERE deleted_at IS NOT NULL;
-- "Undo" a delete (customer changed their mind):
UPDATE users SET deleted_at = NULL WHERE id = 42;
-- For heavy audit requirements, add a separate audit table:
CREATE TABLE user_audit (
id SERIAL PRIMARY KEY,
user_id INT NOT NULL,
action VARCHAR(20) NOT NULL, -- 'created', 'updated', 'deleted'
old_data JSONB, -- snapshot before change
new_data JSONB, -- snapshot after change
changed_by INT, -- who made the change
changed_at TIMESTAMPTZ DEFAULT NOW()
);
The created_at + updated_at + deleted_at trio appears on virtually every table at companies like Shopify. Rails generates these columns automatically. They're not optional — they're essential for debugging ("when was this record last changed?"), compliance ("show me the deletion timestamp for GDPR"), and data recovery ("undelete this accidentally removed product").
Going Deeper — Advanced Schema Decisions
The basics — tables, foreign keys, normal forms — get you 80% of the way. The remaining 20% involves decisions that depend on your specific situation: how big is your system? Do you need distributed IDs? Do you store semi-structured data? How do you change the schema without downtime? These four topics come up in every real production system.
Every row needs a unique identifier — a primary keyThe column (or combination of columns) that uniquely identifies each row in a table. No two rows can have the same primary key. It's automatically indexed for fast lookups. The two most common strategies: auto-increment integers (1, 2, 3...) and UUIDs (random 128-bit identifiers like 550e8400-e29b-41d4-a716-446655440000). Each has trade-offs.. The two main options are auto-incrementing integers (1, 2, 3...) and UUIDsUniversally Unique Identifier — a 128-bit random number, typically displayed as a 36-character string like 550e8400-e29b-41d4-a716-446655440000. The chance of generating two identical UUIDs is so astronomically small (1 in 2^122) that it's treated as zero. UUIDs can be generated on any machine without coordinating with a central database — which is why distributed systems love them. Downside: 16 bytes vs 4 bytes for INT, and they're harder to type/debug. (random 128-bit identifiers). The choice matters more than you'd think.
-- Auto-increment: the database assigns 1, 2, 3, 4...
CREATE TABLE users (
id SERIAL PRIMARY KEY, -- PostgreSQL
-- id INT AUTO_INCREMENT PRIMARY KEY -- MySQL
name VARCHAR(100)
);
INSERT INTO users (name) VALUES ('Alice'); -- id = 1
INSERT INTO users (name) VALUES ('Bob'); -- id = 2
-- PROS:
-- ✅ Small (4 bytes INT, 8 bytes BIGINT) — faster indexes
-- ✅ Sequential — great for range queries and B-tree performance
-- ✅ Human-readable — easy to debug ("check user 42")
-- ✅ Simple — no libraries needed
-- CONS:
-- ❌ Leaks information — user 1000 means you have ~1000 users
-- ❌ Single point of coordination — only ONE database generates IDs
-- ❌ Can't pre-generate — you need the database to get an ID
-- ❌ Merging databases is hell — two databases both have user #42
-- UUID: random 128-bit identifiers
CREATE EXTENSION IF NOT EXISTS "pgcrypto"; -- enables gen_random_uuid()
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(100)
);
INSERT INTO users (name) VALUES ('Alice');
-- id = '7c9e6679-7425-40de-944b-e07fc1f90ae7' (auto-generated)
INSERT INTO users (name) VALUES ('Bob');
-- id = 'f47ac10b-58cc-4372-a567-0e02b2c3d479' (different, random)
-- PROS:
-- ✅ No coordination — ANY server can generate IDs independently
-- ✅ No information leakage — can't guess how many users you have
-- ✅ Merge-safe — two databases will never collide
-- ✅ Can generate BEFORE inserting — great for distributed systems
-- CONS:
-- ❌ Larger (16 bytes vs 4/8 bytes) — bigger indexes, more RAM
-- ❌ Random — B-tree index fragmentation, slower inserts on huge tables
-- ❌ Hard to type/debug — "check user 7c9e6679..." is painful
-- ❌ UUIDv4 is fully random; UUIDv7 (2024) is time-ordered, fixing the B-tree issue
Real-world choices: Twitter uses neither — they built Snowflake IDsA 64-bit ID format invented by Twitter in 2010. Structure: 41 bits for timestamp (milliseconds since Twitter epoch) + 10 bits for machine ID + 12 bits for sequence number. This gives you: time-ordered IDs (great for "sort by newest"), distributed generation (up to 1024 machines), and 4096 IDs per millisecond per machine. It's used by Twitter, Discord, Instagram, and many others. Open-source implementations exist for every language. (64-bit, time-ordered, distributed). Instagram uses a similar approach. Shopify uses auto-increment BIGINT for most tables. Stripe uses string IDs with prefixes: cus_1234 for customers, pi_5678 for payment intents — the prefix tells you the type at a glance.
Your app has posts, photos, and videos. Users can comment on all three. Do you create three separate comment tables (post_comments, photo_comments, video_comments) with identical columns? Or one comments table that knows which type of thing it's attached to?
The second approach is called a polymorphic associationA pattern where a single table (like comments) can belong to multiple parent tables (posts, photos, videos) using two columns: commentable_type (which table) and commentable_id (which row). Ruby on Rails popularized this pattern. It's pragmatic and widely used, but technically breaks relational purity because the database can't enforce a foreign key when the target table is dynamic. Many teams add application-level checks instead.. Ruby on Rails popularized it, and it's used extensively in production by GitHub, Shopify, and Basecamp.
-- One comments table for posts, photos, and videos
CREATE TABLE comments (
id SERIAL PRIMARY KEY,
user_id INT NOT NULL REFERENCES users(id),
commentable_type VARCHAR(50) NOT NULL, -- 'Post', 'Photo', or 'Video'
commentable_id BIGINT NOT NULL, -- the ID in that table
body TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Alice comments on a post (id=10)
INSERT INTO comments (user_id, commentable_type, commentable_id, body)
VALUES (1, 'Post', 10, 'Great article!');
-- Alice comments on a photo (id=55)
INSERT INTO comments (user_id, commentable_type, commentable_id, body)
VALUES (1, 'Photo', 55, 'Beautiful sunset!');
-- Get all comments on Post #10
SELECT * FROM comments
WHERE commentable_type = 'Post' AND commentable_id = 10;
-- Add an index for fast lookups
CREATE INDEX idx_comments_polymorphic
ON comments (commentable_type, commentable_id);
-- THE TRADE-OFF:
-- ✅ One table instead of three — simpler queries, simpler code
-- ❌ No foreign key constraint — the DB can't verify commentable_id exists
-- ❌ Your application must enforce referential integrity
-- Rails handles this automatically with belongs_to :commentable, polymorphic: true
The pure SQL alternative is to use separate tables or a shared parent table with table inheritanceA PostgreSQL-specific feature where child tables inherit columns from a parent table. CREATE TABLE photos () INHERITS (content); means photos get all columns from content, plus any photo-specific columns. Then your comments table has a foreign key to the content table. This preserves referential integrity while allowing polymorphism. Most teams don't use this because it's PostgreSQL-only and adds complexity.. But in practice, polymorphic associations are so convenient that they dominate real-world applications. Use them for comments, likes, tags, notifications — any feature that attaches to multiple entity types.
Sometimes you don't know the exact shape of your data ahead of time. Product attributes are the classic example: a laptop has RAM and screen size, a shirt has color and size, a book has page count and ISBN. You could create a column for every possible attribute (hundreds of columns, mostly NULL), or you could use a JSONB columnA PostgreSQL data type that stores JSON (JavaScript Object Notation) in a binary, indexed format. Unlike plain JSON (which is stored as text), JSONB is parsed once on insert and stored as a binary structure — so queries are fast. You can index individual keys with GIN indexes, query nested paths, and mix structured (normal columns) with semi-structured (JSONB) data. Shopify uses JSONB for product metafields. It's PostgreSQL's answer to "I need some NoSQL flexibility in my SQL database." to store flexible data.
-- Structured columns for common fields, JSONB for flexible attributes
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(200) NOT NULL,
price DECIMAL(10,2) NOT NULL,
category VARCHAR(50),
attributes JSONB DEFAULT '{}'::jsonb -- flexible key-value pairs
);
-- A laptop: has RAM, screen size, CPU
INSERT INTO products (name, price, category, attributes) VALUES
('MacBook Pro 14"', 1999.00, 'Laptops',
'{"ram_gb": 16, "screen_inches": 14.2, "cpu": "M3 Pro", "storage_gb": 512}');
-- A shirt: has color, size, material
INSERT INTO products (name, price, category, attributes) VALUES
('Classic Oxford Shirt', 49.99, 'Clothing',
'{"color": "blue", "size": "M", "material": "cotton", "fit": "regular"}');
-- Query JSONB: find all laptops with 16GB+ RAM
SELECT name, price, attributes->>'ram_gb' AS ram
FROM products
WHERE category = 'Laptops'
AND (attributes->>'ram_gb')::INT >= 16;
-- Index JSONB for fast lookups (GIN index)
CREATE INDEX idx_product_attrs ON products USING GIN (attributes);
-- Check if a key exists
SELECT * FROM products WHERE attributes ? 'color';
-- RULE OF THUMB:
-- Use normal columns for data you query/filter/sort on frequently
-- Use JSONB for data that varies per category or changes often
-- NEVER store everything in JSONB — you lose type safety and constraints
Shopify uses this exact pattern for metafields — custom attributes that store owners define for their products. The core product fields (name, price, SKU) are normal columns with constraints. The custom stuff (like "roast level" for a coffee shop or "vintage year" for a wine store) goes in a flexible JSONB column. This gives you the best of both worlds: relational structure where you need it, document-style flexibility where you don't.
Your schema will change. Products will get new fields. Tables will need new indexes. Columns will need different types. The question isn't whether you'll need to modify your schema — it's how to do it safely on a database that's serving live traffic.
Every major framework has a migration tool that version-controls your schema changes:
# Rails (Ruby) — the gold standard for migration tooling
rails generate migration AddPhoneToUsers phone:string
rails db:migrate # apply pending migrations
rails db:rollback # undo the last migration
rails db:migrate:status # see what's applied and what's pending
# Alembic (Python / SQLAlchemy)
alembic revision --autogenerate -m "add phone column"
alembic upgrade head # apply all pending
alembic downgrade -1 # undo last
# Flyway (Java / JVM)
flyway migrate # apply pending migrations
flyway info # see current state
# Entity Framework (.NET)
dotnet ef migrations add AddPhone # generate migration
dotnet ef database update # apply it
# ALWAYS backup before a migration on production:
pg_dump -Fc mydb > backup_before_migration.dump
Some schema changes are instant, and some rewrite the entire table. Knowing the difference can save you from a 3 AM production incident:
-- ✅ SAFE (instant, no table rewrite in PostgreSQL):
ALTER TABLE users ADD COLUMN phone VARCHAR(20); -- nullable column: instant
ALTER TABLE users ADD COLUMN age INT NOT NULL DEFAULT 0; -- PostgreSQL 11+: instant!
CREATE INDEX CONCURRENTLY idx_users_email ON users(email); -- non-blocking index
-- ⚠️ DANGEROUS (rewrites entire table — can take hours on large tables):
ALTER TABLE users ALTER COLUMN name TYPE TEXT; -- type change: full rewrite
ALTER TABLE users ADD COLUMN age INT NOT NULL DEFAULT 0; -- PostgreSQL 10 and below: full rewrite!
ALTER TABLE tweets ALTER COLUMN id TYPE BIGINT; -- GitHub's 2019 migration nightmare
-- ❌ VERY DANGEROUS (exclusive lock — blocks ALL reads and writes):
ALTER TABLE users RENAME COLUMN name TO full_name; -- instant but takes ACCESS EXCLUSIVE lock
-- If any long-running query is reading the table, this waits.
-- While it waits, ALL new queries queue behind it. Cascade timeout.
-- THE SAFE MIGRATION CHECKLIST:
-- 1. Always add columns as nullable OR with a DEFAULT (PostgreSQL 11+)
-- 2. Create indexes with CONCURRENTLY to avoid blocking
-- 3. Never rename columns in production — add new, migrate data, drop old
-- 4. Always pg_dump before destructive changes
-- 5. Test migrations on a copy of production data FIRST
Variations — Different Schemas for Different Workloads
Everything we've covered so far is OLTPOnline Transaction Processing — the kind of database work that powers your day-to-day application. Inserts, updates, deletes, and simple lookups by primary key. Think: "add item to cart," "update user address," "record a payment." OLTP schemas are normalized (3NF) to minimize redundancy and support fast writes. PostgreSQL, MySQL, and SQL Server are all OLTP-first databases. schema design — the standard approach for applications that read and write individual records. But not all workloads are the same. Analytics, reporting, and event-driven architectures need fundamentally different schema shapes. Here are the three main variations.
OLTP Schema Design — The Application Database
This is everything we've built in this page — normalized tables linked by foreign keys, optimized for fast reads and writes of individual records. When a user adds an item to their cart, your app reads the product, writes an order_item, and decrements inventory. Each operation touches a small number of rows.
-- OLTP: small, targeted queries. Fast by design.
-- "Get Alice's most recent order with items"
SELECT o.id, o.total, o.created_at, p.name, oi.quantity, oi.unit_price
FROM orders o
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
WHERE o.user_id = 1
ORDER BY o.created_at DESC
LIMIT 1;
-- "Decrement inventory for product 42"
UPDATE inventory SET quantity = quantity - 1, updated_at = NOW()
WHERE product_id = 42 AND quantity > 0;
-- The `quantity > 0` check prevents overselling.
-- Characteristics of OLTP schemas:
-- ✅ Normalized (3NF) — no redundancy
-- ✅ Many tables with small rows
-- ✅ Indexes on foreign keys and lookup columns
-- ✅ Optimized for single-row or small-batch operations
-- ✅ Used by: Shopify, Stripe, GitHub (PostgreSQL), Uber (MySQL)
OLAP / Star Schema — The Analytics Database
Imagine your product manager asks: "What was our total revenue by product category, by month, for the last two years, broken down by customer country?" That query needs to scan millions of rows across multiple tables. Your OLTP schema can answer it, but it'll take minutes. For analytics, companies use a different schema shape called a star schemaA denormalized schema optimized for analytical queries. At the center is a "fact table" (e.g., fact_sales) containing measurements (revenue, quantity, timestamps). Around it are "dimension tables" (products, customers, dates) containing descriptive attributes. It's called "star" because the ER diagram looks like a star with the fact table at the center. Data warehouses like Snowflake, BigQuery, and Redshift all use star schemas..
-- STAR SCHEMA: one fact table at the center, dimension tables around it
-- Fact table: one row per sale event (the measurements)
CREATE TABLE fact_sales (
sale_id BIGINT PRIMARY KEY,
date_key INT REFERENCES dim_date(date_key),
product_key INT REFERENCES dim_product(product_key),
customer_key INT REFERENCES dim_customer(customer_key),
quantity INT,
revenue DECIMAL(12,2),
discount DECIMAL(10,2)
);
-- Dimension: date (pre-computed calendar table)
CREATE TABLE dim_date (
date_key INT PRIMARY KEY, -- 20240115
full_date DATE,
year INT, month INT, day INT,
quarter INT,
day_of_week VARCHAR(10),
is_weekend BOOLEAN,
is_holiday BOOLEAN
);
-- Dimension: product (denormalized — category embedded, not a separate table)
CREATE TABLE dim_product (
product_key INT PRIMARY KEY,
name VARCHAR(200),
category VARCHAR(50),
subcategory VARCHAR(50),
brand VARCHAR(100)
);
-- Dimension: customer (denormalized — address embedded directly)
CREATE TABLE dim_customer (
customer_key INT PRIMARY KEY,
name VARCHAR(100),
country VARCHAR(50),
city VARCHAR(100),
segment VARCHAR(50) -- 'Enterprise', 'SMB', 'Consumer'
);
-- The analytics query that would be slow on OLTP is FAST here:
SELECT dp.category, dd.year, dd.month, SUM(fs.revenue) AS total_revenue
FROM fact_sales fs
JOIN dim_product dp ON fs.product_key = dp.product_key
JOIN dim_date dd ON fs.date_key = dd.date_key
WHERE dd.year IN (2023, 2024)
GROUP BY dp.category, dd.year, dd.month
ORDER BY dd.year, dd.month, total_revenue DESC;
Key difference from OLTP: star schemas are intentionally denormalized. The customer's country is stored directly in dim_customer, not in a separate countries table. That duplication is fine — analytics databases optimize for read speed, not write consistency. Snowflake, BigQuery, Amazon Redshift, and ClickHouse are all built for star schemas.
Event Sourcing Schema — Storing What Happened, Not What Is
Traditional schemas store current state: "Alice has $150 in her account." Event sourcing stores every change that ever happened: "Alice deposited $100, then deposited $50." You compute the current state by replaying the events. This sounds like more work (because it is), but it gives you a complete audit trail and the ability to reconstruct the state at any point in time.
-- EVENT SOURCING: store events, derive state
-- The event log — append-only, NEVER update or delete
CREATE TABLE account_events (
event_id BIGSERIAL PRIMARY KEY,
account_id UUID NOT NULL,
event_type VARCHAR(50) NOT NULL, -- 'deposit', 'withdrawal', 'transfer'
amount DECIMAL(12,2) NOT NULL,
metadata JSONB DEFAULT '{}', -- extra context (ATM location, etc.)
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Alice opens an account and makes transactions
INSERT INTO account_events (account_id, event_type, amount) VALUES
('a1b2c3d4-...', 'account_opened', 0.00),
('a1b2c3d4-...', 'deposit', 100.00),
('a1b2c3d4-...', 'deposit', 50.00),
('a1b2c3d4-...', 'withdrawal', 30.00);
-- Compute current balance by replaying events
SELECT account_id,
SUM(CASE
WHEN event_type = 'deposit' THEN amount
WHEN event_type = 'withdrawal' THEN -amount
ELSE 0
END) AS balance
FROM account_events
WHERE account_id = 'a1b2c3d4-...'
GROUP BY account_id;
-- balance = 120.00 (0 + 100 + 50 - 30)
-- "What was Alice's balance on Jan 15?"
-- Just replay events up to that date!
SELECT SUM(CASE WHEN event_type = 'deposit' THEN amount
WHEN event_type = 'withdrawal' THEN -amount ELSE 0 END)
FROM account_events
WHERE account_id = 'a1b2c3d4-...' AND created_at <= '2024-01-15';
-- WHO USES THIS:
-- ✅ Banks and payment processors (complete audit trail required by law)
-- ✅ Stripe (every charge, refund, dispute is an event)
-- ✅ Event-driven systems (Kafka, EventStoreDB)
-- ❌ NOT for simple CRUD apps — overkill for a blog or todo list
Event sourcing is not a replacement for OLTP — it's a complement. Many systems use event sourcing for the core domain (payments, banking) and traditional OLTP for everything else (user profiles, settings). The events are the source of truth, and materialized views or read models provide fast query access.
At Scale — How Big Companies Actually Model Data
Theory is great, but the real test of a schema is whether it survives 10 million users, 500 million rows, and a team of 200 engineers all writing migrations at the same time. Let's look at four companies that faced exactly that — and how their schema decisions shaped everything.
Instagram — Sharded PostgreSQL with User-ID-Based Routing
Instagram serves over 2 billion monthly active users, yet their core storage is still PostgreSQLAn open-source relational database known for its correctness, extensibility, and rich feature set. It supports JSONB, full-text search, GIS data, and hundreds of extensions. Instagram, Shopify, GitLab, Supabase, and Notion all run on PostgreSQL. — not some exotic NoSQL system. The secret? Sharding by user_id.
Every photo, like, comment, and follow relationship is keyed to the user who created it. When you upload a photo, Instagram hashes your user_id to determine which PostgreSQL shard stores it. That means all of your data lives on the same machine — so loading your profile, your feed, and your notifications doesn't require cross-shard joins.
Their IDs are also clever: Instagram uses a variant of Snowflake IDsA 64-bit ID format invented by Twitter that encodes the creation timestamp, machine ID, and a sequence number into a single integer. It's sortable by time (newer IDs are always larger), globally unique without a central coordinator, and fits in a BIGINT column. Instagram, Discord, and many other companies use their own Snowflake variants. — 64-bit integers that encode the timestamp, shard ID, and a per-shard sequence number. This means IDs are time-sortable (newer photos always have higher IDs) and you can extract the shard from the ID itself, so routing is instant.
Shopify — Per-Merchant Schema with Vitess
Shopify hosts millions of online stores, and every one of them has products, orders, customers, and inventory. The schema for each store is identical — same tables, same columns, same constraints. But the data is completely isolated. Merchant A can never accidentally see Merchant B's orders.
Under the hood, Shopify uses VitessAn open-source database clustering system for horizontal scaling of MySQL. Originally built at YouTube in 2010 to handle their massive growth, it sits between your application and MySQL, routing queries to the right shard automatically. Shopify, Slack, HubSpot, and Square all use Vitess in production. It handles connection pooling, query routing, and schema management across thousands of shards. (originally built at YouTube) to shard MySQL across thousands of machines. The shard key is the shop_id. Every query includes the shop context, so Vitess routes it to the correct shard without the application needing to know which physical database holds the data.
The brilliant part? Each merchant's schema is exactly the same, but each lives in its own virtual keyspace. When Shopify adds a new column (like fulfillment_status on orders), they roll out the migration one shard at a time over days — not a single big-bang deployment.
Slack — Denormalized for Speed at 18 Billion Messages
Slack's core product is messaging, and messages have a specific access pattern: you almost always read them in order, within a single channel, for a single workspace. Slack exploits this by storing messages in a heavily denormalizedIntentionally duplicating data across tables to avoid JOINs at read time. The opposite of normalization. You trade disk space and write complexity for much faster reads. Common in high-throughput systems where reads vastly outnumber writes — Slack reads 10-100x more than it writes. format, keyed by (workspace_id, channel_id, message_ts).
The message row stores the author's display name and avatar URL directly, rather than joining to a users table. Yes, this duplicates data. Yes, it means updating a username requires a backfill. But it means loading 50 messages in a channel is one simple query with zero JOINs — and at 18 billion messages, eliminating JOINs matters more than saving a few bytes of duplication.
Their schema evolved over time. Early Slack was more normalized. As they grew past 10 billion messages, they denormalized aggressively — a textbook example of the principle that the right schema at 1 million rows is not the right schema at 10 billion rows.
Airbnb — Availability Calendars & Dynamic Pricing
Airbnb's listing schema is deceptively complex. A listing isn't just a title and a price — it has an availability calendar (which dates are open, blocked, or booked), dynamic pricing (price changes per night based on demand, season, and local events), and location data (latitude, longitude, neighborhood, city).
Their calendar table has one row per listing per date — so a listing available for the next 365 days creates 365 rows. With millions of listings, that's billions of calendar rows. They partition this table by date range and shard by listing_id, so checking "is this listing available on March 15-18?" hits a single shard and scans only the relevant date partition.
The pricing model is even more interesting. Rather than storing a single price_per_night on the listing, Airbnb stores price overrides per date (like "New Year's Eve is $500/night instead of $150"). The schema supports a default_price on the listing, plus a price_overrides table keyed by (listing_id, date). The app queries both and picks the override if one exists.
Anti-Lessons — Advice That Sounds Smart but Hurts
Some of the most popular schema advice on the internet is technically correct but practically harmful. These are the ideas that sound great in a textbook and fail miserably at scale. If someone tells you any of these without nuance, be suspicious.
Full normalization is academically elegant: no duplicated data, no update anomalies, every fact stored exactly once. In practice, queries that join 8 tables to display a single page are slow. Really slow. Especially at scale.
Slack learned this the hard way. Their early, fully normalized schema required 3-4 JOINs to render a single message (user table for display name, workspace table for settings, channel table for permissions). At 1 million messages, fine. At 18 billion, those JOINs became their biggest bottleneck. They denormalized — storing the display name directly on the message row — and read latency dropped by 60%.
Entity-Attribute-Value (EAV)A schema pattern where instead of columns, you store data as rows of (entity_id, attribute_name, attribute_value). Example: instead of a "color" column on products, you'd have a row like (product_id=1, key="color", value="blue"). It's infinitely flexible — you can add any attribute without a migration — but querying is a nightmare because you need to PIVOT rows back into columns. is a pattern where instead of adding columns to a table, you store everything as key-value rows: (entity_id, key, value). It sounds brilliant — infinite flexibility without schema migrations! But in practice, it turns every simple query into a multi-JOIN pivot nightmare.
WordPress's wp_options and wp_postmeta tables are the most infamous example. Want to find all posts with a custom field "color" equal to "blue" that also have "size" equal to "large"? That's a self-join on wp_postmeta for every attribute — and it gets slower with every custom field you add.
-- EAV: "flexible" but miserable to query
-- Find products that are blue AND large
SELECT p.id, p.name
FROM products p
JOIN product_meta pm1 ON p.id = pm1.product_id
AND pm1.meta_key = 'color' AND pm1.meta_value = 'blue'
JOIN product_meta pm2 ON p.id = pm2.product_id
AND pm2.meta_key = 'size' AND pm2.meta_value = 'large';
-- One self-join PER attribute. 5 attributes = 5 JOINs. Slow.
-- Better alternative: JSONB column
SELECT id, name FROM products
WHERE attributes @> '{"color": "blue", "size": "large"}'::jsonb;
-- One clause. One GIN index. Fast.
Schema-on-readThe idea that you don't enforce any structure when writing data — you just store raw documents (JSON, BSON) and validate the shape when you read it. MongoDB, early Elasticsearch, and data lakes like S3+Athena use this approach. It sounds flexible, but it means your application code becomes responsible for handling every possible shape of data, including corrupt or inconsistent documents. was MongoDB's selling point in the early 2010s: "No schemas! Just throw JSON in and figure it out later!" And for prototyping, it's genuinely fast. You can start building without designing tables upfront.
But "figure it out later" always arrives, and it arrives painfully. Teams using schema-on-read consistently discovered that after 6-12 months, their data was full of inconsistencies — some documents had email, others had Email, others had emailAddress. No constraints meant duplicate records, missing required fields, and data that silently broke downstream consumers.
MongoDB themselves acknowledged this by adding schema validation in version 3.6 (2017). By version 5.0, they added stricter JSON Schema validation. Most mature MongoDB deployments today enforce schemas — which means they've effectively re-invented constraints, just with more ceremony.
Common Mistakes — Schema Sins We've All Committed
These are the mistakes that show up in code reviews every single week. Each one seems harmless until it blows up in production — and by then, fixing it requires a painful migration on a table with 200 million rows.
We covered this in Section 4, but it deserves its own mistake entry because it's that common. FLOAT and DOUBLE store numbers as binary approximations using IEEE 754. That means 0.1 + 0.2 = 0.30000000000000004. For scientific calculations, this is fine. For money, it's a lawsuit waiting to happen.
DECIMAL(10,2) for currency. Alternatively, store money as integers in the smallest unit (cents) — Stripe stores all amounts as integers. $19.99 becomes 1999.
Some teams remove foreign keys because "they slow down inserts." And yes, a foreign key check does add a tiny overhead on writes — the database has to verify the referenced row exists. But the cost of not having them is orphaned data: orders pointing to deleted users, reviews referencing products that don't exist, line items with no parent order.
Once orphaned data creeps in, it's nearly impossible to clean up. You can't just add the foreign key back because the constraint would fail on existing bad rows. You'd need to find and fix (or delete) every orphan first — a project that can take weeks on a large dataset.
DEFERRABLE INITIALLY DEFERRED in PostgreSQL) which check at transaction commit instead of per-statement.
Using TEXT with no constraints for emails and phone numbers seems convenient — no length limits, no format checks, just store whatever the user sends. But then you end up with "alice@email", "123", "not an email", and empty strings all in the same column. Good luck sending a password reset email to those.
VARCHAR(255) for emails with a CHECK constraint or a regex pattern. Use UNIQUE on emails to prevent duplicates. For phones, use VARCHAR(20) and store in E.164 format (+14155551234). Validate at the database level, not just the application — because application bugs happen.
You'll need timestamps. Maybe not today, maybe not next month, but someday a support ticket will come in asking "when was this order created?" or a bug investigation will need "when was this record last changed?" Without timestamps, the answer is "we have no idea."
created_at TIMESTAMPTZ NOT NULL DEFAULT now() and updated_at TIMESTAMPTZ NOT NULL DEFAULT now() to every table. Use a trigger or application middleware to auto-update updated_at on every modification. This is cheap — two columns, 16 bytes per row — and invaluable for debugging.
A single migration that renames 4 columns, changes 2 data types, adds 3 tables, and drops an index is a recipe for disaster. If step 3 fails, what state is your database in? Can you roll back safely? Usually not — because some of the changes were already applied and they can't be undone (like dropping a column).
phone column to users" = one migration. "Rename fname to first_name" = another migration. Small migrations are easy to review, easy to test, and easy to roll back. GitHub deploys hundreds of small migrations per week, not one giant one per quarter.
MySQL's utf8 charset is a lie. It only supports 3-byte characters, which means it cannot store emojis (which require 4 bytes). If a user writes a review containing a thumbs-up emoji and your table uses utf8, MySQL silently truncates the text at the emoji. Your user's review just got cut in half.
This has caused real production outages. In 2013, several major Rails applications discovered that user-generated content was being silently corrupted because MySQL's default utf8 couldn't handle the emoji explosion. The fix was migrating to utf8mb4 — MySQL's actual UTF-8 implementation that supports the full Unicode range.
utf8mb4 and utf8mb4_unicode_ci collation. In PostgreSQL, this isn't an issue — its UTF8 encoding supports full 4-byte Unicode by default. When creating a MySQL database: CREATE DATABASE myapp CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Interview Playbook — "Design the Schema for Twitter"
This is one of the most common system design interview questions. The interviewer isn't looking for the "one right answer" — they want to see your thought process: how you identify entities, define relationships, choose data types, and make trade-offs. Here's how a junior, mid-level, and senior engineer would each approach it.
The junior answer focuses on getting the core entities right. Don't overthink it — just identify the obvious tables and relationships.
-- Core entities: users, tweets, follows
CREATE TABLE users (
id BIGINT PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
display_name VARCHAR(100),
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE tweets (
id BIGINT PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(id),
content VARCHAR(280) NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE follows (
follower_id BIGINT NOT NULL REFERENCES users(id),
followed_id BIGINT NOT NULL REFERENCES users(id),
created_at TIMESTAMPTZ DEFAULT now(),
PRIMARY KEY (follower_id, followed_id)
);
This covers the basics: users can tweet, and users can follow other users. The follows table uses a composite primary key to prevent duplicate follows. A VARCHAR(280) enforces the character limit at the database level. This alone would pass a junior-level interview.
The mid-level answer extends the junior schema with likes, retweets, replies, and proper indexing — the features that make Twitter actually useful.
-- Tweets now support replies and retweets
CREATE TABLE tweets (
id BIGINT PRIMARY KEY, -- Snowflake ID (time-sortable)
user_id BIGINT NOT NULL REFERENCES users(id),
content VARCHAR(280) NOT NULL,
reply_to_id BIGINT REFERENCES tweets(id), -- NULL = original tweet
retweet_of_id BIGINT REFERENCES tweets(id), -- NULL = not a retweet
like_count INT DEFAULT 0, -- denormalized counter
retweet_count INT DEFAULT 0,
reply_count INT DEFAULT 0,
created_at TIMESTAMPTZ DEFAULT now()
);
-- Likes: who liked which tweet
CREATE TABLE likes (
user_id BIGINT NOT NULL REFERENCES users(id),
tweet_id BIGINT NOT NULL REFERENCES tweets(id),
created_at TIMESTAMPTZ DEFAULT now(),
PRIMARY KEY (user_id, tweet_id)
);
-- Indexes for common queries
CREATE INDEX idx_tweets_user ON tweets(user_id, created_at DESC);
CREATE INDEX idx_tweets_reply ON tweets(reply_to_id) WHERE reply_to_id IS NOT NULL;
CREATE INDEX idx_follows_followed ON follows(followed_id); -- "who follows me?"
Key improvements: Snowflake IDs for time-sortable primary keys. Denormalized counters (like_count) to avoid COUNT(*) queries on every tweet render. Partial indexes (the WHERE reply_to_id IS NOT NULL) to save space on indexes that only apply to a subset of rows.
The senior answer addresses the hard problems: timeline generation, fan-out strategy, and sharding. This is where schema design meets system architecture.
-- Timeline: pre-computed feed per user (fan-out on write)
CREATE TABLE user_timeline (
user_id BIGINT NOT NULL,
tweet_id BIGINT NOT NULL,
author_id BIGINT NOT NULL, -- denormalized: avoid JOIN
created_at TIMESTAMPTZ NOT NULL, -- denormalized: sort without JOIN
PRIMARY KEY (user_id, tweet_id)
) PARTITION BY HASH (user_id); -- shard-friendly
-- When @elonmusk (100M followers) tweets:
-- DON'T fan-out to 100M timeline rows (too slow).
-- Instead, celebrities use fan-out-on-read:
-- their tweets are fetched at read time and merged with
-- the pre-computed timeline.
-- Shard key: user_id for user_timeline, tweet_id for tweets.
-- This means "show my timeline" hits ONE shard,
-- but "show all replies to tweet X" might hit multiple.
-- Trade-off: optimize for the most common query.
The senior answer shows you understand that schema design doesn't exist in a vacuum. The right schema depends on access patterns (timelines are read-heavy), scale (100M follower fan-out is different from 500), and operational concerns (how do you shard? how do you migrate?). The interviewer wants to see you reason through these trade-offs, not recite a perfect CREATE TABLE.
Practice Exercises — Build Muscle Memory
Reading about schema design is helpful. Doing it is what makes you good. These exercises progress from straightforward table design to real-world migration challenges.
Design the schema for a blogging platform. You need: users (name, email, bio), posts (title, body, status: draft/published), comments (nested — comments can reply to other comments), and tags (many-to-many with posts). Include appropriate constraints, indexes, and timestamps.
- Nested comments need a
parent_comment_idthat references the same table (self-referencing foreign key). - Tags are many-to-many: you'll need a junction table (
post_tags) with composite primary key. - Post status should be an ENUM or CHECK constraint — not free text.
CREATE TABLE users (
id BIGSERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
bio TEXT,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE posts (
id BIGSERIAL PRIMARY KEY,
author_id BIGINT NOT NULL REFERENCES users(id),
title VARCHAR(300) NOT NULL,
body TEXT NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'draft'
CHECK (status IN ('draft', 'published', 'archived')),
published_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE comments (
id BIGSERIAL PRIMARY KEY,
post_id BIGINT NOT NULL REFERENCES posts(id),
author_id BIGINT NOT NULL REFERENCES users(id),
parent_comment_id BIGINT REFERENCES comments(id), -- self-ref for nesting
body TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE tags (
id SERIAL PRIMARY KEY,
name VARCHAR(50) UNIQUE NOT NULL
);
CREATE TABLE post_tags (
post_id BIGINT NOT NULL REFERENCES posts(id),
tag_id INT NOT NULL REFERENCES tags(id),
PRIMARY KEY (post_id, tag_id)
);
CREATE INDEX idx_posts_author ON posts(author_id, created_at DESC);
CREATE INDEX idx_comments_post ON comments(post_id, created_at);
CREATE INDEX idx_comments_parent ON comments(parent_comment_id)
WHERE parent_comment_id IS NOT NULL;
You're given this denormalized table with massive duplication. Break it into properly normalized tables (3NF) with appropriate foreign keys:
CREATE TABLE flat_orders (
order_id INT, order_date DATE, customer_name VARCHAR(100),
customer_email VARCHAR(255), customer_city VARCHAR(100),
product_name VARCHAR(200), product_price DECIMAL(10,2),
product_category VARCHAR(50), quantity INT, total DECIMAL(10,2)
);
- Identify three entities: customers, products, and orders.
- An order can have multiple products — you'll need an
order_itemsjunction table. - Store
unit_priceon order_items (not just a FK to products) — because prices change over time, and the order should record what the customer actually paid.
CREATE TABLE customers (
id BIGSERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
city VARCHAR(100)
);
CREATE TABLE products (
id BIGSERIAL PRIMARY KEY,
name VARCHAR(200) NOT NULL,
price DECIMAL(10,2) NOT NULL,
category VARCHAR(50)
);
CREATE TABLE orders (
id BIGSERIAL PRIMARY KEY,
customer_id BIGINT NOT NULL REFERENCES customers(id),
order_date TIMESTAMPTZ DEFAULT now(),
total DECIMAL(10,2) NOT NULL
);
CREATE TABLE order_items (
id BIGSERIAL PRIMARY KEY,
order_id BIGINT NOT NULL REFERENCES orders(id),
product_id BIGINT NOT NULL REFERENCES products(id),
quantity INT NOT NULL CHECK (quantity > 0),
unit_price DECIMAL(10,2) NOT NULL -- snapshot of price at purchase time
);
Take any of the schemas above and add: (1) soft deletes — records are never truly deleted, just marked as deleted with a timestamp, and (2) an audit trail — a separate table that logs every INSERT, UPDATE, and DELETE with the old and new values. Bonus: use a PostgreSQL trigger to populate the audit table automatically.
- Soft delete: add
deleted_at TIMESTAMPTZ(NULL = active, non-NULL = deleted). - Update all queries to filter:
WHERE deleted_at IS NULL. - Audit table columns:
table_name, record_id, action (INSERT/UPDATE/DELETE), old_data JSONB, new_data JSONB, changed_at, changed_by.
-- Soft delete: add to every table
ALTER TABLE users ADD COLUMN deleted_at TIMESTAMPTZ;
ALTER TABLE posts ADD COLUMN deleted_at TIMESTAMPTZ;
-- "Delete" a user (soft)
UPDATE users SET deleted_at = now() WHERE id = 42;
-- All queries must respect soft deletes
SELECT * FROM users WHERE deleted_at IS NULL;
-- Audit trail table
CREATE TABLE audit_log (
id BIGSERIAL PRIMARY KEY,
table_name VARCHAR(100) NOT NULL,
record_id BIGINT NOT NULL,
action VARCHAR(10) NOT NULL CHECK (action IN ('INSERT','UPDATE','DELETE')),
old_data JSONB,
new_data JSONB,
changed_at TIMESTAMPTZ DEFAULT now(),
changed_by VARCHAR(100)
);
-- PostgreSQL trigger to auto-log changes
CREATE OR REPLACE FUNCTION audit_trigger() RETURNS TRIGGER AS $$
BEGIN
INSERT INTO audit_log (table_name, record_id, action, old_data, new_data)
VALUES (
TG_TABLE_NAME,
COALESCE(NEW.id, OLD.id),
TG_OP,
CASE WHEN TG_OP != 'INSERT' THEN to_jsonb(OLD) END,
CASE WHEN TG_OP != 'DELETE' THEN to_jsonb(NEW) END
);
RETURN COALESCE(NEW, OLD);
END;
$$ LANGUAGE plpgsql;
-- Attach to any table
CREATE TRIGGER users_audit AFTER INSERT OR UPDATE OR DELETE
ON users FOR EACH ROW EXECUTE FUNCTION audit_trigger();
Design a star schemaA data warehouse design where a central "fact" table (containing measurements like revenue, quantity, clicks) is surrounded by "dimension" tables (containing descriptive attributes like product name, customer city, date). Called a "star" because the diagram looks like a star with the fact table in the center. Optimized for analytical queries (GROUP BY, SUM, AVG) rather than transactional workloads. for an e-commerce analytics warehouse. The business wants to answer: "What was total revenue by product category, by month, by customer city?" Design a fact table and at least 3 dimension tables.
- Fact table:
fact_saleswith foreign keys to each dimension + numeric measures (revenue, quantity, discount). - Dimensions:
dim_product(name, category, brand),dim_customer(name, city, segment),dim_date(date, month, quarter, year, day_of_week). - The date dimension is pre-populated with one row per day for the next 10 years.
-- Dimension: dates (pre-populated, one row per day)
CREATE TABLE dim_date (
date_key INT PRIMARY KEY, -- YYYYMMDD format: 20240315
full_date DATE NOT NULL UNIQUE,
day_of_week VARCHAR(10), -- 'Monday'
month INT, quarter INT, year INT,
is_weekend BOOLEAN
);
-- Dimension: products
CREATE TABLE dim_product (
product_key SERIAL PRIMARY KEY,
name VARCHAR(200) NOT NULL,
category VARCHAR(50),
brand VARCHAR(100),
unit_cost DECIMAL(10,2)
);
-- Dimension: customers
CREATE TABLE dim_customer (
customer_key SERIAL PRIMARY KEY,
name VARCHAR(100),
city VARCHAR(100),
country VARCHAR(50),
segment VARCHAR(30) -- 'retail', 'wholesale', 'enterprise'
);
-- Fact: sales (one row per order line item)
CREATE TABLE fact_sales (
sale_id BIGSERIAL PRIMARY KEY,
date_key INT NOT NULL REFERENCES dim_date(date_key),
product_key INT NOT NULL REFERENCES dim_product(product_key),
customer_key INT NOT NULL REFERENCES dim_customer(customer_key),
quantity INT NOT NULL,
revenue DECIMAL(12,2) NOT NULL,
discount DECIMAL(10,2) DEFAULT 0
);
-- The analytics query the business wants:
SELECT dp.category, dd.year, dd.month, dc.city,
SUM(fs.revenue) AS total_revenue
FROM fact_sales fs
JOIN dim_product dp ON fs.product_key = dp.product_key
JOIN dim_date dd ON fs.date_key = dd.date_key
JOIN dim_customer dc ON fs.customer_key = dc.customer_key
GROUP BY dp.category, dd.year, dd.month, dc.city
ORDER BY total_revenue DESC;
Your users table needs a new phone column that's required (NOT NULL). But the table has 50 million rows and you can't take the app offline. Write a safe migration plan: add column, backfill, set constraint, deploy. What order? What could go wrong?
- You can't add a NOT NULL column without a default — existing rows would violate the constraint.
- Step 1: Add column as nullable. Step 2: Backfill in batches. Step 3: Add NOT NULL constraint.
- Backfill in batches of 10K-50K rows to avoid locking the table for minutes.
-- Step 1: Add column as NULLABLE (instant, no table rewrite)
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Step 2: Deploy app code that writes phone on new signups
-- (both old and new code must handle phone = NULL gracefully)
-- Step 3: Backfill in batches (not one giant UPDATE)
UPDATE users SET phone = 'UNKNOWN'
WHERE id BETWEEN 1 AND 50000 AND phone IS NULL;
-- repeat for next 50K, next 50K... (script this in a loop)
-- Step 4: Verify no NULLs remain
SELECT COUNT(*) FROM users WHERE phone IS NULL;
-- Must be 0 before proceeding
-- Step 5: Add NOT NULL constraint
-- In PostgreSQL 12+, use NOT VALID + VALIDATE for zero-downtime:
ALTER TABLE users ADD CONSTRAINT users_phone_nn
CHECK (phone IS NOT NULL) NOT VALID;
-- NOT VALID: constraint added instantly, only checked on new rows
ALTER TABLE users VALIDATE CONSTRAINT users_phone_nn;
-- VALIDATE: scans existing rows (SHARE UPDATE EXCLUSIVE lock, not ACCESS EXCLUSIVE)
-- reads still work, writes still work, just a background scan
Cheat Sheet — Pin This to Your Wall
Money → DECIMAL(10,2) IDs → BIGINT or UUID Short txt → VARCHAR(n) Long txt → TEXT Boolean → BOOLEAN Dates → TIMESTAMPTZ Flexible → JSONB
id BIGSERIAL PK created_at TIMESTAMPTZ updated_at TIMESTAMPTZ deleted_at TIMESTAMPTZ (soft del) -- Always: NOT NULL where possible -- Always: UNIQUE on natural keys
1NF → No repeating groups 2NF → No partial-key deps 3NF → No transitive deps -- Target 3NF for writes -- Denormalize for reads -- Materialized views bridge both
WHERE col = ? → B-tree index WHERE col IN(?) → B-tree index ORDER BY col → include in index JSONB @> → GIN index Full-text → GIN + tsvector Partial → WHERE condition
1. Add column (nullable) 2. Deploy app (handle NULL) 3. Backfill in batches 4. Add NOT NULL (NOT VALID) 5. VALIDATE CONSTRAINT -- NEVER: rename + add in one step -- NEVER: one giant UPDATE
FLOAT for money → DECIMAL No foreign keys → orphan data No timestamps → blind debugging utf8 in MySQL → use utf8mb4 EAV tables → use JSONB Giant single table → normalize it
Connected Topics — Where to Go Next
Schema design connects to almost every other topic in system design. Here's where to deepen your knowledge.