TL;DR โ Picking a Database in Plain English
- The 5-question framework that narrows any database decision to 1โ3 candidates
- Why workload shape is the first filter and why it eliminates most exotic choices immediately
- When to default to Postgres and when that instinct breaks down
- How consistency needs, scale ceiling, ops complexity, and cost model each trim the shortlist further
Picking a database is mostly about answering 5 questions in order: (1) What's the workload shape? (2) What consistency do you need? (3) What scale are you betting on? (4) What ops complexity can your team handle? (5) What's your cost model? The right answer is almost always the simplest tool that satisfies all five.
Start with workload shape โ is your data fundamentally relational (tables, foreign keys, joins), document-oriented (variable-shape JSON), key-value (fast single-key lookups), graph (traversals across relationships), time-series (append-only measurements), or search (full-text + faceting)? This one question eliminates most of the field immediately. Then ask about consistency: can reads ever see slightly stale data, or must every read see the latest write? Then ask about scale: are you building for 500 users or 50 million, and is your data 10 GB or 10 TB? Consistency and scale together tell you whether a single-node database is fine or whether you need distributed infrastructure. The last two questions โ ops complexity and cost model โ are tiebreakers, not primary drivers.
For most apps โ especially early-stage products โ the answer is Postgres. It handles relational data, JSONB for semi-structured fields, full-text search, geospatial queries, and time-series data with extensions. It's battle-tested over 30+ years, runs everywhere, and every engineer knows it. The temptation to reach for a specialized database early is strong, but it usually means adding operational complexity before you've even confirmed you need it. The right question is almost always: "Have I really exhausted what Postgres can do here?"
The default breaks when you hit a real ceiling, not an imagined one. You need a document store when your schema genuinely cannot be known in advance and you'd spend more time wrestling with Postgres's schema migrations than building features. You need a time-series database when you're ingesting millions of metrics per second and Postgres's row-per-event model creates bloat and slow range queries. You need a graph database when your core queries are multi-hop traversals โ "friends of friends who also bought X" โ and SQL JOINs across adjacency tables become exponentially expensive. The pattern: measure first, specialize second.
Why You Need This โ Two Teams, Two Expensive Mistakes
Bad database choices don't usually kill a product in week one. They kill it in month 18, when the team is spending two-thirds of every sprint fighting the database instead of shipping features. Here are two real-world patterns that show up constantly.
Story 1 โ "We Need Flexibility" (MongoDB on a Financial App)
A team building a B2B financial platform chose MongoDB early on because they wanted "schema flexibility." The data โ users, accounts, transactions, invoices โ was actually highly relational. But NoSQL was trendy, they liked JSON, and they figured they could always add structure later.
Two years in, the pain was severe. Cross-document transactions were possible but awkward โ MongoDB added multi-document ACID transactions in v4.0, but they carry a performance cost and the team's app logic had already grown to assume a document-per-record model that fought transactions at every turn. Reporting queries that felt natural in SQL ("total revenue by customer, grouped by quarter, filtered by invoice status") required complex aggregation pipelines. New engineers joining the team kept wanting to add foreign-key-like references between documents, then writing application-level join logic to compensate. The schema wasn't flexible at all โ it had just moved its rigidity from the database layer into the application code, where it was invisible and hard to enforce.
The team spent 18 months migrating to Postgres. Had they answered the 5 questions at the start โ workload shape: relational; consistency: strong ACID needed for money โ they would have picked Postgres on day one.
Story 2 โ "We'll Scale With Postgres" (IoT Firehose at 100K Writes/sec)
A second team built an IoT platform ingesting sensor telemetry from industrial equipment โ temperature, pressure, vibration, measured at 10-second intervals across 200,000 devices. That's roughly 20,000 rows per second, each with a timestamp, device ID, metric name, and a float value. They started on Postgres because they knew it well.
At 6 months, the database held 3 billion rows. SELECT avg(temperature) FROM telemetry WHERE device_id = 42 AND ts > now() - interval '7 days' โ a query that sounds simple โ took 4 seconds even with a (device_id, ts) index, because Postgres's MVCC model and general-purpose storage format aren't optimized for append-only, time-ordered float columns. The team spent 6 months adding connection poolers, partitioning tables by week, tuning autovacuum, and manually managing old partition drops. It helped, but the query was still 400 ms at peak.
They migrated to TimescaleDB. That same 7-day query dropped to 18 ms. The columnar compression shrank storage 8ร. Had they answered the 5 questions upfront โ workload shape: time-series; scale: billions of rows, millions/day; consistency: eventual is fine for sensor data โ they would have landed on a TSDB on day one.
Your B2B SaaS product needs strong ACID guarantees, complex joins across customers/orders/invoices, and analytics dashboards for account managers. You're at 500 customers today. Which database do you pick? Now fast-forward: you reach 50,000 customers. Your Postgres primary is at 80% CPU. Your biggest analytics query scans 200 million rows and takes 45 seconds. Do you still keep the same database? Or does the answer change โ and if so, which question in the 5-question framework is now pointing you somewhere new?
The 5-Question Framework
Five questions, asked in this exact order, form a structured decision path. Each one narrows your options significantly. By the end, 1โ3 candidates should remain โ and the right choice is almost always the simplest one among them.
Q1 โ Workload Shape
This is the first filter and usually the strongest one. Ask: what is the fundamental structure of your data?
- Relational โ entities with relationships, foreign keys, joins needed: Postgres, MySQL, Aurora
- Document โ variable-shape JSON records, schema evolves per record: MongoDB, DynamoDB, Firestore
- Key-value โ fast single-key reads/writes, no complex queries: Redis, Memcached, DynamoDB
- Graph โ entities connected by relationships, traversal queries: Neo4j, Neptune
- Time-series โ append-only measurements with timestamps, range aggregations: TimescaleDB, InfluxDB, Prometheus
- Search โ full-text queries, faceted filtering, relevance ranking: Elasticsearch, OpenSearch, MeiliSearch
Why this question first? Because the workload shape determines the physical data model โ how bytes are laid out on disk. A column-store optimized for time-range scans is physically incapable of fast key-value lookups, and vice versa. Picking a DB that fights your data model means fighting the storage engine at every query.
Q2 โ Consistency Needs
After workload shape, ask: when two clients read the same data, how fresh must that data be?
- Strong / ACID โ every read sees the latest committed write. Non-negotiable for financial systems, inventory, healthcare records.
- Eventual โ replicas converge over time (seconds to minutes). Fine for social feeds, analytics counters, search indexes.
- Tunable โ per-query consistency level. Cassandra's
QUORUMvsONE; DynamoDB's strongly consistent reads. Flexible but adds complexity. - Monotonic read โ you never see an older version after reading a newer one, but you might read slightly stale data. Good middle ground for user-facing apps.
Why this matters: strong consistency requires coordination between replicas (quorum reads, consensus writes). That coordination costs latency. If your app can tolerate eventual consistency, you unlock databases that are far simpler to scale horizontally โ but the trade-off is real bugs if your code assumes freshness it doesn't actually have.
Q3 โ Scale Ceiling
Be honest about your growth trajectory. Where do you realistically land in 3 years?
- Small (< 100 GB, single region) โ almost any database works. Optimize for developer familiarity and simplicity.
- Medium (100 GBโ5 TB, single region) โ Postgres with read replicas handles most of this. Add connection pooling (PgBouncer) and proper indexes before reaching for anything exotic.
- Large (5 TB+, single region) โ consider partitioning, sharding, or a distributed SQL engine (CockroachDB, YugabyteDB).
- Global (multi-region, low-latency reads worldwide) โ distributed NewSQL (Spanner, CockroachDB), globally replicated caches (Redis Global), or regional Postgres with replication + routing logic.
Common mistake: picking a globally-distributed database for a 5,000-user app because "we might go global." Distributed systems add latency, operational overhead, and debugging complexity. Scale when you must; not before.
Q4 โ Operational Complexity
The best database for your workload is useless if your team can't operate it reliably. Ask: how much operational complexity can we absorb?
- Managed vs self-hosted โ RDS / Aurora / Atlas handle upgrades, backups, and HA for you. Self-hosted gives full control but requires expertise.
- Team experience โ a team that knows Postgres deeply will outperform a team struggling with a "better" database they barely understand.
- On-call burden โ distributed databases (Cassandra, CockroachDB) have failure modes that require 3 AM debugging by people who understand consensus, replication lag, and node decomissioning. Is your team ready for that?
- Ecosystem maturity โ observability tooling, ORMs, migration frameworks, managed hosting options. Postgres wins this by a wide margin over almost everything else.
Q5 โ Cost Model
Cost is a tiebreaker, not a primary driver โ but it can flip a close decision. Understand what you're actually paying for:
- Per-node โ you pay for compute/memory regardless of query volume. Postgres, MySQL, Cassandra. Predictable but idle capacity wastes money.
- Per-query โ you pay for what you use. BigQuery, Athena. Great for infrequent large scans; can be shocking if queries are poorly written.
- Per-GB stored + per-read/write unit โ DynamoDB's model. Predictable at known access patterns; expensive if you scan large tables.
- License โ MongoDB's SSPL license and Oracle's pricing are real concerns at enterprise scale. Postgres is fully open-source (PostgreSQL License).
At small scale, cost differences are negligible. At 10 TB+ or high QPS, a 3ร cost difference between databases matters. Model your actual access patterns against each pricing model before committing.
Core Concepts โ Terms You'll See Everywhere
Database discussions are dense with acronyms. Here are the six terms you'll encounter most often โ each explained in plain English first, then with the precise technical meaning.
OLTP โ Online Transaction Processing
Imagine a cashier at a grocery store ringing up items โ many small, fast operations happening simultaneously across thousands of customers. That's OLTP in a nutshell: lots of small reads and writes, each touching a handful of rows, each needing to complete in milliseconds.
Technical definition: workloads characterized by high concurrency, short-duration transactions, point lookups (fetch one row by ID), small updates (increment a counter, update a status), and ACID requirements. Your e-commerce checkout, banking transfer, and user authentication are all OLTP workloads.
Why it matters: OLTP databases optimize for low latency per query and high concurrency โ they use B-tree indexes for fast lookups and row-oriented storage so that fetching a single row reads minimal data from disk.
OLAP โ Online Analytical Processing
Now imagine an analyst pulling up a dashboard to see "total revenue by region, by product category, for Q3 last year" โ one big, slow query that touches millions of rows but runs infrequently. That's OLAP: few queries, but each one scans enormous amounts of data.
Technical definition: workloads characterized by low concurrency, long-duration scans, aggregations across millions of rows (SUM, AVG, COUNT GROUP BY), and read-only access patterns. BI dashboards, data warehouses, and reporting systems are all OLAP workloads.
Why it matters: OLAP databases use columnar storage โ reading only the columns your query needs โ and massively parallel query execution. BigQuery, Redshift, Snowflake, ClickHouse, and DuckDB are OLAP databases.
HTAP โ Hybrid Transactional/Analytical Processing
The obvious question after OLTP vs OLAP: can one system do both? HTAP systems try to answer yes โ handle your transactional writes AND serve analytical queries on fresh data, without replication lag or separate ETL pipelines.
Technical definition: a database architecture that serves both OLTP and OLAP workloads from the same system, typically by maintaining both row-oriented storage (for OLTP) and columnar storage (for OLAP) on the same data, kept in sync automatically.
Real examples: TiDB (open-source HTAP from PingCAP), SingleStore, YugabyteDB with read replicas. Google Spanner recently added HTAP capabilities.
Honest caveat: HTAP systems are more complex to operate and often underperform dedicated OLTP or OLAP databases at the extremes. Use them when the freshness guarantee (analytics on data seconds old, not hours old) is a genuine business requirement โ not just convenience.
Polyglot Persistence
The idea is simple: use the right database for each job, even if that means using five different databases in one application. Your user profiles in Postgres, your session tokens in Redis, your product catalog in Elasticsearch, your event log in Kafka/S3, your recommendation graph in Neo4j.
Technical definition: an architectural pattern where different components of an application use different data storage technologies, each chosen to best fit that component's access patterns.
The trade-off: each additional database adds operational overhead (monitoring, backups, upgrades, access control), introduces potential consistency boundaries between systems, and adds to onboarding complexity for new engineers. Polyglot persistence is a deliberate tool โ not the default. Start with one database and only add a second when a measured bottleneck demands it.
CAP Theorem / PACELC
Imagine your database is spread across two cities and the cable between them gets cut. You have to choose: refuse the request (boring but correct), or answer with possibly-old data (helpful but maybe wrong). That's the heart of the CAP theorem. Formally: a distributed database can only guarantee two of three properties โ Consistency (every read sees the latest write), Availability (every request gets a response), and Partition tolerance (the system keeps working if network links between nodes fail). Since network partitions are unavoidable in real distributed systems, every distributed DB is really choosing between CA and CP behavior when a partition occurs.
PACELC extends this: even without a partition, there's a trade-off between latency and consistency. A system that achieves strong consistency requires coordination between replicas โ that coordination takes time (adds latency). A system that answers immediately may return slightly stale data. PACELC captures both trade-offs: during a Partition โ CA vs CP; Else โ Latency vs Consistency.
Practical takeaway: don't pick a database based on CAP alone. CAP tells you what happens during a network failure (rare). PACELC's latency vs consistency trade-off affects every single request.
Vendor Lock-in
Vendor lock-in happens when you use database-specific features so deeply that migrating away becomes extremely painful. Once you've built your application around DynamoDB's single-table design pattern, or used dozens of proprietary Firestore collection group queries, or relied on Aurora-specific parallel query features, the "exit cost" is enormous.
Why it matters for DB selection: proprietary managed databases often have the best performance and simplest operations, but they tie your infrastructure to one cloud vendor. Open standards (Postgres-compatible APIs, MySQL-compatible protocols) give you portability โ you can move from RDS to Aurora to Supabase to self-hosted without rewriting application code.
The honest answer: some lock-in is acceptable if the productivity and performance gains are real. The risk to manage is lock-in to a pricing model that becomes punitive at scale, or lock-in to a vendor that gets acquired or deprecated.
Mapping Workload to Database Family
Most applications fall into one of six well-known workload patterns. Each pattern has a family of databases built specifically for it โ databases whose storage engine, query planner, and replication model are optimized for exactly that kind of work. Match the pattern, and queries are fast, storage is efficient, and the data model feels natural. Fight the pattern, and you spend months tuning a database that was never designed for your use case.
Transactional Records โ Relational (Postgres, MySQL)
Any time your data has well-defined entities with relationships between them โ users have orders, orders have line items, line items reference products โ you have a relational workload. The data model is inherently tabular and the queries need to join across those tables.
Why relational databases win here: B-tree indexes make point lookups and range scans fast; the query planner optimizes join order automatically; foreign keys enforce referential integrity at the storage layer; MVCC enables high read/write concurrency without locking.
When to consider moving: when write throughput exceeds ~50K rows/sec sustained on a single node (this varies enormously by row size and query complexity), or when a single table exceeds ~1 TB and query performance degrades despite good indexing.
Variable-Shape Documents โ Document Store (MongoDB, DynamoDB)
Some domains genuinely have variable structure. A CMS where every article type has different metadata fields. A product catalog where a TV has HDMI ports and a shirt has a size chart. A user profile system where optional fields vary by feature flags. In these cases, a rigid relational schema fights you at every schema migration.
Why document stores win here: each document is a self-contained JSON blob โ no schema migration needed to add a new field to some documents. Nested objects and arrays map naturally to the app's object model. Horizontal scaling via sharding is built-in.
Honest warning: "variable-shape" is often used as an excuse for not designing the data model properly. Before choosing a document store, ask: do your queries actually need the flexibility, or is the schema just 95% consistent with 5% variation? Postgres's JSONB column handles the 95/5 case elegantly without abandoning relational structure.
Hot Key-Value โ Redis, Memcached
Sessions, rate-limit counters, leaderboard scores, feature flags, ephemeral locks โ all of these share one property: you access them by a single known key, you need sub-millisecond response times, and you don't need joins or complex queries.
Why Redis wins here: all data lives in RAM (disk persistence is optional), access is O(1) by key, and Redis supports rich data structures (sorted sets for leaderboards, streams for queues, pub/sub for notifications) that make common patterns trivial to implement.
When not to use Redis as primary storage: data that must survive crashes (use a durable store for source of truth, Redis for the cache/hot layer), and data that's too large to fit in RAM economically.
Time-Series โ TimescaleDB, InfluxDB, Prometheus
If your data is append-only measurements โ "temperature sensor 42 read 23.4ยฐC at 14:32:07" โ repeated billions of times, you have a time-series workload. The key insight: this data is never updated in place. It's always written once and then read as a range scan (last 24 hours, last 7 days).
Why TSDBs win here: they automatically partition data by time, apply columnar compression per time chunk (time-series data is highly compressible โ adjacent timestamps differ by milliseconds, adjacent float values differ by tiny amounts), and answer time-range aggregations using chunk-level metadata to skip irrelevant partitions entirely.
The result: TimescaleDB query benchmarks typically show 100โ1000ร speedup over vanilla Postgres for time-range aggregations on large datasets, plus 90%+ storage compression. These aren't marginal gains โ they're the difference between a usable and an unusable system.
Full-Text Search โ Elasticsearch, OpenSearch, MeiliSearch
When users search your app โ typing "blue waterproof hiking boots size 10" โ they expect results that match semantics, not just exact strings. Facets ("filter by brand: Nike"), highlighting ("here's where your search term appears"), and relevance scoring ("this result matches 4 of 5 terms; this one matches 2") make search feel like search.
Why search engines win here: they build an inverted index โ a structure that maps every word to every document containing it. Querying "boots" returns all matching documents in O(1) regardless of corpus size. SQL's LIKE '%boots%' scans every row sequentially: O(n).
Use Postgres FTS for simple cases: pg_trgm and built-in full-text search handle simple text matching well. Switch to Elasticsearch/OpenSearch when you need faceted filtering, real-time indexing at high volume, or semantic/vector search.
Graph โ Neo4j, Amazon Neptune
Some problems are fundamentally about connections. Fraud detection: "is this new account connected to any known fraudulent account within 3 hops?" Social networks: "suggest friends based on mutual connections." Recommendation engines: "users who bought X also bought Y, and those users also liked Z โ what does that tell me about you?"
Why graph databases win here: in a relational database, finding all nodes within N hops requires N self-joins โ each additional hop multiplies query cost. A graph database stores relationships as first-class objects with direct physical pointers, so a 3-hop traversal is 3 pointer dereferences per node, not 3 full table scans.
When graph DBs are overkill: if your "graph" queries are simple parent-child relationships (one level deep) with low edge counts, a recursive CTE in Postgres handles it cleanly. Graph databases pay off when traversals go many hops deep across millions of edges.
Roughly 90% of applications fit cleanly into one of the six patterns above. The remaining 10% are genuinely exotic workloads โ columnar OLAP at petabyte scale (BigQuery, Redshift, Snowflake), vector similarity search for ML embeddings (Pinecone, Weaviate, pgvector), ultra-low-latency financial tick data (kdb+), or spatial geospatial analytics (PostGIS, SpatiaLite). These get dedicated pages. For now: identify which of the six patterns your workload matches, and you're 90% of the way to the right answer.
When Postgres Is Enough
Postgres is the most underestimated database in existence. Engineers constantly reach for specialized tools โ a Redis cache, a MongoDB document store, an Elasticsearch search index โ before asking whether Postgres, properly used, could handle the same job with less operational overhead. The right question is often not "which exotic database do I need?" but "have I actually tried what Postgres can do?"
Four Reasons to Default to Postgres
Maturity โ 30+ Years of Battle Testing
Postgres has been in production since 1996. It has survived decades of use in banking, healthcare, telecom, government systems, and Fortune 500 backends. This isn't just marketing โ maturity means edge cases have been found and fixed, failure modes are well-documented, and the community knowledge base is enormous.
When you hit a weird behavior in a newer database, there may be no Stack Overflow answer, no blog post, no one in your on-call rotation who's seen it before. When you hit a weird Postgres behavior, there's almost certainly a 10-year-old mailing list thread with the exact answer.
Ecosystem โ Drivers, ORMs, Hosting Everywhere
Every programming language has a mature Postgres driver. Every ORM (SQLAlchemy, Hibernate, Prisma, ActiveRecord, GORM) has first-class Postgres support. Every hosting platform (AWS RDS, Google Cloud SQL, Azure Database, Supabase, Neon, Railway) offers managed Postgres. Every monitoring tool understands Postgres query plans.
This ecosystem depth means you're never the first person to do what you're trying to do. Schema migration tools, connection poolers, read replica libraries, logical replication setups โ all of this is solved, documented, and maintained by thousands of contributors.
Extensions โ One Database for Many Workloads
Postgres's extension system lets you install specialized storage engines and query types without changing databases:
- pgvector โ store and query ML embedding vectors; power semantic search and recommendation systems. Used in production by Supabase and many AI startups.
- TimescaleDB โ automatic time-partitioned hypertables + columnar compression for time-series workloads. 10โ1000ร faster than vanilla Postgres for time-range queries.
- PostGIS โ full geospatial support: store geometries, query "all restaurants within 2 km," compute polygon intersections. Industry standard for spatial data.
- pg_trgm + built-in FTS โ trigram indexes for fuzzy string matching; full-text search with ranking, stemming, and language support for many simple search use cases.
- Citus โ horizontal sharding that turns Postgres into a distributed database, acquired by Microsoft and available on Azure.
Operational Simplicity โ One Database to Rule Them
Every database you add to your stack multiplies operational overhead: monitoring dashboards, backup procedures, upgrade runbooks, access control policies, connection limits, failover procedures, oncall alert rules. Running one database means running one set of all of these.
With Postgres, you get all your data in one place โ no data sync pipelines, no eventual consistency between systems, no cross-system transactions. A complex query that joins user profiles, order history, product metadata, and recent reviews is a single SQL statement. In a polyglot persistence setup, that same query requires 4 separate API calls, application-level join logic, and careful cache invalidation.
Operational simplicity compounds over time: fewer systems to monitor, fewer incidents to respond to, and faster onboarding for new engineers who already know SQL.
"Use Postgres for everything" is a useful default, not an absolute rule. Postgres has genuine limits that you'll eventually hit if you push hard enough:
- Horizontal write scale โ Postgres has one writable primary. When your write throughput exceeds what one machine can handle (typically 50Kโ200K writes/sec depending on workload), you need either sharding (Citus), logical replication with application-level routing, or a purpose-built distributed SQL engine. Attempting to scale Postgres writes horizontally with application sharding creates significant complexity.
- OLAP at scale โ Postgres's row-oriented storage is inefficient for large analytical scans. At multi-TB analytical workloads (scanning 100M+ rows per query regularly), columnar OLAP databases (ClickHouse, BigQuery, Redshift) will be 10โ100ร faster and cheaper.
- Specialized models โ pure graph traversals, vector similarity search at scale (>100M vectors), and ultra-low-latency (<1 ms) caching all have purpose-built tools that outperform Postgres extensions at the extremes.
The key discipline: verify with measurement before switching. Many teams switch away from Postgres because of imagined future scale, not measured current bottlenecks. Run benchmarks on realistic data volumes. Fix indexes first. Try connection pooling. Profile slow queries. Only reach for a specialized database when Postgres is provably the bottleneck.
Consistency & Trade-offs โ The CAP Triangle
Imagine you have two copies of your database โ one in New York, one in London. A user in Tokyo writes a record. Moments later, the network cable between data centres is cut. Now another user in Tokyo asks to read that record. The database has three options: (1) block the read until the cable is fixed (consistent but unavailable), (2) answer with the old value (available but stale), or (3) refuse to answer at all. That choice โ forced by a network partition โ is what the CAP theorem formalises.
In plain English: every distributed database can guarantee at most two of three properties simultaneously โ Consistency (every read sees the latest write), Availability (every request gets a response), and Partition Tolerance (the system keeps working even if nodes can't talk). Because network partitions in any real multi-node system are not optional, the real choice is between CP (stay consistent, sacrifice availability) and AP (stay available, sacrifice strict consistency).
The 5 Levels of Consistency
Consistency isn't binary. There's a full spectrum from "the whole world sees the same value at the same instant" down to "your read might be minutes behind." Most systems let you choose per-operation where on that spectrum you sit. Here's the ladder from strongest to weakest โ with real costs attached.
Strict Serializable
The gold standard. Every operation appears to execute in a single global order that respects real time. If your write completes at 10:00:01.000, no concurrent read anywhere in the world can see a state before that write. To pull this off, the database has to agree on a single sense of "now" across machines โ Google Spanner does it with TrueTime (GPS + atomic clocks), CockroachDB does it with hybrid logical clocks.
Why so strong? Because Spanner's distributed architecture has to coordinate acknowledgements across data centres, and it uses GPS-synchronized atomic clocks to bound the uncertainty. The cost: writes involve cross-region round-trips, adding roughly 10โ100 ms of latency. You pay that price because financial or inventory systems genuinely cannot tolerate stale reads.
Serializable (ACID)
What Postgres and most relational databases offer by default. All transactions execute as if they ran one at a time in some order โ but that order doesn't have to match wall-clock time exactly. Within a single node (or synchronous replica set) this is extremely efficient. The WHY: a single master with a write-ahead log naturally gives you serializable reads without any inter-DC coordination โ just one process, one clock, one log.
Read Committed
You only ever see data that has been committed โ no dirty reads from in-progress transactions. But two reads in the same transaction can return different values if someone commits a change between them (non-repeatable reads). This is the default isolation level for most databases including Postgres, MySQL, and Oracle. Why default here? Because it's the sweet spot: it prevents the most common anomalies (dirty reads) while adding minimal lock overhead.
Eventual Consistency
The system guarantees that eventually all replicas converge to the same value โ but a read right after a write might return the old value. Cassandra's default, DynamoDB's default, S3, DNS โ all eventual. Why choose this? Because you get dramatically higher write throughput (no coordination overhead) and near-100% availability even during network partitions. The cost: your application must be designed to tolerate stale reads โ idempotent operations, conflict-resolution logic, or "last-write-wins" semantics.
Causal Consistency
A middle ground that preserves the most important human intuition โ "I just wrote this, I should be able to read it back." Formally: if operation B happens because of operation A (B was issued after A completed, on the same session or via an explicit token), then B will always see A's effects. Unrelated operations from other sessions may still be stale. MongoDB sessions and Yugabyte's bounded-staleness reads use this model. The WHY it's cheaper: you only need to track which operations follow which (a small token per session), not synchronise clocks across every node in the world.
Scale Ceilings โ Knowing When You'll Hit the Wall
A database scale ceiling isn't about the technology being "bad" โ it's physics. A single machine has finite RAM, finite disk I/O, and finite CPU. When your working dataset exceeds RAM, every cache miss becomes a disk seek โ latency jumps from microseconds to milliseconds. When your write rate exceeds what a single WAL writer can flush, writes queue up and latency spikes. These aren't bugs; they're the cost of a simpler architecture. Knowing where those limits sit helps you choose the right database for where you're going โ not where you are today.
The 4 Ceilings That Matter
Working-Set RAM
Your "working set" is the subset of data that gets accessed frequently โ the hot rows, the recent records, the indexed columns. When the working set fits in RAM, every read is a cache hit: microseconds. When it spills to disk, each cache miss becomes a disk seek: milliseconds. That's a 1,000ร jump in latency. Why does this happen? Because databases use a buffer pool (Postgres's shared_buffers) as an in-memory page cache. Once it fills, old pages get evicted to make room for new ones โ and if the evicted page is needed again, it has to be read off disk.
The practical implication: when sizing your database instance, your RAM should comfortably hold the hot portion of your dataset โ often 20โ30% of total data, since most access patterns follow a power-law distribution.
Single-Node Write Throughput
Every write to a durable database goes through the Write-Ahead Log (WAL): the database writes the change to a sequential log file, fsyncs it to disk, then applies it to the data pages. This design is intentional โ sequential log writes are fast, and they give you crash recovery. But the fsync call is the bottleneck: each fsync blocks until the OS confirms the write hit durable storage. SSDs help (roughly 10Kโ100K fsyncs/sec), but a single Postgres node doing write-heavy OLTP typically tops out around 50Kโ100K writes/sec, depending on write size and index count.
Cross-Region Latency
If you replicate data across regions (US-East to EU-West, for example), every write that requires synchronous confirmation from the remote replica adds the round-trip time for that link โ roughly 80โ150 ms for transatlantic, 150โ200 ms for US-to-Asia. This isn't a database limitation; it's the speed of light. The WHY matters here: strong consistency across regions is physically bounded by how fast information can travel. Spanner uses TrueTime to minimise this overhead, but it can't eliminate the physics. Most cross-region read replicas are asynchronous precisely to avoid blocking writes on this latency.
Operational Complexity as a Ceiling
This one's often missed: your team's ability to operate the database is itself a ceiling. A 3-person startup running a self-managed Cassandra cluster with manual resharding is operating close to their ops ceiling โ a single late-night incident can be catastrophic. Sharding and distributed-SQL systems add deployment, monitoring, capacity planning, and failure-recovery complexity that grows super-linearly with the number of nodes. The practical ceiling here is "how many on-call engineers do you have, and how familiar are they with this system under fire?"
Read vs Write Workload Patterns
Think of a typical e-commerce product page: you load it 10,000 times a day (reads) but only update it when someone edits the description (writes). That's a 10,000:1 read-to-write ratio. Your database should be optimised to serve reads fast โ which means caching, indexing, and read replicas. Contrast this with an IoT sensor network pushing temperature readings every 10 seconds from 100,000 devices: that's ~10,000 writes per second, with dashboards querying only the latest few minutes. A read-optimised relational database would be a terrible choice here; a time-series database built for sequential appends is the obvious fit.
The 4 Workload Archetypes
Read-Heavy (90%+ reads)
The most common profile for web applications. A user lands on a product page, a blog post, or a dashboard โ hundreds or thousands of reads per write. The optimisation strategy: keep your database fast for reads by adding indexes on frequently queried columns, enabling read replicas so multiple database nodes can serve SELECT queries in parallel, and layering a cache (Redis) in front for the hottest data.
Why do read replicas help so much? Because each replica is a full copy of the primary that can serve reads independently. Three replicas give you roughly four times the read throughput (primary + 3 replicas). The catch: replicas are slightly behind the primary (replication lag, usually under 100 ms for synchronous replicas, seconds for asynchronous) โ so reads that must be fresh should still go to the primary.
Balanced (OLTP)
Online Transaction Processing โ the heartbeat of e-commerce, banking, and SaaS applications. Roughly equal reads and writes, often in small transactions that touch a handful of rows at a time. Postgres and MySQL were built for exactly this: B-tree indexes make point reads fast, the WAL makes writes durable, and MVCC lets reads and writes run concurrently without blocking each other. At higher scale, CockroachDB or Aurora add horizontal write capacity while preserving the SQL interface your team already knows.
Write-Heavy (50%+ writes)
When writes dominate, B-tree indexes become the enemy โ every write must update every index covering that table, and those updates are random I/O. Cassandra and ScyllaDB solve this with the LSM-tree (Log-Structured Merge-tree): writes go to an in-memory structure (memtable) first, then are flushed to immutable sorted files (SSTables) in a sequential batch. No random writes, no index thrashing. The trade-off: reads may need to merge data from multiple SSTables, making point reads more expensive than in a B-tree system. This is why Cassandra shines for write-heavy workloads but struggles for ad-hoc complex queries.
Append-Only (Logs, IoT, Events)
The most extreme write pattern: you never update or delete, you only append. Server logs, IoT sensor readings, user click events, financial audit trails โ all append-only. Time-series databases like TimescaleDB, InfluxDB, and Prometheus are designed specifically for this: they compress sequential timestamps aggressively (delta encoding: store differences rather than full values), automatically partition data by time window, and let you drop old partitions with a single operation instead of running expensive DELETE queries. For even higher throughput, Kafka is a write-ahead log at internet scale โ designed for millions of appends per second from many producers simultaneously.
Operational Complexity โ What Happens After You Deploy
Here is a pattern that plays out at companies of all sizes: an engineering team reads a benchmark showing Cassandra handles 500K writes/second and adds it to their stack โ only to discover six months later that resharding the cluster requires a weekend maintenance window, that monitoring multi-node replication lag requires custom dashboards their ops team hasn't built, and that their runbook for a node failure is "Google it and hope." Technical capability and operational readiness are two completely different things.
The 5 Operational Dimensions
Day-1 Setup
Getting the database running, secured, and integrated with your application. For a managed service (RDS, Cloud SQL, PlanetScale), this is a console wizard and a connection string โ under an hour. For self-hosted Postgres, you're installing, configuring pg_hba.conf for authentication, setting up SSL, and hardening the network โ a day or two. For a self-hosted Cassandra cluster, you're designing the ring topology, choosing replication factor and placement strategy per keyspace, and provisioning consistent hardware โ days to a week, often requiring a specialist.
Day-2 Operations
Everything that happens after you ship: automated backups and restore tests, version upgrades (Postgres major version upgrades require careful planning), monitoring query performance as data grows, and capacity planning before you hit limits. Managed services handle most of this automatically โ AWS RDS handles minor version patching during maintenance windows, takes daily snapshots, and alerts on disk usage. Self-hosted systems require you to build and own all of this.
Failure Recovery
What happens when a node crashes, a disk fails, or a region goes down? Managed services give you automatic failover โ RDS Multi-AZ switches to a standby replica in under 30 seconds. Self-hosted Postgres requires a tool like Patroni or repmgr plus a runbook that every on-call engineer must know cold. Self-hosted Cassandra requires understanding anti-entropy repair, hinted handoff, and the implications of a node being out of the ring for more than the grace period โ complexity that requires dedicated database engineering expertise.
Team Learning Curve & Vendor Support
SQL is the most widely known query language in software engineering. Switching to Cassandra Query Language (CQL) or DynamoDB's access-pattern-first data model requires retraining โ and CQL in particular involves mental model shifts that take months to internalize (no JOINs, primary key must encode your query patterns, partition key determines physical locality). The vendor support dimension is also real: community Postgres is excellent but enterprise tiers (AWS RDS support, Google Cloud SQL support) provide SLAs, architecture reviews, and 24/7 incident escalation. Community forums don't page someone at 3 am.
Cost Model & Budget โ How the Bill Grows
A common trap: a team evaluates two databases purely on technical merit, picks the winner, and then discovers six months later that their workload shape makes it 10ร more expensive than the alternative. Cost isn't just "how much does the instance cost per hour" โ it's about the shape of the cost curve as your usage grows. A database priced per query is cheap for a startup with 1,000 users and may become unaffordable at 1,000,000 users. A per-node database seems expensive at 1,000 users but may be a bargain at 1,000,000 users with steady load.
The 4 Cost Model Shapes
Per-Node
You pay for the instance whether it's idle or at 100% CPU. Cloud SQL db-standard-2 (2 vCPUs, 7.5 GB RAM) runs around $0.40/hr โ roughly $290/month. RDS postgres.t4g.medium is comparable. This model rewards dense, consistent workloads: if you're running at 80% utilisation 24/7, you're getting your money's worth. It penalises bursty apps: a product that gets hammered during business hours and is idle overnight still pays for the full instance. The WHY for stepping cost: when you exhaust one node, you provision another โ hence the staircase cost curve.
Per-node pricing includes predictable capacity: you know exactly what you're getting (vCPUs, RAM, IOPS) and can tune accordingly. There are no surprise bills from a traffic spike.
Per-Query / Serverless
DynamoDB on-demand charges roughly $0.125 per million read request units and $0.625 per million write request units in US East after AWS's November 2024 50% price cut (other regions can still run closer to the older $0.25 / $1.25 figures, so always check the calculator for your region). Aurora Serverless v2 scales capacity in ACU (Aurora Capacity Units) up and down within seconds. This model is ideal for variable workloads: you pay nothing for idle periods and scale seamlessly for traffic spikes. The trap: at sustained high volume, the cost curve becomes linear and often exceeds a reserved per-node instance. A rough rule of thumb: if your traffic is below roughly 50K requests/day, serverless wins; above that, per-node reserved capacity starts winning.
Per-GB Storage
BigQuery charges around $0.023/GB/month for storage, and only charges for query compute when you actually run a query ($6.25/TB scanned). Snowflake's storage pricing is similar. This model is perfectly suited for analytics and cold data: you store petabytes cheaply and query them occasionally. For hot OLTP data with thousands of queries per second, this model is expensive per-query โ it's not designed for sub-millisecond response times. The WHY: cloud warehouses are built on object storage (S3 / GCS) which is ~$0.023/GB/month versus block SSD at ~$0.11/GB/month โ they pass that storage discount to you.
License + Enterprise Support
Oracle and Microsoft SQL Server charge per-core licensing fees that can run tens of thousands of dollars per year, separate from hardware costs. These models are typically locked to enterprise contracts, include certified support SLAs, and historically come with features (row-level security, advanced partitioning, in-memory columnar engines) that weren't available in open-source alternatives. The WHY these still exist: large enterprises with compliance requirements (finance, healthcare, government) need licensed software with indemnification and contractual support. The practical implication: the license is often the single largest line item, which makes growth very expensive.
Vendor Lock-In Risk โ Speed Now vs Freedom Later
Lock-in often starts innocently. You're building on AWS, so you choose Aurora over Postgres because it's a bit faster for your workload and the auto-scaling is seamless. Then you start using Aurora's storage optimisations and I/O mode pricing. Then your ETL pipeline uses AWS Glue, which has native Aurora integration. Then your read replica is in AWS, so egress costs make moving data out expensive. A year later, migrating away from Aurora isn't just a database migration โ it's a three-month infrastructure project. This isn't a horror story; it's the normal trajectory. The lesson: understand which lock-in axes you're accepting and what they'd cost to reverse.
The 4 Lock-In Axes
Query Language Portability
ANSI SQL is the closest thing to a universal database language, but "compatible" doesn't mean "identical." Postgres-compatible databases (CockroachDB, Aurora, Neon) support most Postgres SQL features and let you point your existing application at a different backend with minimal code changes. Spanner uses a SQL dialect that looks like SQL but has unique syntax for array operations and JSON traversal โ your queries won't run on Postgres without modification. DynamoDB doesn't use SQL at all; switching means rewriting every data access layer. The more your codebase has absorbed vendor-specific query syntax, the more migration costs grow.
Schema Features & Data Types
Postgres has types that don't exist elsewhere: JSONB with GIN indexing, hstore (key-value pairs inside a column), PostGIS geography types, ARRAY columns, range types. If your schema uses Postgres-specific types heavily, migrating to MySQL or any other database requires schema redesign, not just a data dump. Spanner has interleaved tables โ a physical layout where child rows are stored adjacent to their parent rows on disk for performance. It's powerful, but it encodes Spanner's physical storage model directly into your schema design, making the schema non-portable by definition.
Operational Tooling Ecosystem
Your backup scripts, monitoring dashboards, schema migration tools, and data pipeline connectors may be built for one database's specific APIs. AWS RDS has native integrations with CloudWatch, AWS Backup, and Parameter Store. Migrating to Google Cloud SQL means rebuilding your monitoring setup, testing new backup procedures, and updating every integration. This tooling lock-in is often underestimated because it's not visible in the database itself โ it's distributed across your entire infrastructure. The deeper your ops tooling integrates with vendor-specific APIs, the higher the migration coordination cost.
Data Egress Costs
Moving data out of a cloud provider's network costs real money. AWS charges roughly $0.09/GB for data leaving its network. A 10 TB database migration costs around $900 in egress fees alone โ before the engineering time to validate the migration and cut over. For larger databases (100 TB+), egress costs can reach tens of thousands of dollars, turning a technically straightforward migration into a budget discussion. This is one of the most underappreciated forms of lock-in: you might be able to run the same SQL queries on a different cloud, but getting the data there first is expensive.
SQL vs NoSQL โ The Debate That Has the Wrong Question
When engineers argue "SQL vs NoSQL," they're usually arguing about the wrong thing. The debate frames it as two camps, but modern databases have blurred that line almost completely. PostgreSQL added JSONB โ a binary JSON column type with indexing โ back in 2014. MongoDB added full multi-document ACID transactions in 2018. The walls between camps have crumbled.
So what's the real question? It's this: which data model fits the shape of your data and your most important queries? That question doesn't care about SQL vs NoSQL labels. It cares about whether your data is naturally tabular, naturally nested, naturally a graph, or naturally a time-ordered stream โ and whether your queries need to join across entities, traverse relationships, or aggregate over time.
Below are four dimensions that actually matter when comparing databases โ not the label on the tin, but the practical behaviors that determine how painful or painless your system will be at scale.
The chart above shows most databases cluster into three zones: (1) battle-tested single-node relational (Postgres, MySQL, SQLite), (2) flexible/document stores that can distribute (MongoDB, DynamoDB, Cassandra), and (3) distributed SQL โ the newest zone โ that gives you relational rigidity at distributed scale (CockroachDB, Spanner). Where a DB lives on this chart determines what it's good at, not its NoSQL/SQL label.
The 4 Dimensions That Actually Matter
Schema Model
Relational databases enforce a schema upfront โ every row in a table has the same columns. This feels restrictive, but it's actually a guarantee: your application code can always assume column X is present and has type Y. The database becomes the enforcement layer so your application code doesn't have to be.
Document databases store each record as a JSON-like object with potentially different keys per document. This feels liberating, but the flexibility doesn't disappear โ it moves into your application code, which now has to handle missing fields, null checks, and version migrations. Neither model is inherently better; the question is where you want the complexity to live.
WHY it matters: If your entities genuinely have variable structure (a product catalog where some items have 3 attributes and others have 50), documents save real pain. If your entities are uniform (every user has an email, a created_at, and a role), a schema just enforces what you already know is true.
Query Language
SQL is 50 years old and universal for a reason: its declarative style lets you describe what data you want without specifying how to retrieve it, and the query planner figures out the execution plan. Joins, aggregations, window functions, CTEs โ all expressible in concise, readable queries.
Document query languages (MongoDB's aggregation pipeline, DynamoDB's expression syntax) are more imperative and can feel awkward for multi-entity queries. They're excellent for single-document lookups and document-scoped aggregations, but cross-document "join" logic usually pushes into application code.
WHY it matters: If your queries frequently combine data from multiple entity types (users + orders + products), SQL's join model is dramatically more natural. If most queries fetch a single document by ID or a small set by a known key, document query APIs are fine.
Transactions
A transaction is a group of operations that either all succeed or all fail together โ atomically. This is critical any time a business operation touches more than one piece of data. Transferring money between accounts means debiting one row and crediting another; if your database crashes between those two writes, you need the database to roll back both, not leave the money in limbo.
Traditional relational databases have had full ACID transactions for decades. NoSQL databases historically traded transactions for write throughput and distribution. That trade-off has narrowed: MongoDB supports multi-document transactions, DynamoDB has transactional writes for up to 25 items. But the original design philosophy still shows โ NoSQL transactions carry more overhead and behavioral caveats than native SQL transactions.
WHY it matters: Any time you have an "all-or-nothing" business rule that spans multiple records (financial transfers, inventory reservation, order placement), make sure the database's transaction semantics genuinely support it โ not just in theory but in the way your access patterns actually work.
Scaling Model
Scaling a database means handling more data or more requests. The two strategies are vertical (bigger hardware) and horizontal (more nodes). Relational databases traditionally scale vertically โ you buy a larger server. This works surprisingly far: a modern high-memory server running Postgres can handle tens of thousands of queries per second and store multiple terabytes of data.
Distributed databases (Cassandra, DynamoDB, CockroachDB) scale horizontally โ you add nodes, and the database shards data across them. This is more complex operationally, but it removes the hard ceiling of a single machine. The cost is that distributed coordination adds latency and complexity, especially for transactions.
WHY it matters: Most applications never need horizontal scaling. Vertical Postgres gets you surprisingly far. Reach for distributed databases only when you have measured evidence that vertical scaling won't get you where you need to go โ and only after evaluating read replicas, connection pooling, and query optimization as cheaper alternatives.
Many teams reported migrating "from SQL to NoSQL for scale" only to discover that sharded Postgres, read replicas, or distributed SQL (CockroachDB, Spanner) would have given them the same scale without abandoning the relational model. Before picking NoSQL for scale, benchmark your actual workload on optimized Postgres first. The real performance numbers are almost always more forgiving than the estimates.
When to Use Specialized Databases
A specialized database is one built for a single workload type: search, time-series, graphs, vector embeddings, analytics, or embedded in-process storage. Each one trades generality for performance โ it's faster and more expressive for its target workload than a general-purpose database, but it's worse or unusable for everything else.
The key rule: use specialized databases as secondary stores, not primary stores. Your primary store is where the canonical version of your data lives โ usually a relational or document database. Specialized databases sit alongside it, populated by replication or CDC, and serve specific query types the primary store can't handle efficiently. An application that runs Postgres as its primary store plus Elasticsearch for search is doing it right. An application that runs only Elasticsearch for everything is asking for problems with transactions, schema migrations, and operational complexity.
Search โ Elasticsearch / OpenSearch
Search databases are built around an inverted index โ the same structure behind a book's index. When you search "payment failed," the database doesn't scan every document; it looks up "payment" and "failed" in the index and intersects the document lists in milliseconds.
Use Elasticsearch when: keyword search with relevance ranking is a core product feature, you need faceted filtering (filter by category + price + brand simultaneously), or you're indexing semi-structured logs and need full-text queries across them. Don't use it as a primary data store โ Elasticsearch's eventual-consistency model and lack of strong transactions make it a poor choice for authoritative data.
WHY Elasticsearch over Postgres full-text search? Postgres has tsvector full-text search that's fine for basic cases. Elasticsearch wins when you need multi-language stemming, fuzzy matching, BM25 relevance scoring, nested faceting, and real-time index updates at high write volume โ all simultaneously.
Time-Series โ TimescaleDB / InfluxDB
Time-series databases are optimized for data that is always appended (never updated in place), always has a timestamp as the primary lookup dimension, and is queried with time-range filters and windowed aggregations. Think IoT sensor readings, server metrics, financial tick data, application performance data.
TimescaleDB gives you Postgres-compatible SQL plus automatic time partitioning (hypertables) and columnar compression. InfluxDB uses a custom query language (Flux) and purpose-built storage engine optimized for float-heavy append workloads.
WHY not just Postgres? At tens of millions of rows, Postgres's MVCC model accumulates dead tuple bloat on every insert, autovacuum struggles to keep up, and time-range scans across billions of rows get slow even with indexes. TimescaleDB's chunk-based architecture means a 7-day window query only reads 7 chunks, not the whole table.
Graph โ Neo4j / Neptune
Graph databases store data as nodes (entities) and edges (relationships) and traverse relationships as a native operation. When you ask "find all users who are friends-of-friends with User 42 and also purchased Product X," a graph database walks edges directly โ it doesn't need to join tables.
The classic sign you need a graph database: your SQL queries contain recursive CTEs or 3+ self-joins on an adjacency table, and they're getting slow. Each hop in a graph DB is O(1) โ you follow a pointer. Each hop in SQL is O(n) โ you do a join that scales with table size. At 4+ hops, SQL explodes; graphs stay fast.
Real use cases: social network recommendation engines, fraud detection (find all accounts connected to a known bad actor within 3 hops), knowledge graphs, and route-finding in maps or logistics networks.
Vector โ Pinecone / Qdrant
Vector databases store high-dimensional floating-point arrays (embeddings) produced by AI models and answer queries like "which stored vectors are most similar to this query vector?" This is the backbone of semantic search and RAG (Retrieval-Augmented Generation) pipelines for AI applications.
Vector databases use ANN (approximate nearest-neighbor) indexes like HNSW or IVF to make similarity search fast. A brute-force comparison of a query against 10 million 1536-dimension vectors would take seconds; ANN search does it in milliseconds.
Alternative: pgvector (Postgres extension) is a reasonable starting point for up to ~1 million vectors if you want to avoid a new infrastructure component. Purpose-built vector DBs (Pinecone, Qdrant, Weaviate) handle larger scale and offer richer filtering with less operational tuning.
OLAP โ ClickHouse / BigQuery / Snowflake
OLAP databases store data in columnar format โ all values for a column stored together. When an analytics query reads only 4 of 80 columns across 500 million rows, columnar storage reads only 4 column files, ignoring the rest. Row-oriented databases like Postgres must read every column in every row even when you only need 4.
Use OLAP databases for dashboards, business intelligence, and data warehouse queries โ anything where you're aggregating large historical datasets with GROUP BY and time-range filters. Don't use them as operational databases โ OLAP DBs are poorly suited to point lookups and frequent updates.
ClickHouse is self-hostable and extremely fast for real-time analytics. BigQuery and Snowflake are fully managed serverless options โ you pay per query, not per server, which makes costs variable but removes all ops burden.
Embedded โ SQLite / RocksDB
Embedded databases run inside the same process as your application โ no separate server, no network hop. They're used when you need local storage that's more structured than a flat file: mobile apps, desktop applications, edge devices, and test environments.
SQLite is the world's most deployed database โ it runs in every phone, every browser, every Electron app. It's a full SQL engine in a single library, with no configuration and a single-file storage model. For read-heavy workloads or applications with one writer, it's remarkably capable.
RocksDB is an embedded key-value store optimized for write-heavy workloads on SSDs. It underlies many other databases (Cassandra uses it optionally; TiKV uses it as its storage engine) โ a sign that even major distributed databases find value in a well-tuned embedded layer.
Migration Patterns โ Changing Databases Without Breaking Everything
Migrating a database in a live production system is one of the most operationally expensive things a software team can do. Not because the technology is complicated โ the challenge is that you have to keep serving live traffic while simultaneously moving the data, validating correctness, and cutting over without losing writes or returning stale reads.
Think of it like replacing the engine of a plane while it's in flight. The plane can't stop โ users are actively reading and writing your database. You have to build a parallel system, move everything over, prove it works, and then flip the switch โ all without a maintenance window that lasts more than a few seconds. Realistic timelines for production migrations at meaningful scale: three to twelve months. Plan accordingly.
The 4 Migration Strategies
Big Bang
The simplest strategy: stop all writes to the old database, copy the data to the new one, update the application's connection strings, and restart traffic. The migration window is the downtime window โ all traffic is blocked until the migration completes.
When it works: small datasets (gigabytes, not terabytes), internal tools or batch systems where a maintenance window is acceptable, or development/staging environments where downtime is free.
Why it fails at scale: copying a 2 TB database takes hours. Users don't tolerate hours of downtime. Even with fast hardware and parallel copy tools, big bang migrations for any customer-facing production system with meaningful data size are almost never acceptable.
WHY it still exists: simplicity has value. Big bang migrations are trivially reversible (switch the connection string back), require zero code changes during the migration window, and have no dual-write consistency concerns. If your dataset fits in the downtime window, big bang is often the right choice.
Dual-Write
Dual-write is the most common production migration strategy. The idea: update your application to write every new record to both the old and new databases simultaneously. Then, separately, run a backfill job to copy all existing historical data to the new database. Once the backfill completes and you've validated that both databases contain the same data, you gradually move read traffic to the new database and eventually stop writing to the old one.
The hard part: write ordering. If a record is updated in the old DB before the backfill copies it to the new DB, and then the dual-write layer writes the update to both, you might end up with the update applied twice or applied to a stale base. Robust dual-write systems use version numbers or timestamps to resolve conflicts.
WHY dual-write over big bang for production: zero downtime. Users never experience an outage. You can pause, roll back, or slow down at any point. The risk is distributed over months rather than concentrated in a single high-stakes maintenance window.
Change Data Capture (CDC)
Every database already keeps a private diary of every change it makes โ used internally for crash recovery. The clever idea here is: read that diary and stream each entry to the new database as it happens. That technique is called Change Data Capture (CDC), and the diary has a name in each database: Postgres calls it the WAL (Write-Ahead Log), MySQL calls it the binlog, MongoDB calls it the oplog. Tools like Debezium read these logs and publish change events to Kafka, which your migration pipeline consumes to populate the new database.
Key advantage: no application code changes required during migration. The application keeps writing to the old database normally; the CDC pipeline silently keeps the new database synchronized.
Complexity trade-off: CDC requires understanding the source database's change log format, handling schema changes correctly (a column rename in the old DB needs to be translated in the CDC pipeline), and managing lag (the new DB may be seconds to minutes behind the old one depending on throughput).
Strangler Fig
The strangler fig pattern โ named after a tree that slowly grows around a host and eventually replaces it โ migrates data feature by feature rather than all at once. Each time you build a new feature or refactor an old one, you store that data in the new database. Over time, the old database "strangles" as its tables empty out and all active data lives in the new system.
This is the lowest-risk migration strategy because each step is small and independently reversible. If the migration of the "user profiles" feature goes wrong, only user profiles are affected โ orders, payments, and product catalog are untouched.
WHY this works better for large legacy systems: big bang migrations of large legacy systems almost always uncover hidden dependencies, undocumented schemas, and data quality issues. The strangler fig pattern gives you time to discover and fix these one feature at a time instead of discovering them all during a single high-stakes cutover weekend.
The downside: you're operating a polyglot system โ data split across two databases โ for months or years. Application code must know which database each entity lives in. This adds cognitive overhead and can be confusing for new engineers joining mid-migration.
The most dangerous moment in any migration is the cutover โ when you stop writing to the old database and commit fully to the new one. Always have a rollback plan that can be executed in under 5 minutes (a feature flag, a connection-string environment variable, a traffic weight in your load balancer). Never commit to a cutover that can't be reversed quickly.
Cloud-Managed vs Self-Hosted โ Who Runs the Database?
Before you decide which database to use, there's a question that comes first: who's going to run it? Cloud-managed database services (RDS, Aurora, Cloud Spanner, Atlas) handle the infrastructure for you. Self-hosting means your team installs, configures, patches, monitors, and backs up the database on VMs or bare metal.
This isn't a technology decision โ it's an operations staffing decision. Running a database in production safely requires expertise in storage configuration, replication setup, backup validation, failover testing, query plan analysis, and capacity planning. The question is whether your team has that expertise and wants to apply it to database administration or to product features.
The chart above shows the intuition: managed services carry a per-unit pricing premium (you pay for the convenience), but self-hosting carries a fixed ops burden that doesn't scale as steeply. At small scale, managed wins โ the premium is small and the ops savings are large. At very large scale, the managed premium becomes significant. But the crossover point varies enormously based on how much your engineers cost versus how much the cloud premium costs in your specific region and workload.
4 Considerations
Team Size
The single most reliable rule: under 10 engineers, use managed. A team of 5 engineers trying to self-host Postgres with proper HA, backups, failover, and monitoring will spend 20โ30% of their engineering capacity on database operations. That's one full-time engineer just keeping the database alive โ capacity that won't ship features.
Managed services (RDS, Aurora, Atlas) handle automated backups, point-in-time recovery, read replica provisioning, and minor version patching. These seem like small conveniences, but each one represents hours of work per month when done manually โ and skipped maintenance has a way of becoming a 3 AM incident.
WHY the threshold is roughly 10 engineers: at that size, a team typically has at least one engineer with enough infrastructure depth to own database operations without it consuming their entire role. Below that, the ops burden competes directly with product velocity.
Compliance & Data Residency
Some industries โ healthcare (HIPAA), finance, government โ have regulations that restrict where data can physically reside or require specific audit trails that some cloud providers can't certify for. In these cases, self-hosting isn't a cost choice โ it's a compliance requirement.
That said, major cloud providers (AWS, Azure, GCP) all offer HIPAA-compliant managed database services with BAAs. Don't assume self-host is required for regulated industries โ verify with your compliance team whether specific managed offerings can satisfy your regulatory requirements before building self-host infrastructure.
Cost at Scale
At very large scale โ hundreds of terabytes, thousands of queries per second โ managed database pricing can become significant. Aurora can cost 3โ5x the equivalent EC2 + storage cost for the same workload. At that scale, teams with mature infrastructure practices often move to self-managed Postgres on optimized EC2 instances, saving millions per year.
But this math only works if you have the engineering expertise to do it safely. The engineers needed to run a large Postgres fleet safely are expensive โ senior DBAs or SREs with deep database expertise. The real calculation is: cloud premium vs. (DBA salary ร headcount ร years). Many teams find that the cloud premium is cheaper than the team it would take to match what the managed service does automatically.
Rule of thumb: unless you're spending more than ~$20K/month on managed database services and you have at least 2 engineers with deep database expertise, the math almost never favors self-hosting.
Feature Velocity
Managed database services often receive new features earlier than self-hosted equivalents because the cloud provider's team is focused exclusively on operating that database at scale. Aurora Serverless v2, Spanner autoscaling, Atlas's in-memory online index builds โ these capabilities appear in managed services months or years before they're operationally safe to run self-hosted.
If your team wants to use cutting-edge database features without deep expertise in operating them safely, managed services provide a sandbox for experimentation without the risk of misconfiguration bringing down production.
Vendor lock-in caveat: the more proprietary managed features you use (Aurora's storage engine, Spanner's external consistency API, DynamoDB's DAX caching), the harder it becomes to migrate later. Use standard SQL and standard APIs wherever possible, and treat proprietary managed features as technical debt that needs a migration story.
Anti-Patterns โ The Mistakes That Hurt for Years
Database anti-patterns aren't mistakes that show up immediately โ they're decisions that feel reasonable at the time and become painful 12 to 24 months later, when the cost of reversing them is high and the pressure to ship is real. The reason they're common is that they often come with a short-term benefit (flexibility, excitement, low initial cost) that masks the long-term tax.
Resume-Driven Development
The temptation: adopting a new, exciting database technology because the team wants to learn it or add it to their resumes. The result: a production system running a database that nobody on the team has operated at scale, chosen for career reasons rather than technical fit.
This is more common than anyone admits. Kafka for a queue that would work fine with a Postgres table. CockroachDB for a single-region app with 1,000 users. Elasticsearch as a primary store because inverted indexes sounded interesting.
The fix: explicitly separate "technologies I want to learn" from "technologies I choose for production." Personal projects, side projects, hackathons โ great places to learn new databases. Production systems with real users โ choose boring, proven technology. Boring technology has known failure modes, abundant Stack Overflow answers, and junior engineers who can operate it without months of training.
Premature Distributed
Distributed databases solve a real problem โ scaling beyond a single machine โ but they introduce significant operational complexity: split-brain scenarios, distributed transaction coordination, network partition handling, and cross-shard query routing. These problems don't exist in a single-node database.
The mistake: adopting Cassandra, CockroachDB, or sharded MongoDB because you might need to scale someday, before you have any evidence that a single Postgres instance is insufficient. A well-tuned Postgres instance on modern hardware can handle tens of thousands of queries per second and store multiple terabytes of data. Most applications never come close to that ceiling.
The fix: run a single Postgres. Add read replicas when reads need to scale. Consider partitioning when tables exceed hundreds of millions of rows. Only graduate to distributed when you have measured, reproducible evidence that Postgres can't handle the load โ not because you expect it might not someday.
"NoSQL for Flexibility"
The pitch: "Our schema is changing rapidly. With MongoDB, we don't need migrations โ we just add fields to documents." The reality: the schema doesn't disappear. It moves from the database (where it's enforced automatically) into the application code (where it's invisible and inconsistently applied).
Six months in, you have documents with 15 different shapes because the schema evolved without formal migrations. Your application has null checks, try-catch blocks, and version detection logic scattered everywhere. New engineers spend days understanding what fields exist. Reporting queries require complex $project stages to normalize the data before aggregating it.
The fix: analyze your actual data shape before choosing. If your entities genuinely have highly variable attributes that are impossible to model in a table, a document DB may be right. But most "our schema changes fast" concerns are really about schema migration velocity โ which Postgres handles fine with tools like Flyway or Alembic.
Vendor Lock-In Unaware
Vendor lock-in happens when you adopt proprietary database features โ Aurora-specific storage behaviors, DynamoDB-specific access patterns, Cosmos DB-specific consistency APIs โ without a plan for what happens if you need to migrate. The risk: your vendor raises prices, sunsets a feature, or changes terms, and you discover that migrating away would cost more than accepting the new terms.
Lock-in isn't inherently bad โ it's a trade-off. You get better performance or features in exchange for reduced negotiating leverage and higher migration cost. The problem is doing it unaware, without a conscious decision that the trade-off is worth it.
The fix: document every proprietary feature your system uses. For each one, estimate the migration cost if you had to replace it. Make that list visible to the team and revisit it annually. Use standard SQL and standard access patterns wherever performance allows, and treat proprietary features as technical debt with a known cost.
Ignoring Ops Complexity
Some databases are excellent on paper โ powerful, scalable, well-documented โ but operationally demanding in practice. Cassandra's ring topology, token management, compaction tuning, and repair operations require deep expertise to manage correctly. Elasticsearch's JVM heap tuning, shard rebalancing, and mapping explosion risks bite teams that don't have Elasticsearch specialists.
The mistake: evaluating a database purely on feature capabilities and benchmarks, without evaluating the operational overhead of running it safely in production. The database that performs best on a benchmark is worthless if your team can't operate it without constant fires.
The fix: before committing to any database, ask: "Who on our team has operated this database in production? What are the top 3 most common failure modes, and how do we recover from each?" If no one on the team can answer confidently, either use a managed service or choose a simpler database that the team genuinely understands.
Premature Optimization
The classic version: picking Cassandra "because we'll need it when we have millions of users," when the product currently has 100 users and no clear path to a million. The distributed complexity costs you now โ slower development, harder debugging, more complex queries, additional infra. The scale benefits come later, if they ever come.
The pattern is expensive because distributed systems are more complex to build on. Every feature takes longer to implement correctly. Every bug is harder to diagnose. Every new engineer takes longer to become productive. These costs are real and recurring; the "future scale" benefit is speculative.
The fix: size the database for your current scale plus roughly one order of magnitude growth. If you have 1,000 users today and can imagine 10,000 users in a year, you don't need Cassandra โ you need well-indexed Postgres. When you hit 10,000 users and can project 100,000, revisit. Build for the next milestone, not the hypothetical endpoint.
Decision Worksheet โ Apply This to Your Next Choice
Everything in this guide has been building toward a single, practical output: a framework you can actually use the next time someone asks you "which database should we pick?" The worksheet below turns the 5-question funnel into a fillable form with concrete example answers that lead you to a specific recommendation.
The goal isn't to make the decision mechanical โ database selection always requires judgment. The goal is to make sure you've thought about each dimension explicitly rather than jumping to the first database that comes to mind. A 30-minute worksheet session at the start of a project is worth far more than a 12-month migration later.
The 5-Step Process
Step 1 โ Describe the Workload Shape
Start by describing your data and queries in plain English โ not in terms of any database technology, but in terms of what the data looks like and how your application will access it.
Key questions to answer: What are your main entity types? How do they relate to each other? What are the three most common queries your application will run? What percentage of operations are reads vs writes? Does your schema change frequently, or is it stable once defined?
WHY this comes first: the workload shape determines which data model fits naturally. Relational entities with joins โ relational DB. Variable-schema documents with no join needs โ document DB. Millions of timestamped measurements โ time-series DB. This one answer eliminates most of the candidate list immediately, which is why it's first in the funnel.
Step 2 โ Define Consistency Requirements
Ask which operations must be atomic and what "wrong" data would cost you. The answer tells you how strict your consistency requirements are.
Strong consistency is required when: money changes hands, inventory is decremented, access permissions are modified, or any state change that can't be undone is recorded. Eventual consistency is acceptable when: you're reading product catalog data, displaying aggregate statistics, or serving content that's slightly stale for a few seconds.
Why this is step 2: consistency requirements are a hard constraint, not a preference. If your business logic requires atomically updating two records and one database family can't do that reliably, it's eliminated immediately regardless of how good its benchmark numbers look. Determine hard constraints before soft preferences.
Step 3 โ Estimate Scale
Write down two numbers: your expected scale at one year and your expected scale at three years. Express them in users, requests per second, and data volume in gigabytes or terabytes. Be specific โ "a lot" is not a scale estimate.
Then ask: does a single well-tuned Postgres instance handle this? If yes, use Postgres. If no, what's the specific bottleneck โ storage, write throughput, read throughput, or data volume? Only add complexity to address specific, measured bottlenecks.
WHY hard numbers matter: "we'll need to scale eventually" has led more teams into premature distribution than any other phrase. When you write "200 QPS peak" on paper, it becomes obvious that a single Postgres instance handles that easily. The specificity prevents hypothetical scale from driving real engineering decisions.
Step 4 โ Assess Team Capability
Ask honestly: who on the team has operated this database in production before? Not played with it locally โ actually run it in production, handled incidents, done major version upgrades, restored from backups, and diagnosed slow queries under load.
If the answer is nobody, factor that into your choice. Either use a managed service that handles the hard parts automatically, or choose a database your team already knows well. The learning curve tax is real โ it shows up as slower feature development, more incidents, and longer incident recovery times for the first 6โ12 months of operating an unfamiliar database.
Practical output: if you pick a database that no one knows and won't use managed services, budget explicitly for 3โ6 months of learning curve time in your project plan. Don't pretend the learning is free.
Step 5 โ Calculate Total Cost
Total cost has five components that all matter: (1) raw infrastructure cost (per-node, per-GB, per-query pricing), (2) ops labor cost (how many engineer-hours per month does operating this safely require?), (3) training cost (how long until the team is proficient?), (4) migration risk (how hard is it to move away in 2 years if circumstances change?), and (5) incident cost (what's the business cost of a 2-hour database outage?).
Many teams compare databases on infrastructure cost alone and ignore the other four components. A database that costs $200/month more than the alternative but requires zero ops labor and has a trivial migration path is often cheaper in total cost than the "cheaper" alternative that requires 20 engineer-hours per month of maintenance.
WHY migration risk matters: circumstances change โ team size, business model, scale, cloud providers. A database with a clear, well-documented migration path keeps your options open. A database built around proprietary APIs without migration tooling is a bet that your current situation is permanent.
Using the Worksheet in Practice
Fill out the worksheet as a team โ not solo. Different engineers will have different answers to steps 1โ3 based on what they know about the business, and that disagreement is valuable. If your backend engineer thinks the workload is relational and your data engineer thinks it's time-series, you probably need a more detailed discussion about what queries matter most.
Document the answers even briefly. Six months later, when someone asks "why did we pick Postgres?" you want a written answer that explains the reasoning, not a shrug. The documentation also helps you know when to revisit the decision โ if your actual scale in year 2 diverges significantly from your step 3 estimate, the worksheet tells you which assumptions have changed.
Time investment: a thorough worksheet session takes 30 to 60 minutes. That's a bargain compared to the cost of a wrong decision โ which, as the migration patterns section described, can consume 3 to 12 months of engineering time to reverse.
The worksheet doesn't replace judgment โ it structures it. There will always be cases where two databases look equally good after step 5, or where a constraint that isn't captured in the five questions matters most. In those cases, the worksheet has still done its job: it's made the trade-offs explicit so the team can have an informed debate rather than an opinion contest.
Decision Tools โ What to Use When Picking a DB
You do not have to guess your way to a database choice. There are purpose-built tools for ranking options, comparing features, estimating costs, and running honest benchmarks. Think of these six tools as a due-diligence kit: use them together in the order shown in the workflow below and you will arrive at a defensible, documented decision rather than a gut feeling.
db-engines.com
db-engines.com tracks popularity rankings for hundreds of databases based on job postings, search trends, and Stack Overflow activity. This matters because a database with a larger community is safer: more tutorials, more Stack Overflow answers, more engineers who already know it, and more likelihood of long-term support. The site also has feature-matrix comparison pages โ you can select five databases side-by-side and compare ACID support, replication model, partitioning strategy, licensing, and more. Use it as a first filter: before you read vendor docs, check whether the database has enough community momentum to be a safe bet for a multi-year project.
AWS Database Migration Service (DMS)
AWS DMS is a managed service that moves data between different database engines โ for example, from Oracle to Postgres, or MySQL to Aurora. This matters when evaluating a new database because one of the biggest switching costs is the migration itself. DMS handles the heterogeneous migration plumbing (schema conversion + data replication) so you can do a test migration without writing custom ETL scripts. It also supports ongoing replication during the cutover window, so you can keep the old database in sync while you validate the new one. When assessing whether you can actually switch databases, run a DMS trial migration on a copy of your production data to surface compatibility issues before you commit.
Cloud Pricing Calculators
Every major cloud has a pricing calculator โ AWS Pricing Calculator, Google Cloud Pricing Calculator, and Azure Pricing Calculator. For databases, the key variables are: storage size, read/write throughput (I/O operations per second), and data transfer out. Run three scenarios through the calculator: today's workload, 10ร growth, and 100ร growth. Why three? Because the cheapest option today is sometimes the most expensive at scale โ Aurora Serverless is inexpensive at low traffic but expensive under sustained load, while provisioned instances flip the math. Cloud calculators also let you model Aurora vs. Cloud SQL vs. RDS vs. BigQuery side-by-side. This is the only honest way to compare costs; vendor marketing almost never tells you the full story.
Apache Atlas / DataHub
Apache Atlas and DataHub (open-sourced by LinkedIn) are data catalog tools. A data catalog is essentially a searchable inventory of all your data assets: which tables exist, what columns they have, what the lineage is (where data came from), and who owns each dataset. Why does this matter when picking a database? Because in a polyglot environment โ where you have Postgres, Mongo, Redis, Snowflake, and Kafka all running together โ it is very easy to lose track of where your data actually lives and how it flows. Data catalogs prevent this operational chaos. Before adding a fifth database to your stack, run a catalog scan and ask: "Is this data already available in an existing store?" You may find you do not need the fifth database at all.
OpenTelemetry
OpenTelemetry is a vendor-neutral standard for collecting metrics, traces, and logs from your applications and infrastructure. It matters here because when you are comparing two databases, you need consistent observability data โ not one DB with Datadog and another with CloudWatch and a third with Prometheus, each reporting latency numbers in different ways. OpenTelemetry provides a common instrumentation layer that works across databases, ORMs, and cloud providers. In practice: instrument your application with OpenTelemetry before you start the evaluation, then your benchmark dashboards and production performance data all speak the same language. This also means your observability investment is not wasted if you switch databases โ the instrumentation travels with the application, not the database.
Benchmarking Tools โ pgbench, sysbench, YCSB
Three tools cover most benchmarking needs. pgbench ships with Postgres and runs TPC-B-like OLTP workloads (mix of reads, updates, and inserts) against any Postgres-compatible database. sysbench is more configurable: you can tune the read/write ratio, number of threads, and dataset size, and it also works against MySQL. YCSB (Yahoo Cloud Serving Benchmark) is the standard for NoSQL databases โ it defines workload templates (A = 50/50 read-write, B = 95% reads, C = read-only, D = recent reads) and can target Cassandra, MongoDB, Redis, and HBase. The important rule: always benchmark with your actual data shape and access pattern. Running YCSB workload A against Cassandra proves nothing if your real workload is 99% reads. Customize the workload templates; do not trust default results.
Common Misconceptions โ What People Get Wrong
Database choices are surrounded by persistent myths โ things that sound true, get repeated in blog posts, and quietly lead teams into expensive decisions. Each myth below has a kernel of truth that makes it believable, which is exactly why it keeps spreading. Understanding why the myth is wrong is just as valuable as knowing the correct answer.
This one survives because some NoSQL databases are faster in specific scenarios โ Redis sub-millisecond key lookups, Cassandra high-throughput appends โ and those benchmarks get amplified. But speed is always relative to a workload. PostgreSQL consistently outperforms MongoDB on workloads that fit the relational model: joining customers to orders to invoices, enforcing referential integrity, doing aggregations with GROUP BY. Why? Because Postgres was designed for exactly those access patterns; MongoDB was designed for flexible document retrieval. MongoDB can be faster for document lookups where the document holds all the data you need in one read. But for anything requiring cross-document consistency or multi-collection joins, Postgres wins โ it avoids the application-level join work that MongoDB pushes onto you. The real lesson: match the database to the workload shape. "Faster" without specifying the workload is meaningless.
Distributed SQL (CockroachDB, Spanner, YugabyteDB) is genuinely impressive engineering โ it gives you horizontal scale with ACID guarantees. But it adds a real cost that most teams underestimate: write latency increases because consensus takes network round-trips. A Postgres write on a local SSD completes in under 1 ms. The same write in CockroachDB needs a Raft quorum โ at minimum 2โ3 ms in a single region, 50โ200 ms across regions. For interactive user-facing workloads, that difference is felt. Distributed SQL also has a smaller ecosystem than Postgres: fewer extensions, fewer ORM drivers with full compatibility, fewer DBAs who have production experience with it. The right question is: do you actually need global scale with ACID? Most applications with under 50,000 writes per second, or confined to a single region, can run on Postgres with replicas and never hit that ceiling. Distributed SQL is the right upgrade path when you hit that ceiling โ not the default starting point.
Document databases like MongoDB store data without enforcing a schema at the database level โ each document can have different fields. This feels flexible, but the flexibility is an illusion. The schema does not disappear; it moves into your application code. Every time your app reads a document, it implicitly assumes a structure โ it accesses doc.user.email, expects it to be a string, and breaks if it is missing or null. That assumption is a schema; it is just undeclared and unenforced. The result: after 18 months of feature development, you have 12 different document shapes floating in the same collection, your application has conditional logic for each variant, and adding a new field requires a data migration just like SQL โ except the migration is now invisible and untested because no one documented what "the current schema" actually is. Explicit schemas (Postgres, MySQL) force you to think through your data model upfront, which is usually the right constraint, not a burden.
This myth comes from looking only at the per-node sticker price, which is legitimately higher for managed services (RDS, Cloud SQL, Aurora) than for a self-hosted VM running Postgres. But the comparison ignores the full cost. Self-hosting has hidden costs: a DBA or senior engineer's time for patching, backup validation, failover testing, capacity planning, and on-call. At a fully-loaded engineer cost of roughly $150โ300/hour, even four hours of monthly database ops time adds $600โ1,200/month to the "cheaper" self-hosted option. For small teams โ 2โ10 engineers โ managed databases routinely cost less in total than self-hosting because you do not have a dedicated DBA. The calculus flips at large scale: once you have a dedicated DB team and are running dozens of high-traffic instances, the per-node premium adds up and self-hosting wins on pure dollar cost. The right question is: "What is my ops team's real cost, and what can they maintain without it becoming a second job?"
Microservices architecture has popularized "each service owns its data store" โ and that principle is sound. But it gets stretched into "therefore we should use a different type of database for each service" which is not what separation of concerns means. Separation of concerns is about ownership boundaries, not technology diversity. Two services can own separate schemas in the same Postgres cluster and still have clean boundaries. Every additional database type you add to your stack multiplies operational burden: different backup procedures, different monitoring agents, different failure modes, different expertise required from your on-call team. A team running Postgres + Redis + Elasticsearch has three runbooks, three sets of alarms, three upgrade cycles. Add Cassandra for time-series and Mongo for user preferences and you now have five sets of everything โ and your on-call engineer at 2 AM needs to know all five. Polyglot persistence is sometimes the right call; it just has a real operational cost that needs to be measured, not assumed away.
This is the most dangerous myth because it sounds reasonable โ why over-engineer the data layer early? The problem is that migrating databases later is one of the most expensive engineering projects a company can undertake. Data migrations require dual-write periods (writing to old and new DB simultaneously), backfill jobs (copying historical data), cutover windows, rollback plans, and extensive validation. Every day of production traffic makes the migration harder: more data to copy, more edge cases to handle, more application code that has grown coupled to the old DB's quirks. Choosing poorly upfront does not just cost you the migration effort โ it costs you the months of feature velocity you lose during the migration, the bugs that surface from subtle behavioral differences between systems, and the team morale hit of spending a year "undoing" an old decision. The right time to think carefully about database choice is before the first line of application code is written, not after 500,000 rows are in production.
Real-World Disasters โ When the Wrong DB Costs Millions
These incidents are composites of real patterns seen across engineering organizations โ anonymized and generalized, but every failure mode described here has been documented in engineering post-mortems, conference talks, and blog posts. They are presented not to shame teams but to make these failure modes visceral enough that you remember them when you are in the middle of a database decision.
Incident 1 โ The MongoDB-to-Postgres Rebuild
What happened: A financial services startup chose MongoDB early because "we do not know our schema yet." Eighteen months later the application had grown to manage multi-currency ledger entries, reconciliation reports, and audit trails โ all of which required strong cross-document consistency and multi-collection joins. The team spent more engineering time working around MongoDB's consistency model than building features. They rebuilt on Postgres over six months.
Why it happened: "We do not know our schema yet" is almost never true for financial data. Money, accounts, and transactions have been modeled relationally for decades. The uncertainty was in the product features, not the data model. The relational model โ tables, foreign keys, joins โ was the right choice from day one.
Lesson: If your core entities have clear relationships (customers have accounts, accounts have transactions), model them relationally from the start. Flexibility in schema design is not a reason to avoid SQL โ Postgres's JSONB column can hold flexible semi-structured fields while the relational skeleton enforces correctness.
Incident 2 โ Premature Cassandra
What happened: A startup with roughly 1,000 daily active users adopted Cassandra because a senior engineer had used it at a previous company (which had 10 million users). The cluster required careful capacity planning, nodetool management, compaction tuning, and tombstone monitoring โ operational knowledge that took significant time to build. Feature velocity slowed by an estimated 40โ50% for six months while the team learned Cassandra's operational model.
Why it happened: Technology decisions made by pattern-matching to a previous context rather than current requirements. Cassandra is the right tool at genuine scale โ millions of writes per second, multi-datacenter active-active, high availability for append-heavy workloads. At 1,000 DAU, a single-node Postgres handles the workload trivially.
Lesson: Scale when needed, not preemptively. Every exotic database adds operational overhead that burns engineering time. That time has a direct opportunity cost: features not built, bugs not fixed, customers not retained. The rule of thumb: exhaust Postgres (+ replicas + read scaling) before reaching for distributed infrastructure.
Incident 3 โ Polyglot Sprawl
What happened: A fintech company grew over three years from Postgres-only to seven production databases: Postgres (core ledger), MongoDB (user preferences), Redis (caching + rate limiting), Elasticsearch (transaction search), Cassandra (event logs), Snowflake (analytics), and Neo4j (fraud graph). Each addition was individually justified. The cumulative result: the on-call rotation required expertise in seven different failure modes. A Neo4j compaction stall at 2 AM caused a pager alert that three on-call engineers could not diagnose โ none of them had deep Neo4j knowledge. Consolidation back to three databases (Postgres + Redis + Snowflake) took 12 months.
Lesson: Consolidate when overhead exceeds benefit. Before adding any new database type, ask: "Is there a way to satisfy this requirement inside an existing store?" Postgres handles full-text search, geospatial, JSONB, time-series (with TimescaleDB), and graph-adjacent queries well enough for most applications. The friction of adding a fourth database should be high and deliberate.
Incident 4 โ Vendor Lock-in Trap
What happened: A team built their core transaction platform on Google Cloud Spanner, attracted by the global ACID guarantees and fully-managed operations. Two years later, a significant Spanner pricing increase (triggered by a new billing model for processing units) made the total cost untenable. The migration to an open-source NewSQL alternative took 18 months โ not because the data was hard to move, but because the Spanner-specific query features, IAM integration, and session pooling patterns were baked into dozens of microservices.
Why it happened: The team assessed the technical fit carefully but did not formally assess portability. Questions they did not ask upfront: "What is our egress cost if we want to leave?" "Does the client library we use have an equivalent for the alternative?" "What Spanner-specific features are we depending on that have no portable equivalent?"
Lesson: Assess vendor lock-in risk before committing to any proprietary cloud database. For each Spanner (or DynamoDB, or CosmosDB) feature you plan to use, identify the open-source equivalent. If there is none, you are building a dependency that will be expensive to remove.
Incident 5 โ Wrong Consistency Choice
What happened: An analytics team used DynamoDB with eventually consistent reads (the default, lower cost option) to power a business intelligence dashboard that was presented in board meetings. Board members noticed that revenue figures shown live sometimes differed from figures shown five minutes earlier โ by amounts large enough to cause confusion. The root cause: eventually consistent DynamoDB reads could return data that was up to several seconds old, and the dashboard was refreshing frequently from different read replicas at different points of convergence.
Why it happened: Eventually consistent reads are the right default for many workloads โ they are cheaper and faster. But "board-meeting revenue figures" is not a workload where stale data is acceptable. The engineering team optimized for cost and latency without asking the product question: "Who reads this data, and what do they do with it?"
Lesson: Match consistency to user expectations, not engineering convenience. Whenever data is user-facing and users will compare values across time, across sessions, or in high-stakes decisions, you need consistent reads. The cost difference between eventually consistent and strongly consistent DynamoDB reads is small; the business cost of a confused board presentation is not.
Best Practices โ Eight Rules for Picking Well
These eight rules are distilled from the failure patterns in Section 21 and the decision framework we have built across this page. They are not commandments โ every rule has exceptions โ but they are the right defaults for most teams. Violating any one of them should require a deliberate, documented reason, not just momentum or habit.
Rule 1 โ Default to Postgres
Postgres is the right default because it is battle-tested (30+ years), runs everywhere (cloud, on-prem, local), handles more workload types than most engineers realize (JSONB, full-text search, geospatial, time-series with TimescaleDB), and has the largest support ecosystem of any open-source database. The question is never "should we use Postgres?" but "have we found a specific, measured limitation that Postgres cannot solve?" If you have not found that limitation, Postgres is the answer. This rule has saved hundreds of teams from premature complexity.
Rule 2 โ Match Data Model to Query Patterns
The most common database performance problem is not the database itself โ it is a mismatch between how data is stored and how it is queried. Storing time-series events in a relational table with individual rows per event then running range-aggregate queries is slow not because SQL is slow but because the row model is wrong for that access pattern. Before designing your schema, write out your top 10 most frequent queries. Then design your data model to make those queries fast. This is the principle behind column-oriented databases (fast scans), document stores (full-document reads), and time-series databases (range aggregates). The model follows the query, not the other way around.
Rule 3 โ Plan Migration Paths
Every database decision is temporary โ business needs change, pricing models change, better tools emerge. Before committing to a database, spend one hour answering: "How do we leave?" What is the standard export format? Does the client library have equivalents elsewhere? What features are proprietary and have no open-source substitute? This is not pessimism; it is risk management. A database that is hard to leave is not just a technical constraint โ it is a negotiating disadvantage with the vendor. Portable foundations (open protocols, standard SQL, documented export formats) give you leverage. Proprietary features give the vendor leverage.
Rule 4 โ Managed Over Self-Host for Small Teams
Self-hosting a database means you own: initial setup, OS patching, database version upgrades, backup configuration and testing, replication setup, failover testing, capacity planning, monitoring, and on-call runbooks. For a team of 2โ5 engineers focused on product, that is a meaningful slice of engineering time every month. Managed services (RDS, Cloud SQL, Aurora, Neon, PlanetScale) handle all of that. The per-node cost is higher, but the total cost โ including the opportunity cost of engineering time โ is almost always lower for small teams. Re-evaluate when you have a dedicated DBA or a database team; at that point the math changes.
Rule 5 โ Measure Before Deciding
Vendor benchmarks are designed by vendors to make their product look good. They use optimal hardware, optimal workload shapes, and optimal configuration settings. Your production workload will not look like that. The only benchmark that matters is: your data shape, your query mix, your access pattern, your volume, on hardware similar to what you will run in production. This does not have to be exhaustive โ a two-day benchmark with pgbench or YCSB on a realistic data sample is enough to surface the 10ร performance cliffs that matter. Run benchmarks before the architecture review, not after the contract is signed.
Rule 6 โ Calculate Real TCO
Total Cost of Ownership for a database has four components: (1) Infrastructure cost โ node hours, storage, I/O, network egress; (2) Operations cost โ engineer time for maintenance, upgrades, on-call; (3) Migration risk โ the cost of the work you will need to do if you have to switch later; (4) Opportunity cost โ features not built while engineers maintain infrastructure. Run these four numbers for each candidate before deciding. A "cheap" database with high ops cost is often more expensive over two years than a "expensive" managed service with near-zero ops cost.
Rule 7 โ Limit Polyglot Sprawl
A useful heuristic: most teams should run no more than 2โ3 database types in production. Postgres for relational and semi-structured data, Redis for caching and ephemeral state, and optionally one analytics store (Snowflake, BigQuery, ClickHouse) for OLAP. Anything beyond that requires a specific, measured justification. The friction of adding a fourth database type should feel heavy โ not because the technology is bad, but because you are choosing to multiply your operational surface area. This heuristic is not a hard limit; a graph-heavy application may genuinely need Neo4j as a fourth store. But "it would be neat" is not enough. "We measured the cost of doing this in Postgres and it was 10ร slower" is enough.
Rule 8 โ Document the Decision
Write an Architecture Decision Record (ADR) for every significant database choice. It should answer four questions: (1) What did we choose? (2) What alternatives did we consider? (3) Why did we make this choice? (4) What conditions would make us reconsider? An ADR takes 30 minutes to write and pays dividends for years โ future engineers joining the team understand immediately why MySQL and not Postgres, or Cassandra and not DynamoDB. More importantly, it forces the team to articulate the reasoning, which often surfaces assumptions that are worth challenging before the decision is locked in.
FAQ โ The Questions Everyone Actually Asks
These are the questions that come up in every system design interview, every Slack architecture thread, and every database evaluation. Each answer is written to give you a defensible, honest answer โ not a "it depends" non-answer, and not a vendor-aligned recommendation.
Postgres. Almost always. Here is why: Postgres runs the most workload types of any single database โ relational joins and ACID transactions (the core), JSONB for semi-structured data, full-text search via tsvector, geospatial queries via PostGIS, time-series via TimescaleDB, and vector similarity search via pgvector. Deep Postgres knowledge transfers to almost everything โ Aurora is Postgres-compatible, Neon is Postgres-compatible, YugabyteDB YSQL is Postgres-compatible. Understanding Postgres's MVCC (how it handles concurrent reads and writes without locks), its WAL (write-ahead log for crash recovery), and its query planner (how it chooses indexes and join strategies) gives you mental models that apply directly to every other relational and distributed database. If you are a backend engineer who deeply understands one database, make it Postgres.
MongoDB genuinely wins in a narrow set of conditions: (1) Variable schema across rows โ your data truly varies in structure from record to record, not just "we might add fields later" but "half the records have a completely different shape than the other half." Content management systems with heterogeneous content types are a real example. (2) Flexible queries against semi-structured data โ you need to query inside nested JSON objects in ways that would require many indexed JSON columns in Postgres. Before reaching for MongoDB, though, check Postgres JSONB first โ it supports indexing inside JSON documents, queries against nested fields, and operators for containment and path traversal. Many workloads that "need MongoDB" are actually fine with Postgres JSONB, and keeping them in Postgres means they can JOIN against relational data without application-level joins. MongoDB wins when you have truly variable, richly nested documents and your primary access pattern is whole-document reads by a single identifier or a few indexed fields.
AWS's recommendations are shaped by AWS's business model. AWS makes more money on DynamoDB (fully proprietary, no way to run it outside AWS) and Aurora (higher margin than RDS) than on standard RDS Postgres. Their documentation and well-architected framework emphasize these products accordingly. That does not make them bad choices โ DynamoDB is genuinely excellent for certain workloads and Aurora is a solid Postgres-compatible managed service. But read AWS recommendations knowing their incentive: to keep you on AWS products, ideally proprietary ones. For portability: RDS Postgres or Aurora Postgres are reasonable choices because they use standard Postgres wire protocol and can be migrated. For AWS-native simplicity: DynamoDB is excellent if you understand its data model constraints (no cross-partition transactions without extra work, limited query flexibility compared to SQL) and you accept that migrating away later is a significant project. The right question is always: "What does MY workload need?" โ not "What does AWS suggest?"
Only when you have measured pain in a single database โ not imagined pain. The microservices "each service owns its data store" principle is about ownership and deployment independence, not about requiring different database types. Two services can own separate schemas in the same Postgres cluster and still be independently deployable. Each additional database type you add doubles (or more) the operational complexity: two sets of backup procedures, two monitoring systems, two upgrade cycles, two knowledge domains your on-call engineer needs at 3 AM. Add a third and it triples. The right trigger for a second database type is a concrete, measured performance or capability constraint: "We need sub-millisecond cache reads that Postgres cannot provide" โ add Redis. "We need full-text search with fuzzy matching across 10 million documents in under 100ms" โ add Elasticsearch. "We need to cache session tokens and expire them automatically" โ Redis again. The trigger is never "this service is different, so it deserves a different database."
You need sharding or distributed SQL when two conditions are both true: (1) Write throughput exceeds your single primary's capacity, and (2) you have exhausted vertical scaling and read replicas. A modern Postgres primary on a large cloud instance (96 vCPUs, 768 GB RAM, NVMe SSD) can handle roughly 50,000โ100,000+ writes per second depending on transaction complexity. Most applications โ even well-funded startups at significant scale โ never exceed this. Read replicas handle read scaling (add more replicas, route read traffic to them). Connection poolers (PgBouncer, pgpool) handle connection overhead. Only after vertical scaling has hit a real ceiling, and reads cannot be offloaded further, does distributed SQL or manual sharding make sense. The common mistake is reaching for horizontal distribution early because "we might need it someday." You probably will not need it, and the operational cost of distributed SQL is real and immediate.
A practical five-step evaluation: (1) Read the "when to use" page from the official docs and the "known limitations" page โ these tell you the honest boundaries of the product. (2) Check db-engines.com for popularity trend and feature comparison against your current stack. (3) Benchmark with YCSB or pgbench using your data shape โ generate a realistic dataset (right row count, right column distribution, right access pattern) and measure P50, P99, and P999 latency. (4) Test the failure modes: kill a node, partition the network, fill up disk. Watch how the database behaves and how easy it is to recover. (5) Run the ops checklist: how do I back up? How do I restore? How do I upgrade the version? How do I add capacity? If any of these takes more than an hour to answer from documentation, the operational overhead is real. Vendor benchmarks are propaganda; your own measurements are data.
Vector databases store numerical embeddings (dense vectors) and answer approximate nearest-neighbor queries โ "find the 10 most similar documents to this embedding." The use case is AI retrieval-augmented generation (RAG): you embed your documents, store the embeddings, and at query time find the most semantically similar context to pass to a language model. Dedicated vector databases (Pinecone, Weaviate, Qdrant, Milvus) are purpose-built for this and optimize for embedding storage and ANN search performance. However, Postgres + pgvector handles most cases. pgvector adds a vector column type and HNSW/IVFFlat index support directly inside Postgres โ you get vector search and your relational data in one place, with standard SQL for filtering. For most teams building RAG applications, pgvector is sufficient and removes an entire database from your stack. Dedicated vector databases become necessary at very high vector counts (billions of vectors) or when you need sub-10ms ANN search across hundreds of millions of embeddings under high concurrency. Start with pgvector; graduate to dedicated vector DB only if you hit its limits.
Postgres remains the safest default going into 2026 โ and the gap has widened, not narrowed, as Postgres extensions (pgvector, TimescaleDB, PostGIS, pg_partman) have closed the gap with specialized databases for many workloads. Postgres is 30+ years old, completely open source (BSD license, no CLA), runs on every cloud, has the best documentation of any open-source database, and is known by more engineers than any alternative. For global ACID at large scale, Google Cloud Spanner is the safest choice if you are committed to GCP โ it has the longest production track record of any globally distributed database. For the emerging AI/embedding workload, Postgres + pgvector is safer than any dedicated vector database which may not exist in its current form in five years. The general principle: safety comes from maturity, ecosystem size, and portability โ not from novelty or marketing claims about "next-generation" architecture.