Glossary

CAP Theorem

The CAP theorem states a distributed data store can guarantee at most two of three properties: Consistency (all reads return the most recent write), Availability (all requests receive a response), and Partition tolerance (the system operates despite network partitions).

Explanation

A network partition is a split where some nodes can't communicate with others. In any real distributed system on separate machines or data centers, partitions are inevitable. Therefore, every distributed system must tolerate partitions — P is not optional. Given P is required, the real choice during a partition is between C and A: Consistency (refuse requests that might return stale data — wait until the partition heals) or Availability (continue serving requests but potentially returning stale data). This yields the two practical categories: CP systems (consistent, may be unavailable during partitions — Zookeeper, Etcd, HBase) and AP systems (always available, eventually consistent — Cassandra, DynamoDB, CouchDB). Traditional SQL databases are primarily designed for single-node use. With synchronous replication (all replicas must commit before acknowledging a write), they're effectively CP. With async replication (read replicas), reads are AP — replicas may be stale. Beyond the binary: CAP is a useful framework but oversimplified. PACELC extends it: even when there's no partition (E = else), there's a latency (L) vs. consistency (C) trade-off. Systems like Cassandra tune consistency per-request: consistency.one gives AP behavior; consistency.quorum gives CP behavior.

Code Example

javascript

// CAP in practice: Cassandra tunable consistency

const cassandra = require('cassandra-driver');
const client = new cassandra.Client({
  contactPoints: ['node1', 'node2', 'node3'],
  localDataCenter: 'datacenter1',
});

// AP behavior: read from 1 replica (fastest, may be stale)
await client.execute(
  'SELECT * FROM users WHERE id = ?', [userId],
  { consistency: cassandra.types.consistencies.one }
);

// CP behavior: read from quorum of replicas (consistent, slower)
await client.execute(
  'SELECT balance FROM accounts WHERE id = ?', [accountId],
  { consistency: cassandra.types.consistencies.quorum }
);

// Write with quorum consistency (most replicas must acknowledge)
await client.execute(
  'UPDATE accounts SET balance = ? WHERE id = ?',
  [newBalance, accountId],
  { consistency: cassandra.types.consistencies.quorum }
);

// Database selection by CAP property:
// CP: PostgreSQL (synchronous replication), Zookeeper, Etcd
// AP: DynamoDB, Cassandra, CouchDB
// Choose based on: does your use case require strong consistency?

Why It Matters for Engineers

The CAP theorem is foundational vocabulary for distributed systems. 'What consistency model does your database use?' is a standard system design interview question. Understanding CP vs. AP helps you choose the right database: financial systems need CP; social features can tolerate AP. CAP also explains why distributed systems behave differently from single-node systems: stale reads, write conflicts, and availability during failures are all CAP trade-offs in action. Understanding the theorem frames the entire landscape of distributed systems design.

Learn This In Practice

Go deeper with the full module on Beyond Vibe Code.

Systems Design Fundamentals → →