CAP Theorem & PACELC

In distributed systems, we often have to make hard choices about data consistency and availability, especially when the network is unreliable.

1. The CAP Theorem

The CAP Theorem states that a distributed system can only provide two of the following three guarantees at the same time:

Consistency (C): Every read receives the most recent write or an error.
Availability (A): Every request receives a (non-error) response, without the guarantee that it contains the most recent write.
Partition Tolerance (P): The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.

The CP vs AP Trade-off

Since network partitions are inevitable in distributed systems, we must always choose P. This leaves us with a choice between C and A:

CP (Consistency + Partition Tolerance): If a partition occurs, the system will stop responding to some requests to ensure data consistency.
AP (Availability + Partition Tolerance): If a partition occurs, nodes will continue responding with the data they have, even if it's stale.

2. Beyond CAP: The PACELC Theorem

CAP only describes system behavior during a network partition. PACELC extends CAP by describing behavior during normal operation (when there is no partition).

(P) If there is a Partition, choose between (A) Availability and (C) Consistency.
(E) Else (no partition), choose between (L) Latency and (C) Consistency.

PACELC highlights that even when everything is running smoothly, we still have to choose between keeping data consistent (which takes time for synchronization) and responding quickly (latency).

In a distributed system, why is Partition Tolerance (P) usually mandatory?

Because network failures are inevitable in large-scale systems.

1. The CAP Theorem

The CP vs AP Trade-off

2. Beyond CAP: The PACELC Theorem

Knowledge Check