Distributed Systems: Architecture, Coordination, and Consensus

Distributed Systems: Architecture, Coordination, and Consensus

Verified Sources
May 19, 2026

At its core, a distributed system consists of multiple autonomous machines that communicate over a network to coordinate actions and share resources . Unlike centralized configurations where a single system acts as the sole compute node, distributed systems leverage parallelism to achieve scale, fault tolerance, and low latency.

However, moving from a single-node design to a distributed paradigm introduces fundamental engineering challenges. The system must coordinate state changes across nodes without relying on a shared physical memory or a highly precise global physical clock. Instead, nodes must rely on asynchronous message passing, which is prone to arbitrary delays, package loss, and complete connection drops.

Core Characteristics of Distributed Systems

  1. Concurrency of Components: System components execute code simultaneously on separate hardware instances.
  2. Lack of a Global Clock: Physical clocks drift over time, making it impossible to rely on wall-clock timestamps alone for ordering sequential actions without margin of error.
  3. Independent Failures: Individual nodes can fail (crash, reboot, or run out of memory) while the rest of the system remains operational.

The Fallacies of Distributed Computing

A major source of engineering bugs stems from the "Fallacies of Distributed Computing"—a set of common, false assumptions developers make:

  • The network is reliable.
  • network latency is zero.
  • Bandwidth is infinite.
  • The network is secure.
  • Topology does not change.
  • There is one administrator.
  • Transport cost is zero.
  • The network is homogeneous.

When a network partition occurs, these fallacies are laid bare, forcing system architectures to build robust fault-tolerant schemes.

Footnotes

  1. Designing Data-Intensive Applications - Martin Kleppmann's authoritative book on distributed data systems design.

Distributed Systems Explained | System Design Interview Basics

Theoretical Foundations: CAP & PACELC Theorems

To build predictable systems, engineers must navigate formal theoretical constraints. The most prominent of these is the CAP theorem . Formally stated:

C (Linearizability)A (Availability)    No partitions possibleC \text{ (Linearizability)} \land A \text{ (Availability)} \implies \text{No partitions possible}

Because physical networks are inherently prone to partitions (PP), a distributed database must choose one of two options when a partition occurs:

  1. Consistency (CPCP): The system rejects incoming read or write operations to prevent stale or conflicting data states across disconnected partitions. This prioritizes linearizability.
  2. Availability (APAP): The system accepts read or write operations on any accessible node, prioritizing responsiveness at the cost of temporary data divergence. The system later reconciles states via eventual consistency mechanisms.

Expanding CAP: The PACELC Theorem

The CAP theorem only describes system behavior during an active network partition. The PACELC theorem extends this by describing trade-offs under normal execution:

If P (Partition), choose AC;Else (E), choose L (Latency)C (Consistency)\text{If } \mathbf{P} \text{ (Partition), choose } \mathbf{A} \lor \mathbf{C}; \quad \text{Else } (\mathbf{E}), \text{ choose } \mathbf{L} \text{ (Latency)} \lor \mathbf{C} \text{ (Consistency)}

SystemPartition BehaviorNormal Operation BehaviorArchetype
MongoDBConsistent (CC)Consistent (CC)PC/EC
CassandraAvailable (AA)Latency (LL)PA/EL
SpannerConsistent (CC)Consistent (CC)PC/EC

Footnotes

  1. CAP Theorem Introduction - Overview of Brewer's CAP Theorem constraints in database design.

The 'CA' System Myth

Many designers claim their database is a 'CA' system because it operates on highly reliable enterprise hardware. However, a CA system is mathematically impossible across a real network because physical connections can always fail. A network partition (PP) is a physical reality, not an architectural choice.

Replication Protocol Latency Trade-offs

Average write latency (ms) vs replication consensus depth (N=3, N=5, N=9)

Raft Consensus Leader Election Workflow

  1. 1
    Step 1

    All nodes start in the Follower state. Each node maintains a randomized election timeout (typically 150ms - 300ms). If a follower fails to receive heartbeats from an active leader before the timeout expires, it assumes the leader has failed and transitions to the Candidate state .

    Footnotes

    1. Raft Consensus Algorithm - The original paper and visualization resource for the Raft consensus protocol.

  2. 2
    Step 2

    The follower increments its current epoch term, votes for itself, and broadcasts RequestVote Remote Procedure Calls (RPCs) to all other nodes in the network cluster. This initiates a new election phase.

  3. 3
    Step 3

    Nodes receive the RequestVote RPC and grant their vote if the candidate's log is at least as up-to-date as the receiver's own log, and the receiver hasn't already voted in this term. The candidate waits for a quorum: N/2+1\lfloor N/2 \rfloor + 1 votes.

  4. 4
    Step 4

    Once a candidate secures a quorum of votes, it ascends to the Leader state. It immediately begins broadcasting AppendEntries RPCs (heartbeats) to all peer nodes to establish authority and prevent new elections.

Preventing Split-Brain Scenarios

By enforcing a strict quorum size of N/2+1\lfloor N/2 \rfloor + 1, the system prevents split-brain scenarios where two independent partitions of nodes both elect their own leader simultaneously. Two overlapping majorities cannot exist in a system of size NN.

Consistent hashing maps both data keys and cluster nodes to a continuous circular coordinate space called a hash ring.

Key Benefits:

  • Minimizes data movement when nodes are added or removed (only K/nK/n keys relocated, where KK is total keys and nn is node count).
  • Utilizes virtual nodes (vnodes) to prevent hotspotting and achieve uniform load distribution across heterogeneous hardware.

Advanced Distributed System Edge Cases

Knowledge Check

Question 1 of 3
Q1Single choice

Under the PACELC theorem, what does a system like Apache Cassandra prioritize in the absence of network partitions (the 'Else' condition) if it is configured for low latency?

Explore Related Topics

1

Algorithms: Foundations, Analysis, and Design Paradigms

Algorithms are finite, well‑defined procedures that transform inputs into outputs, and their study emphasizes correctness, efficiency, and scalability.

  • Good algorithms exhibit finiteness, definiteness, clear I/O, effectiveness, and reasonable time/space usage.
  • Asymptotic analysis uses OO, Θ\Theta, Ω\Omega; e.g., merge sort T(n)=2T(n/2)+O(n)T(n)=2T(n/2)+O(n) gives O(nlogn)O(n\log n), binary search yields O(logn)O(\log n).
  • Design paradigms—divide‑and‑conquer, greedy, dynamic programming, backtracking, branch‑and‑bound—fit different problem structures.
  • Standard analysis steps: define input size nn, count dominant operations, derive growth, examine best/average/worst cases, and assess space.
  • Core algorithm families include searching (linear, binary), sorting (merge sort), graph traversal (BFS/DFS), and shortest‑path (Dijkstra).
2

Data Communication Components: Various Connection Topology, Protocols and Standards

Data communication fundamentals are presented, detailing the five essential components, common physical and logical topologies, protocol layering (OSI and TCP/IP), and the standards bodies that ensure interoperability.

  • Core components: message, sender, receiver, transmission medium, protocol; transmission modes include simplex, half‑duplex, and full‑duplex.
  • Topologies: bus, star, ring, mesh, tree, hybrid—each balancing cost, fault tolerance, scalability, and complexity.
  • Protocols define syntax, semantics, and timing; OSI (7 layers) and TCP/IP (4 layers) use key protocols such as IP, TCP, UDP, HTTP.
  • Standards from ISO, ITU‑T, IEEE (e.g., 802.3 Ethernet, 802.11 Wi‑Fi) and IETF guarantee vendor‑independent communication.
  • Design guidance: align requirements with appropriate topology, media, protocol stack, and verify compliance with relevant standards.
3

Systems Programming: Processes, Memory, Concurrency, and Operating-System Interfaces