Scaling an Event Ticketing System

Scaling an Event Ticketing System

Verified Sources
Jun 19, 2026

An event ticketing system is one of the most challenging distributed systems to design. When tickets for a popular concert go on sale, the system must handle massive traffic spikes, guarantee zero overselling of limited inventory, and deliver sub-second response times under extreme load. This course section explores the architecture, strategies, and trade-offs involved in scaling such a system from thousands to millions of concurrent users.

At its core, a ticketing system faces a fundamental tension: high read throughput (users browsing events) versus strict write consistency (reserving specific seats). The CAP theorem tells us we cannot have all three properties simultaneously — and in ticketing, consistency is non-negotiable. You cannot sell the same seat to two people.

The diagram above illustrates the two-path architecture — a common pattern in ticketing systems where reads and writes follow separate pipelines. Reads are served from cached data with eventual consistency, while writes go through a serialized path that guarantees inventory integrity.

Key Scalability Challenges

ChallengeDescriptionTypical Scale
Flash Traffic10x-100x normal load in seconds100K+ concurrent users
Inventory IntegrityPrevent double-booking of seatsZero oversell tolerance
Seat Selection ConsistencyLock seats during user decision window5–15 min hold timers
Payment Latency3rd-party gateway delays2–30 sec per transaction
Search & DiscoveryFaceted event search under loadMillions of event records

Footnotes

  1. Gilbert, S. & Lynch, N. "Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services" — ACM SIGACT News, 2002. The CAP theorem's formal proof establishing that consistency and availability cannot both be guaranteed during network partitions.

  2. Kleppmann, M. "Designing Data-Intensive Applications" O'Reilly, 2017 — Chapter on derived data and the lambda architecture pattern for separating read/write paths.

System Design Interview: Design Ticketmaster w/ a Ex-Meta Staff Engineer

Core Architectural Components

1. Load Balancing & Traffic Management

At the edge, a Global Server Load Balancer routes users to the nearest data center. Behind it, L7 load balancers distribute requests across API server pools.

For ticketing, weighted round-robin with connection draining is preferred over least-connections. The reason: during a sale, new servers spinning up shouldn't immediately receive reservation requests until their local caches are warm.

2. Caching Strategy

The read path for ticketing is overwhelmingly dominant — 95%+ of requests are users browsing events, viewing seat maps, or checking availability. A multi-tier caching strategy is essential:

  • L1 — Browser Cache: Static event data, images, and venue maps served with Cache-Control headers
  • L2 — CDN Edge Cache: Event listings, pricing tiers cached at edge nodes; TTL set to 30–60s during live sales
  • L3 — Application Cache (Redis): Seat availability maps, event metadata; updated on every reservation via cache invalidation
  • L4 — Database Query Cache: Frequent queries for event lists, venue details

During a high-demand onsale event, cache hit ratios must stay above 90% for the system to survive. Every cache miss that reaches the database adds latency and load.

3. The Inventory Problem — Preventing Overselling

This is the hardest problem in ticketing systems. You have NN seats for an event and 10N10N concurrent buyers. Solutions include:

Pessimistic Locking (Row-Level)

Traditional approach using SELECT ... FOR UPDATE. Simple but creates severe lock contention under high concurrency.

Optimistic Concurrency Control

Use version columns: UPDATE seats SET version=version+1, status='reserved' WHERE id=X AND version=V. If zero rows affected, the seat was already taken — retry or fail.

Distributed Locking (Redis-based)

Use SETNX (SET if Not eXists) to atomically claim a seat in Redis, then persist to database asynchronously. This makes the Redis cluster the source of truth for availability during the sale.

Queue-Based Reservation

All reservation requests enter a message queue. A single-threaded consumer processes reservations sequentially per event, guaranteeing no conflicts. This is the approach used by large-scale systems.

Throughputqueue=min(Nconsumerstprocess,Npartitionstpartition)\text{Throughput}_{\text{queue}} = \min\left(\frac{N_{\text{consumers}}}{t_{\text{process}}}, \frac{N_{\text{partitions}}}{t_{\text{partition}}}\right)

Where NconsumersN_{\text{consumers}} is the number of consumer instances, tprocesst_{\text{process}} is the processing time per reservation, and NpartitionsN_{\text{partitions}} is the Kafka partition count (one per event).

Footnotes

  1. Nginx documentation on load balancing algorithms and connection draining: https://docs.nginx.com/nginx/admin-guide/load-balancer/

  2. Redis Labs case study on ticketing system caching patterns — "Achieving 99.9% Cache Hit Ratio During Peak Load" — demonstrating multi-tier caching in high-concurrency event systems.

  3. Kafka documentation on exactly-once semantics and partition-based ordering: https://kafka.apache.org/documentation/ — Using one partition per event to serialize all seat reservation operations, preventing concurrency conflicts.

The Overselling Nightmare

Overselling is unrecoverable in ticketing. You cannot ask a customer to give up their seat after purchase. Every architectural decision must prioritize inventory integrity over availability. During flash sales, it is better to reject requests (fail closed) than to sell the same seat twice (fail open).

Ticket Reservation Flow at Scale

  1. 1
    Step 1

    The user selects a seat on the interactive seat map. The frontend sends a POST /api/events/{id}/seats/{seatId}/hold request to the API Gateway. The request includes the user's session token and a idempotency key to handle network retries safely.

  2. 2
    Step 2

    The API Gateway checks rate limits — typically 10 requests/second per user during onsales. Bots and scrapers are filtered using CAPTCHA, browser fingerprinting, and token bucket algorithms. Malicious traffic is rejected before reaching the reservation service.

  3. 3
    Step 3

    The Reservation Service attempts to acquire a Redis distributed lock on the seat using SET seat:{eventId}:{seatId} <userId> NX EX 600 (10-minute TTL). If the lock succeeds, the seat is temporarily held. If it fails (key already exists), return 409 Conflict — the seat is held by another user.

  4. 4
    Step 4

    After acquiring the Redis lock, the service writes the reservation to the primary PostgreSQL database with status HELD. An outbox pattern is used — the reservation and an event record are written in a single transaction. A CDC (Change Data Capture) pipeline picks up the outbox entry and publishes it to Kafka.

  5. 5
    Step 5

    The user is redirected to payment. A 10–15 minute countdown timer starts. The payment service calls the 3rd-party payment gateway (Stripe/PayPal). On success, the seat status becomes CONFIRMED. On failure or timeout, a scheduled job releases the hold and deletes the Redis lock, returning the seat to the available pool.

  6. 6
    Step 6

    Once the reservation is confirmed, the system invalidates the Redis availability cache for that event (DEL availability:{eventId}). A WebSocket push notification updates the seat map for all connected users in real-time. The final state is: Database = CONFIRMED, Redis Lock = expired naturally, Cache = stale (invalidated), User = notified via email + push.

Handling Flash Sales — The Queue-Based Approach

The most robust pattern for extreme scale is virtual queuing — made famous by systems like Queue-it and used by Ticketmaster during onsales. Instead of letting all users hit the reservation service simultaneously, they enter a waiting room:

The key parameters for virtual queuing:

ParameterDescriptionTypical Value
Drain RateUsers let through per second50–200/s
Batch SizeUsers released per interval100–500
Hold TimerTime user has to complete purchase10–15 min
Queue TTLMax wait time before session expires2–4 hours
Heartbeat IntervalKeep-alive polling from waiting room15–30s

The drain rate rr is calculated based on the reservation service capacity CC and average purchase time TT:

r=CT×safety_factorr = \frac{C}{T} \times \text{safety\_factor}

For example, if the reservation cluster can handle 5,000 concurrent sessions and the average purchase takes 5 minutes, then:

r=5000300s×0.8=13.3 users/secr = \frac{5000}{300\text{s}} \times 0.8 = 13.3 \text{ users/sec}

This is deliberately conservative — a safety factor of 0.8 prevents the system from becoming saturated.

Footnotes

  1. Queue-it virtual waiting room technology — used by Ticketmaster and other major ticketing platforms to manage flash sale traffic. See: https://queue-it.com/

Queue Position Optimization

Pre-sort users in the virtual queue by session quality score. Users with verified accounts, payment methods on file, and no bot signals get a higher priority within their position band. This reduces abandonments in the purchase flow and increases revenue per onsale.

Scaling Lifecycle of a Ticketing System

Monolith Phase

Stage 1

Single server running a Rails/Django app with a relational DB. Handles 100–1,000 concurrent users. Pessimistic locking (SELECT FOR UPDATE) works fine at this scale. Suitable for local venues and small events."

Horizontal Read Scaling

Stage 2

Add read replicas, Redis cache layer, and CDN for static assets. Introduce a load balancer. This extends capacity to ~10,000 concurrent users. Inventory integrity still handled by primary DB with row locks."

Service Decomposition

Stage 3

Split into microservices: Event Service, Inventory Service, Reservation Service, Payment Service, Notification Service. Introduce Kafka for async event-driven communication. Capacity: ~50,000 concurrent users."

Distributed Inventory

Stage 4

Move inventory availability to Redis as source of truth during sales. Use Redis Cluster for sharding by event_id. Implement queue-based reservation. Add virtual waiting room for onsales. Capacity: 100K–500K concurrent users."

Global Multi-Region

Stage 5

Deploy across multiple cloud regions with active-passive failover. Implement CRDT-based caches for read availability. Use dedicated queue drain clusters per region. Capacity: 1M+ concurrent users globally."

Concurrent User Capacity by Scaling Stage

Maximum concurrent users handled at each architectural stage

Database Architecture & Sharding Strategy

The database layer is where consistency meets scale. For a ticketing system, we use a hybrid approach — different database technologies for different access patterns:

Primary Database: PostgreSQL (OLTP)

PostgreSQL with serializable isolation level for all reservation and payment operations. Key optimizations:

  • Partition events by date range — old events are archived to cold storage
  • Index heavily on (event_id, seat_status) for availability queries
  • Use advisory locks for event-level operations: pg_advisory_lock(event_id) serializes all seat modifications for a given event in one DB node

Sharding Strategy

For global scale, shard by event_id using hash-based sharding:

shard=hash(event_id)modNshards\text{shard} = \text{hash}(event\_id) \mod N_{\text{shards}}

This ensures all seats for a single event live on the same shard, eliminating cross-shard transactions for the common case of "reserve seat for event X." The shard count NshardsN_{\text{shards}} should be set to 2× the number of physical DB nodes to allow for future resharding via consistent hashing.

Event Sourcing for Audit Trail

Every state transition in the ticketing lifecycle is stored as an event record:

Event TypeDescriptionExample
SEAT_HELDUser selected seat, timer started{seatId, userId, expiresAt}
SEAT_RELEASEDHold expired or user abandoned{seatId, reason}
PAYMENT_INITIATEDPayment gateway called{seatId, amount, gatewayRef}
SEAT_CONFIRMEDPayment successful{seatId, orderId, confirmationCode}
SEAT_REFUNDEDRefund processed{seatId, refundAmount, reason}

This event log enables perfect auditability — critical for the ticketing industry where disputes, fraud detection, and regulatory compliance require a complete history.

Footnotes

  1. Vogels, W. "Eventually Consistent" — ACM Queue, 2009. Discussion of consistent hashing and its role in minimizing data movement during resharding operations in distributed systems.

Edge Cases & Advanced Topics

1import redis 2 3r = redis.Redis(cluster={...}) 4 5def hold_seat(event_id: int, seat_id: str, user_id: str) -> bool: 6 lock_key = f"seat:{event_id}:{seat_id}" 7 # NX = only set if not exists, EX = TTL in seconds 8 acquired = r.set(lock_key, user_id, nx=True, ex=600) 9 if acquired: 10 # Persist to DB asynchronously 11 publish_reservation_event(event_id, seat_id, user_id) 12 return True 13 return False

Key Concepts in Ticketing System Scalability

1 / 5
20%
Question · Term

Virtual Waiting Room

Click to reveal
Answer · Definition

A controlled queue that releases users at a rate the reservation system can handle. Prevents thundering herd during flash sales. Used by Ticketmaster and implemented by services like Queue-it.

Real-World Capacity Planning

Based on industry data from large-scale ticketing platforms:

MetricEstimateSource
Taylor Swift Eras Tour onsale3.5M verified fans queuedTicketmaster
Peak concurrent users during onsale500K–2MIndustry reports
Average tickets per transaction2–4Industry average
Abandonment rate after hold15–25%Payment analytics
Bot traffic during popular onsales60–80% of total requestsBot management firms
Revenue loss from bot scalping$15B+ annually (US)Research estimates

Capacity planning must account for the worst-case concurrent write throughput:

Wpeak=Uconcurrent×Rreservation_rate1AabandonmentW_{\text{peak}} = \frac{U_{\text{concurrent}} \times R_{\text{reservation\_rate}}}{1 - A_{\text{abandonment}}}

Where UconcurrentU_{\text{concurrent}} is peak concurrent users, Rreservation_rateR_{\text{reservation\_rate}} is the fraction of users attempting to reserve per second (~5%), and AabandonmentA_{\text{abandonment}} is the hold abandonment rate. For 1M concurrent users:

Wpeak=1,000,000×0.0510.20=62,500 reservations/sec (attempted)W_{\text{peak}} = \frac{1{,}000{,}000 \times 0.05}{1 - 0.20} = 62{,}500 \text{ reservations/sec (attempted)}

With a virtual queue draining at 50 users/sec and 80% converting to reservations:

Wactual=50×0.80=40 reservations/secW_{\text{actual}} = 50 \times 0.80 = 40 \text{ reservations/sec}

This is why virtual queuing is essential — it reduces the write load from 62,500/sec to 40/sec, a 1500× reduction.

Footnotes

  1. Ticketmaster Verified Fan program data and industry reports on bot traffic during major onsales — including analysis of the Taylor Swift Eras Tour onsale incident and subsequent congressional testimony.

Don't Forget the Payment Gateway

The payment gateway is often the bottleneck in a ticketing system. Third-party gateways (Stripe, Adyen) have rate limits — typically 100–500 req/sec per merchant account. Request rate limit increases well in advance of major onsales. Account for 3–5 second HTTP timeouts with retry logic and circuit breakers. A payment gateway outage at minute 5 of a 10-minute hold window is catastrophic.

Pre-warm Everything

Before a major onsale, pre-warm all caches. Load event data, seat maps, and pricing into Redis and CDN. Scale up API servers, reservation consumers, and DB read replicas 30 minutes before the onsale time. Enable autoscaling with a minimum floor — don't scale to zero and wait for scale-up during a sale. Pre-allocate Redis connections pool to maximum expected concurrency.

Knowledge Check

Question 1 of 5
Q1Single choice

Which approach provides the strongest guarantee against overselling in a distributed ticketing system?

Explore Related Topics

1

Distributed Systems: Architecture, Coordination, and Consensus

The course covers distributed system fundamentals, consistency‑availability trade‑offs, consensus via Raft, and data partitioning methods.

  • Key traits: concurrent components, no global clock, independent failures; network partitions reveal common fallacies.
  • CAP forces a consistency vs. availability choice during partitions; PACELC adds latency vs. consistency when no partition (e.g., Cassandra prefers latency).
  • Raft election: followers timeout, become candidates, request votes, and win leadership with a quorum of ⌊N/2⌋+1, avoiding split‑brain.
  • Consistent hashing minimizes reshuffling to ~K/n keys on node addition, while range sharding speeds range queries but can hotspot.
2

Shipping Speed vs. Clean Architecture in Early-Stage Startups: An Engineering Case Study

3

Systems Programmer Interview Preparation

The course provides a structured roadmap to ace systems programmer interviews, covering OS internals, concurrency, memory management, C/C++ mastery, and networking.

  • Core domains (OS internals, concurrency, memory, C/C++, networking) each ~20‑25%; know syscall flow and 115  μs5\;\mu\text{s} context‑switch cost.
  • Assess baseline, then deep‑dive into process lifecycle, lock‑free structures, page‑fault handling, and TCP/epoll server implementation.
  • Hone C/C++ low‑level skills (pointers, UB, move semantics, ABI) and use ASan, Valgrind, perf.
  • Practice tracing, concurrency bugs, implementations, performance analysis, and low‑level system design.
  • Follow the 12‑week timeline, solve targeted problems, do mock interviews, and read kernel source.