Coursify

System Design for Software Engineers

CDNs: Delivering Content at the Edge

CDNs: Delivering Content at the Edge

A Content Delivery Network (CDN) is a geographically distributed group of servers that work together to provide fast delivery of internet content. A CDN allows for the quick transfer of assets needed for loading internet content including HTML pages, javascript files, stylesheets, images, and videos.

Without a CDN, every user in the world must fetch content from your origin server, potentially thousands of miles away. With a CDN, users fetch content from an Edge Location much closer to them, drastically reducing latency.

How a CDN Works

  1. User Request: A user in Tokyo requests an image from example.com.
  2. DNS Routing: The DNS server for example.com returns the IP address of the nearest CDN edge server (using Anycast or GeoDNS).
  3. Cache Hit: If the edge server has the image in its cache, it serves it immediately.
  4. Cache Miss: If the edge server doesn't have the image, it fetches it from the Origin Server (your main server), saves it locally for future requests, and serves it to the user.

Push vs. Pull CDNs

  • Pull CDN: The CDN pulls content from your origin server only when a user requests it for the first time. This is the most common approach and is very easy to set up.
  • Push CDN: You proactively push content to the CDN servers whenever it changes. This is useful for large files or content that doesn't change often but needs to be available instantly globally.

Caching Strategies and Invalidation

The effectiveness of a CDN depends on its Cache Hit Ratio.

  • Query String Caching: Should image.jpg?v=1 and image.jpg?v=2 be treated as the same file or different ones?
  • Cache Invalidation (Purging): When you update a file on your origin, you need to tell the CDN to delete the old version from its thousands of edge servers. This is called "purging" or "invalidating" the cache.
  • TTL (Time To Live): Just like DNS, CDN assets have a TTL that tells the edge server how long to keep the file before checking the origin for an update.

The Lifecycle of a CDN Request

  1. 1
    Step 1

    The user enters https://example.com/logo.png in their browser. The browser performs a DNS lookup.

  2. 2
    Step 2

    The DNS resolution returns an IP address for a CDN edge server that is geographically close to the user (e.g., a server in London for a user in the UK).

  3. 3
    Step 3

    The London edge server receives the request and checks its local storage (SSD/RAM) for logo.png.

  4. 4
    Step 4

    If the file is missing (Cache Miss), the edge server establishes a high-speed connection to your origin server in San Francisco, downloads the file, and stores it.

  5. 5
    Step 5

    The edge server sends the file to the user. Future users in London requesting the same file will now experience a 'Cache Hit' and receive the file instantly.

Benefits of Using a CDN

  • Lower Latency: Content travels a shorter physical distance.
  • Reduced Origin Load: Your main server doesn't have to serve static assets, freeing up CPU/RAM for dynamic API requests.
  • Scalability: CDNs are built to handle massive traffic spikes (e.g., a viral video or a product launch) that would crash a single origin server.
  • DDoS Protection: CDNs can absorb large-scale attacks at the edge before they ever reach your infrastructure.

Common Mistakes

  • Caching Dynamic Content: Accidentally caching a user's private dashboard or an API response with sensitive data. Always check your Cache-Control headers.
  • Forgotten Purges: Updating a CSS file but forgetting to purge the CDN, leading to users seeing a broken layout with old styles.
  • Ignoring Costs: While CDNs save server costs, high-bandwidth egress from a CDN can become expensive if not monitored.

Recap

  • CDNs use Edge Locations to serve content closer to the user.
  • Pull CDNs are automatic; Push CDNs are manual/proactive.
  • Cache Invalidation is the process of removing stale content from the edge.
  • CDNs provide both performance and security (DDoS mitigation).

Knowledge Check

Question 1 of 3
Q1Single choice

What is an 'Edge Location' in the context of a CDN?

CDNs: Delivering Content at the Edge | System Design for Software Engineers | Coursify