Coursify

System Design for Software Engineers

Case Study: Designing a URL Shortener (TinyURL)

Case Study: Designing a URL Shortener (TinyURL)

In this first case study, we will design a service like TinyURL or Bitly. This is a classic system design interview question that covers many of the fundamentals we've learned: scaling, hashing, and database selection.

The Requirements

  1. Functional: Given a long URL, return a short, unique alias. When a user clicks the short URL, redirect them to the original long URL.
  2. Non-Functional: High availability (the links must always work), low latency (redirection should be instant), and scalability (handle billions of URLs).

Hashing and Uniqueness

The core of the system is converting a long URL into a short string (e.g., https://tiny.url/abc123).

  • Base 62 Encoding: We use characters [a-z, A-Z, 0-9] which gives us 62 possible characters for each position.
  • Length: A 7-character string gives us 6273.562^7 \approx 3.5 trillion unique combinations, more than enough for our needs.
  • Key Generation Service (KGS): Instead of hashing the URL at runtime (which can lead to collisions), we can have a separate service that pre-generates billions of unique 7-character strings and stores them in a database. When a request comes in, we just "grab" one.

The URL Shortening Workflow

  1. 1
    Step 1

    A user submits https://very-long-domain.com/some/deep/path/page.html to our API.

  2. 2
    Step 2

    The App Server asks the Key Generation Service for the next available 7-character key (e.g., 8xK9p2z).

  3. 3
    Step 3

    The App Server saves the mapping { '8xK9p2z': 'https://very-long-domain.com/...' } into our primary database and also into a Redis Cache for fast access.

  4. 4
    Step 4

    The server returns https://tiny.url/8xK9p2z to the user. Total time: < 50ms.

  5. 5
    Step 5

    A different user clicks the short link. The Load Balancer sends the request to an App Server, which first checks Redis. Finding the hit, it returns a 302 Redirect to the original long URL.

Database Selection

  • Type: A NoSQL Key-Value store or a simple Relational database (like PostgreSQL) both work well. Since we don't have complex relationships, a NoSQL store like DynamoDB or Cassandra is often easier to scale.
  • Structure:
    • short_url (Primary Key, String)
    • original_url (String)
    • created_at (Timestamp)
    • user_id (Optional, for analytics)

Scaling and Optimization

  1. Caching: 20% of the URLs will likely get 80% of the traffic. Use a Redis cache to store the most popular mappings, drastically reducing DB load.
  2. DB Sharding: Shard the database by the short_url hash to handle billions of rows across multiple servers.
  3. Clean-up Service: Use a background worker to delete URLs that haven't been accessed in years to save space.

Common Mistakes

  • Using MD5/SHA Hashing: MD5 produces a long string. If you truncate it to 7 characters, you will get collisions (two different long URLs having the same short URL). Using a KGS is safer.
  • Blocking KGS: If the KGS is a single point of failure, your whole system stops. Ensure the KGS is highly available and pre-allocates keys in memory for speed.
  • Missing Redirection Code: Forgetting that for SEO and analytics, you should use 301 (Permanent) or 302 (Found) redirects appropriately.

Recap

  • Use Base 62 for short, readable, unique IDs.
  • A Key Generation Service prevents collisions and simplifies the logic.
  • Redis Caching is vital for the 80/20 traffic distribution.
  • Choose a database that scales horizontally (NoSQL).

Knowledge Check

Question 1 of 3
Q1Single choice

Why is a Key Generation Service (KGS) preferred over hashing the long URL directly?

Case Study: Designing a URL Shortener (TinyURL) | System Design for Software Engineers | Coursify