Big Tech - System Design Products

Here is some problems for Big Tech - System Design Products

  1. Social Media (high read, low write)

    1. Users should be able to create posts featuring photos, videos, and a simple caption.
    2. Users should be able to follow other users.
    3. Users should be able to see a chronological feed of posts from the users they follow.

      image.png

    4. Live comment (use Redis pub/sub for live comment, Kafka pub/sub for tracking)

      1. Viewers can post comments on a Live video feed (dễ)
      2. Viewers can see new comments being posted while they are watching the live video. (khó)
      3. Viewers can see comments made before they joined the live feed. (API cursor)

      image.png

    5. Likes posts (write behind solutions, shard init post1Likes_1, post1Likes_2, post1Likes_3)

      1. Users can like and unlike a post by toggling the like button.
        1. Do not need to query like_id in post_id due to O(logN)
        2. Solution 1: pre-computed like_count field in database
        3. Solution 2: Use redis counter
      2. Both the post owner and viewers should be able to see the total like count for post.

      image.png

    6. Search posts (Ingestion service)

      1. Users should be able to create and like posts.
      2. Users should be able to search posts by keyword.
        • Inverted index: post + like
      3. Users should be able to get search results sorted by recency or like count. (Redis sorted set → Keep order)
        • Client sends text query
        • API Gateway forwards to Search Service
        • Search Service pulls from Redis indexes:
          • Keyword → [PostIds]
          • Likes Index → score
          • Creation Index → timestamp
        • Combine (e.g., weighted ranking)
        • Return sorted results to user
      4. Optimization
        1. Search result keep in the CDN first
        2. Ingestion and like service in background jobs.
        3. Cold and warm storage for index.

      image.png

  2. Youtube (high read, long write)

    1. Users can upload videos.
      1. Upload video, trancoding service to 1080 HD, 720 HD,… ⇒ Adaptive Birate Streaming Protocol
      2. Transcoding each success chunks after upload.
    2. Users can watch (stream) videos.

      1. The client will fetch the VideoMetadata, which will have a URL pointing to the manifest file in S3.
      2. The client will download the manifest file.
      3. The client will choose a format based on network conditions / user settings. The client retrieves the URL for this segment in its chosen format from the manifest file.
      4. The client will download the first segment ⇒ and each time they search in video ⇒ find the current chunk

      image.png

    3. Top-k video

      1. Clients should be able to query the top K videos for all-time (up to a max of 1k results).
      2. Clients should be able to query tumbling windows of 1 {hour, day, month} and all-time (up to a max of 1k results) ⇒ Prefix sum.
      3. Idea: Batch processing in view consumer
        • Top-K Cron → Query DB → Compute Top-K → Write to Redis
        • Finds the Top-K most viewed videos by:
          • last 1 hour
          • last 24 hours
          • overall

      image.png

      Feature Likes System YouTube Top-K Views
      Update frequency Real-time (per like) Batch (per 5–15 min)
    4. Likes and search video
  3. Chat App (high read, high write) ⇒ WebSocket

    1. Users should be able to start group chats with multiple participants (limit 100).
    2. Users should be able to send/receive messages.
    3. Users should be able to receive messages sent while they are not online (up to 30 days).
    4. Users should be able to send/receive media in their messages.
    5. User can receive notifications when offline.

      image.png

      image.png

      image.png

      image.png

      image.png

      • If high throughput and decouping service ⇒ Using queue, otherwise do not use it.
      • The Notification service integrates with external push notification providers like Firebase Cloud Messaging (FCM) and Apple Push Notification Service (APNS) to deliver messages as push notifications to offline users.
      • If User B is online: Server A forwards the message to Server B, which delivers User B via their open WebSocket connection.
      • If User B is offline: Server A sends the message to the notification service, which triggers a push notification to notify User B of the new message.
  4. Live stream/Zoom call (high read, high write) ⇒ WebSocket

    1. User can call 1 - 1, P2P

      image.png

    2. User can group call, Server-client

      1. Using SFU: Like a pub/sub to 1 producer → N streams

        image.png

  5. Ride-hailing (low read, high write)

    1. Riders

      1. Riders should be able to input a start location and a destination and get a fare estimate.
      2. Riders should be able to request a ride based on the estimated fare.
      3. Upon request, riders should be matched with a driver who is nearby and available.
      4. Users can search for nearby places (restaurants, gyms, salons, etc.) using their current location

      image.png

    2. Drivers
      1. Drivers should be able to accept/decline a request and navigate to pickup/drop-off.
      2. Driver tracking: Once a rider is matched with a driver, the rider should be able to track the driver’s location and view the estimated time of arrival (ETA).
    3. Ratings and Payments
      1. Ratings: Both riders and drivers should have the ability to rate each other after a ride is completed.
      2. Payments: The user should be able to complete the payment after the ride is completed.
  6. Booking System (high read, low write) (Yelp is search of booking)

    1. Searching
      1. The system should prioritize availability for searching & viewing events.
      2. The system should be scalable and able to handle high throughput in the form of popular events (10 million users, one event)
      3. The system is read heavy, and thus needs to be able to support high read throughput (100:1)
      4. The system should have low latency search (< 500ms)
    2. Booking
      1. The system should prioritize consistency for booking events (no double booking)
    3. Local query

      1. Customers should be able to query availability of items, deliverable in 1 hour, by location (i.e. the effective availability is the union of all inventory nearby DCs).
      2. Customers should be able to order multiple items at the same time.

      image.png

      • Idea: Sharding by region-id

      image.png

Add these 5

  1. Search engine (high read, high write)

    1. Handle billions of queries per day: Users worldwide constantly submit searches, and the system has to process them all.
    2. Return relevant results in under a second: Users expect near-instant feedback.
    3. Provide features like autocomplete and spell correction: The experience should guide users even when queries aren’t perfect.
    4. Continuously update the index: As new web pages are created, they need to appear in results quickly.

      image.png

  2. Rate limiter / Key-Value / Web Crawler

    2.1. Rate limiting (high read)

    1. The system should identify clients by user ID, IP address, or API key to apply appropriate limits.
    2. The system should limit HTTP requests based on configurable rules (e.g., 100 API requests per minute per user).
    3. When limits are exceeded, the system should reject requests with HTTP 429 and include helpful headers (rate limit remaining, reset time).

      image.png

    2.2. Key-Value (high read)

    1. Users should be able to set, get, and delete key-value pairs.
    2. Users should be able to configure the expiration time for key-value pairs.
    3. Data should be evicted according to Least Recently Used (LRU) policy.

    image.png

    2.3. Web Crawler (high read, high write)

    1. Crawl the web starting from a given set of seed URLs.
    2. Extract text data from each web page and store the text for later processing.

    image.png

  3. Notification service + Job service (email/SMS/push) (high read, low write)

    1. Users should be able to schedule jobs to be executed immediately, at a future date, or on a recurring schedule (ie. “every day at 10:00 AM”).
    2. Users should be able monitor the status of their jobs.

    image.png

    • Message sent now: add to queue
    • Message schedule: use Schedule service at the time fire to the queue.

    image.png

  4. Payment System (low read, low write)

    1. Merchants should be able to initiate payment requests (charge a customer for a specific amount).
    2. Users should be able to pay for products with credit/debit cards.
    3. Merchants should be able to view status updates for payments (e.g., pending, success, failed).

    image.png

    • Case: do not store PII, card_user_info in the system ⇒ Delegate it for Visa

    image.png

  5. Trading System (high read, low write)

    1. Users can see live prices of stocks.
    2. Users can manage orders for stocks (market / limit orders, create / cancel orders).

    image.png

  6. Leaderboard/Auction System (1 flow High-read, 1 flow High-write)

    1. Users should be able to post an item for auction with a starting price and end date.
    2. Users should be able to bid on an item. Where bids are accepted if they are higher than the current highest bid.
    3. Users should be able to view an auction, including the current highest bid.
    4. Top-N Queries: Display top N players (e.g., top 10, top 100) on the leaderboard update it in real-time.
    5. Player ’s Own Rank: Allow a player to query their current rank without scanning the entire leaderboard.

    image.png

    image.png

    image.png

November 18, 2025