System Design of News Feed
Motivation
As a software engineer, I'm deeply interested in how large-scale systems operate behind the scenes. While coding is critical, understanding system architecture is just as important — especially for building scalable, reliable applications.
To explore this further, I started reading System Design Interview – An Insider's Guide by Alex Xu. One chapter focuses on building a news feed system, a core part of most social media platforms. I decided not to just take notes but to write my own walkthrough of how such a system could be designed — step by step, improving iteratively. This helps me reason deeply, challenge assumptions, and learn more effectively.
Introduction
Social media platforms like Facebook, Instagram, and LinkedIn all display a news feed — a continuous stream of content posted by friends, followers, or connections. These posts can be text, images, videos, or links, and are often presented in various formats, priorities, or languages.
The central challenge is this:
How do we deliver fresh, personalized content to millions of users in near real-time — efficiently and reliably?
In this write-up, I’ll walk through designing such a system from the ground up, refining it along the way. I’ll start with a basic version, point out flaws, and gradually improve each component to handle scale, latency, and complexity — just like we'd do in a real system design discussion.
Requirements
I’m not going to mimic the interview-style requirement gathering from the book. Instead, I’ll define a basic set of assumptions:
- The system should support 50 million users
- Users can create posts (text, images, videos)
- Users have followers, and a post should appear in the followers' news feed
- The news feed should show recent content in reverse chronological order - even though we can think of weight based order
- We’ll prioritize freshness over relevance (i.e., no ranking based on engagement or affinity for now)
- The system should support both mobile and web clients
- Latency should be low, especially when fetching feeds
Step 1: The Naive Approach — Start Simple
Let’s say we start small. One user creates a post. Their followers see that post in their feed. Sounds straightforward.
Components:
-
A Web Server to handle requests
-
A Posts Database to store posts
-
A Users Database to store users
-
Two simple APIs:
POST /users/{id}/posts
— to create a postGET /users/{id}/feed
— to fetch the feed
In this setup:
-
The
POST
request inserts the post into the Posts DB -
The
GET
request:- Retrieves the list of users that the requester follows
- Fetches recent posts from those users
- Sorts them in reverse chronological order
Problem?
It works — but only at small scale. Once the number of users, followers, and posts grows, this system starts falling apart:
- One Web Server becomes a bottleneck
- Queries to fetch feeds involve multiple joins and large reads from the database
- Latency increases and reliability decreases
Step 2: Add Scale and Redundancy
To solve the reliability and scalability issue:
- Add a Load Balancer in front of the Web Servers
- Horizontally scale Web Servers to handle concurrent traffic
This improves uptime and balances the load across machines.
Still Not Enough
Even with consideration of scalability, the database becomes a hotspot. Every user request to fetch their feed results in:
- Reading follower lists
- Fetching hundreds of posts
- Sorting and filtering them
Each of these actions adds latency.
Step 3: Introduce Caching — Reduce DB Load
We now introduce caching to speed things up.
What to cache?
- Feed Cache: For each user, store a list of recent post IDs
<follower_id, post_id>
they should see - Post Cache: For each post ID, store post content (text, media URLs, etc.)
- Users Cache: For each user ID, store user relevant information
- Store user metadata, preferences, mute/block lists
- Useful when generating customized feeds or applying filters
This changes the POST /users/{id}/posts
flow:
- Get followers of the user from users table - or even improve using cache - we'll talk about this later
- Create Feed Cache
- Create Post Cache
This changes the GET /users/{user_id}/feed
flow:
- Fetch the list of post IDs from the Feed Cache
- Retrieve post content from Post Cache
- Retrieve user information from Users Cache
- Process the news feed to return by filtering out for particular user.
- If any cache is missing, fallback to the DB, and write to the cache again
This significantly reduces DB hits and improves latency.
But Now the Problem Is...
How do we populate the feed cache? Especially when a user with thousands of followers posts something new?
If we do it synchronously during the post creation — i.e., write to DB and then update thousands of feed caches — it becomes slow and error-prone.
Step 4: Decouple with Asynchronous Workers
To solve this, we decouple the write flow using a Message Queue and Worker system.
Write Flow (Improved):
-
User creates a post → Web Server accepts it
-
Post is stored in Posts DB
-
A job is pushed to a Message Queue
-
A Feed Worker picks up the job:
- Gets the list of followers
- Updates each follower’s Feed Cache with the new post ID
Now, post creation is fast and responsive, and feed updates happen asynchronously in the background.
Step 5: Fan-Out Strategy — Push vs Pull
At this point, we’re using a Fan-out-on-write strategy, also known as the Push model.
Push Model:
- When a user posts, the system pushes the post to their followers’ feed caches
- Reads are fast, because the feed is precomputed
But there’s a catch...
This model doesn't scale well for celebrity users with millions of followers. Updating that many feed caches takes time — even asynchronously.
Alternative: Pull Model (Fan-out-on-read)
- Posts are stored once
- When a user requests their feed, we dynamically fetch recent posts from all the people they follow
This shifts the load to read time and avoids huge fan-out writes.
What’s better?
Neither is perfect. In practice, we use a hybrid approach:
- Use Push model for regular users (less fan-out load)
- Use Pull model for high-follower accounts (e.g., celebs)
💡 In this design, I’m focusing on the Push model for its performance benefits in most common scenarios.
Step 6: More Optimizations and Peripheral system consideration
To further strengthen the design, we can introduce:
1. Graph DB for Follow Relationships
- Storing followers/following in a Graph DB or Redis improves traversal speed
- Makes it efficient to fetch all followers of a user when pushing new posts
2. Retries and Dead-letter Queues
- If the Feed Worker fails, we retry the job
- If it keeps failing, move it to a Dead Letter Queue for inspection
3. Rate Limiting
- Prevent abusive users or clients from overloading the system
5. Content Delivery Network (CDN)
- Serve static assets like images and videos efficiently
Summary
Designing a news feed system is more than just storing posts and showing them in a list. It involves constant iteration, trade-offs, and scaling challenges.
We started with a simple architecture, pointed out its flaws, and gradually improved it:
- ✅ Naive DB-centric design
- ✅ Horizontal scaling with load balancer
- ✅ Caching for reads
- ✅ Async processing with queues and workers
- ✅ Push vs Pull model trade-off
- ✅ Optimizations for scale and performance
This is my personal walkthrough of designing a scalable news feed system, step by step, inspired by concepts from System Design Interview – An Insider’s Guide.
Hope this helps anyone else trying to think through large-scale system architecture!
Credit: This post was cleaned up and organized from my rough notes with help from LLM — but the thought process, steps, and structure reflect how I personally reason through system design.