⚡ Pulse

Scalable Social Feed Platform - A production-style system design project demonstrating distributed systems concepts, event-driven architecture, and cloud-native design patterns.

📋 Table of Contents

Overview
System Architecture
Key Features
Technology Stack
Quick Start
API Documentation
System Design Deep Dive
Performance & Scaling
Tradeoffs & Design Decisions

🎯 Overview

Pulse is a social feed platform built to demonstrate real-world system design concepts used by companies like Twitter, Instagram, and Facebook. It showcases:

Hybrid Push-Pull Timeline Architecture
Event-Driven Fan-out Pattern
Celebrity Problem Solution
Multi-tier Caching Strategy
Graceful Degradation
Cloud-Native Design

Problem Statement

Modern social platforms must serve low-latency personalized feeds while handling:

High read traffic (millions of timeline requests/sec)
Write amplification (one post to millions of timelines)
Hot users (celebrities with millions of followers)
Service failures without downtime

Pulse demonstrates production-grade solutions to these challenges.

🏗️ System Architecture

┌─────────────┐
│   Client    │
│   (Web UI)  │
└──────┬──────┘
       │
       ▼
┌────────────────────────────────────────┐
│         FastAPI Application            │
│  ┌─────────┬──────────┬──────────────┐ │
│  │  Auth   │  Users   │    Posts     │ │
│  │ Service │ Service  │   Service    │ │
│  └─────────┴──────────┴──────────────┘ │
│  ┌──────────────────────────────────┐  │
│  │      Timeline Service            │  │
│  │  (Hybrid Push-Pull Strategy)     │  │
│  └──────────────────────────────────┘  │
└───┬──────────────┬──────────────┬──────┘
    │              │              │
    ▼              ▼              ▼
┌─────────┐  ┌──────────┐  ┌──────────┐
│ Redis   │  │   SQS    │  │PostgreSQL│
│ (Cache) │  │ (Queue)  │  │   (DB)   │
└─────────┘  └────┬─────┘  └──────────┘
                  │
                  ▼
           ┌──────────────┐
           │  Fan-out     │
           │   Worker     │
           └──────────────┘

Data Flow

Write Path (Normal User):

User creates post, saved to PostgreSQL
Event published to SQS queue
Fan-out worker consumes event
Post pushed to all follower timelines in Redis

Write Path (Celebrity):

User creates post, saved to PostgreSQL
No fan-out (write amplification avoided)

Read Path:

Request timeline, check Redis cache
If cache hit, return immediately
Pull celebrity posts (always fresh)
Merge and sort by timestamp
If cache miss, fallback to database

✨ Key Features

1. Hybrid Timeline Architecture

Push Model: Fan-out for normal users (fast reads)
Pull Model: On-demand for celebrities (prevents write amplification)
Best of both worlds

2. Celebrity Detection

Automatic threshold detection (100K+ followers)
Dynamic flag update
Separate handling logic

3. Event-Driven Fan-out

Asynchronous processing via SQS
Retry logic and idempotency
Decoupled architecture

4. Graceful Degradation

Redis down? Fall back to database
SQS unavailable? Direct timeline write
Always operational

5. Production-Ready Code

Type hints and validation (Pydantic)
Proper error handling
Logging and observability
Security (JWT, password hashing)

🛠️ Technology Stack

Backend

FastAPI - Modern Python web framework
SQLAlchemy - ORM for database operations
PostgreSQL - Primary data store
Redis - Timeline cache and sorted sets
Pydantic - Data validation

Infrastructure

Docker - Containerization
Docker Compose - Local orchestration
AWS SQS - Message queue
AWS RDS - Managed PostgreSQL
AWS EC2 - Compute

Frontend

Vanilla JavaScript - Simple, dependency-free UI
HTML/CSS - Modern responsive design

🚀 Quick Start

Prerequisites

Docker & Docker Compose
Python 3.11+
AWS Account (optional for cloud features)

Setup

# Clone the repository
cd Pulse

# Run automated setup
chmod +x scripts/setup_local.sh
./scripts/setup_local.sh

# Start the API
uvicorn services.main:app --reload

# In another terminal, start the worker
python -m workers.fanout_worker

# Open the UI
open ui/index.html

Demo Users: alice, bob, celebrity_user (password: password123)

📚 API Documentation

Core Endpoints

Authentication

POST /auth/signup - Register new user
POST /auth/login - Login and receive JWT token

Posts

POST /posts - Create a new post
GET /timeline - Get personalized timeline

Social

POST /users/follow/{user_id} - Follow a user
GET /users/{user_id}/followers - Get user followers

System

GET /system/health - Health check
GET /system/metrics - System metrics

Interactive Documentation: http://localhost:8000/docs

🎓 System Design Deep Dive

The Celebrity Problem

Problem: A celebrity with 10M followers posts, resulting in 10M timeline writes and database overload.

Traditional Approach (Push Only):

Write amplification: 1 write becomes 10M writes
Slow: Takes minutes to fan-out
Expensive: High compute and storage costs

Our Solution (Hybrid):

Push for normal users (< 100K followers)
- Fast reads (cached)
- Acceptable write cost
Pull for celebrities (> 100K followers)
- No fan-out
- Fetched at read time
- Fresh and accurate

Timeline Architecture

Sorted Sets in Redis:

Key: "timeline:{user_id}"
Value: Sorted Set
  - Member: post_id
  - Score: timestamp

Why Sorted Sets?

O(log N) insertion
O(log N) range queries
Automatic sorting by timestamp
Memory efficient

Fan-out Strategy

Decision Matrix:

User Type    | Follower Count | Strategy      | Write Cost | Read Cost
-------------|----------------|---------------|------------|----------
Normal       | < 100K         | Push (Fan-out)| Medium     | Low (cached)
Celebrity    | > 100K         | Pull (On-read)| Low        | Medium

Caching Strategy

L1 Cache (Redis):

Timeline sorted sets
Size-limited to 1000 posts per timeline
Eviction: ZREMRANGEBYRANK

Cache Invalidation:

On Follow: Clear follower's timeline
On Unfollow: Clear follower's timeline
On Post Delete: Remove from all timelines (async)

Fallback Chain:

Request -> Redis Cache -> PostgreSQL -> Return

📊 Performance & Scaling

Scaling Strategies

Horizontal Scaling:

Load Balancer
    ├── API Server 1
    ├── API Server 2
    └── API Server 3
         ↓
    (Stateless, share Redis & DB)

Database Scaling:

Read replicas for timeline queries
Sharding by user_id
Connection pooling

Cache Scaling:

Redis Cluster (16K slots)
Shard by user_id hash
Read replicas for hot keys

Worker Scaling:

Multiple workers reading from SQS
Auto-scaling based on queue depth
Dead letter queue for failures

Bottlenecks & Mitigations

Bottleneck	Symptom	Solution
DB Writes	Slow post creation	Write buffer, async commits
Redis Memory	OOM errors	TTL, size limits, eviction policy
Fan-out Lag	Stale timelines	More workers, batch writes
Celebrity Reads	Slow timelines	Separate cache, CDN

⚖️ Tradeoffs & Design Decisions

1. Eventual Consistency

Decision: Timelines are eventually consistent

Why:

CAP Theorem: Choose Availability + Partition Tolerance
Acceptable for social feeds (not financial data)
Enables horizontal scaling

Tradeoff:

High availability
Better performance
Brief inconsistency (seconds)

2. Push vs. Pull

Decision: Hybrid approach based on follower count

Why:

Best of both worlds
Optimizes for common case (normal users)
Handles edge case (celebrities)

Tradeoff:

Balanced write/read costs
Scalable
More complexity

3. Redis for Timelines

Decision: Use Redis sorted sets for timeline cache

Why:

In-memory results in very fast reads
Sorted sets are natural fit for timelines
Simple data structure

Tradeoff:

Sub-50ms reads
Simple implementation
Memory cost
Data loss on crash (acceptable)

4. SQS for Events

Decision: Use message queue for fan-out

Why:

Decouples API from worker
Built-in retry and DLQ
Managed service (no ops)

Tradeoff:

Reliability
Scalability
AWS dependency
Latency (seconds)

5. Simple Pagination

Decision: Simple offset/limit pagination

Why:

Sufficient for demonstration
Easy to understand

Production would use:

Cursor-based pagination
Prevents skipped/duplicate posts
Better for real-time feeds

📄 License

MIT License - See LICENSE file for details

Built with FastAPI, PostgreSQL, Redis, AWS SQS, Docker

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
diagrams		diagrams
scripts		scripts
services		services
ui		ui
workers		workers
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

License

darshjasani/Pulse

Folders and files

Latest commit

History

Repository files navigation

⚡ Pulse

📋 Table of Contents

🎯 Overview

Problem Statement

🏗️ System Architecture

Data Flow

✨ Key Features

1. Hybrid Timeline Architecture

2. Celebrity Detection

3. Event-Driven Fan-out

4. Graceful Degradation

5. Production-Ready Code

🛠️ Technology Stack

Backend

Infrastructure

Frontend

🚀 Quick Start

Prerequisites

Setup

📚 API Documentation

Core Endpoints

🎓 System Design Deep Dive

The Celebrity Problem

Timeline Architecture

Fan-out Strategy

Caching Strategy

📊 Performance & Scaling

Scaling Strategies

Bottlenecks & Mitigations

⚖️ Tradeoffs & Design Decisions

1. Eventual Consistency

2. Push vs. Pull

3. Redis for Timelines

4. SQS for Events

5. Simple Pagination

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages