Skip to content

dawgdevv/Real_time_reddit_analysis

Repository files navigation

🚀 Reddit Real-Time Analytics Dashboard


📈 Project Overview

Reddit Real-Time Analytics Dashboard is a dynamic and visually stunning web application that continuously fetches live subreddit data stored in MongoDB and visualizes key insights like:

  • 📊 Average Upvotes
  • 💬 Average Comments
  • 📈 Post Counts
  • 😊 Sentiment Trends
  • 🔥 Top Subreddits
  • 🏆 Highest and Lowest Sentiment Posts

All updated every 30 seconds, in real-time, without refreshing the page!

Built using:

  • Dash (by Plotly) for the interactive dashboard
  • MongoDB as the live data source
  • Plotly Express for rich, animated graphs
  • Dash DAQ for future enhancements (knobs, indicators)
  • Python and Pandas for data processing

🎯 Features

  • ✅ Real-time Data Refresh (every 30 seconds)
  • ✅ Beautiful Dark-Themed UI (Plotly Dark Template)
  • ✅ Live Sentiment Tracking Line Graph
  • ✅ Post Sentiment Boxplots by Subreddit
  • ✅ Top Subreddits Visualization (Pie Chart)
  • ✅ Highlighted Highest and Lowest Sentiment Posts
  • ✅ Mobile and Desktop Friendly

🛠️ Tech Stack

Technology Usage
Dash For building reactive, fast dashboards
Plotly Express For interactive, beautiful plotting
MongoDB Atlas Cloud-hosted live Reddit data storage
Pandas Dataframe manipulation and aggregation
Python Core backend logic

📂 Project Structure

📦 reddit-realtime-dashboard
 ┣ 📜 .gitignore                  # Git ignore file for excluding unnecessary files
 ┣ 📜 README.md                   # Project documentation
 ┣ 📜 requirements.txt            # Python dependencies (including python-dotenv)
 ┣ 📜 setup_kafka.sh              # Script to set up Kafka for streaming data
 ┣ 📜 advanced_spark_processor.py # Spark processor for advanced data handling
 ┣ 📜 commands.md                 # Instructions for running various commands
 ┣ 📜 consumerr.py                # Consumer script for fetching data from Kafka
 ┣ 📜 dashboard.py                # Main dashboard application (Dash)
 ┣ 📜 producerr.py                # Producer script for sending data to Kafka

🔥 Quick Setup & Run Locally

  1. Clone this repository 🚀
git clone https://github.com/your-username/reddit-realtime-dashboard.git
cd reddit-realtime-dashboard
  1. Install the dependencies 📦
pip install -r requirements.txt
  1. Run the app 🖥️
python dashboard.py
  1. Open in Browser 🌐 Navigate to 👉 http://127.0.0.1:8050/ to view your dashboard live!

🗄️ MongoDB Collections Structure

  • subreddit_stats Contains aggregated statistics for each subreddit:
  • subreddit
  • avg_upvotes
  • avg_comments
  • avg_sentiment
  • post_count
  • processing_timestamp
  • reddit_posts Contains individual post details:
  • subreddit
  • description
  • sentiment_score
  • processing_timestamp

Note: Three different MongoDB clients are connected (team collaboration).


🎨 Dashboard Preview

Upvotes, Comments, and Posts Sentiment Tracking

Top Subreddits Sentiment Distribution

Highlighted Posts


📌 Future Enhancements

  • Add sentiment-based word clouds.
  • Introduce filter options (select subreddits, date ranges).
  • Add alerts if a subreddit sentiment becomes extremely negative.
  • Deploy the app on Render, Heroku, or AWS for public access!

🤝 Contributors

  • Dev Rishi Verma
  • Nishant Raj
  • Kushagra Gupta

⭐ Show Your Support!

If you like this project, don’t forget to:

  • ⭐ Star this repo
  • 🍴 Fork it
  • 🛠️ Submit PRs
  • 🔥 Share with your friends!

Built with ❤️ using Python, MongoDB, Dash, and Plotly.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •