Skip to content

Machine Learning and Data Engineering projects including recommender systems, scalable pipelines, and evaluation-driven ML solutions.

License

Notifications You must be signed in to change notification settings

Uttam-38/Machine_Learning_Projects-

Repository files navigation

Machine Learning Projects Portfolio

This repository contains a curated collection of end-to-end Machine Learning and Data Engineering projects that demonstrate my experience in building scalable, reproducible, and production-style ML systems.

The projects emphasize:

  • Practical machine learning algorithms
  • Data preprocessing and feature engineering
  • Model evaluation using appropriate metrics
  • Clean software engineering practices
  • Real-world datasets and problem statements

Each project is organized as an independent, runnable module with its own documentation and scripts.


🚀 Projects

🔹 1. Personalized Content Recommendation System

Folder: personalized-recsys/

A Netflix-style hybrid recommendation system that combines:

  • Collaborative Filtering (Matrix Factorization using SVD)
  • Content-Based Filtering (TF-IDF on movie metadata)
  • Hybrid ranking with cold-start handling

Key Highlights

  • End-to-end ML pipeline (data ingestion → training → evaluation)
  • Offline ranking metrics: Precision@K, Recall@K, NDCG@K, MAP@K
  • Modular, production-style Python codebase
  • Demo script to inspect real recommendations
  • Built using MovieLens 1M dataset

➡️ See personalized-recsys/README.md for full details.


🛠 Technologies Used

  • Languages: Python
  • ML & Data: NumPy, Pandas, Scikit-learn
  • Evaluation: Ranking-based metrics
  • Engineering: Modular code, configuration-driven pipelines, Git
  • Datasets: MovieLens (GroupLens)

🎯 Purpose of This Repository

This repository serves as:

  • A technical portfolio for Machine Learning / Data Engineering internships
  • A demonstration of problem-solving and system design
  • A foundation for experimenting with scalable ML systems

Future projects will extend into:

  • Distributed systems (Spark, Kafka)
  • Graph-based ML (Neo4j)
  • Advanced ML models and pipelines

👤 Author

Uttam
Master’s Student – Data Science (Computing and Decision Analytics) Actively seeking Machine Learning Engineer and Data Engineering Intern roles.


📌 Notes

  • Large datasets and trained models are excluded from version control.
  • Each project is self-contained and reproducible.

About

Machine Learning and Data Engineering projects including recommender systems, scalable pipelines, and evaluation-driven ML solutions.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages