π View Final Project Presentation (PDF)
Knowledge Is Power is a comprehensive data-driven analysis and prediction tool that utilizes U.S. Census data to explore key societal themes. The project applies machine learning techniques and statistical analysis to uncover insights into demographics, education, and economic factors. This initiative demonstrates practical applications of data science and predictive modeling to address real-world issues through accessible and engaging visualizations.
-
Veterans Data Analysis
Explore demographic breakdowns (e.g., by gender) of veteran data using interactive graphs and visual reports. -
Educational Attainment Prediction
Predict a person's likely educational degree based on income levels using supervised learning models. -
Interactive Visualizations
Dynamic, user-friendly charts and dashboards provide clear interpretation of complex data trends.
We used MATLAB's Classification Learner Toolbox to build a supervised machine learning model capable of predicting educational attainment based on income. Using U.S. Census data, the model was trained on categories such as:
- Less than High School
- High School Graduate
- Some College
- Bachelorβs Degree
- Graduate/Professional Degree
The model utilized Linear Discriminant Analysis (LDA) to classify data points, achieving 100% accuracy on the training dataset according to the confusion matrix output. Unlike regression-based models, this implementation focuses on categorical prediction.
Our objective was to build an intuitive platform where users could explore U.S. Census data topics and gain meaningful insights. The tool highlights how data literacy and machine learning can inform public understanding and policy-making, particularly around education and income disparity.
-
Data Processing & Visualization
Developed skills in handling large datasets, cleaning data, and producing compelling visual outputs. -
Applied Machine Learning
Gained experience with model training, evaluation, and deployment using MATLABβs Classification Learner. -
Human-Centered Design
Created user experiences that prioritize clarity, simplicity, and engagement when interacting with data.
-
Data-Driven Problem Solving
Applied analytical thinking to address complex societal challenges through software tools. -
Technical Growth
Strengthened proficiency in data science, machine learning, and full-cycle software development. -
Social Impact
Designed a platform that educates and empowers usersβdemonstrating how software can drive progress.
-
Languages:
- MATLAB (Machine Learning & Classification)
- Python (Data Preprocessing & Visualization)
-
Libraries & Tools:
- MATLAB Classification Learner
- Pandas, Seaborn, Matplotlib
-
Data Source:
- U.S. Census Bureau Public Datasets
βBias in machine learning systems can reflect or amplify societal inequalities.β
We incorporated an ethical lens by examining Prabhakar Krishnamurthy's article "Understanding Data Bias", which highlights examples such as Amazonβs recruitment model and ad delivery systems that reinforced bias. To mitigate such issues, we discussed data pre-processing, in-processing (during training), and post-processing evaluation as best practices for fairness and accountability in predictive modeling.
Knowledge Is Power reflects the belief that informed citizens can drive meaningful change. By combining data science with user-friendly design, this project empowers individuals to understand key social and economic trends. Education remains at the heart of societal progress, and this tool aims to make that understanding accessible to all.
βKnowledge is power. Education is the premise of progress in every society.β
This project is licensed under the MIT License.