Skip to content

Knowledge Is Power is a data analysis and prediction tool leveraging U.S. Census data to provide insights into societal topics. Utilizing machine learning, specifically MATLAB's classification learner, the project predicts educational attainment based on income and offers interactive visualizations of veterans' data.

Notifications You must be signed in to change notification settings

mar19a/KnowledgeIsPower

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Knowledge Is Power

knowledgeispower

πŸ”— View Final Project Presentation (PDF)


πŸ“˜ Overview

Knowledge Is Power is a comprehensive data-driven analysis and prediction tool that utilizes U.S. Census data to explore key societal themes. The project applies machine learning techniques and statistical analysis to uncover insights into demographics, education, and economic factors. This initiative demonstrates practical applications of data science and predictive modeling to address real-world issues through accessible and engaging visualizations.


✨ Key Features

  • Veterans Data Analysis
    Explore demographic breakdowns (e.g., by gender) of veteran data using interactive graphs and visual reports.

  • Educational Attainment Prediction
    Predict a person's likely educational degree based on income levels using supervised learning models.

  • Interactive Visualizations
    Dynamic, user-friendly charts and dashboards provide clear interpretation of complex data trends.


πŸ€– Machine Learning Implementation

πŸ” Approach

We used MATLAB's Classification Learner Toolbox to build a supervised machine learning model capable of predicting educational attainment based on income. Using U.S. Census data, the model was trained on categories such as:

  • Less than High School
  • High School Graduate
  • Some College
  • Bachelor’s Degree
  • Graduate/Professional Degree

The model utilized Linear Discriminant Analysis (LDA) to classify data points, achieving 100% accuracy on the training dataset according to the confusion matrix output. Unlike regression-based models, this implementation focuses on categorical prediction.

🎯 Project Goals

Our objective was to build an intuitive platform where users could explore U.S. Census data topics and gain meaningful insights. The tool highlights how data literacy and machine learning can inform public understanding and policy-making, particularly around education and income disparity.


πŸ“ˆ What I Learned

  • Data Processing & Visualization
    Developed skills in handling large datasets, cleaning data, and producing compelling visual outputs.

  • Applied Machine Learning
    Gained experience with model training, evaluation, and deployment using MATLAB’s Classification Learner.

  • Human-Centered Design
    Created user experiences that prioritize clarity, simplicity, and engagement when interacting with data.


πŸ’Ό Relevance to Software Engineering

  • Data-Driven Problem Solving
    Applied analytical thinking to address complex societal challenges through software tools.

  • Technical Growth
    Strengthened proficiency in data science, machine learning, and full-cycle software development.

  • Social Impact
    Designed a platform that educates and empowers usersβ€”demonstrating how software can drive progress.


πŸ›  Technologies Used

  • Languages:

    • MATLAB (Machine Learning & Classification)
    • Python (Data Preprocessing & Visualization)
  • Libraries & Tools:

    • MATLAB Classification Learner
    • Pandas, Seaborn, Matplotlib
  • Data Source:

    • U.S. Census Bureau Public Datasets

πŸ“š Additional Reading: Understanding Data Bias

β€œBias in machine learning systems can reflect or amplify societal inequalities.”

We incorporated an ethical lens by examining Prabhakar Krishnamurthy's article "Understanding Data Bias", which highlights examples such as Amazon’s recruitment model and ad delivery systems that reinforced bias. To mitigate such issues, we discussed data pre-processing, in-processing (during training), and post-processing evaluation as best practices for fairness and accountability in predictive modeling.


πŸ“ Conclusion

Knowledge Is Power reflects the belief that informed citizens can drive meaningful change. By combining data science with user-friendly design, this project empowers individuals to understand key social and economic trends. Education remains at the heart of societal progress, and this tool aims to make that understanding accessible to all.

β€œKnowledge is power. Education is the premise of progress in every society.”


πŸ”— Links


πŸ“„ License

This project is licensed under the MIT License.

About

Knowledge Is Power is a data analysis and prediction tool leveraging U.S. Census data to provide insights into societal topics. Utilizing machine learning, specifically MATLAB's classification learner, the project predicts educational attainment based on income and offers interactive visualizations of veterans' data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages