Skip to content

Developed and optimized linear regression and locally weighted linear regression models, performed overfitting detection, evaluated logistic regression for gender classification, visualized decision boundaries, and conducted comparative analysis with KNN and Naïve Bayes, highlighting feature removal effects.

Notifications You must be signed in to change notification settings

ramidimeghanareddy/Machine_Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning

Meghana Ramidi

Follow Github GitHub stars

Project1:

Question 1 - K-Nearest Neighbors (KNN):

  • Defined three distance metrics (Cartesian, Manhattan, Minkowski of order 3) and built a KNN algorithm with steps for distance calculation, nearest neighbor retrieval, and prediction.
  • Made predictions on test data using K values 1, 3, and 7.
  • Implemented Leave One Out Evaluation for KNN with Cartesian distance, assessing algorithm's performance for different K values.
  • Evaluated KNN performance after excluding 'age' data, highlighting the significance of age in predicting labels.

Question 2 - Gaussian Naive Bayes:

  • Developed Gaussian Naive Bayes algorithm with steps for dataset separation, summarization, Gaussian Probability Density Function calculation, class probability calculation, and prediction.
  • Made predictions on test data using the Gaussian Naive Bayes algorithm.
  • Evaluated algorithm performance using Leave One Out Evaluation.
  • Assessed algorithm performance after removing 'age' data.
  • Compared the performance of KNN and Gaussian Naive Bayes, concluding that Gaussian Naive Bayes outperforms KNN.

Project2:

Question 1 - Linear Regression and Function Depth Analysis:

Title: Exploring Linear Regression and Function Depth Impact

Description: Developed linear regression models with varying function depths up to 6, applied regression to generated data, and assessed model performance by evaluating errors on test data. Found depth 4 to be the best fit due to its minimized mean square error, although acknowledging potential limitations of small datasets in affecting model reliability.

Question 2 - Locally Weighted Linear Regression and Dataset Size Influence:

Title: Investigating Locally Weighted Linear Regression and Dataset Size Effects

Description: Explored locally weighted linear regression for 1-dimensional data, applied the method to generated data, and compared its performance with the linear regression model. Noted that the locally weighted model outperformed the linear regression model on the test data. Additionally, examined the impact of dataset size reduction on model performance and observed increased mean squared error and reduced fit quality. Concluded that the original data might not adhere to the assumed function format.

Question 3 - Classification with Logistic Regression and Comparative Analysis:

Title: Classification Analysis using Logistic Regression: Comparing Performance and Feature Impact

Description: Implemented logistic regression to classify data based on height, weight, and age. Created visualization plots for separation boundaries and data points. Evaluated the logistic regression model's performance using leave-one-out validation and compared results with KNN and Naïve Bayes classifiers. Found the logistic regression model outperformed KNN and slightly surpassed Naïve Bayes in accuracy (70.83% vs. 63.33% and 70%). Evaluated model performance after removing the age feature, observing decreased accuracy compared to KNN and Naïve Bayes due to reduced dimensionality, indicating that the latter two models perform better in lower-dimensional scenarios.

Project3:

Question 1: Decision Tree Implementation Project

Description: This project focuses on the implementation of a decision tree algorithm using Python. The primary objective is to develop a working decision tree model and demonstrate comprehension of concepts such as information gain, entropy calculations, and data splitting. The provided run_code.sh script should be used to execute the Python code. Ensure code comments are extensively added to clarify critical sections, including entropy calculation, information gain, split evaluation, and threshold determination. The project includes a reference decision tree output (decision_tree_output.png), which might differ from your implementation. The dummy_sample_output.txt provides an example of the expected output format. Successful completion of this project will showcase your understanding of decision trees and effective coding practices.

About

Developed and optimized linear regression and locally weighted linear regression models, performed overfitting detection, evaluated logistic regression for gender classification, visualized decision boundaries, and conducted comparative analysis with KNN and Naïve Bayes, highlighting feature removal effects.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published