Model Evaluation with Sampling Techniques

This project evaluates the performance of different machine learning models using various sampling techniques on a credit card fraud detection dataset. The models are assessed using accuracy scores across five different sampling techniques:

Simple Random Sampling - Randomly selects a fraction of the data.
Stratified Sampling - Samples 80% of each class to maintain class distribution.
Systematic Sampling - Selects every 5th sample from the data.
Cluster Sampling - Selects 50% of the data randomly.
Bootstrap Sampling - Randomly samples the data with replacement.

Models Evaluated

The following models are evaluated:

Decision Tree
K-Nearest Neighbors
Logistic Regression
Naive Bayes
Random Forest

Results

Accuracy Score Results Table:

Model	Bootstrap Sampling	Cluster Sampling	Simple Random Sampling	Stratified Sampling	Systematic Sampling
Decision Tree	0.990170	0.968533	0.982297	0.982787	0.934691
K-Nearest Neighbors	0.885966	0.803414	0.851902	0.845902	0.702591
Logistic Regression	0.946924	0.904352	0.926611	0.923770	0.878953
Naive Bayes	0.839486	0.876832	0.855168	0.836885	0.836330
Random Forest	0.998689	0.992140	0.995412	0.995902	0.993443

Key Findings

Bootstrap Sampling yields the best accuracy scores, particularly with models like Random Forest and Decision Tree.
Stratified Sampling is ideal for datasets with imbalanced classes, ensuring that each class is proportionally represented.
Systematic Sampling tends to produce lower accuracy scores, especially for models like K-Nearest Neighbors.
Cluster Sampling performs well with Decision Tree and Random Forest, though not as well with K-Nearest Neighbors.

These results suggest that Bootstrap Sampling is the most effective sampling technique for improving model performance across various machine learning models.

Visualizations

Heatmap of Model Accuracy Scores Across Sampling Techniques

Below is the heatmap showing the accuracy of different models with respect to the sampling techniques:

This heatmap provides a visual comparison of model performance for each sampling technique. You can clearly see that Random Forest and Decision Tree outperform other models across most sampling techniques.

Bar Chart of Model Performance

A bar chart comparing the accuracy scores of different models across all sampling techniques is also included. This chart helps to visualize the differences in performance in a more intuitive way.

Project Insights

Bootstrap Sampling is the most effective for improving accuracy, particularly for models like Random Forest, which performs best overall.
Stratified Sampling is critical for handling class imbalances, ensuring that both the majority and minority classes are properly represented.
Systematic Sampling showed less promising results, especially for K-Nearest Neighbors, indicating that this sampling technique may not be the best for certain models.
Cluster Sampling had a mixed performance but worked well with models like Decision Tree and Random Forest.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Creditcard_data.csv		Creditcard_data.csv
README.md		README.md
Sampling code.py		Sampling code.py
Sampling_Assignment.pdf		Sampling_Assignment.pdf
comp_1_heatmap.png		comp_1_heatmap.png
comp_2.png		comp_2.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Model Evaluation with Sampling Techniques

Models Evaluated

Results

Accuracy Score Results Table:

Key Findings

Visualizations

Heatmap of Model Accuracy Scores Across Sampling Techniques

Bar Chart of Model Performance

Project Insights

About

Uh oh!

Releases

Packages

Languages

Aryan-Chharia/Sampling_Machine_Learning

Folders and files

Latest commit

History

Repository files navigation

Model Evaluation with Sampling Techniques

Models Evaluated

Results

Accuracy Score Results Table:

Key Findings

Visualizations

Heatmap of Model Accuracy Scores Across Sampling Techniques

Bar Chart of Model Performance

Project Insights

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages