Skip to content

End-to-end credit risk analysis on LendingClub loan data using Python and Machine Learning - feature engineering, imbalanced classification, and model calibration for default prediction

Notifications You must be signed in to change notification settings

abailey81/Credit-Risk---Lending-Club

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Credit Risk Default Prediction (Lending Club)

Predicting loan default probability on a 2007–2015 Lending Club–style dataset.
This repository shows a full, reproducible ML workflowβ€”from data prep and EDA to model benchmarking and evaluation.


πŸ”Ž Problem

Given borrower and loan attributes, estimate the probability a loan will default.
This supports better risk-based decisions (pricing, approvals, limits) and model explainability for stakeholders.


πŸ“¦ Models Benchmarked

  • Logistic Regression β€” interpretable baseline
  • Decision Tree (no CV) β€” simple, high variance
  • Decision Tree (CV-tuned) β€” better bias/variance balance
  • Random Forest β€” robust ensemble

πŸ“ˆ Headline Test Results

Model ROC-AUC (test) Notes
Decision Tree (CV-tuned) ~0.84 Best balance of accuracy & interpretability
Random Forest ~0.83 Strong generalisation, less transparent
Logistic Regression ~0.76 Reliable, easy to explain
Decision Tree (no CV) ~0.64 Overfits without tuning

Full write-up, figures and confusion matrices: report/credit_risk_analysis_report.pdf
Reproduce the pipeline in notebooks/credit_risk_models.ipynb.


πŸ—‚ Repository Structure

About

End-to-end credit risk analysis on LendingClub loan data using Python and Machine Learning - feature engineering, imbalanced classification, and model calibration for default prediction

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published