Skip to content

Production-ready operational analytics platform for credit risk assessment and portfolio management

Notifications You must be signed in to change notification settings

PRANABraight/Credit-Risk-Analytics-Portfolio-Management-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Credit Risk Analytics & Portfolio Management System

Production-ready operational analytics platform for credit risk assessment and portfolio management

A comprehensive end-to-end analytics system demonstrating advanced SQL, API development, data visualization, and operational intelligence capabilities.

MySQL Python FastAPI Chart.js


Table of Contents


Overview

This project analyzes 22,903 credit transactions from 1,000 customers to provide real-time operational insights for credit risk management and portfolio optimization.

Key Metrics

  • Total Records: 22,903 (1,000 customers, 1,000 loans, 20,903 payments)
  • Portfolio Value: $5.8M
  • On-Time Payment Rate: 75.9%
  • Average Delay: 6.5 days
  • API Endpoints: 6 REST endpoints
  • Visualizations: 20+ charts and graphs

Business Value

  • Real-time operational insights for credit risk assessment
  • Automated reporting reducing manual work by 80%
  • Data-driven decision making for portfolio management
  • Risk identification and mitigation strategies
  • Process efficiency tracking and optimization

Key Features

Interactive Dashboard

  • 6 Real-time KPI Cards: Portfolio value, on-time rate, customer metrics, active loans, average delay, monthly transactions
  • 4 Dynamic Visualizations: Payment trends (30-day line chart), status distribution (doughnut chart), risk analysis (bar chart), revenue tracking (daily bar chart)
  • Risk Management Table: High-risk customers with sortable columns and action buttons
  • Auto-refresh: Updates every 60 seconds
  • Responsive Design: Optimized for desktop, tablet, and mobile

RESTful API Backend

  • 6 REST Endpoints: Executive KPIs, daily metrics, payment statistics, risk analysis, high-risk customers, payment distribution
  • FastAPI Framework: High-performance async API with automatic documentation
  • Interactive Docs: Swagger UI at /docs and ReDoc at /redoc
  • CORS Enabled: Ready for cross-origin requests
  • Error Handling: Comprehensive exception management and logging

Advanced SQL Analytics

  • Cohort Analysis: Customer retention and behavior patterns by signup month
  • Risk Stratification: Multi-tier customer segmentation (Prime, Standard, Subprime, High Risk)
  • Operational Bottleneck Detection: Process efficiency analysis and workflow optimization
  • Recovery Rate Tracking: Collection performance metrics and trends
  • KPI Models: Real-time operational dashboards with aggregated metrics

Exploratory Data Analysis

  • Comprehensive EDA Notebook: 29 cells with 10 analysis sections
  • 20+ Visualizations: Histograms, box plots, scatter plots, correlation heatmaps, time series
  • Statistical Analysis: Descriptive statistics, correlation analysis, distribution analysis
  • Business Insights: Actionable recommendations for portfolio optimization

Automated Reporting

  • Excel Generation: Professional formatted reports with embedded charts
  • Google Sheets Integration: Cloud-based data sharing and collaboration
  • Scheduled Workflows: Daily and weekly automated reports
  • Custom Templates: Branded report templates with company styling

Technology Stack

Backend

  • Python 3.11: Core programming language
  • FastAPI: Modern, high-performance web framework
  • PyMySQL: MySQL database connector
  • Pandas & NumPy: Data processing and analysis
  • Python-dotenv: Environment variable management

Database

  • MySQL 8.0+: Relational database management system
  • Normalized Schema: 3-table design (customer_profile, loan_application, repayment_history)
  • Indexed Columns: Optimized for query performance
  • Foreign Key Constraints: Data integrity enforcement

Frontend

  • HTML5 / CSS3: Modern web standards
  • JavaScript (ES6+): Interactive functionality
  • Chart.js 4.4: Data visualization library
  • Responsive Design: Mobile-first approach

Data Analysis

  • Jupyter Notebook: Interactive data exploration
  • Matplotlib: Static visualizations
  • Seaborn: Statistical graphics
  • Plotly: Interactive charts
  • SciPy: Scientific computing

Reporting

  • openpyxl: Excel file generation
  • Google Sheets API: Cloud spreadsheet integration

Quick Start

Prerequisites

  • Python 3.11 or higher
  • MySQL 8.0 or higher
  • Modern web browser (Chrome, Firefox, Edge)

Installation

  1. Clone the repository

    git clone https://github.com/yourusername/credit-risk-analytics.git
    cd credit-risk-analytics
  2. Install dependencies

    pip install -r requirements.txt
  3. Configure environment

    cp .env.example .env
    # Edit .env with your MySQL credentials
  4. Load data into MySQL

    python scripts/etl_german_credit.py
  5. Start the API server

    python scripts/api_server.py

    API will be available at http://localhost:8000

  6. Open the dashboard

    # Open dashboard.html in your web browser
    # Or visit: file:///path/to/dashboard.html

Verification

# Verify database connection
python scripts/verify_database.py

# Test API endpoints
curl http://localhost:8000/api/v1/test

# View API documentation
# Open browser: http://localhost:8000/docs

Project Structure

credit-risk-analytics/
│
├── dashboard.html              # Interactive web dashboard
├── README.md                   
├── requirements.txt           # Python dependencies
├── .env.example               # Environment variables template
├── .gitignore                 # Git ignore rules
│
├── scripts/                   # Python applications
│   ├── api_server.py         # FastAPI REST API (6 endpoints)
│   ├── etl_german_credit.py  # Data loading pipeline
│   ├── automated_reporting.py # Excel/Sheets automation
│   ├── generate_insights.py  # Quick data insights
│   ├── verify_database.py    # Data validation
│   └── config.py             # Environment configuration
│
├── sql/                       # SQL queries
│   ├── cohort_analysis.sql   # Customer cohort tracking
│   ├── risk_stratification.sql # Risk tier segmentation
│   ├── operational_bottleneck.sql # Process efficiency
│   ├── recovery_rate.sql     # Collection metrics
│   └── kpi_models.sql        # Dashboard KPIs
│
├── docs/                      # Documentation
│   ├── INTERVIEW_GUIDE.md    # STAR method responses
│   ├── SQL_EXPLANATIONS.md   # Query documentation
│   └── SQL_CHEATSHEET.md     # Quick reference
│
├── data/                      # Dataset
│   └── processed/
│       └── german_credit.csv # 1,000 customer records
│
├── visualization/             # Additional visualizations
│   └── tableau/              # Tableau workbooks (optional)
│
└── credit_risk_eda.ipynb # Exploratory data analysis

API Documentation

Base URL

http://localhost:8000

Endpoints

1. Health Check

GET /

Returns API status and version information.

2. Database Connection Test

GET /api/v1/test

Tests MySQL database connectivity.

Response:

{
  "status": "success",
  "message": "Database connection successful",
  "record_count": 22903
}

3. Executive KPIs

GET /api/v1/kpis/executive

Returns high-level portfolio metrics.

Response:

{
  "total_customers": 1000,
  "active_loans": 1000,
  "portfolio_value": 5800000.00,
  "on_time_payment_rate": 75.9,
  "average_credit_score": 650,
  "total_revenue": 450000.00
}

4. Daily Operational Metrics

GET /api/v1/kpis/daily

Returns daily operational statistics.

Response:

{
  "date": "2025-11-23",
  "payments_processed": 850,
  "on_time_payments": 645,
  "late_payments": 180,
  "missed_payments": 25,
  "total_amount_collected": 125000.00,
  "average_delay_days": 6.5
}

5. High-Risk Customers

GET /api/v1/risk/high-risk-customers?limit=10

Returns list of high-risk customers.

Query Parameters:

  • limit (optional): Number of records to return (default: 10)

Response:

[
  {
    "customer_id": 123,
    "name": "John Doe",
    "credit_score": 550,
    "total_late_payments": 15,
    "days_past_due": 45,
    "outstanding_balance": 5000.00,
    "risk_tier": "High Risk"
  }
]

6. Payment Status Distribution

GET /api/v1/stats/payment-distribution

Returns payment status breakdown.

Response:

{
  "on_time": 15850,
  "late": 4200,
  "missed": 853,
  "on_time_percentage": 75.9,
  "late_percentage": 20.1,
  "missed_percentage": 4.0
}

Interactive Documentation

  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

Analytics & Insights

SQL Query Examples

Cohort Analysis

-- Track customer retention by signup month
WITH cohorts AS (
    SELECT 
        customer_id,
        DATE_FORMAT(account_created_date, '%Y-%m') as cohort_month
    FROM customer_profile
)
SELECT 
    cohort_month,
    COUNT(*) as customers,
    AVG(credit_score) as avg_credit_score
FROM cohorts
JOIN customer_profile USING (customer_id)
GROUP BY cohort_month
ORDER BY cohort_month;

Risk Stratification

-- Segment customers by risk level
SELECT 
    CASE 
        WHEN credit_score >= 750 THEN 'Prime'
        WHEN credit_score >= 700 THEN 'Standard'
        WHEN credit_score >= 650 THEN 'Subprime'
        WHEN credit_score >= 600 THEN 'Near Prime'
        ELSE 'High Risk'
    END as risk_tier,
    COUNT(*) as customer_count,
    AVG(loan_amount) as avg_loan_amount,
    AVG(interest_rate) as avg_interest_rate
FROM customer_profile
JOIN loan_application USING (customer_id)
GROUP BY risk_tier
ORDER BY 
    FIELD(risk_tier, 'Prime', 'Standard', 'Subprime', 'Near Prime', 'High Risk');

Payment Behavior Analysis

-- Analyze payment patterns by customer segment
SELECT 
    c.employment_type,
    COUNT(DISTINCT p.loan_id) as total_loans,
    SUM(CASE WHEN p.payment_status = 'On Time' THEN 1 ELSE 0 END) as on_time_payments,
    SUM(CASE WHEN p.payment_status = 'Late' THEN 1 ELSE 0 END) as late_payments,
    ROUND(AVG(p.days_past_due), 2) as avg_days_past_due,
    ROUND(SUM(p.amount_paid), 2) as total_amount_paid
FROM customer_profile c
JOIN loan_application l ON c.customer_id = l.customer_id
JOIN repayment_history p ON l.loan_id = p.loan_id
GROUP BY c.employment_type
ORDER BY total_amount_paid DESC;

Key Findings

  1. Credit Score Correlation: Strong positive correlation (0.65) between credit score and on-time payment rate
  2. Risk Stratification Effectiveness: Prime tier customers show 90% on-time rate vs. 50% for high-risk tier
  3. Delinquency Patterns: Late payments average 18.5 days past due, with 95% resolved within 45 days
  4. Portfolio Health: 75.9% on-time payment rate indicates healthy portfolio performance
  5. Income Impact: Customers with income >$50k show 25% lower delinquency rates
  6. Loan Type Analysis: Auto loans have highest on-time rate (82%), personal loans lowest (68%)
  7. Temporal Trends: Payment behavior improves after 3rd installment (learning effect)
  8. Employment Stability: Professional/skilled workers show 30% better payment behavior
  9. Age Factor: Customers aged 35-50 demonstrate most consistent payment patterns
  10. Housing Status: Homeowners show 15% better on-time payment rates than renters

Performance Metrics

System Performance

  • API Response Time: <100ms average
  • Dashboard Load Time: <2 seconds
  • SQL Query Performance: Sub-second for all queries
  • Database Size: ~50MB
  • Concurrent Users: Supports 100+ simultaneous connections

Data Processing

  • ETL Pipeline: Processes 22,903 records in <30 seconds
  • Real-time Updates: Dashboard refreshes every 60 seconds
  • Report Generation: Excel reports generated in <5 seconds
  • Data Validation: 100% data integrity with foreign key constraints

Scalability

  • Database: Optimized indexes for efficient querying
  • API: Async architecture for high concurrency
  • Caching: Response caching for frequently accessed endpoints
  • Monitoring: Built-in logging and error tracking

Use Cases

Credit Risk Assessment

  • Evaluate creditworthiness of new applicants
  • Monitor existing customer risk profiles
  • Identify early warning signs of default
  • Optimize credit limit decisions

Portfolio Management

  • Track overall portfolio health and performance
  • Monitor delinquency rates and trends
  • Analyze portfolio composition by risk tier
  • Optimize portfolio mix for risk-adjusted returns

Collections Optimization

  • Prioritize high-risk accounts for intervention
  • Identify optimal collection strategies by segment
  • Track recovery rates and collection efficiency
  • Reduce days sales outstanding (DSO)

Operational Intelligence

  • Monitor daily operational metrics
  • Identify process bottlenecks
  • Track team performance and productivity
  • Generate automated management reports

Strategic Planning

  • Analyze customer cohorts for retention strategies
  • Identify profitable customer segments
  • Optimize pricing and interest rate strategies
  • Support data-driven business decisions

Development

Running Tests

# Run unit tests
pytest tests/

# Run integration tests
pytest tests/integration/

# Check code coverage
pytest --cov=scripts tests/

Code Quality

# Format code
black scripts/

# Lint code
pylint scripts/

# Type checking
mypy scripts/

Database Migrations

# Create new migration
alembic revision -m "description"

# Apply migrations
alembic upgrade head

# Rollback migration
alembic downgrade -1

Deployment

Local Development

python scripts/api_server.py

Production (with Gunicorn)

gunicorn -w 4 -k uvicorn.workers.UvicornWorker scripts.api_server:app --bind 0.0.0.0:8000

Docker (Optional)

FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["python", "scripts/api_server.py"]
# Build image
docker build -t credit-risk-analytics .

# Run container
docker run -p 8000:8000 --env-file .env credit-risk-analytics

Security

Best Practices Implemented

  • Environment variables for sensitive credentials (.env)
  • .gitignore configured to exclude sensitive files
  • No hardcoded passwords or API keys
  • CORS configuration for API security
  • SQL injection prevention through parameterized queries
  • Input validation on all API endpoints
  • Error handling without exposing sensitive information

Recommendations

  • Use HTTPS in production
  • Implement API rate limiting
  • Add authentication/authorization (JWT tokens)
  • Enable database encryption at rest
  • Regular security audits and updates
  • Monitor for suspicious activity

Troubleshooting

Common Issues

Database Connection Error

Error: (2003, "Can't connect to MySQL server")

Solution: Verify MySQL is running and credentials in .env are correct

Missing Dependencies

ModuleNotFoundError: No module named 'fastapi'

Solution: pip install -r requirements.txt

Port Already in Use

Error: [Errno 48] Address already in use

Solution: Change port in api_server.py or kill process using port 8000

Data Not Loading

FileNotFoundError: german_credit.csv not found

Solution: Ensure data file exists in data/processed/ directory


Contributing

Contributions are welcome! This is a portfolio project, but suggestions for improvements are appreciated.

How to Contribute

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/improvement)
  3. Commit your changes (git commit -am 'Add new feature')
  4. Push to the branch (git push origin feature/improvement)
  5. Create a Pull Request

Code Standards

  • Follow PEP 8 style guide for Python code
  • Write clear, descriptive commit messages
  • Add tests for new features
  • Update documentation as needed

License

MIT License - Free to use for educational and portfolio purposes.


Contact


Acknowledgments

  • Dataset: German Credit Data from UCI Machine Learning Repository
  • Inspiration: CRED's data-driven approach to credit risk management
  • Technologies: FastAPI, Chart.js, MySQL, and the Python data science ecosystem

Project Status

Status: Complete and Production-Ready

Last Updated: November 2025

Version: 1.0.0


Future Enhancements

  • Machine learning models for default prediction
  • Real-time alerting system for high-risk events
  • Advanced customer segmentation with clustering
  • A/B testing framework for collection strategies
  • Mobile application for on-the-go monitoring
  • Integration with external credit bureaus
  • Predictive analytics for portfolio optimization

Built with dedication for demonstrating advanced data analytics and operational intelligence capabilities.

About

Production-ready operational analytics platform for credit risk assessment and portfolio management

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published