🌲 UN SDG Goal 15 – Mountain Forest Cover

Comprehensive Exploratory Data Analysis & Preprocessing Pipeline

A Python-based end-to-end Exploratory Data Analysis (EDA) project aligned with UN Sustainable Development Goal 15 – Life on Land, focusing on Mountain Forest Cover and Degraded Mountain Land.

The project delivers a complete EDA + preprocessing pipeline that transforms raw SDG data into clean, insightful, and machine-learning-ready outputs.

🎯 Project Objective

Analyze mountain land degradation trends
Identify geographic, temporal, and environmental patterns
Generate high-quality visual insights
Prepare clean and ML-ready datasets
Ensure reproducibility using requirements.txt

🚀 Key Features

📊 Exploratory Data Analysis

Dataset structure & feature inspection
Numerical vs categorical feature separation
Summary statistics and profiling

🔬 Data Quality Analysis

Missing value detection
Duplicate identification
Outlier detection using IQR method

🧹 Data Cleaning

Duplicate removal
Median imputation (numerical)
Mode imputation (categorical)

📈 Statistical & Pattern Analysis

Most affected geographic regions
Degradation by bioclimatic belt
Land cover distribution
Temporal trend analysis
Correlation analysis

📊 Automated Visualizations

Geographic degradation ranking
Bioclimatic & land cover plots
Time-series trends
Distribution & correlation plots

🤖 Machine Learning Preparation

Label encoding of categorical features
Feature scaling using StandardScaler
Dimensionality reduction with PCA
ML-ready dataset export

📋 Report Generation

Auto-generated EDA report
Cleaned dataset export
ML-ready dataset export

🛠️ Tech Stack

Python 3.9+
Pandas – Data manipulation
NumPy – Numerical computation
Matplotlib & Seaborn – Visualization
SciPy – Statistical analysis
Scikit-learn – Scaling, Encoding & PCA
OpenPyXL – Excel file handling

📁 Project Structure

SDG_ExploratoryDataAnalysis/
│
├── data/
│   └── Goal15.xlsx
│
├── outputs/
│   ├── plots/
│   │   ├── geographic_analysis.png
│   │   ├── bioclimatic_landcover.png
│   │   ├── temporal_trends.png
│   │   └── comprehensive_analysis.png
│   │
│   └── results/
│       ├── cleaned_data.csv
│       ├── ml_ready_data.csv
│       └── eda_report.txt
│
├── main.py
├── requirements.txt
└── README.md

⚙️ System Requirements

Python 3.9 or higher
pip package manager

📦 Dependencies

All required libraries are listed in requirements.txt.

Install them using:

pip install -r requirements.txt

This ensures environment consistency and reproducibility.

⚡ Quick Start

1️⃣ Clone the Repository

git clone https://github.com/AnkeshGG/SDG_ExploratoryDataAnalysis.git
cd SDG_ExploratoryDataAnalysis

2️⃣ Install Dependencies

pip install -r requirements.txt

3️⃣ Run the EDA Pipeline

python main.py

📦 Generated Outputs

📊 Visualizations

Geographic degradation ranking
Bioclimatic belt & land cover analysis
Temporal degradation trends
Correlation and distribution plots

📁 Datasets

cleaned_data.csv – Cleaned dataset
ml_ready_data.csv – Encoded & scaled dataset

📄 Report

eda_report.txt – Detailed EDA summary and insights

📌 Dataset Description

UN SDG Goal: 15 – Life on Land
Indicator: 15.4 – Mountain land degradation

Key Attributes:

Geographic Area
Time Period
Bioclimatic Belt
Land Cover Type
Degraded Area (sq. km)

Note: If the dataset file is missing, the pipeline automatically generates realistic sample data for demonstration purposes.

💡 Use Cases

Environmental & sustainability analysis
Climate and forest degradation studies
Machine learning feature engineering
Academic mini/major projects

🧪 Future Enhancements

Interactive dashboards (Streamlit / Plotly)
Degradation trend forecasting
Region-wise clustering
Integration with live UN SDG APIs
Predictive ML models

🤝 Contributing

Contributions are welcome!

Fork the repository
Create a feature branch
Commit your changes
Push to your fork
Submit a pull request

📄 License

This project is licensed under the MIT License.

👨‍💻 Author

Ankesh Kumar
CSE Undergraduate | Data & ML Enthusiast

🌐 GitHub: AnkeshGG
💼 LinkedIn: Ankesh Kumar
🔗 Medium: ankeshGG

🌍 Final Note

This project demonstrates how structured EDA and preprocessing pipelines can convert raw sustainability data into insightful, reproducible, and ML-ready outputs, contributing toward UN SDG Goal 15 – Life on Land.

⭐ If you found this useful, consider starring the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
outputs		outputs
main.py		main.py
readme.md		readme.md
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌲 UN SDG Goal 15 – Mountain Forest Cover

🎯 Project Objective

🚀 Key Features

📊 Exploratory Data Analysis

🔬 Data Quality Analysis

🧹 Data Cleaning

📈 Statistical & Pattern Analysis

📊 Automated Visualizations

🤖 Machine Learning Preparation

📋 Report Generation

🛠️ Tech Stack

📁 Project Structure

⚙️ System Requirements

📦 Dependencies

⚡ Quick Start

1️⃣ Clone the Repository

2️⃣ Install Dependencies

3️⃣ Run the EDA Pipeline

📦 Generated Outputs

📊 Visualizations

📁 Datasets

📄 Report

📌 Dataset Description

💡 Use Cases

🧪 Future Enhancements

🤝 Contributing

📄 License

👨‍💻 Author

🌍 Final Note

About

Uh oh!

Releases

Packages

Languages

AnkeshGG/SDG_ExploratoryDataAnalysis

Folders and files

Latest commit

History

Repository files navigation

🌲 UN SDG Goal 15 – Mountain Forest Cover

🎯 Project Objective

🚀 Key Features

📊 Exploratory Data Analysis

🔬 Data Quality Analysis

🧹 Data Cleaning

📈 Statistical & Pattern Analysis

📊 Automated Visualizations

🤖 Machine Learning Preparation

📋 Report Generation

🛠️ Tech Stack

📁 Project Structure

⚙️ System Requirements

📦 Dependencies

⚡ Quick Start

1️⃣ Clone the Repository

2️⃣ Install Dependencies

3️⃣ Run the EDA Pipeline

📦 Generated Outputs

📊 Visualizations

📁 Datasets

📄 Report

📌 Dataset Description

💡 Use Cases

🧪 Future Enhancements

🤝 Contributing

📄 License

👨‍💻 Author

🌍 Final Note

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages