Implementation of paper: https://arxiv.org/pdf/1802.02611
A PyTorch implementation of DeepLabv3Plus for semantic segmentation . This project provides a robust solution for segmenting objects with high accuracy and includes features for monitoring training progress and model performance.
Note: depthwise separable convolutions and the Xception backbone are not included.
DeepLabv3Plus/
├── src/
│ ├── model.py # DeepLabv3Plus model implementation
│ ├── dataset.py # Dataset and data loading utilities
│ ├── train.py # Training script with visualization
│ ├── inference.py # Inference script
│ ├── metrics.py # Evaluation metrics
│ └── utils.py # Utility functions
├── configs/
│ └── default.yaml # Configuration file
├── data/ # Dataset directory (ignored by git)
│ ├── train/
│ │ ├── images/ # Training images
│ │ └── masks/ # Training masks
│ ├── val/
│ │ ├── images/ # Validation images
│ │ └── masks/ # Validation masks
│ └── test/
│ ├── images/ # Test images
│ └── masks/ # Test masks
├── checkpoints/ # Model checkpoints (ignored by git)
├── logs/ # Training logs (ignored by git)
├── results/ # Inference results (ignored by git)
├── requirements.txt # Python dependencies
└── README.md # This file
- Robust Mask Handling: Automatic normalization of masks in various formats (JPG, PNG, TIFF)
- Training Visualization: Monitor model progress with visualizations of predictions
- Checkpoint Management: Save model checkpoints for every epoch
- Early Stopping: Prevent overfitting with configurable early stopping
- Mixed Precision Training: Optimize training speed with automatic mixed precision
- Data Augmentation: Comprehensive augmentation pipeline for robust training
- Edge Detection: Enhanced edge detection for better boundary segmentation
- Python 3.8+
- PyTorch 1.8+
- CUDA 11.0+ (for GPU training)
- Other dependencies listed in
requirements.txt
- Clone the repository:
git clone https://github.com/yourusername/DeepLabv3Plus.git
cd DeepLabv3Plus- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Linux/Mac
# or
.venv\Scripts\activate # On Windows- Install dependencies:
pip install -r requirements.txtThe configs/default.yaml file contains all configurable parameters:
model:
num_classes: 1
backbone: resnet50
pretrained: true
training:
epochs: 100
batch_size: 8
learning_rate: 0.001
weight_decay: 0.0001
early_stopping_patience: 10
early_stopping_min_delta: 0.001
max_train_samples: -1 # Set to limit training samples
max_val_samples: -1 # Set to limit validation samples
max_test_samples: -1 # Set to limit test samples
num_visualization_samples: 5
data:
image_size: [512, 512]
test_split: 0.1
val_split: 0.1
augmentations:
enabled: true
horizontal_flip: true
vertical_flip: true
rotation: true
brightness: 0.2
contrast: 0.2
saturation: 0.2
hue: 0.1
edge_scale: 1.0-
Prepare your dataset:
- Place images in
data/train/images/ - Place corresponding masks in
data/train/masks/ - Repeat for validation and test sets
- Place images in
-
Start training:
python src/train.py --config configs/default.yamlTraining features:
- Automatic checkpoint saving for each epoch
- Visualization of predictions every epoch
- Progress bar with loss and metrics
- TensorBoard integration for monitoring
- Early stopping to prevent overfitting
Run inference on a single image:
python src/inference.py --config configs/default.yaml --image path/to/imageThe model uses DeepLabv3Plus with:
- ResNet50 backbone (pretrained)
- Atrous Spatial Pyramid Pooling (ASPP)
- Decoder module for refined segmentation
- Edge detection enhancement
-
Data Loading:
- Images and masks are automatically normalized
- Masks are handled robustly regardless of format
- Data augmentation is applied during training
-
Training Loop:
- Mixed precision training for efficiency
- Automatic learning rate scheduling
- Validation after each epoch
- Metrics calculation (IoU, Dice)
-
Monitoring:
- TensorBoard integration
- Visualizations of predictions
- Checkpoint saving
- Early stopping
The model achieves:
- High IoU scores on validation set
- Accurate boundary detection
- Robust performance across different image types
This project is licensed under the MIT License - see the LICENSE file for details.
- Original DeepLabv3Plus paper: Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
- PyTorch implementation inspiration
- Dataset providers and contributors