This repository contains our team(HUST_TinySmart)'s first place solution of the Global Wheat Full Semantic Segmentation.
Our solution is based on Guided Distillation. We also integrates the SAPA feature upsampling operator and utilizes ViT-Adapter as backbone networks within a semi-supervised training framework.
For details, see the paper: First Place Solution to the MLCAS 2025 GWFSS Challenge: The Devil is in the Detail and Minority
Songliang Cao, Tianqi Hu, Hao Lu
Correspondence to: hlu@hust.edu.cn, songliangcao@126.com
National Key Laboratory of Multispectral Information Intelligent Processing Technology School of Artificial Intelligence and Automation Huazhong University of Science and Technology, China.
Our solution includes three stages: in stage one, we leverage the labeled training dataset to train a supervised baseline ViT-Adapter and enhance its detail delineation with a dynamic upsampler SAPA; in stage two, we apply a semi-supervised learning pipeline with guided distillation on both labeled data and selected unlabeled data; in stage three, we implement a form of test-time scaling by zooming in images and segmenting twice following the sliding-window-style inference.
conda create -n gwfss python=3.8 -y
conda activate gwfss
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
# build MSDformableAttention
cd ops
sh make.sh
# build detectron2
cd ../projects/detectron2
pip install -e .
# build SAPA operators
cd ../sapa/sapa
python setup.py develop
To test our model on the GWFSS validation set, follow these instructions:
- Download our trained model from this link.
- Modify the
inference.pyto change the model path and data path. - Run
sh test.sh.
Here is the results of our solution on GWFSS competition:
| Backbone | #Param. | Public Leaderboard | Private Leaderboard |
|---|---|---|---|
| BEiTv2-L | 348.7M | 0.77 | 0.75 |
-
convert the weights to d2 format:
python tools/convert-pretrained-model-to-d2.py weight.pth weight_stage1.pkl stage1 python tools/convert-pretrained-model-to-d2.py weight.pth weight_stage2.pkl stage2
-
labeled data: modify both
data/data/datasets/bultin.pyandprojects/detectron2/detectron2/data/datasets/builtin.py(refer to this issue) -
unlabled data: modify the
data/datasets/gwfss_images.py; For GWFSS unlabeled data, you can select part of samples viaunlabeled_4500.txt
- stage1: Supervised Training
bash stage1_train.sh
- stage2: Guided Distillation
bash stage2_train.sh
If you find this work or code useful for your research, please consider giving a star and citation:
@article{songliang2025gwfss
title={First Place Solution to the MLCAS 2025 GWFSS Challenge: The Devil is in the Detail and Minority},
author={Cao, Songliang and Hu, Tianqi and Lu, Hao},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
year={2025}
}
Codes and model weights are released under MiT license. See LICENSE for additional details.
