Skip to content

HuiGuanLab/LEAD-pp

Repository files navigation

LEAD++: Unsupervised Fine-grained Visual Recognition with Multi-context Enhanced Entropy-based Adaptive Distillation

This work is an extension of our previous method, LEAD, which focuses on entropy-based adaptive distillation for fine-grained visual representation learning. The official implementation of LEAD is available at: https://github.com/HuiGuanLab/LEAD.

📦 Datasets Preparation

In our experiments, we use the following publicly available fine-grained datasets. All datasets can be downloaded by clicking the corresponding links below.

Dataset Download Link
CUB-200-2011 Download
Stanford Cars Download
FGVC Aircraft Download
Stanford Dogs Download

All datasets are expected to be processed and organized in a unified ImageFolder format.Please download the datasets and arrange them following this structure. For the CUB-200-2011 and FGVC Aircraft dataset, you can use the following command to convert it into the desired ImageFolder format:

python aircraft_organize.py --ds /path/to/fgvc-aircraft-2013b --out /path/to/aircraft --link none
python bird_organize.py --cub_root /path/to/CUB_200_2011 --output_root /path/to/bird_imagefolder

All datasets are expected to be processed into the following structure:

LEAD++
├── bird/🐦
│   ├── train/ 
		├── 001.Black_footed_Albatross
		├── 002.Laysan_Albatross
		……
	├── test/
├── car/🚙
│   ├── train/ 
		├── Acura Integra Type R 2001
		├── Acura RL Sedan 2012
		……
    ├── test/
├── aircraft/✈️
│   ├── train/ 
		├── 707-320
		├── 727-200
		……
    ├── test/
……

🌏 Environments

  • Ubuntu 22.04
  • CUDA 12.4

Use the following instructions to create the corresponding conda environment. Besides, you should download the ResNet50 pre-trained model by clicking here and save it in this folder.

conda create --name LEAD++ python=3.9.1
conda activate LEAD++
pip install -r requirements.txt

🕹 Mutual-information-based Localization Preprocessing

Before training, we need to generate cropped images by following the steps below. Please make sure you are in the LEAD++ root directory before running the commands.

cd DDT
chmod +x ./run_ddt.sh
./run_ddt.sh $task $dataset $pretrained $cuda_device

$task is the task name (bird, car, aircraft and others).

$dataset is the dataset path for unsupervised pre-training.

$pretrained indicates whether to use pretrained weights for the model.In our implementation, we use pretrained ResNet-50 weights.

$cuda_device is the ID of used GPU.

After that, we will obtain the cropped version of the dataset.

🚀 Direct Training and Downstream Testing

  • For ease of use, we have pre-converted the text descriptions generated by LLM into tensor format and placed them in the text_description_tensor folder. The original descriptions and the descriptions of random categories generated by LLM are all in the text_description folder.
  • Run the following scripts for pre-training and downstream linear probing and image retrieval.
chmod +x ./run_*.sh
./run_train_test.sh $task $dataset $llm_description $train_ckpt_name $num_classes $cuda_device &linear_name

$task is the task name (bird or car or aircraft).

$dataset is the dataset path for unsupervised pre-training.

$llm_description is the text description address generated by LLM.

$train_ckpt_name is the name of the folder where the checkpoints are saved in.

$num_classes is the Number of labels. bird 200, car 196, aircraft 100.

$cuda_device is the ID of used GPU.

$linear_name is the name of the folder where the linear probing checkpoints are saved in.

  • An example of pretraining on CUB_200_2011.
./run_train_test.sh bird bird/ text_description_tensor/bird_text_tensor.pt result_bird 200 0,1 linear_bird

⚗ Single Unsupervised Training

  • For ease of use, we have pre-converted the text descriptions generated by LLM into tensor format and placed them in the text_description_tensor folder. The original descriptions and the descriptions of random categories generated by LLM are all in the text_description folder.
  • Run the following script for pretraining. It will save the checkpoints to ./checkpoints/$checkpoints_name/.
chmod +x ./run_train.sh
./run_train.sh $task $dataset $llm_description $checkpoints_name $num_classes $cuda_device

$task is the task name (bird or car or aircraft).

$dataset is the dataset path for unsupervised pre-training.

$llm_description is the text description address generated by LLM.

$checkpoints_name is the name of the folder where the checkpoints are saved in.

$num_classes is the Number of labels. bird 200, car 196, aircraft 100.

$cuda_device is the ID of used GPU.

  • An example of pretraining on CUB_200_2011.
./run_train.sh bird bird/ text_description_tensor/bird_text_tensor.pt result_bird 200 0,1

📋 Single Downstream Task Evaluation

Linear probing

  • Run the following script for linear probing. We use a single machine and a single GPU to train linear probing. It will save the checkpoints to ./checkpoints_linear/$checkpoints_name/.
chmod +x ./run_linear.sh
./run_linear.sh $task $pretrained $checkpoints_name $num_classes $cuda_device

$task is the task name (bird or car or aircraft).

$pretrained is the name of the folder where the training checkpoints are saved in.

$checkpoints_name is the name of the folder where the linear probing checkpoints are saved in.

$num_classes is the Number of labels. bird 200, car 196, aircraft 100.

$cuda_device is the ID of used GPU.

  • An example of linear probing on CUB_200_2011.
./run_linear.sh bird result_bird linear_bird 200 0

Image Retrieval

  • Run the following script for Image Retrieval. We use a single machine and a single GPU to implement image retrieval.
chmod +x ./run_retrieval.sh
./run_retrieval.sh $task $dataset $pretrained $cuda_device

$task is the task name (bird or car or aircraft).

$dataset is the path to the cropped dataset generated in the Mutual-information-based Localization Preprocessing step for unsupervised pre-training.

$pretrained is the name of the folder where the training checkpoints are saved in.

$cuda_device is the ID of used GPU.

  • An example of linear probing on CUB_200_2011.
./run_retrieval.sh bird bird/ result_bird 0

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •