[ICLR 2026]Trust but Verify: Adaptive Conditioning for Reference-Based Diffusion Super-Resolution via Implicit Reference Correlation Modeling

Yuan Wang^*1,2 Yuhao Wan¹ Siming Zheng² Bo Li² Qibin Hou¹ Peng-Tao Jiang^2,†

¹ VCIP, School of Computer Science, Nankai University ² vivo Mobile Communication Co., Ltd

^* Work done when interning at vivo. ^† Corresponding authors.

📄 arXiv Paper 🌐 Project Page 🤗 Hugging Face Models

Figure 1. Visual comparisons: S3Diff is a single-image generation network, and ReFIR is the current SOTA for reference based restoration. Our method not only outperforms ReFIR in leveraging reference details but also shows stronger robustness against degradations than S3Diff.

Figure 2. Overview of our framework. It comprises two components: (a) a reference-based restoration backbone, and (b) a correlation-aware adaptive gating mechanism.

✨ Highlights

🎯 AICG: Adaptive Implicit Correlation Gating. We propose AICG, a lightweight implicit correlation gating module that directly addresses a key challenge in RefSR: how to reliably use reference information to restore LQ inputs degraded by real-world artifacts. By reusing existing projections in the attention module and introducing only a few learnable summary tokens, AICG implicitly models LQ–Ref correlations while adding negligible computational overhead.

🚀 Ada-RefSR: Strong Generalization, Robustness, and Speed. Built upon AICG, Ada-RefSR achieves stable reference-based enhancement across diverse tasks and degradation scenarios. Its single-step diffusion design provides over 30× speedup compared to multi-step RefSR baselines, enabling fast and robust SR in both aligned and mismatched reference conditions.

🛠️ 1. Environment Setup

The code is developed using Python 3.10 and PyTorch.

# Create and activate environment
conda create -n adarefsr python=3.10
conda activate adarefsr

# Install dependencies
pip install -r ./requirements.txt

📦 2. Pretrained Weights

Please download the following weights and place them in the ./models directory.

Component	Source / Link	Config Parameter
SD Turbo	stabilityai/sd-turbo	`sd_path`
S3Diff (Backbone)	ArcticHare105/S3Diff	`pretrained_backbone_path`
RAM & DAPE	RAM Swin-L / SeeSR (DAPE)	`ram_path` / `dape_path`
Ada-RefSR (Ours)	Download Link	`pretrained_ref_gen_path`

Path Configuration: After downloading, please ensure the local paths are correctly updated in ./my_utils/training_utils.py and ./my_utils/testing_utils.py to match your directory structure.

📂 3. Dataset Preparation

📊 Training Datasets

General SR Datasets:
- Download DIV2K (Link), DIV8K (Link), and Flickr2K (Link).
- Preparation: Follow the index format in ./datasets/info/*.txt.
- Organization: Process the data according to the structure defined in ./data/train.
Face specific reference SR Dataset:
- Download CelebFaceRef-HQ (Link).
- Processing: Run the provided script to partition the dataset:
```
python ./data/create_celebref.py
```

🧪 Validation Datasets

Download the full four RefSR testing datasets from Hugging Face (Link) and save it into the ./data/test directory.

🦅 Other Datasets

For those interested in specialized domains, such as fine-grained retrieval and restoration, our Bird Retrieval Dataset is available here: (Link)

🚀 4. Usage

🎨 Quick Start (Demo)

You can quickly test our model on your own images using the provided demo script. This script automatically handles image resizing (to multiples of 8) and color alignment.

# Basic usage
python ./demo.py \
    --config "./configs/demo_config.yaml" \
    --lq_path "./assets/pic/lq.png" \
    --ref_path "./assets/pic/ref.png" \
    --output_path "./assets/pic/result.png"

⚙️ Training

Ensure the training datasets are prepared (see Section 3). Training configurations for both real and virtual scenarios are located in the shell scripts:

cd ./main_code/train
# Training includes weights and config information
sh run_training.sh

🧪 Validation & Evaluation

We provide specific validation scripts for different benchmarks. Navigate to the corresponding directories to run evaluations:

# CUFED5
cd ./main_code/test/cufed5 && sh run_validation.sh

# WRSR
cd ./main_code/test/wrsr && sh run_validation.sh

# Bird
cd ./main_code/test/bird && sh run_validation.sh

# Face
cd ./main_code/test/face && sh run_validation.sh

📊 5. GPU Memory and Inference Speed

The following performance metrics for Ada-RefSR were measured on a single NVIDIA A40 GPU. Our method is specifically optimized for high-resolution generation, achieving high-fidelity restoration with remarkable computational efficiency.

At $512 \times 512$ resolution: Ada-RefSR requires 12.66 GB of GPU memory and completes inference in just 0.41 seconds.
At $1024 \times 1024$ resolution: Ada-RefSR requires 15.54 GB of GPU memory with an inference time of only 1.35 seconds.

🙏 Acknowledgements

This project is built upon the following excellent open-source repositories:

S3Diff: The base generative backbone for our framework.
ReFIR: For reference-based logic and benchmark implementations.
SeeSR: For the RAM and DAPE-based semantic conditioning.
Stability AI: For the foundational SD-Turbo model.
diffusers: For the powerful and flexible diffusion model training and inference suite.

We thank the authors of these projects for their great work and for making their code available to the community, which has significantly facilitated our research.

📜 Citation

If you find our work or code useful for your research, please cite:

@inproceedings{wang2026trust,
  title={Trust but Verify: Adaptive Conditioning for Reference-Based Diffusion Super-Resolution via Implicit Reference Correlation Modeling},
  author={Wang, Yuan and Wan, Yuhao and Zheng, Siming and Li, Bo and Hou, Qibin and Jiang, Peng-Tao},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[ICLR 2026]Trust but Verify: Adaptive Conditioning for Reference-Based Diffusion Super-Resolution via Implicit Reference Correlation Modeling

📄 arXiv Paper 🌐 Project Page 🤗 Hugging Face Models

✨ Highlights

🛠️ 1. Environment Setup

📦 2. Pretrained Weights

📂 3. Dataset Preparation

📊 Training Datasets

🧪 Validation Datasets

🦅 Other Datasets

🚀 4. Usage

🎨 Quick Start (Demo)

⚙️ Training

🧪 Validation & Evaluation

📊 5. GPU Memory and Inference Speed

🙏 Acknowledgements

📜 Citation

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
basicsr		basicsr
configs		configs
data		data
datsr		datsr
main_code		main_code
my_utils		my_utils
ram		ram
.DS_Store		.DS_Store
demo.py		demo.py
readme.md		readme.md
requirements.txt		requirements.txt

vivoCameraResearch/AdaRefSR

Folders and files

Latest commit

History

Repository files navigation

[ICLR 2026]Trust but Verify: Adaptive Conditioning for Reference-Based Diffusion Super-Resolution via Implicit Reference Correlation Modeling

📄 arXiv Paper 🌐 Project Page 🤗 Hugging Face Models

✨ Highlights

🛠️ 1. Environment Setup

📦 2. Pretrained Weights

📂 3. Dataset Preparation

📊 Training Datasets

🧪 Validation Datasets

🦅 Other Datasets

🚀 4. Usage

🎨 Quick Start (Demo)

⚙️ Training

🧪 Validation & Evaluation

📊 5. GPU Memory and Inference Speed

🙏 Acknowledgements

📜 Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages