Kyeongmin Yeo *, Jaihoon Kim *, Minhyuk Sung
KAIST
*Equal contribution
We propose
- Python: 3.9
- CUDA: CUDA 12.1
- GPU: Tested on NVIDIA RTX 3090 and RTX A6000
-
Clone the Repository with Submodules:
git clone --recursive https://github.com/KAIST-Visual-AI-Group/StochSync.git & cd StochSync
-
Create Conda Environment:
conda create -n stochsync python=3.9 -y conda activate stochsync
-
Install Core Dependencies:
First, install PyTorch and xformers compatible with your CUDA environment. For example:
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 xformers --index-url https://download.pytorch.org/whl/cu121
-
Install Python Dependencies:
Install remaining dependencies using
requirements.txtand additional modules inthird_party/:pip install -r requirements.txt pip install third_party/gsplat/ pip install third_party/nvdiffrast/
We provide several example configurations in the config/ directory. Below are examples for different applications:
-
Format:
python main.py --config "your_config.yaml" root_dir="root_dir_for_results" tag="run_name" text_prompt="your text prompt here" [other application-specific options]
-
360° Panorama Generation:
python main.py --config config/stochsync_panorama.yaml text_prompt="A vibrant urban alleyway filled with colorful graffiti, and stylized lettering on wall" -
3D Mesh Texturing:
python main.py --config config/stochsync_mesh.yaml mesh_path="./data/mesh/face.obj" text_prompt="Kratos bust, God of War, god of power, hyper-realistic and extremely detailed."
-
Sphere & Torus Texture Generation:
python main.py --config config/stochsync_sphere.yaml text_prompt="Paint splatter texture."python main.py --config config/stochsync_torus.yaml text_prompt="Paint splatter texture."
We provide comprehensive tests to validate the functionality of our modules. To run the tests, execute:
python run_unit_test.py --extensive --devices {list of gpu indices to use}Test results will be stored in the directory: unit_test_results/{application}.
We provide a unified script to compute Clean-FID, CLIP text-image alignment, and GIQA metrics for any set of generated images.
pip install clean-fid clippython evaluate/evaluate.py \
-r "path/to/reference/images/*.png" \
-f "path/to/generated/images/prompt_:0:/*.png" \
-o path/to/output.txtArgument details
| flag | meaning |
|---|---|
-r |
Glob pattern pointing to reference images. |
-f |
Glob for generated images. Replace the substring that encodes the text prompt with the special token :0:. This lets the script recover the prompt string when computing the CLIP score. |
-o |
Output file for the aggregated metric table. |
Example: if your generated files have the following structure,
reference/
└── panorama/
├── graffiti_alley/
│ ├── 000000.png
│ ├── 000001.png
│ └── …
├── golden_sunset/
│ ├── 000000.png
│ └── …
└ …
results/
└── run_01/
├── graffiti_alley/ # ← use text prompts as folder names
│ ├── 000000.png
│ ├── 000001.png
│ └── …
├── golden_sunset/
│ ├── 000000.png
│ └── …
└ …write -r "reference/panorama/*/*.png -f "results/run_01/:0:/*.png" for evaluation.
The script automatically:
- groups images by prompt,
- computes Clean-FID and GIQA against the matching reference set,
- measures the average CLIP alignment (text ↔︎ image).
If you find our work useful, please consider citing our paper:
@article{yeo2025stochsync,
title={StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces},
author={Yeo, Kyeongmin and Kim, Jaihoon and Sung, Minhyuk},
journal={arXiv e-prints},
pages={arXiv--2501},
year={2025}
}
This repository builds upon several outstanding projects and libraries. We would like to express our gratitude to the developers and contributors of:
- NVDiffrast
- paint-it
- gsplat
- mvdream
Their work has been instrumental in the development of StochSync.
