Skip to content

BinLee26/InterAgent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction Graphs

🎉 CVPR 2026

Project Page arXiv Models

Bin Li* 1     Ruichi Zhang* 2     Han Liang† 3     Jingyan Zhang1
Juze Zhang4     Xin Chen3     Lan Xu1     Jingyi Yu1     Jingya Wang† 1,5

1ShanghaiTech University     2University of Pennsylvania     3ByteDance     4Stanford University     5InstAdapt
*Equal contribution    Corresponding Author


✨ Overview

InterAgent is the first end-to-end framework for text-driven physics-based multi-agent humanoid control.

It introduces:

  • 🧠 An autoregressive diffusion transformer
  • 🔀 Multi-stream blocks decoupling proprioception, exteroception, and action
  • 🔗 A novel interaction graph exteroception representation
  • ⚡ Sparse edge-based attention for robust interaction modeling

InterAgent produces coherent, physically plausible, and semantically faithful multi-agent behaviors from only text prompts.


🛠️ Installation

Clone the repository and create the environment:

git clone https://github.com/BinLee26/InterAgent.git
cd InterAgent

conda create -n interagent python=3.8 -y
conda activate interagent

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -e .

🗃️ Data Preparation

1️⃣ Download Training Data

Download the training file:

interhuman_train.pkl

from HuggingFace and place it under:

data/

2️⃣ (Optional) Track Your Own Data

We rely on the official ASE implementation to obtain tracking state-action pairs.

If you wish to track your own motion data:

  • Modify ASE to support two agents
  • You may refer to the implementation in ./inference for a working example of the two-agent setup.

3️⃣ Build LMDB Dataset

After downloading the data, run:

python training/interagent/dataset/dataset_create_lmdb.py

🔥 Training

To train the policy network:

bash scripts/train.sh

You may modify hyperparameters inside the corresponding config files if needed.


🤖 Inference

1️⃣ Environment Setup

Follow the setup instructions in the official
ASE repository
to configure the simulation environment.

2️⃣ Download Checkpoint

Download the pretrained checkpoint from HuggingFace and place it in checkpoint/

3️⃣ Run Inference

cd inference/multi-agent
bash infer.sh

📝 Citation

If you find our work useful, please consider citing:

@article{li2025interagent,
  title={InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction Graphs},
  author={Li, Bin and Zhang, Ruichi and Liang, Han and Zhang, Jingyan and Zhang, Juze and Chen, Xin and Xu, Lan and Yu, Jingyi and Wang, Jingya},
  journal={arXiv preprint arXiv:2512.07410},
  year={2025}
}

🙏 Acknowledgments

This project partially builds upon several excellent open-source works:

We sincerely thank the authors for making their code publicly available.


📜 License

Please refer to the LICENSE file for details.

About

[CVPR2026] InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction Graphs

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages