RoRE: Rotary Ray Embedding for Generalised Multi-Modal Scene Understanding

ICLR 2026

Setup

Environment

A requirements.txt, and docker file are provided

pip install -r requirements.txt

or

bash docker_build.sh

Data

For RealEstate10K we use the same data format as pixelSplat. Please follow the data formating instructions provided there. You can also download a preprocessed dataset here. The dataset can be left in the zip file and loaded directly from it.

A subset of our synthetic multimodal dataset can be found here.

Usage

Training

Training is down as follows.

For training rgb on RealEstate10K:

bash ./nvs.sh --ray_encoding plucker --pos_enc rore --gpus "0,1,..." --dataset_path /path/to/re10k.zip

For training rgb-thermal on our synthetic rgb-thermal dataset:

bash ./nvs.sh --ray_encoding plucker --pos_enc rore --gpus "0,1,..." --dataset_path /path/to/MultimodalBlender --dataset_type multimodal

Alternative embedding methods can be selected.

Validation

The validation on different zooming-in (focal length) factors can be done via:

bash ./nvs.sh --ray_encoding plucker --pos_enc rore --gpus "0,1,..." --test-zoom-in "2" --dataset_path "/path/to/re10k.zip"

And different synthetic distortion on re10k:

bash ./nvs.sh --ray_encoding plucker --pos_enc rore --gpus "0,1,..." --test-distortion "2" --dataset_path "/path/to/re10k.zip"

Testing only on the multimodal dataset:

bash ./nvs.sh --ray_encoding plucker --pos_enc rore --gpus "0,1,..." --dataset_path "/path/to/MultimodalBlender" --dataset_type multimodal --test-only

NOTE: Here the network architecture used for rgb and rgb-thermal are the same, for simplicity. In the paper for the multimodal networked differed slightly rgb only.

Acknowledgement

This repository is built on top of LVSM and PRoPE repositories. Who we thank for making their work open-source.

Citation

If you find this work useful, please consider citing our work:

@inproceedings{rore2026,
  title={RoRE: Rotary Ray Embedding for Generalised Multi-Modal Scene Understanding},
  author={Ryan Griffiths and Donald G. Dansereau},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://openreview.net/forum?id=BR2ItBcqOo}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker_build.sh		docker_build.sh
nvs.sh		nvs.sh
requirements.txt		requirements.txt
trainval.py		trainval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RoRE: Rotary Ray Embedding for Generalised Multi-Modal Scene Understanding

ICLR 2026

Setup

Environment

Data

Usage

Training

Validation

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Languages

RoboticImaging/RoRE

Folders and files

Latest commit

History

Repository files navigation

RoRE: Rotary Ray Embedding for Generalised Multi-Modal Scene Understanding

ICLR 2026

Setup

Environment

Data

Usage

Training

Validation

Acknowledgement

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages