Skip to content
/ RMRL Public

Official code of our paper "RM-RL: Role-Model Reinforcement Learning for Precise Robot Manipulation" (ICRA 2026)

License

Notifications You must be signed in to change notification settings

NTUMARS/RMRL

Repository files navigation

RMRL (xArm-6)

RMRL is a robotic manipulation reinforcement learning project built specifically for the xArm-6 robot. The core training pipeline lives in train.py, which coordinates data collection, policy optimization.

This repository is intended for xArm-6 only. Other robot platforms are not supported in this codebase.

Highlights

  • End-to-end training loop in train.py
  • Online + replay training modes
  • Image + pose policy model (model/)
  • RealSense camera integration (environment/camera.py)
  • xArm-6 control wrapper (environment/robot.py)

Repository Layout

  • train.py: Core training entrypoint
  • train_discrete.py: Alternative discrete training script
  • environment/: xArm-6 environment, camera, and SAM2 integration
  • model/: Policy networks and decoders
  • training_data_collector.py: Data collection and replay support
  • config/: Configuration files (if present)
  • experiments/: Experiment scripts

Requirements

Hardware:

  • xArm-6 robot
  • Intel RealSense camera (for vision input)

Software:

  • Python 3.10
  • Conda (recommended)
  • CUDA-enabled GPU recommended for training

xArm Python SDK

RMRL uses the official xArm Python SDK. You can install it either from PyPI or from source.

Install from PyPI:

pip install xarm-python-sdk

Install from source:

git clone https://github.com/xArm-Developer/xArm-Python-SDK.git
cd xArm-Python-SDK
pip install .

Install from source (build wheel):

pip install build
python -m build
pip install dist/xarm_python_sdk-1.16.0-py3-none-any.whl

Conda Environment

Create the environment from environment.yml:

conda env create -f environment.yml
conda activate rmrl

Configuration

Training configuration is currently defined inside train.py in the main() function. Update values such as:

  • num_instances, num_episodes
  • learning_rate
  • save_dir, save_frequency
  • optimizer, replay, Normal_replay

Hardware and model paths are configured in config/environment.yaml (e.g., xArm IP, camera intrinsics, SAM2/GroundingDINO checkpoints).

config/environment.yaml parameters:

  • sam2_checkpoint: Path to the SAM2 checkpoint (.pt).
  • sam2_model_config: Path to the SAM2 model config (.yaml).
  • grounding_dino_config: Path to the GroundingDINO config (.py).
  • grounding_dino_checkpoint: Path to the GroundingDINO checkpoint (.pth).
  • device: Compute device override ("cuda" or "cpu"). Leave empty to auto-detect.
  • robot_ip: IP address of the xArm-6 controller.
  • camera2_pose: Camera extrinsics as a 7D SE(3) LieTensor [x, y, z, qw, qx, qy, qz].
  • text_prompt: GroundingDINO text prompt for object detection.
  • box_threshold: Bounding box confidence threshold for GroundingDINO.
  • text_threshold: Text matching threshold for GroundingDINO.
  • camera_intrinsics: 3x3 camera intrinsic matrix.
  • ground_plane: Z height of the ground plane in mm (negative if below camera origin).
  • home_pose: Robot home pose [x, y, z, roll, pitch, yaw] in xArm units.
  • camera_config_path: Path to the RealSense camera config YAML.

Running Training

python train.py

By default, the script:

  • Connects to the xArm-6 at the hard-coded IP in train.py
  • Collects data online
  • Saves checkpoints into checkpoints/
  • Logs metrics to Weights & Biases (wandb)

If you use Weights & Biases, ensure you are logged in and have an API key set (WANDB_API_KEY).

Notes

  • The environment uses modules under environment/, including SAM2 and GroundingDINO dependencies. Ensure the conda environment installs all required packages from environment.yml.
  • RealSense and xArm SDKs are required at runtime when running on hardware.

Thanks

License

MIT license.

Citation

If you use RMRL in your research, please cite:

@article{chen2025rm,
  title={RM-RL: Role-Model Reinforcement Learning for Precise Robot Manipulation},
  author={Chen, Xiangyu and Zhou, Chuhao and Liu, Yuxi and Yang, Jianfei},
  journal={IEEE International Conference on Robotics & Automation (ICRA)},
  year={2026}
}

About

Official code of our paper "RM-RL: Role-Model Reinforcement Learning for Precise Robot Manipulation" (ICRA 2026)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published