RMRL is a robotic manipulation reinforcement learning project built specifically for the xArm-6 robot. The core training pipeline lives in train.py, which coordinates data collection, policy optimization.
This repository is intended for xArm-6 only. Other robot platforms are not supported in this codebase.
- End-to-end training loop in
train.py - Online + replay training modes
- Image + pose policy model (
model/) - RealSense camera integration (
environment/camera.py) - xArm-6 control wrapper (
environment/robot.py)
train.py: Core training entrypointtrain_discrete.py: Alternative discrete training scriptenvironment/: xArm-6 environment, camera, and SAM2 integrationmodel/: Policy networks and decoderstraining_data_collector.py: Data collection and replay supportconfig/: Configuration files (if present)experiments/: Experiment scripts
Hardware:
- xArm-6 robot
- Intel RealSense camera (for vision input)
Software:
- Python 3.10
- Conda (recommended)
- CUDA-enabled GPU recommended for training
RMRL uses the official xArm Python SDK. You can install it either from PyPI or from source.
Install from PyPI:
pip install xarm-python-sdkInstall from source:
git clone https://github.com/xArm-Developer/xArm-Python-SDK.git
cd xArm-Python-SDK
pip install .Install from source (build wheel):
pip install build
python -m build
pip install dist/xarm_python_sdk-1.16.0-py3-none-any.whlCreate the environment from environment.yml:
conda env create -f environment.yml
conda activate rmrlTraining configuration is currently defined inside train.py in the main() function. Update values such as:
num_instances,num_episodeslearning_ratesave_dir,save_frequencyoptimizer,replay,Normal_replay
Hardware and model paths are configured in config/environment.yaml (e.g., xArm IP, camera intrinsics, SAM2/GroundingDINO checkpoints).
config/environment.yaml parameters:
sam2_checkpoint: Path to the SAM2 checkpoint (.pt).sam2_model_config: Path to the SAM2 model config (.yaml).grounding_dino_config: Path to the GroundingDINO config (.py).grounding_dino_checkpoint: Path to the GroundingDINO checkpoint (.pth).device: Compute device override ("cuda"or"cpu"). Leave empty to auto-detect.robot_ip: IP address of the xArm-6 controller.camera2_pose: Camera extrinsics as a 7D SE(3) LieTensor[x, y, z, qw, qx, qy, qz].text_prompt: GroundingDINO text prompt for object detection.box_threshold: Bounding box confidence threshold for GroundingDINO.text_threshold: Text matching threshold for GroundingDINO.camera_intrinsics: 3x3 camera intrinsic matrix.ground_plane: Z height of the ground plane in mm (negative if below camera origin).home_pose: Robot home pose[x, y, z, roll, pitch, yaw]in xArm units.camera_config_path: Path to the RealSense camera config YAML.
python train.pyBy default, the script:
- Connects to the xArm-6 at the hard-coded IP in
train.py - Collects data online
- Saves checkpoints into
checkpoints/ - Logs metrics to Weights & Biases (
wandb)
If you use Weights & Biases, ensure you are logged in and have an API key set (WANDB_API_KEY).
- The environment uses modules under
environment/, including SAM2 and GroundingDINO dependencies. Ensure the conda environment installs all required packages fromenvironment.yml. - RealSense and xArm SDKs are required at runtime when running on hardware.
- xArm Python SDK: https://github.com/xArm-Developer/xArm-Python-SDK
- Grounded-SAM-2: https://github.com/IDEA-Research/Grounded-SAM-2/tree/main
MIT license.
If you use RMRL in your research, please cite:
@article{chen2025rm,
title={RM-RL: Role-Model Reinforcement Learning for Precise Robot Manipulation},
author={Chen, Xiangyu and Zhou, Chuhao and Liu, Yuxi and Yang, Jianfei},
journal={IEEE International Conference on Robotics & Automation (ICRA)},
year={2026}
}