This repository provides a complete vision pipeline that supports object detection with YOLOv8 and 6D pose estimation. The project is structured to support fine-tuning, inference, and integration with robotic systems using ROS 2.
Python version: 3.12
ROS 2: Jazzy
- Cloning the Repository
- Project Structure
- Pose Estimation Setup
- Launching the Camera and Robot
- Running the Nodes
- RViz Visualization
- Topics
- Fine-Tuning Instructions
git clone git@github.com:Robots4Sustainability/perception.git
cd your-reporos_perception_scripts: Contains nodes specific scripts which goes inside the src folder.Annotation_and_fine-tuning: Contains scripts/utility scripts if you wish to train/fine-tune on your custom dataset.
-
Create a virtual environment in the root folder:
python3 -m venv venv source venv/bin/activate -
Install dependencies (Use the root folder's requirements.txt):
pip install -r requirements.txt
-
Follow documentation on how to set up the robot and ROS2 workspace.
-
Once eddie_ros, kinova_vision and ROS2 workspace has been setup. Install this package
sudo apt update sudo apt install ros-jazzy-image-pipeline
ros2 launch realsense2_camera rs_launch.py \
enable_rgbd:=true \
enable_sync:=true \
align_depth.enable:=true \
enable_color:=true \
enable_depth:=true \
pointcloud.enable:=true \
rgb_camera.color_profile:=640x480x30 \
depth_module.depth_profile:=640x480x30 \
pointcloud.ordered_pc:=trueLeft Arm IP: 192.168.1.10 Right Arm IP: 192.168.1.12
For Kinova Vision Node, you must export RMW_IMPLEMENTATION=rmw_zenoh_cpp in every terminal, but before that
Follow the instructions below:
sudo apt install ros-jazzy-rmw-zenoh-cpp
pkill -9 -f ros && ros2 daemon stopInstall Zenoh router and then source /opt/ros/jazzy/setup.bash
# terminal 1
ros2 run rmw_zenoh_cpp rmw_zenohd
# terminal 2
export RMW_IMPLEMENTATION=rmw_zenoh_cpp
your kinova vision nodeCommand to run kinova_vision.lanch
ros2 launch kinova_vision kinova_vision.launch.py \
device:=192.168.1.12 \
depth_registration:=true \
color_camera_info_url:=package://kinova_vision/launch/calibration/default_color_calib_1280x720.iniIf you get resolution errors then, go to the admin portal of the arm(192.168.1.1X) and then in the Camera settings, set the calibration to 1280x720
The project provides multiple ways to launch the files.
- Both Camera and the Nodes
- All the Nodes via one launch file
- Separate Nodes/Terminals
ros2 launch perception camera_and_perception_launch.py input_source:=robotinput_source:robotorrealsense(default)
In this, you must run the camera in a separate terminal and rest of the nodes will be executed under one launch file
ros2 launch perception perception_launch.py input_source:=robotinput_source:robotorrealsense(default)
ros2 run perception yolo_node --ros-args \
-p input_mode:=realsense \
-p model_type:=fine_tuned \
-p conf_threshold:=0.6 \
-p device:="cpu"input_mode:robotorrealsense(default)model_type:fine_tunedordefault(YOLOv8n COCO Dataset)model_path: optional explicit path to weights (.pt). If empty, falls back to defaults based onmodel_type.conf_threshold: float, default0.6.device: optional device string (e.g.,cpu,cuda).class_names: optional list to override class names; must align with training order.
ros2 run perception pose_node --ros-args -p input_mode:=defaultinput_mode:robotorrealsense(default)
To visualize the results:
rviz2You can load the pre-configured RViz setup:
ros2_ws/src/perception/rviz/pose_estimation.rviz| Topic Name | Description |
|---|---|
/annotated_images |
Publishes YOLO-annotated image data |
/cropped_pointcloud |
Publishes cropped point cloud used for pose estimation |
/pickable_objects |
Publishes YOLO-detection array for objects which can be picked by robot (see object_classifier_node.py) |
/non_pickable_objects |
Publishes YOLO-detection array for objects which can't be picked by robot (see object_classifier_node.py) |
-
Navigate to the fine-tuning folder:
cd Annotation_and_fine-tuning -
Create a Python virtual environment and activate it:
python3.12 -m venv venvTrain source venvTrain/bin/activate -
Install dependencies:
pip install -r requirements.txt
-
Download the dataset from the Releases page.
-
Run
main.ipynb:- Renames files (if needed)
- Applies a Vision Transformer to assist in image labeling
- Displays YOLOv8-compatible bounding boxes
- Trains a YOLOv8 model
The trained model will be saved at:
runs/detect/xxxx/weights/best.pt -
Use
Inference.ipynbto test the trained model.