Release_3 Feature Readmes

Vision-Based Object Detection and Pose Estimation Framework

This repository provides a complete vision pipeline that supports object detection with YOLOv8 and 6D pose estimation. The project is structured to support fine-tuning, inference, and integration with robotic systems using ROS 2.

Python version: 3.12
ROS 2: Jazzy

Cloning the Repository

git clone git@github.com:Robots4Sustainability/perception.git
cd your-repo

Project Structure

ros_perception_scripts: Contains nodes specific scripts which goes inside the src folder.
Annotation_and_fine-tuning: Contains scripts/utility scripts if you wish to train/fine-tune on your custom dataset.

Pose Estimation Setup

Create a virtual environment in the root folder:
```
python3 -m venv venv
source venv/bin/activate
```
Install dependencies (Use the root folder's requirements.txt):
```
pip install -r requirements.txt
```
Follow documentation on how to set up the robot and ROS2 workspace.
Once eddie_ros, kinova_vision and ROS2 workspace has been setup. Install this package
```
sudo apt update
sudo apt install ros-jazzy-image-pipeline
```

Launching the Camera and Robot

RealSense Camera Launch

ros2 launch realsense2_camera rs_launch.py \
  enable_rgbd:=true \
  enable_sync:=true \
  align_depth.enable:=true \
  enable_color:=true \
  enable_depth:=true \
  pointcloud.enable:=true \
  rgb_camera.color_profile:=640x480x30 \
  depth_module.depth_profile:=640x480x30 \
  pointcloud.ordered_pc:=true

Kinova Robot Vision Node Launch

Left Arm IP: 192.168.1.10 Right Arm IP: 192.168.1.12

For Kinova Vision Node, you must export RMW_IMPLEMENTATION=rmw_zenoh_cpp in every terminal, but before that Follow the instructions below:

sudo apt install ros-jazzy-rmw-zenoh-cpp

pkill -9 -f ros && ros2 daemon stop

Install Zenoh router and then source /opt/ros/jazzy/setup.bash

# terminal 1
ros2 run rmw_zenoh_cpp rmw_zenohd


# terminal 2
export RMW_IMPLEMENTATION=rmw_zenoh_cpp

your kinova vision node

Command to run kinova_vision.lanch

ros2 launch kinova_vision kinova_vision.launch.py \
  device:=192.168.1.12 \
  depth_registration:=true \
  color_camera_info_url:=package://kinova_vision/launch/calibration/default_color_calib_1280x720.ini

If you get resolution errors then, go to the admin portal of the arm(192.168.1.1X) and then in the Camera settings, set the calibration to 1280x720

Running the Nodes

The project provides multiple ways to launch the files.

Both Camera and the Nodes
All the Nodes via one launch file
Separate Nodes/Terminals

1. Both Camera and the Nodes

ros2 launch perception camera_and_perception_launch.py input_source:=robot

input_source: robot or realsense (default)

2. All the Nodes via one launch file

In this, you must run the camera in a separate terminal and rest of the nodes will be executed under one launch file

ros2 launch perception perception_launch.py input_source:=robot

input_source: robot or realsense (default)

3. Run YOLO Object Detection

ros2 run perception yolo_node --ros-args \
  -p input_mode:=realsense \
  -p model_type:=fine_tuned \
  -p conf_threshold:=0.6 \
  -p device:="cpu"

input_mode: robot or realsense (default)
model_type: fine_tuned or default (YOLOv8n COCO Dataset)
model_path: optional explicit path to weights (.pt). If empty, falls back to defaults based on model_type.
conf_threshold: float, default 0.6.
device: optional device string (e.g., cpu, cuda).
class_names: optional list to override class names; must align with training order.

3. Run Pose Estimation

ros2 run perception pose_node --ros-args -p input_mode:=default

input_mode: robot or realsense (default)

RViz Visualization

To visualize the results:

rviz2

You can load the pre-configured RViz setup:

ros2_ws/src/perception/rviz/pose_estimation.rviz

Topics

Topic Name	Description
`/annotated_images`	Publishes YOLO-annotated image data
`/cropped_pointcloud`	Publishes cropped point cloud used for pose estimation
`/pickable_objects`	Publishes YOLO-detection array for objects which can be picked by robot (see `object_classifier_node.py`)
`/non_pickable_objects`	Publishes YOLO-detection array for objects which can't be picked by robot (see `object_classifier_node.py`)

Fine-Tuning Instructions

Navigate to the fine-tuning folder:
```
cd Annotation_and_fine-tuning
```

Create a Python virtual environment and activate it:

python3.12 -m venv venvTrain
source venvTrain/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```
Download the dataset from the Releases page.
Run main.ipynb:
- Renames files (if needed)
- Applies a Vision Transformer to assist in image labeling
- Displays YOLOv8-compatible bounding boxes
- Trains a YOLOv8 model
The trained model will be saved at:
```
runs/detect/xxxx/weights/best.pt
```
Use Inference.ipynb to test the trained model.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
Asjad		Asjad
Feature_Readme		Feature_Readme
results		results
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
no_place.png		no_place.png
place_object_1.png		place_object_1.png
place_object_2.png		place_object_2.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Release_3 Feature Readmes

Vision-Based Object Detection and Pose Estimation Framework

Table of Contents

Cloning the Repository

Project Structure

Pose Estimation Setup

Launching the Camera and Robot

RealSense Camera Launch

Kinova Robot Vision Node Launch

Running the Nodes

1. Both Camera and the Nodes

2. All the Nodes via one launch file

3. Run YOLO Object Detection

3. Run Pose Estimation

RViz Visualization

Topics

Fine-Tuning Instructions

About

Uh oh!

Releases 3

Packages

Contributors 2

Uh oh!

Languages

License

Robots4Sustainability/perception

Folders and files

Latest commit

History

Repository files navigation

Release_3 Feature Readmes

Vision-Based Object Detection and Pose Estimation Framework

Table of Contents

Cloning the Repository

Project Structure

Pose Estimation Setup

Launching the Camera and Robot

RealSense Camera Launch

Kinova Robot Vision Node Launch

Running the Nodes

1. Both Camera and the Nodes

2. All the Nodes via one launch file

3. Run YOLO Object Detection

3. Run Pose Estimation

RViz Visualization

Topics

Fine-Tuning Instructions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Uh oh!

Languages

Packages