GitHub - iSEE-Laboratory/Seg-ReSearch: Official repository of paper "Seg-ReSearch: Segmentation with Interleaved Reasoning and External Search"

Seg-ReSearch: Segmentation with Interleaved Reasoning and External Search

Tianming Liang Qirui Du Jian-Fang Hu Haichao Jiang Zicheng Lin Wei-Shi Zheng

ISEE Lab, Sun Yat-sen University

🌟 Highlights

Through multi-turn interleaved reasoning and web search, Seg-ReSearch is able to localize and segment any text-guided target in images or videos, even those involving new concepts or up-to-date information that lies beyond the internal knowledge of MLLMs.

📢 News

[2026/02/06] All training and inference code for Seg-ReSearch has been released. Check it out!
[2026/02/06] We have released OK-VOS, a challenging VOS Benchmark explicitly requiring external knowledge.
[2026/02/04] The paper is available on arXiv.

🔎 Framework

Seg-ReSearch conducts multi-turn interactions with the search engine throughout the dynamic Multi-modal Chain-of-Thought (MCoT). This capability is incentivized by a hierarchical reward design: IGR pilots the initial planning, TPR encourages extensive exploration, and OR ensures final task accuracy.

⚙️ Getting Started

To support Qwen3-VL, we use verl==0.7.0.dev0 and vllm==0.11.0, which require pytorch>=2.8.0 and cuda>=12.6.

Installation

git submodule update --init --recursive
conda create --name seg-research python=3.10
conda activate seg-research
pip install -e verl
pip install -e ".[vllm,search_tool]"
pip install "flash-attn==2.8.3" --no-build-isolation

Retrieval Server Setup

We recommend creating a separate environment for the retrieval server to avoid dependency conflicts.

conda create --name retrieval python=3.10
conda activate retrieval
conda install -c pytorch -c nvidia faiss-gpu=1.8.0
pip install transformers datasets fastapi numpy torch uvicorn

📦 Data Preparation

Download OK-VOS Benchmark

bash data/download_okvos.sh

Preprocess

Run the preprocessing script to prepare the datasets for training and evaluation. This script includes extracting the bbox and point.

# Training Set
python examples/data_preprocess/okvos.py --split train --num_frames 6 --max_size 448 --min_size 448 --model_type Qwen3VL

# Test set
python examples/data_preprocess/okvos.py --split test --num_frames 6 --max_size 448 --min_size 448 --model_type Qwen3VL

🚀 Training

Activate the retrieval environment and start the retrieval server.

bash examples/train/okvos/start_retrieval.sh

Activate the seg-research environment and run one of the following scripts.

Qwen3-VL-4B-Instruct:

# GRPO Training
bash examples/train/okvos/train_4b.sh 

# DAPO Training
bash examples/train/okvos/train_4b_dapo.sh

Qwen3-VL-8B-Instruct:

Note: We set tensor_model_parallel_size=2 for 48G GPU memory. You can reduce it to 1 if you have larger memory.

# GRPO Training
bash examples/train/okvos/train_8b.sh 

# DAPO Training
bash examples/train/okvos/train_8b_dapo.sh

📊 Evaluation

1. Reasoning & Search

Remember to replace the checkpoint path in the evaluation script before running.

# Serper API (Google Search)
bash examples/train/okvos/eval.sh 

# DuckDuckGo (Free API)
bash examples/train/okvos/eval_ddg.sh

The prediction results will be saved in a jsonl file.

2. Segmentation

Based on the generated jsonl file, run the segmentation model to generate object masks:

python post_segmentation/sam2_okvos.py [path_to_jsonl]

🧩 Web Browsing

You can integrate an auxiliary LLM (e.g., Qwen3-Next-80B-A3B-Instruct-FP8) to act as a summarizer, empowering Seg-ReSearch with web browsing capabilities for more precise retrieval results. Give it a try! 😊

To enable this feature,

Launch the LLM service:

vllm serve Qwen/Qwen3-Next-80B-A3B-Instruct-FP8 \
     --port 8000 \
     --tensor-parallel-size 2 \
     --max-model-len 262144

Then, set the SUMM_MODEL_URL and SUMM_MODEL_PATH in eval.sh, like

export SUMM_MODEL_URL="http://localhost:8000/v1"
export SUMM_MODEL_PATH="Qwen/Qwen3-Next-80B-A3B-Instruct-FP8"

🤝 Acknowledgements

Our work is built upon verl-tool, Seg-Zero and SeC. We sincerely appreciate these excellent works.

📜 Citation

If you find our work helpful for your research, please consider citing our paper.

@article{liang2026segresearch,
title={Seg-ReSearch: Segmentation with Interleaved Reasoning and External Search}, 
author={Tianming Liang and Qirui Du and Jian-Fang Hu and Haichao Jiang and Zicheng Lin and Wei-Shi Zheng},
journal={arXiv preprint arXiv:2602.04454},
year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
SeC		SeC
assets		assets
data		data
examples		examples
post_segmentation		post_segmentation
sam2		sam2
search_engine		search_engine
verl		verl
verl_step_records		verl_step_records
verl_tool		verl_tool
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Seg-ReSearch: Segmentation with Interleaved Reasoning and External Search

🌟 Highlights

📢 News

🔎 Framework

⚙️ Getting Started

Installation

Retrieval Server Setup

📦 Data Preparation

Download OK-VOS Benchmark

Preprocess

🚀 Training

📊 Evaluation

1. Reasoning & Search

2. Segmentation

🧩 Web Browsing

🤝 Acknowledgements

📜 Citation

About

Uh oh!

Releases

Packages

Languages

License

iSEE-Laboratory/Seg-ReSearch

Folders and files

Latest commit

History

Repository files navigation

Seg-ReSearch: Segmentation with Interleaved Reasoning and External Search

🌟 Highlights

📢 News

🔎 Framework

⚙️ Getting Started

Installation

Retrieval Server Setup

📦 Data Preparation

Download OK-VOS Benchmark

Preprocess

🚀 Training

📊 Evaluation

1. Reasoning & Search

2. Segmentation

🧩 Web Browsing

🤝 Acknowledgements

📜 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages