From 64ba3b3a4bcba7d1b71c76c467b9a1027e6fbe5b Mon Sep 17 00:00:00 2001 From: DuangZhu Date: Wed, 31 Dec 2025 13:37:13 +0800 Subject: [PATCH 1/2] add VL-LN Bench doc --- source/en/user_guide/internnav/index.md | 1 + .../internnav/projects/benchmark.md | 157 ++++++++++++++++++ .../en/user_guide/internnav/projects/index.md | 16 ++ 3 files changed, 174 insertions(+) create mode 100644 source/en/user_guide/internnav/projects/benchmark.md create mode 100644 source/en/user_guide/internnav/projects/index.md diff --git a/source/en/user_guide/internnav/index.md b/source/en/user_guide/internnav/index.md index 8afca93..4aa56aa 100644 --- a/source/en/user_guide/internnav/index.md +++ b/source/en/user_guide/internnav/index.md @@ -13,4 +13,5 @@ myst: quick_start/index tutorials/index +projects/index ``` diff --git a/source/en/user_guide/internnav/projects/benchmark.md b/source/en/user_guide/internnav/projects/benchmark.md new file mode 100644 index 0000000..2e51932 --- /dev/null +++ b/source/en/user_guide/internnav/projects/benchmark.md @@ -0,0 +1,157 @@ +# Extension Benchmark in InternNav + +This guidance would detail to show how to use specific dataset for training a VLA model for different navigation benchmark. + +## VL-LN Bench + +VL-LN Bench is a large-scale benchmark for Interactive Instance Goal Navigation. It provides (1) an automatically dialog-augmented trajectory generation pipeline, (2) a comprehensive evaluation protocol for training and assessing dialog-capable navigation models, and (3) the dataset and base model used in our experiments. For full details, see our [paper](https://arxiv.org/abs/2512.22342) and the [project website](https://0309hws.github.io/VL-LN.github.io/). + +- [Data Collection Pipeline](https://github.com/InternRobotics/VL-LN) +- [Training and Evaluation](https://github.com/InternRobotics/InternNav) +- [Dataset](https://huggingface.co/datasets/InternRobotics/VL-LN-Bench) and [Base Model](https://huggingface.co/InternRobotics/VL-LN-Bench-basemodel) + + +### 1. Download Data & Assets +VL-LN Bench is built on Matterport3D (MP3D) scenes, so you’ll need to download both the MP3D scene dataset and the VL-LN Bench dataset. +- Scene Datasets + + Download the scene dataset of [MP3D](https://niessner.github.io/Matterport/) +- [VL-LN Data](https://huggingface.co/datasets/InternRobotics/VL-LN-Bench) +- [VL-LN Base Model](https://huggingface.co/InternRobotics/VL-LN-Bench-basemodel) + +After unzipping the base model, scene datasets, and trajectory data, put everything under VL-LN-Bench/ in the layout below. + ```bash + VL-LN-Bench/ + ├── base_model/ + │ └── iion/ + ├── raw_data/ + │ └── mp3d/ + │ ├── scene_summary/ + │ ├── train/ + │ │ ├── train_ion.json.gz + │ │ └── train_iion.json.gz + │ └── val_unseen/ + │ ├── val_unseen_ion.json.gz + │ └── val_unseen_iion.json.gz + ├── scene_datasets/ + │ └── mp3d/ + │ ├── 17DRP5sb8fy/ + │ ├── 1LXtFkjw3qL/ + │ ... + └── traj_data/ + ├── mp3d_split1/ + ├── mp3d_split2/ + └── mp3d_split3/ + ``` + +### 2. Environment Setup +Here we set up the Python environment for VL-LN Bench and InternVLA-N1. If you’ve already installed the InternVLA-N1 environment, you can skip those steps and only run the commands related to VL-LN Bench. + +- Get Code + ```bash + git clone git@github.com:InternRobotics/VL-LN.git # code for data collection + git clone git@github.com:InternRobotics/InternNav.git # code for training and evaluation + ``` + +- Create Conda Environment + ```bash + conda create -n vlln python=3.9 -y + conda activate vlln + ``` + +- Install Dependencies + ```bash + conda install habitat-sim=0.2.4 withbullet headless -c conda-forge -c aihabitat + cd VL-LN + pip install -r requirements.txt + cd ../InternNav + pip install -e . + ``` + +### 3. Guidance for Data Collection +This step is optional. You can either use our collected data for policy training, or follow this step to collect your own training data. + + +- Prerequisites: + - Get pointnav_weights.pth from [VLFM](https://github.com/bdaiinstitute/vlfm/tree/main/data) + - Arrange the Directory Structure Like This + ```bash + VL-LN + ├── dialog_generation/ + ├── images/ + ├── VL-LN-Bench/ + │ ├── base_model/ + │ ├── raw_data/ + │ ├── scene_datasets/ + │ ├── traj_data/ + │ └── pointnav_weights.pth + ... + ``` + +- Collect Trajectories + ```bash + # If having slurm + sbatch generate_frontiers_dialog.sh + + # Or directly run + python generate_frontiers_dialog.py \ + --task instance \ + --vocabulary hm3d \ + --scene_ids all \ + --shortest_path_threshold 0.1 \ + --target_detected_threshold 5 \ + --episodes_file_path VL-LN-Bench/raw_data/mp3d/train/train_iion.json.gz \ + --habitat_config_path dialog_generation/config/tasks/dialog_mp3d.yaml \ + --baseline_config_path dialog_generation/config/expertiments/gen_videos.yaml \ + --normal_category_path dialog_generation/normal_category.json \ + --pointnav_policy_path VL-LN-Bench/pointnav_weights.pth\ + --scene_summary_path VL-LN-Bench/raw_data/mp3d/scene_summary\ + --output_dir \ + ``` + +### 4. Guidance for Training and Evaluation +Here we show how to train your own model for the IIGN task and evaluate it on VL-LN Bench. + +- Prerequisites + ```bash + cd InternNav + # Link VL-LN Bench data into InternNav + mkdir projects && cd projects + ln -s /path/to/your/VL-LN-Bench ./VL-LN-Bench + ``` + - Write Your Api Key of OpenAI in api_key.txt. + ```bash + # Your final repo structure may look like + InternNav + ├── assets/ + ├── internnav/ + │ ├── habitat_vlln_extensions + │ │ ├── simple_npc + │ │ │ ├── api_key.txt + │ ... ... ... + ... + ├── projects + │ ├── VL-LN-Bench/ + │ │ ├── base_model/ + │ │ ├── raw_data/ + │ │ ├── scene_datasets/ + │ │ ├── traj_data/ + ... ... + ``` + +- Start Training + ```bash + # Before running, please open this script and make sure + # the "llm" path points to the correct checkpoint on your machine. + sh ./scripts/train/qwenvl_train/train_system2_vlln.sh + ``` + +- Start Evaluation + ```bash + # If having slurm + sh ./scripts/eval/bash/srun_eval_dialog.sh + + # Or directly run + python scripts/eval/eval.py \ + --config scripts/eval/configs/habitat_dialog_cfg.py + ``` diff --git a/source/en/user_guide/internnav/projects/index.md b/source/en/user_guide/internnav/projects/index.md new file mode 100644 index 0000000..f4d7f08 --- /dev/null +++ b/source/en/user_guide/internnav/projects/index.md @@ -0,0 +1,16 @@ +--- +myst: + html_meta: + "description lang=en": | + Documentation for users who wish to build sphinx sites with + pydata-sphinx-theme. +--- + +# Projects + +```{toctree} +:caption: Projects +:maxdepth: 2 + +benchmark +``` From b698f76d546fae0e0141ee0ca1bfcdb35712482e Mon Sep 17 00:00:00 2001 From: DuangZhu Date: Wed, 31 Dec 2025 15:55:05 +0800 Subject: [PATCH 2/2] Solve the issua from kew6688 --- .../user_guide/internnav/projects/benchmark.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/source/en/user_guide/internnav/projects/benchmark.md b/source/en/user_guide/internnav/projects/benchmark.md index 2e51932..c35c70a 100644 --- a/source/en/user_guide/internnav/projects/benchmark.md +++ b/source/en/user_guide/internnav/projects/benchmark.md @@ -1,21 +1,21 @@ -# Extension Benchmark in InternNav +# Extended Benchmarks in InternNav -This guidance would detail to show how to use specific dataset for training a VLA model for different navigation benchmark. +This guide details how to use specific dataset for training a VLA model for different navigation benchmark. ## VL-LN Bench -VL-LN Bench is a large-scale benchmark for Interactive Instance Goal Navigation. It provides (1) an automatically dialog-augmented trajectory generation pipeline, (2) a comprehensive evaluation protocol for training and assessing dialog-capable navigation models, and (3) the dataset and base model used in our experiments. For full details, see our [paper](https://arxiv.org/abs/2512.22342) and the [project website](https://0309hws.github.io/VL-LN.github.io/). +VL-LN Bench is a large-scale benchmark for Interactive Instance Goal Navigation. VL-LN Bench provides: (1) an automatically dialog-augmented trajectory generation pipeline, (2) a comprehensive evaluation protocol for training and assessing dialog-capable navigation models, and (3) the dataset and base model used in our experiments. For full details, see our [paper](https://arxiv.org/abs/2512.22342) and the [project website](https://0309hws.github.io/VL-LN.github.io/). - [Data Collection Pipeline](https://github.com/InternRobotics/VL-LN) -- [Training and Evaluation](https://github.com/InternRobotics/InternNav) +- [Training and Evaluation Code](https://github.com/InternRobotics/InternNav) - [Dataset](https://huggingface.co/datasets/InternRobotics/VL-LN-Bench) and [Base Model](https://huggingface.co/InternRobotics/VL-LN-Bench-basemodel) ### 1. Download Data & Assets -VL-LN Bench is built on Matterport3D (MP3D) scenes, so you’ll need to download both the MP3D scene dataset and the VL-LN Bench dataset. +VL-LN Bench is built on Matterport3D (MP3D) Scene Dataset, so you need to download both the MP3D scene dataset and the VL-LN Bench dataset. - Scene Datasets - Download the scene dataset of [MP3D](https://niessner.github.io/Matterport/) + Download the [MP3D Scene Dataset](https://niessner.github.io/Matterport/) - [VL-LN Data](https://huggingface.co/datasets/InternRobotics/VL-LN-Bench) - [VL-LN Base Model](https://huggingface.co/InternRobotics/VL-LN-Bench-basemodel) @@ -45,7 +45,7 @@ After unzipping the base model, scene datasets, and trajectory data, put everyth ``` ### 2. Environment Setup -Here we set up the Python environment for VL-LN Bench and InternVLA-N1. If you’ve already installed the InternVLA-N1 environment, you can skip those steps and only run the commands related to VL-LN Bench. +Here we set up the Python environment for VL-LN Bench and InternVLA-N1. If you've already installed the InternNav Habitat environment, you can skip those steps and only run the commands related to VL-LN Bench. - Get Code ```bash @@ -119,7 +119,7 @@ Here we show how to train your own model for the IIGN task and evaluate it on VL mkdir projects && cd projects ln -s /path/to/your/VL-LN-Bench ./VL-LN-Bench ``` - - Write Your Api Key of OpenAI in api_key.txt. + - Write your OpenAI API key to api_key.txt. ```bash # Your final repo structure may look like InternNav @@ -143,7 +143,7 @@ Here we show how to train your own model for the IIGN task and evaluate it on VL ```bash # Before running, please open this script and make sure # the "llm" path points to the correct checkpoint on your machine. - sh ./scripts/train/qwenvl_train/train_system2_vlln.sh + sbatch ./scripts/train/qwenvl_train/train_system2_vlln.sh ``` - Start Evaluation