After git cloning, run the following commands:
git submodule update --init --recursive
conda create -n point_tracking python=3.12 -y
conda activate point_tracking
pip install -r requirements.txt
# detectron gripper tracking
pip install git+https://github.com/facebookresearch/detectron2@9604f5995cc628619f0e4fd913453b4d7d61db3f
gdown --fuzzy https://drive.google.com/file/d/1qQV-yPZHqW9Z_eKR_U0aTnkcKpQ1em9V/view # download the detectron gripper weights from LLARVA
mv model_final.pth detectron_gripper.pthFor co-tracker
pip install git+https://github.com/facebookresearch/co-trackerTo download openx, make sure the rlds_dataset_mod submodule is installed, check the paths are correct in rlds_dataset_mod/prepare_open_x.sh and run the script after:
cd rlds_dataset_mod
conda env create -f environment_ubuntu.yml
conda activate rlds_env
bash prepare_open_x.sh
cd ..However, this downloads an old version of BRIDGE. To download the latest version, run the following script and put the path to the saved directory the same as the path to the saved directory for the openx datasets.
bash download_bridge_v2.sh [/path/to/save/tensorflow_datasets]To label OXE datasets, run the run_openx_processing.sh script.
See README_processing_scripts.md for instructions.
The script now processes datasets sequentially instead of running multiple iterations of the same dataset:
- SAVE_DIR required:
./run_openx_processing.sh /path/to/save/directoryprocesses all datasets - With dataset index:
./run_openx_processing.sh /path/to/save/directory 3processes dataset at index 3 (seerun_openx_processing.shfor the list of datasets) - With dataset name:
./run_openx_processing.sh /path/to/save/directory bridge_v2processes that specific dataset (seerun_openx_processing.shfor the list of datasets)
To label LIBERO data, first download the processed OpenVLA-style-processed LIBERO data here (LIBERO). Then run the following script:
python label_videos_libero.py --save_dir /path/to/save/directory --dataset_location [LOCATION_OF_FOLDERS_CONTAINING_LIBERO_HDF5_FILES] --which_dataset [path/to/libero_90_openvla_processed]To label OXE datasets, run the following script:
bash run_openx_processing.sh /path/to/save/directory /path/to/openx_datasets_saved_location # by default this is ~/tensorflow_datasets/openx_datasets/Or for individual datasets, run the following script:
bash run_openx_processing.sh /path/to/save/directory /path/to/openx_datasets_saved_location [dataset_name]The scripts keep track of processed videos in separate JSON files:
finished_videos.json: Tracks videos that have been fully processed with trackinglabeled_videos.json: Tracks videos that have been labeled with trajectory segments
The processed data is saved in HDF5 files with the following structure:
dataset_movement_and_masks.h5
├── {dataset_name}/
│ ├── {episode_key}/
│ │ ├── {img_key}/
│ │ │ ├── gripper_positions: uint16 array
│ │ │ ├── significant_points: uint16 array
│ │ │ ├── stopped_points: uint16 array
│ │ │ ├── movement_across_subtrajectory: float array
│ │ │ ├── masked_frames: uint8 array (compressed with gzip)
│ │ │ └── traj_splits_indices: uint16 array
Where:
dataset_name: Name of the dataset (e.g., "oxe")episode_key: Unique identifier for each episode in the datasetimg_key: Camera view identifier (e.g., "primary", "secondary")gripper_positions: Binary mask indicating gripper positionssignificant_points: Binary mask indicating points with significant movementstopped_points: Binary mask indicating points that have stopped movingmovement_across_subtrajectory: Array tracking movement across video framesmasked_frames: Video frames with object masks applied (compressed)traj_splits_indices: Indices indicating trajectory splits
The data is organized hierarchically to maintain the relationship between datasets, episodes, and different camera views. All binary masks are stored as uint8 arrays for efficiency, and the masked frames are compressed using gzip compression to reduce file size. The trajectory labels are stored as variable-length strings to accommodate captions of different lengths.
Note: LIBERO is not like this structure.
It is missing dataset_name but is otherwise the same.