This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"

- Ubuntu 20.04
- CUDA 11.7
- Python 3.7
Install other required packages by
pip install -r requirements.txtThis paper has reconstructed the validation and test sets of two widely used datasets in the TSG domain: Charades-STA and ActivityNet Captions, to construct a testing environment for TSG-RF task., i.e., Charades-STA-RF, ActivityNet Captions-RF. The reconstructed dataset is located in the ./data/dataset directory.
The details about how to prepare the Charades-STA, ActivityNet Captions features are followed previous work: VSLNet Datasets Preparation. Alternatively, you can download the prepared visual features from Mega, and place them to the ./data/features/ directory.
Download the word embeddings from here and place it to
./data/features/ directory.
Train
# train RaTSG on Charades-STA-RF dataset
bash charades_RF_train.sh
# train RaTSG on ActivityNet Captions-RF dataset
bash activitynet_RF_train.shRun the following script to test on the trained models: Test
# test RaTSG on Charades-STA-RF dataset
bash charades_RF_test.sh
# test RaTSG on ActivityNet Captions-RF dataset
bash activitynet_RF_test.shWe release several pretrained checkpoints, please download and put them into ./ckpt/
- RaTSG on Charades-STA-RF: RaTSG_charades_RF_i3d_128
- RaTSG on Activitynet Captions-RF: RaTSG_activitynet_RF_i3d_128
If you find this repository useful, please consider citing our paper:
@article{dong2024temporal,
title={Temporal sentence grounding with relevance feedback in videos},
author={Dong, Jianfeng and Peng, Xiaoman and Liu, Daizong and Qu, Xiaoye and Yang, Xun and Bao, Cuizhu and Wang, Meng},
journal={Advances in Neural Information Processing Systems},
volume={37},
pages={43107--43132},
year={2024}
}