The code repository of paper "TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities"
The paper has been accepted for EMNLP-2024(main)!
The model TransferTOD-7B can be accessed in https://www.modelscope.cn/models/Mee1ong/TransferTOD-7B
arxiv: https://arxiv.org/abs/2407.21693
aclanthology: https://aclanthology.org/2024.emnlp-main.710.pdf
Overall statistics of TransferTOD dataset are as follows:
| Train | ID Test | OOD Test | |
|---|---|---|---|
| #Domain | 27 | 27 | 3 |
| #Slot | 188 | 188 | 27 |
| #Dialogue | 4320 | 540 | 600 |
| #Turns | 28680 | 3585 | 3700 |
| #Slots/Dialogue | 10.3 | 10.3 | 9.7 |
| #Tokens/Turn | 66.4 | 66.4 | 76.8 |
ID Test means In-Domain test and OOD Test means Out-of-Domain test. The domains of the test set are Water-Delivery, Sanitation, and Courier.
All the data used in two-staged finetuning and the raw data of TransferTOD is included in directory ./data. For each version, train.json is a mixed data of train_slot.json and equivalent amounts of ./data/raw_data/belle_data/belle_filtered_950k_train.jsonl
For full fine-tuning, run ./fine_tune/scripts/finetune_full.sh, while for lora fine-tuning, run ./fine_tune/scripts/finetune_lora.sh.
For inference and evaluation with the TransferTOD test set, run ./inference/inference_and_eval.sh.
If you find this project useful in your research, please cite:
@inproceedings{DBLP:conf/emnlp/ZhangHWLZDSDZYZ24,
author = {Ming Zhang and
Caishuang Huang and
Yilong Wu and
Shichun Liu and
Huiyuan Zheng and
Yurui Dong and
Yujiong Shen and
Shihan Dou and
Jun Zhao and
Junjie Ye and
Qi Zhang and
Tao Gui and
Xuanjing Huang},
editor = {Yaser Al{-}Onaizan and
Mohit Bansal and
Yun{-}Nung Chen},
title = {TransferTOD: {A} Generalizable Chinese Multi-Domain Task-Oriented
Dialogue System with Transfer Capabilities},
booktitle = {Proceedings of the 2024 Conference on Empirical Methods in Natural
Language Processing, {EMNLP} 2024, Miami, FL, USA, November 12-16,
2024},
pages = {12750--12771},
publisher = {Association for Computational Linguistics},
year = {2024},
url = {https://aclanthology.org/2024.emnlp-main.710},
timestamp = {Thu, 14 Nov 2024 17:20:55 +0100},
biburl = {https://dblp.org/rec/conf/emnlp/ZhangHWLZDSDZYZ24.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Contact Us: mingzhang23@m.fudan.edu.cn