Source codes for Rethinking Style Transformer with Energy-based Interpretation: Adversarial Unsupervised Style Transfer using a Pretrained Model, accepted at EMNLP-22
- Check
docker/requirements.txtto install dependencies
- Check
Dockerfileanddocker-compose.ymlto set up the environment. - Append the below code in front of the python command instead of
CUDA_VISIBLE_DEVICES=<device_id>:
# Non-docker version
# CUDA_VISIBLE_DEVICES=<device_id> python -m ...
CUDA_VISIBLE_DEVICES=0 python -m style_bart.train data=gyafc_fr
# Docker version
# USER_ID=$(id -u) GROUP_ID=$(id -g) docker compose -f docker/docker-compose.yml run -e NVIDIA_VISIBLE_DEVICES=<device_id> app python -m ...
USER_ID=$(id -u) GROUP_ID=$(id -g) docker compose -f docker/docker-compose.yml run -e NVIDIA_VISIBLE_DEVICES=0 app python -m style_bart.train data=gyafc_fr.venv: python environment. This folder will be generated automatically.config: configs for experimentscontent: forder for experiment outputs. This folder will be generated automatically.content/pretrain: forder for pretrainingcontent/main: forder for main trainingcontent/eval: forder for evaluation
data: folder for train/dev/test datadata/preprocess: folder for preprocessed data. This folder will be generated automatically.data/yelp: yelp dataset.yelp_academic_dataset_review.jsonshould be included. Download from https://www.yelp.com/datasetdata/gyafc: gyafc dataset includingEntertainment_MusicandFamily_Relationshipsfolders. Download from https://github.com/raosudha89/GYAFC-corpusdata/amazon: amazon dataset. Download from https://github.com/Nrgeup/controllable-text-attribute-transfer/tree/master/data/amazon
docker: docker configsevaluate: evaluation source codestyle_bart: StylaBART source code
# python -m style_bart.data.preprocess [--dataset_name]
python -m style_bart.data.preprocess --gyafc --yelp --amazonPlease check evaluate/README.md.
This procedure is also required to run below training code.
# CUDA_VISIBLE_DEVICES=<device_id> python -m style_bart.pretrain.classifier data=<dataset_name> [args]
CUDA_VISIBLE_DEVICES=0 python -m style_bart.pretrain.classifier data=gyafc_frDepending on the dataset (especially for Amazon), classifier pretraining may not be converged. In this case, larger batch size helps convergence.
CUDA_VISIBLE_DEVICES=0 python -m style_bart.pretrain.classifier data=amazon train.batch_size=512 # train.accumulation=2# CUDA_VISIBLE_DEVICES=<device_id> python -m style_bart.pretrain.autoencoder data=<dataset_name> [args]
CUDA_VISIBLE_DEVICES=0 python -m style_bart.pretrain.autoencoder data=gyafc_fr# CUDA_VISIBLE_DEVICES=<device_id> python -m style_bart.pretrain.lm data=<dataset_name> label=<style> [args]
CUDA_VISIBLE_DEVICES=0 python -m style_bart.pretrain.lm data=gyafc_fr label=0Language models should be trained for both labels 0 and 1.
# CUDA_VISIBLE_DEVICES=<device_id> python -m style_bart.train data=<dataset_name> [args]
CUDA_VISIBLE_DEVICES=0 python -m style_bart.train data=gyafc_fr # train.accumulation=2You can download the trained StyleBART weights from http://gofile.me/6XWMw/L53iBR52U
# CUDA_VISIBLE_DEVICES=<device_id> python -m style_bart.transfer -m <model_path> -l <target_style_label> <prompt>
CUDA_VISIBLE_DEVICES=0 python -m style_bart.transfer -m content/main/gyafc_fr/dump -l 0 "He loves you, too, girl...Time will tell."Another option is redirecting the entire corpus to standard input
CUDA_VISIBLE_DEVICES=0 python -m style_bart.transfer -m content/main/gyafc_fr/dump -l 0 < data/preprocessed/gyafc_fr/sentences.test.1.txtIf you are using Docker, you need to add the -T option to redirect the corpus file.
docker compose -f docker/docker-compose.yml run -e NVIDIA_VISIBLE_DEVICES=0 -T app python -m style_bart.transfer -m content/main/gyafc_fr/dump -l 1 < data/preprocessed/gyafc_fr/sentences.test.0.txt > output.txt