Skip to content

Different results with different tfkit version #10

@jyw777

Description

@jyw777

Hi, I'm trying to reproduce your fantastic results based on BART model. I use the trained model you provided:
https://github.com/voidful/BDG/releases/download/v2.0/BDG_ANPM.pt

When I use tfkit==0.7.0(suggested by readme), I get the result like this:
{'Bleu_1': 0.4116063603355367, 'Bleu_2': 0.2629480211200134, 'Bleu_3': 0.19128546675900487, 'Bleu_4': 0.1484759134861437, 'ROUGE_L': 0.2184638476496905, 'CIDEr': 0.07954905358236805}
The value of ROUGE_L is much lower than the reported value, while the BLEU value is similar to the reported value. It takes me about half an hour for evaluation.

However, when I use tfkit==0.8.1(latest), I get the result like this:
'Bleu_1': 0.40226892712763984, 'Bleu_2': 0.2566475644205321, 'Bleu_3': 0.18535836171285228, 'Bleu_4': 0.14348238003117275, 'ROUGE_L': 0.3556143135035776, 'CIDEr': 0.6532226297900213
The value is similar to the reported one, but it takes much more time (about 2.5 hours) for evaluation on the same GPU, and the tqdm doesn't show the progress bar.

I was wondering why different tfkit versions would cause different results and different evaluation time. Which version should I use?
Thank you very much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions