Skip to content
This repository was archived by the owner on Sep 18, 2025. It is now read-only.

Support streaming ASR evaluation#30

Open
hirofumi0810 wants to merge 1 commit intomainfrom
wer_eval
Open

Support streaming ASR evaluation#30
hirofumi0810 wants to merge 1 commit intomainfrom
wer_eval

Conversation

@hirofumi0810
Copy link

Support streaming ASR evaluation in WER. Migrated the same tokenizer from fairseq.

@hirofumi0810 hirofumi0810 requested a review from xutaima March 3, 2023 01:25
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 3, 2023
facebook-github-bot pushed a commit that referenced this pull request Mar 3, 2023
Summary: Pull Request resolved: fairinternal/SimulEval#30

Test Plan: `python tt_waitk_unity_v2.py --min-unit-chunk-size 10  --latency-metrics EndOffset --no-use-ref-len`

Reviewed By: xutaima

Differential Revision: D43581361

Pulled By: annasun28

fbshipit-source-id: 86b06784888bcfd452f6228fd55736fc01b72429
xutaima pushed a commit that referenced this pull request Apr 14, 2023
Summary: Pull Request resolved: fairinternal/SimulEval#30

Test Plan: `python tt_waitk_unity_v2.py --min-unit-chunk-size 10  --latency-metrics EndOffset --no-use-ref-len`

Reviewed By: xutaima

Differential Revision: D43581361

Pulled By: annasun28

fbshipit-source-id: 86b06784888bcfd452f6228fd55736fc01b72429
xutaima added a commit that referenced this pull request Apr 14, 2023
* Fix several bugs 20230109 (#23)

Summary:
- Fix the bugs where the scorers' options are not passed to cli.
- Update sacrebleu dependency to 2.3.1 to support ja-mecab tokenizer
- Fix the error when `eval_latency_unit=char`. The options are used for languages without spaces (e.g. Zh and JA)
- Several bugs in speech-to-text data loader
- Fix the bug where the last delay is ignored when computing CA AL
- Fix minor errors when running remote evaluation
- Typos and type hint mismatches

Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/23

Reviewed By: annasun28

Differential Revision: D42455147

Pulled By: xutaima

fbshipit-source-id: 05b63ad0ed16c37093ad58b49b4b1a1c97fa6070

* Add missing license comments (#24)

Summary:
To address the task [T140465752](https://www.internalfb.com/intern/tasks/?t=140465752)

Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/24

Reviewed By: annasun28

Differential Revision: D42765319

Pulled By: xutaima

fbshipit-source-id: be595eafce62d4b333ee206db7b103f4a7911a21

* Add ATDScore (#28)

Summary:
Added Average Token Delay (ATD) for a latency metric.
paper: Average Token Delay: A Latency Metric for Simultaneous Translation (https://arxiv.org/abs/2211.13173)

X-link: #28

Reviewed By: annasun28

Differential Revision: D42768122

Pulled By: xutaima

fbshipit-source-id: f5cbeb785486dfdbb48a156859dd0451e96fd4cc

* Enable --system-dir option (#25)

Summary:
Simplify the agent building argument to just directory name, with an optional config name.

- [x]  Documentation on readthedocs
- [x]  Provide example system directories based on current s2t and s2t

Given a system directory `${system_dir}`
```bash
> ls ${system_dir}
main.yaml  checkpoint.pt  config.yaml  dict.txt  sentence.bpe.model  wav2vec_small.yaml
```
and `main.yaml` has
```yaml
agent_class: fairseq.models.streaming.agents.TestTimeWaitKS2T
checkpoint: checkpoint.pt
sentencepiece_model: sentence.bpe.model
config_yaml: config.yaml
wav2vec_yaml: wav2vec_small.yaml
waitk_lagging: 2
fixed_pre_decision_ratio: 4
device: cuda:0
```

From cli
```
simuleval --standalone --system-dir ${system_dir}
```

In python
```
from simuleval.utils import build_system_from_dir

system = build_system_from_dir("system")

print(system)
while True:
    speech_segment = audio_frontend.send_segment()
    output_segment = system.pushpop(speech_segment)
    print(output_segment)
    if output_segment.finished:
        break
```

Systems available now (under ` /large_experiments/seamless/ust/xutaima/2023_H1/demo/systems`):
| Path | Modality | language | Description |
| ---- | --------| ------------- | ---------- |
| `s2t_es-en_tt-waitk_multidomain` | speech-to-text | es -> en | Multidomain (2022 H2) |
| `s2t_en-de_tt-waitk_iwslt2023-must-c` | speech-to-text | en -> de | MuST-C for IWSLT 2023 |
| `s2t_en-zh_tt-waitk_iwslt2023-must-c` | speech-to-text | en -> zh | MuST-C for IWSLT 2023 |
| `s2t_en-ja_tt-waitk_iwslt2023-must-c` | speech-to-text | en -> ja | MuST-C for IWSLT 2023 |
|`s2s_es-en_tt-waitk-cascaded_multidomain`| speech-to-speech | es -> en | Multidomain Cascaced Model (2022 Q3) |
|`s2s_es-en_tt-waitk-unity2_multidomain`|  speech-to-speech | es -> en | Multidomain UnitY2 model (2022 H2 |)

Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/25

Reviewed By: schwarzmx

Differential Revision: D42976452

Pulled By: xutaima

fbshipit-source-id: 64d2224ff88fe6ceadabc42236f3c791298d967d

* Enable stateless Agent (#26)

Summary:
Add optional argument states `policy`, `push` and `pop` functions to enable stateless agent. We may want stateless to be the only option in the future.

Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/26

Reviewed By: annasun28

Differential Revision: D43005153

Pulled By: xutaima

fbshipit-source-id: 67612b96e0d2f46a1f7573aeb0296942b4a1fdd9

* Modify the current agents to stateless for production setting (#3817)

Summary:
Now we can use agent like this
```python

system = build_system_from_dir(
    "/large_experiments/seamless/ust/xutaima/2023_H1/demo/systems/s2t_es-en_tt-waitk_multidomain"
)
system.to("cuda:0")

system_states = system.build_states()

while True:
    speech_segment = audio_frontend.send_segment()
    output_segment = system.pushpop(speech_segment, system_states)
    print(output_segment)
    if output_segment.finished:
        break
```

X-link: https://github.com/fairinternal/fairseq-py/pull/3817

Reviewed By: annasun28

Differential Revision: D43008938

Pulled By: xutaima

fbshipit-source-id: 89da787199cf961b33aea822f0a11cc46e30d8c7

* Fix ASR BLEU (#27)

Summary: Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/27

Reviewed By: padentomasello

Differential Revision: D43139557

Pulled By: xutaima

fbshipit-source-id: 8093550156b591459a19e85e78f67c80ece03491

* Fix the bugs introduced in recent PRs (#28)

Summary: Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/28

Reviewed By: schwarzmx

Differential Revision: D43371745

Pulled By: xutaima

fbshipit-source-id: 4440d87a02534ad441b04363ff3696806fdf61a0

* fixed the issue when running jobs through slurm (#29)

Summary: Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/29

Test Plan:
Imported from GitHub, without a `Test Plan:` line.

Kick off evaluation through `--slurm` option.

Previous PRs introduced bug for thie option.

Reviewed By: annasun28

Differential Revision: D43506304

Pulled By: xutaima

fbshipit-source-id: f47299a59b5f0e1c796edc53efadfd54967ddc6f

* EndOffset using playback intervals (#30)

Summary: Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/30

Test Plan: `python tt_waitk_unity_v2.py --min-unit-chunk-size 10  --latency-metrics EndOffset --no-use-ref-len`

Reviewed By: xutaima

Differential Revision: D43581361

Pulled By: annasun28

fbshipit-source-id: 86b06784888bcfd452f6228fd55736fc01b72429

* discontinuity metrics (#31)

---------

Co-authored-by: master-possible <kano.yasumasa.kw4@is.naist.jp>
Co-authored-by: Anna Sun <13106449+annasun28@users.noreply.github.com>
xutaima added a commit that referenced this pull request Apr 14, 2023
* Fix several bugs 20230109 (#23)

Summary:
- Fix the bugs where the scorers' options are not passed to cli.
- Update sacrebleu dependency to 2.3.1 to support ja-mecab tokenizer
- Fix the error when `eval_latency_unit=char`. The options are used for languages without spaces (e.g. Zh and JA)
- Several bugs in speech-to-text data loader
- Fix the bug where the last delay is ignored when computing CA AL
- Fix minor errors when running remote evaluation
- Typos and type hint mismatches

Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/23

Reviewed By: annasun28

Differential Revision: D42455147

Pulled By: xutaima

fbshipit-source-id: 05b63ad0ed16c37093ad58b49b4b1a1c97fa6070

* Add missing license comments (#24)

Summary:
To address the task [T140465752](https://www.internalfb.com/intern/tasks/?t=140465752)

Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/24

Reviewed By: annasun28

Differential Revision: D42765319

Pulled By: xutaima

fbshipit-source-id: be595eafce62d4b333ee206db7b103f4a7911a21

* Add ATDScore (#28)

Summary:
Added Average Token Delay (ATD) for a latency metric.
paper: Average Token Delay: A Latency Metric for Simultaneous Translation (https://arxiv.org/abs/2211.13173)

X-link: #28

Reviewed By: annasun28

Differential Revision: D42768122

Pulled By: xutaima

fbshipit-source-id: f5cbeb785486dfdbb48a156859dd0451e96fd4cc

* Enable --system-dir option (#25)

Summary:
Simplify the agent building argument to just directory name, with an optional config name.

- [x]  Documentation on readthedocs
- [x]  Provide example system directories based on current s2t and s2t

Given a system directory `${system_dir}`
```bash
> ls ${system_dir}
main.yaml  checkpoint.pt  config.yaml  dict.txt  sentence.bpe.model  wav2vec_small.yaml
```
and `main.yaml` has
```yaml
agent_class: fairseq.models.streaming.agents.TestTimeWaitKS2T
checkpoint: checkpoint.pt
sentencepiece_model: sentence.bpe.model
config_yaml: config.yaml
wav2vec_yaml: wav2vec_small.yaml
waitk_lagging: 2
fixed_pre_decision_ratio: 4
device: cuda:0
```

From cli
```
simuleval --standalone --system-dir ${system_dir}
```

In python
```
from simuleval.utils import build_system_from_dir

system = build_system_from_dir("system")

print(system)
while True:
    speech_segment = audio_frontend.send_segment()
    output_segment = system.pushpop(speech_segment)
    print(output_segment)
    if output_segment.finished:
        break
```

Systems available now (under ` /large_experiments/seamless/ust/xutaima/2023_H1/demo/systems`):
| Path | Modality | language | Description |
| ---- | --------| ------------- | ---------- |
| `s2t_es-en_tt-waitk_multidomain` | speech-to-text | es -> en | Multidomain (2022 H2) |
| `s2t_en-de_tt-waitk_iwslt2023-must-c` | speech-to-text | en -> de | MuST-C for IWSLT 2023 |
| `s2t_en-zh_tt-waitk_iwslt2023-must-c` | speech-to-text | en -> zh | MuST-C for IWSLT 2023 |
| `s2t_en-ja_tt-waitk_iwslt2023-must-c` | speech-to-text | en -> ja | MuST-C for IWSLT 2023 |
|`s2s_es-en_tt-waitk-cascaded_multidomain`| speech-to-speech | es -> en | Multidomain Cascaced Model (2022 Q3) |
|`s2s_es-en_tt-waitk-unity2_multidomain`|  speech-to-speech | es -> en | Multidomain UnitY2 model (2022 H2 |)

Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/25

Reviewed By: schwarzmx

Differential Revision: D42976452

Pulled By: xutaima

fbshipit-source-id: 64d2224ff88fe6ceadabc42236f3c791298d967d

* Enable stateless Agent (#26)

Summary:
Add optional argument states `policy`, `push` and `pop` functions to enable stateless agent. We may want stateless to be the only option in the future.

Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/26

Reviewed By: annasun28

Differential Revision: D43005153

Pulled By: xutaima

fbshipit-source-id: 67612b96e0d2f46a1f7573aeb0296942b4a1fdd9

* Modify the current agents to stateless for production setting (#3817)

Summary:
Now we can use agent like this
```python

system = build_system_from_dir(
    "/large_experiments/seamless/ust/xutaima/2023_H1/demo/systems/s2t_es-en_tt-waitk_multidomain"
)
system.to("cuda:0")

system_states = system.build_states()

while True:
    speech_segment = audio_frontend.send_segment()
    output_segment = system.pushpop(speech_segment, system_states)
    print(output_segment)
    if output_segment.finished:
        break
```

X-link: https://github.com/fairinternal/fairseq-py/pull/3817

Reviewed By: annasun28

Differential Revision: D43008938

Pulled By: xutaima

fbshipit-source-id: 89da787199cf961b33aea822f0a11cc46e30d8c7

* Fix ASR BLEU (#27)

Summary: Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/27

Reviewed By: padentomasello

Differential Revision: D43139557

Pulled By: xutaima

fbshipit-source-id: 8093550156b591459a19e85e78f67c80ece03491

* Fix the bugs introduced in recent PRs (#28)

Summary: Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/28

Reviewed By: schwarzmx

Differential Revision: D43371745

Pulled By: xutaima

fbshipit-source-id: 4440d87a02534ad441b04363ff3696806fdf61a0

* fixed the issue when running jobs through slurm

* fixed the issue when running jobs through slurm (#29)

Summary: Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/29

Test Plan:
Imported from GitHub, without a `Test Plan:` line.

Kick off evaluation through `--slurm` option.

Previous PRs introduced bug for thie option.

Reviewed By: annasun28

Differential Revision: D43506304

Pulled By: xutaima

fbshipit-source-id: f47299a59b5f0e1c796edc53efadfd54967ddc6f

* EndOffset using playback intervals (#30)

Summary: Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/30

Test Plan: `python tt_waitk_unity_v2.py --min-unit-chunk-size 10  --latency-metrics EndOffset --no-use-ref-len`

Reviewed By: xutaima

Differential Revision: D43581361

Pulled By: annasun28

fbshipit-source-id: 86b06784888bcfd452f6228fd55736fc01b72429

* add instruction for speech-to-speech evaluation

* Fix bugs in starting offset

* update results

* update readme

* fixed ATD for s2s (#34)

* add new line to files

* discontinuity metrics (#31)

* Add whisper ASR BLEU

---------

Co-authored-by: master-possible <kano.yasumasa.kw4@is.naist.jp>
Co-authored-by: Anna Sun <13106449+annasun28@users.noreply.github.com>
Co-authored-by: master-possible <66279784+master-possible@users.noreply.github.com>
xutaima added a commit that referenced this pull request Apr 14, 2023
* Fix several bugs 20230109 (#23)

Summary:
- Fix the bugs where the scorers' options are not passed to cli.
- Update sacrebleu dependency to 2.3.1 to support ja-mecab tokenizer
- Fix the error when `eval_latency_unit=char`. The options are used for languages without spaces (e.g. Zh and JA)
- Several bugs in speech-to-text data loader
- Fix the bug where the last delay is ignored when computing CA AL
- Fix minor errors when running remote evaluation
- Typos and type hint mismatches

Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/23

Reviewed By: annasun28

Differential Revision: D42455147

Pulled By: xutaima

fbshipit-source-id: 05b63ad0ed16c37093ad58b49b4b1a1c97fa6070

* Add missing license comments (#24)

Summary:
To address the task [T140465752](https://www.internalfb.com/intern/tasks/?t=140465752)

Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/24

Reviewed By: annasun28

Differential Revision: D42765319

Pulled By: xutaima

fbshipit-source-id: be595eafce62d4b333ee206db7b103f4a7911a21

* Add ATDScore (#28)

Summary:
Added Average Token Delay (ATD) for a latency metric.
paper: Average Token Delay: A Latency Metric for Simultaneous Translation (https://arxiv.org/abs/2211.13173)

X-link: #28

Reviewed By: annasun28

Differential Revision: D42768122

Pulled By: xutaima

fbshipit-source-id: f5cbeb785486dfdbb48a156859dd0451e96fd4cc

* Enable --system-dir option (#25)

Summary:
Simplify the agent building argument to just directory name, with an optional config name.

- [x]  Documentation on readthedocs
- [x]  Provide example system directories based on current s2t and s2t

Given a system directory `${system_dir}`
```bash
> ls ${system_dir}
main.yaml  checkpoint.pt  config.yaml  dict.txt  sentence.bpe.model  wav2vec_small.yaml
```
and `main.yaml` has
```yaml
agent_class: fairseq.models.streaming.agents.TestTimeWaitKS2T
checkpoint: checkpoint.pt
sentencepiece_model: sentence.bpe.model
config_yaml: config.yaml
wav2vec_yaml: wav2vec_small.yaml
waitk_lagging: 2
fixed_pre_decision_ratio: 4
device: cuda:0
```

From cli
```
simuleval --standalone --system-dir ${system_dir}
```

In python
```
from simuleval.utils import build_system_from_dir

system = build_system_from_dir("system")

print(system)
while True:
    speech_segment = audio_frontend.send_segment()
    output_segment = system.pushpop(speech_segment)
    print(output_segment)
    if output_segment.finished:
        break
```

Systems available now (under ` /large_experiments/seamless/ust/xutaima/2023_H1/demo/systems`):
| Path | Modality | language | Description |
| ---- | --------| ------------- | ---------- |
| `s2t_es-en_tt-waitk_multidomain` | speech-to-text | es -> en | Multidomain (2022 H2) |
| `s2t_en-de_tt-waitk_iwslt2023-must-c` | speech-to-text | en -> de | MuST-C for IWSLT 2023 |
| `s2t_en-zh_tt-waitk_iwslt2023-must-c` | speech-to-text | en -> zh | MuST-C for IWSLT 2023 |
| `s2t_en-ja_tt-waitk_iwslt2023-must-c` | speech-to-text | en -> ja | MuST-C for IWSLT 2023 |
|`s2s_es-en_tt-waitk-cascaded_multidomain`| speech-to-speech | es -> en | Multidomain Cascaced Model (2022 Q3) |
|`s2s_es-en_tt-waitk-unity2_multidomain`|  speech-to-speech | es -> en | Multidomain UnitY2 model (2022 H2 |)

Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/25

Reviewed By: schwarzmx

Differential Revision: D42976452

Pulled By: xutaima

fbshipit-source-id: 64d2224ff88fe6ceadabc42236f3c791298d967d

* Enable stateless Agent (#26)

Summary:
Add optional argument states `policy`, `push` and `pop` functions to enable stateless agent. We may want stateless to be the only option in the future.

Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/26

Reviewed By: annasun28

Differential Revision: D43005153

Pulled By: xutaima

fbshipit-source-id: 67612b96e0d2f46a1f7573aeb0296942b4a1fdd9

* Modify the current agents to stateless for production setting (#3817)

Summary:
Now we can use agent like this
```python

system = build_system_from_dir(
    "/large_experiments/seamless/ust/xutaima/2023_H1/demo/systems/s2t_es-en_tt-waitk_multidomain"
)
system.to("cuda:0")

system_states = system.build_states()

while True:
    speech_segment = audio_frontend.send_segment()
    output_segment = system.pushpop(speech_segment, system_states)
    print(output_segment)
    if output_segment.finished:
        break
```

X-link: https://github.com/fairinternal/fairseq-py/pull/3817

Reviewed By: annasun28

Differential Revision: D43008938

Pulled By: xutaima

fbshipit-source-id: 89da787199cf961b33aea822f0a11cc46e30d8c7

* Fix ASR BLEU (#27)

Summary: Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/27

Reviewed By: padentomasello

Differential Revision: D43139557

Pulled By: xutaima

fbshipit-source-id: 8093550156b591459a19e85e78f67c80ece03491

* Fix the bugs introduced in recent PRs (#28)

Summary: Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/28

Reviewed By: schwarzmx

Differential Revision: D43371745

Pulled By: xutaima

fbshipit-source-id: 4440d87a02534ad441b04363ff3696806fdf61a0

* fixed the issue when running jobs through slurm

* fixed the issue when running jobs through slurm (#29)

Summary: Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/29

Test Plan:
Imported from GitHub, without a `Test Plan:` line.

Kick off evaluation through `--slurm` option.

Previous PRs introduced bug for thie option.

Reviewed By: annasun28

Differential Revision: D43506304

Pulled By: xutaima

fbshipit-source-id: f47299a59b5f0e1c796edc53efadfd54967ddc6f

* EndOffset using playback intervals (#30)

Summary: Pull Request resolved: https://github.com/fairinternal/SimulEval/pull/30

Test Plan: `python tt_waitk_unity_v2.py --min-unit-chunk-size 10  --latency-metrics EndOffset --no-use-ref-len`

Reviewed By: xutaima

Differential Revision: D43581361

Pulled By: annasun28

fbshipit-source-id: 86b06784888bcfd452f6228fd55736fc01b72429

* add instruction for speech-to-speech evaluation

* Fix bugs in starting offset

* update results

* update readme

* fixed ATD for s2s (#34)

* add new line to files

* discontinuity metrics (#31)

* Add whisper ASR BLEU

* Save individual metrics if possible

* add metrics to log instance

---------

Co-authored-by: master-possible <kano.yasumasa.kw4@is.naist.jp>
Co-authored-by: Anna Sun <13106449+annasun28@users.noreply.github.com>
Co-authored-by: master-possible <66279784+master-possible@users.noreply.github.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants