Skip to content

Please Help, issue with WER: 100% During Training #49

@Onestringlab

Description

@Onestringlab

I'm using your code for training on the PHOENIX dataset, but I'm encountering an issue where the Word Error Rate (WER) is consistently 100% during evaluation. Here’s a snippet of my training log:

[ Tue Sep  3 02:31:07 2024 ]    Mean training loss: 9.5473120947.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 135/135 [02:34<00:00,  1.14s/it]
/share/work/latih15/VAC_CSLR_PHOENIX
preprocess.sh ./work_dir/baseline_res18/output-hypothesis-dev-conv.ctm ./work_dir/baseline_res18/tmp.ctm ./work_dir/baseline_res18/tmp2.ctm
Tue Sep 3 02:33:42 AM UTC 2024
Preprocess Finished.
sh: 1: ./software/sclite: not found
/bin/sh: 1: ./software/sclite: not found
Unexpected error: <class 'IndexError'>
[ Tue Sep  3 02:33:42 2024 ] Epoch 39, dev  100.00%
[ Tue Sep  3 02:33:42 2024 ] Dev WER: 100.00%

I noticed the errors related to sclite not being found, and I'm wondering if this might be causing the issue. I followed the configuration in baseline.yaml, but I'm not sure if I've missed anything. Below is my baseline.yaml configuration:

feeder: dataset.dataloader_video.BaseFeeder
phase: train
dataset: phoenix14
# dataset: phoenix14-si5
num_epoch: 40
work_dir: ./work_dir/baseline_res18/
batch_size: 2
random_seed: 0
test_batch_size: 4 #8
num_worker: 8
device: 0,1
log_interval: 50
eval_interval: 1
save_interval: 5
# python in default
evaluate_tool: sclite
loss_weights:
  SeqCTC: 1.0
  # VAC
  # ConvCTC: 1.0
  # Dist: 10.0
#load_weights: ''

optimizer_args:
  optimizer: Adam
  base_lr: 0.0001
  step: [ 20, 35]
  learning_ratio: 1
  weight_decay: 0.0001
  start_epoch: 0
  nesterov: False

feeder_args:
  mode: 'train'
  datatype: 'video'
  num_gloss: -1
  drop_ratio: 1.0

model: slr_network.SLRModel
decode_mode: beam
model_args:
  num_classes: 1296
  c2d_type: resnet18
  conv_type: 2
  use_bn: 1
  # SMKD
  share_classifier: False
  weight_norm: False

Could you advise on what might be going wrong? Do I need to adjust the sclite path or any other settings? Any help would be greatly appreciated.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions