Skip to content
This repository was archived by the owner on Mar 3, 2026. It is now read-only.
This repository was archived by the owner on Mar 3, 2026. It is now read-only.

Some tests will hang when running tests with pytest on a TPU VM #375

@erichuang-cienet

Description

@erichuang-cienet

When I use pytest to run all tests in the repository on a TPU VM, some of them hang. The hanging tests are as follows:

  • torchprime/launcher/test_run_model.py
  • torchprime/tests/test_parallelism_utils.py
  • torchprime/tests/test_system_check.py
  • torchprime/torch_xla_models/tests/test_assume_pure.py
  • torchprime/torch_xla_models/tests/test_deepseek_v3.py
  • torchprime/torch_xla_models/tests/test_llama.py
  • torchprime/torch_xla_models/tests/test_llama4.py
  • torchprime/torch_xla_models/tests/test_mixtral.py
  • torchprime/torch_xla_models/tests/test_model_loading_saving.py
  • torchprime/torch_xla_models/tests/test_sft_trainer.py
  • torchprime/torch_xla_models/tests/test_trainer.py

However, when I run the above tests individually, they do not hang, except for torchprime/launcher/test_run_model.py.

I found that I need to disable the xla_tpu_use_enhanced_launch_barrier flag to prevent these tests from hanging.

Command:

export LIBTPU_INIT_ARGS='--xla_tpu_use_enhanced_launch_barrier=false'
pytest -v

Environment:

TPU VM: v6e-8
Python 3.11
torch 2.9.0.dev20250825+cpu
torch-xla 2.9.0+git8243a25

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions