fix: address multiple correctness and safety bugs by Ridwannurudeen · Pull Request #3 · PluralisResearch/AsyncPP

Ridwannurudeen · 2026-02-07T20:56:26Z

Summary

Remove undefined $dd variable in run.bash: The gpipe baseline command (line 30) references $dd which is never defined, causing the sync baseline to misparse arguments. Removed the stray variable to match the async command structure.
Fix unclosed file handles: json.load(open(...)) in both main_with_runtime.py and sync_main.py leaks file descriptors. Replaced with proper with statements.
Add weights_only to torch.load calls: Resolves PyTorch >=2.6 FutureWarning deprecation across main_with_runtime.py, sync_main.py, and optim/optimizer.py.
Use collections.deque in threadsafe_queue.py: Replaces list.pop(0) (O(n)) with deque.popleft() (O(1)) in the inter-stage communication path.
Add missing __init__.py files: optim/ and runtime/ lack __init__.py, relying on implicit namespace packages. Added explicit init files for robust package resolution.
Fix argparse description: Changed from "PyTorch ImageNet Training" (leftover from PipeDream) to match the actual language model training use case.

Motivation

Found these while reading through the codebase after the ICML 2025 paper. The $dd bug silently breaks the gpipe baseline script, and the torch.load deprecation will become an error in future PyTorch versions.

Test plan

Verified $dd is not defined anywhere in run.bash — confirmed stray variable
Confirmed the async basecmdstr (line 13) does not use $dd, so removing it from the sync basecmdstr is consistent
Verified deque.popleft() maintains identical FIFO semantics to list.pop(0)
Checked that optim/ and runtime/ are imported via sys.path.append — adding __init__.py makes this more robust without breaking existing imports

- Remove undefined $dd variable in run.bash that causes gpipe baseline to misparse command-line arguments - Close file handles properly when loading JSON config (use with-statement instead of bare open() in both main_with_runtime.py and sync_main.py) - Add weights_only parameter to all torch.load calls to resolve PyTorch >=2.6 deprecation warnings - Replace list with collections.deque in threadsafe_queue.py for O(1) popleft instead of O(n) list.pop(0) in the pipeline communication path - Add missing __init__.py files for optim/ and runtime/ to ensure proper Python package resolution - Fix argparse description from "PyTorch ImageNet Training" to match the actual language model training use case

Ridwannurudeen · 2026-03-02T08:26:10Z

Hi! Friendly ping on this PR — fixes multiple correctness and safety bugs identified during a code review. Happy to walk through the changes or make adjustments if needed. Let me know if this repo is still actively maintained.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: address multiple correctness and safety bugs#3

fix: address multiple correctness and safety bugs#3
Ridwannurudeen wants to merge 1 commit intoPluralisResearch:mainfrom
Ridwannurudeen:fix/correctness-and-safety-bugs

Ridwannurudeen commented Feb 7, 2026

Uh oh!

Ridwannurudeen commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ridwannurudeen commented Feb 7, 2026

Summary

Motivation

Test plan

Uh oh!

Ridwannurudeen commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant