This PR updates the TorchElastic Lab configuration to leverage multi-GPU nodes more effectively by allocating 4 GPUs per worker pod instead of 1. This change enables more efficient distributed training on modern GPU clusters.#1
Open
MagellaX wants to merge 1 commit intolenisha:mainfrom
Commits
Commits on Jun 20, 2025
- committed