Skip to content

This PR updates the TorchElastic Lab configuration to leverage multi-GPU nodes more effectively by allocating 4 GPUs per worker pod instead of 1. This change enables more efficient distributed training on modern GPU clusters.#1

Open
MagellaX wants to merge 1 commit intolenisha:mainfrom
MagellaX:fix-step3-entrypoint

Commits

Commits on Jun 20, 2025