-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Open
Description
🐛 Describe the bug
I built pytorch and torchvision from source against CUDA12.4.1 because that is the latest version available for IBM Power9 ppc64le with V100 GPU.
python3 -c "import torch; print(f'PyTorch version: {torch.__version__}'); print(f'CUDA version: {torch.version.cuda}'); print(f'CUDA available: {torch.cuda.is_available()}')"
PyTorch version: 2.10.0
CUDA version: 12.4
CUDA available: TrueIt isn't clearly documented how to build the wheel so I tried to recreate it based on the gh actions log:
Inside a manylinux_2_28_ppc64le container with CUDA12.4.1 installed:
curl -L https://micro.mamba.pm/api/micromamba/linux-ppc64le/latest | tar -xvj
export MAMBA_ROOT_PREFIX=$HOME/.micromamba # optional, defaults to ~/micromamba
eval "$(./bin/micromamba shell hook -s posix)"
yum install -y libjpeg-turbo-devel libwebp-devel freetype gnutls zip
export rel=v0.25.0
export ver=cp312-cp312
export BUILD_VERSION=${rel:1}
export PYTORCH_BUILD_NUMBER=0
export PYTORCH_BUILD_VERSION=${rel:1}
export FORCE_CUDA=1
git clone --depth 1 -b ${rel} --recursive https://github.com/pytorch/vision.git && cd vision
curl -LO https://raw.githubusercontent.com/pytorch/test-infra/refs/heads/main/.github/scripts/repair_manylinux_2_28.sh
chmod +x repair_manylinux_2_28.sh
sed -i "s/aarch64/ppc64le/g" packaging/post_build_script.sh
export BUILD_VERSION=${rel:1}
export PYTORCH_BUILD_NUMBER=0
export PYTORCH_BUILD_VERSION=${rel:1}
micromamba create -n py-${ver} -c conda-forge python=${ver} conda libwebp libjpeg-turbo -y
micromamba activate py-${ver}
bash packaging/pre_build_script.sh
pip3 install Cython "auditwheel<6.3" numpy future ninja pyyaml http://10.x.x.x/whl/torch/cu124/torch-2.10.0-cp${ver//./}-cp${ver//./}-manylinux_2_28_ppc64le.whl --upgrade setuptools==72.1.0
python3 setup.py clean
python3 setup.py bdist_wheel
./repair_manylinux_2_28.sh /vision/$(ls dist/*whl)
bash packaging/post_build_script.shThen wheel is then uploaded to the 10.x.x.x server from where I install it.
When I try to install it into a python:3.12-slim container I get this error:
export TORCH_VER=2.10.0
export PY_VER=cp312-cp312
pip3 install numpy \
http://10.x.x.x/whl/torch/cu124/torch-${TORCH_VER}-${PY_VER}-manylinux_2_28_ppc64le.whl \
http://10.x.x.x/whl/torchvision/cu124/torchvision-0.25.0-${PY_VER}-manylinux_2_28_ppc64le.whl
python3 -c "import torchvision"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.12/site-packages/torchvision/__init__.py", line 10, in <module>
from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils # usort:skip
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torchvision/_meta_registrations.py", line 163, in <module>
@torch.library.register_fake("torchvision::nms")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/library.py", line 1073, in register
use_lib._register_fake(
File "/usr/local/lib/python3.12/site-packages/torch/library.py", line 203, in _register_fake
handle = entry.fake_impl.register(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_library/fake_impl.py", line 50, in register
if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: operator torchvision::nms does not existVersions
python collect_env.py
Collecting environment information...
PyTorch version: 2.10.0
Is debug build: False
CUDA used to build PyTorch: 12.4
ROCM used to build PyTorch: N/A
OS: Debian GNU/Linux 13 (trixie) (ppc64le)
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.41
Python version: 3.12.13 (main, Mar 3 2026, 20:38:43) [GCC 14.2.0] (64-bit runtime)
Python platform: Linux-4.18.0-553.36.1.el8_10.ppc64le-ppc64le-with-glibc2.41
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to:
GPU models and configuration: GPU 0: Tesla V100-SXM2-32GB
Nvidia driver version: 550.54.15
cuDNN version: Could not collect
Is XPU available: False
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: False
Caching allocator config: N/A
CPU:
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 160
On-line CPU(s) list: 0-159
Model name: POWER9, altivec supported
Model: 2.3 (pvr 004e 1203)
Thread(s) per core: 4
Core(s) per socket: 20
Socket(s): 2
Frequency boost: enabled
CPU(s) scaling MHz: 100%
CPU max MHz: 3800.0000
CPU min MHz: 2300.0000
L1d cache: 1.3 MiB (40 instances)
L1i cache: 1.3 MiB (40 instances)
L2 cache: 10 MiB (20 instances)
L3 cache: 200 MiB (20 instances)
NUMA node(s): 6
NUMA node0 CPU(s): 0-79
NUMA node8 CPU(s): 80-159
NUMA node252 CPU(s):
NUMA node253 CPU(s):
NUMA node254 CPU(s):
NUMA node255 CPU(s):
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Mitigation; RFI Flush, L1D private per thread
Vulnerability Mmio stale data: Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Kernel entry/exit barrier (eieio)
Vulnerability Spectre v1: Mitigation; __user pointer sanitization, ori31 speculation barrier enabled
Vulnerability Spectre v2: Mitigation; Software count cache flush (hardware accelerated), Software link stack flush
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Versions of relevant libraries:
[pip3] numpy==2.4.3
[pip3] torch==2.10.0
[pip3] torchvision==0.25.0
[conda] Could not collectReactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels