Skip to content

the problem of cuda #2

@wuyeyexvnainai

Description

@wuyeyexvnainai

when i run
python -m diffusha.diffusion.evaluation.eval_assistance --env-name LunarLander-v1 --out-dir /outdir --save-video
i find those problem:

pybullet build time: Mar 10 2023 17:09:46
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.14) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
torch runs deteministically!!
WARN: num_envs and num_episodes are overwritten to save the video.
LunarLander task: reach
/usr/local/lib/python3.9/dist-packages/torch/cuda/init.py:88: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:109.)
return torch._C._cuda_getDeviceCount() > 0
sigma_max tensor(39.3563)
Loading a model from /data/ddpm/lunarlander-v1/step_00029999.pt
Traceback (most recent call last):
File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/code/diffusha/diffusion/evaluation/eval_assistance.py", line 193, in
diffusion = prepare_diffusha(
File "/code/diffusha/diffusion/evaluation/helper.py", line 62, in prepare_diffusha
checkpoint = torch.load(model_path)
File "/usr/local/lib/python3.9/dist-packages/torch/serialization.py", line 789, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.9/dist-packages/torch/serialization.py", line 1131, in _load
result = unpickler.load()
File "/usr/local/lib/python3.9/dist-packages/torch/serialization.py", line 1101, in persistent_load
load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "/usr/local/lib/python3.9/dist-packages/torch/serialization.py", line 1083, in load_tensor
wrap_storage=restore_location(storage, location),
File "/usr/local/lib/python3.9/dist-packages/torch/serialization.py", line 215, in default_restore_location
result = fn(storage, location)
File "/usr/local/lib/python3.9/dist-packages/torch/serialization.py", line 182, in _cuda_deserialize
device = validate_cuda_device(location)
File "/usr/local/lib/python3.9/dist-packages/torch/serialization.py", line 166, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

but my cuda is already 11.6
and torch is cu116
These two versions match each other

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions