Runtime error during init_geo

Hello,

I am facing a Runtime error: "RuntimeError: CUDA driver error: out of memory", everytime I try to run "run_infer.sh" using custom images. The code works perfectly for the datasets Art and Santorini but it crashes if I try to use custom images (I have varied the number from 20 -> 3). I get the following in the 01_init_geo.log:

[38;5;247mic[39m[38;5;245m|[39m[38;5;245m [39m[38;5;247mtorch[39m[38;5;245m.[39m[38;5;247mcuda[39m[38;5;245m.[39m[38;5;247mis_available[39m[38;5;245m([39m[38;5;245m)[39m[38;5;245m:[39m[38;5;245m [39m[38;5;100mTrue[39m
[38;5;247mic[39m[38;5;245m|[39m[38;5;245m [39m[38;5;247mtorch[39m[38;5;245m.[39m[38;5;247mcuda[39m[38;5;245m.[39m[38;5;247mdevice_count[39m[38;5;245m([39m[38;5;245m)[39m[38;5;245m:[39m[38;5;245m [39m[38;5;36m1[39m
>> Doing 3 views reconstrution!
... loading model from ./mast3r/checkpoints/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth
instantiating : AsymmetricMASt3R(enc_depth=24, dec_depth=12, enc_embed_dim=1024, dec_embed_dim=768, enc_num_heads=16, dec_num_heads=12, pos_embed='RoPE100',img_size=(512, 512), head_type='catmlp+dpt', output_mode='pts3d+desc24', depth_mode=('exp', -inf, inf), conf_mode=('exp', 1, inf), patch_embed_cls='PatchEmbedDust3R', two_confs=True, desc_conf_mode=('exp', 0, inf), landscape_only=False)
<All keys matched successfully>
>> Loading a list of 5 images
 - adding /root/InstantSplat/assets/sora/pupil/images/frame_01450.jpg with resolution 1088x1080 --> 512x496
 - adding /root/InstantSplat/assets/sora/pupil/images/frame_01451.jpg with resolution 1088x1080 --> 512x496
 - adding /root/InstantSplat/assets/sora/pupil/images/frame_01452.jpg with resolution 1088x1080 --> 512x496
 - adding /root/InstantSplat/assets/sora/pupil/images/frame_01453.jpg with resolution 1088x1080 --> 512x496
 - adding /root/InstantSplat/assets/sora/pupil/images/frame_01454.jpg with resolution 1088x1080 --> 512x496
 (Found 5 images)
>> Making pairs...
>> Inference...
>> Inference with model on 20 image pairs

  0%|          | 0/20 [00:00<?, ?it/s]
  5%|▌         | 1/20 [00:01<00:22,  1.19s/it]
 10%|█         | 2/20 [00:01<00:17,  1.04it/s]
 15%|█▌        | 3/20 [00:02<00:15,  1.12it/s]
 20%|██        | 4/20 [00:03<00:13,  1.17it/s]
 25%|██▌       | 5/20 [00:04<00:12,  1.19it/s]
 30%|███       | 6/20 [00:05<00:11,  1.19it/s]
 35%|███▌      | 7/20 [00:06<00:10,  1.22it/s]
 40%|████      | 8/20 [00:06<00:09,  1.24it/s]
 50%|█████     | 10/20 [00:07<00:05,  1.81it/s]
 55%|█████▌    | 11/20 [00:08<00:05,  1.64it/s]
 60%|██████    | 12/20 [00:08<00:05,  1.54it/s]
 65%|██████▌   | 13/20 [00:09<00:04,  1.46it/s]
 70%|███████   | 14/20 [00:10<00:04,  1.37it/s]
 75%|███████▌  | 15/20 [00:11<00:03,  1.34it/s]
 80%|████████  | 16/20 [00:12<00:03,  1.27it/s]
 85%|████████▌ | 17/20 [00:12<00:02,  1.27it/s]
 90%|█████████ | 18/20 [00:13<00:01,  1.24it/s]
 95%|█████████▌| 19/20 [00:14<00:00,  1.26it/s]
100%|██████████| 20/20 [00:15<00:00,  1.26it/s]
100%|██████████| 20/20 [00:15<00:00,  1.30it/s]
>> Global alignment...
Traceback (most recent call last):
  File "/root/InstantSplat/./init_geo.py", line 154, in <module>
    main(args.source_path, args.model_path, args.ckpt_path, args.device, args.batch_size, args.image_size, args.schedule, args.lr, args.niter,
  File "/root/InstantSplat/./init_geo.py", line 47, in main
    scene = global_aligner(output, device=args.device, mode=GlobalAlignerMode.PointCloudOptimizer)
  File "/root/InstantSplat/dust3r/cloud_opt/__init__.py", line 25, in global_aligner
    net = PointCloudOptimizer(view1, view2, pred1, pred2, **optim_kw).to(device)
  File "/opt/conda/envs/instantsplat/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1355, in to
    return self._apply(convert)
  File "/opt/conda/envs/instantsplat/lib/python3.10/site-packages/torch/nn/modules/module.py", line 915, in _apply
    module._apply(fn)
  File "/opt/conda/envs/instantsplat/lib/python3.10/site-packages/torch/nn/modules/module.py", line 942, in _apply
    param_applied = fn(param)
  File "/opt/conda/envs/instantsplat/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1341, in convert
    return t.to(
RuntimeError: CUDA driver error: out of memory

I am not sure what is going wrong


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime error during init_geo #90

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Runtime error during init_geo #90

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions