-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Description
🐛 Describe the bug
Summary
Repeated calls to torchvision.io.image.decode_jpeg() on a malformed JPEG cause near-linear RSS growth until OOM. Normal JPEGs do not show this behavior. This looks like an error-path memory leak in the CPU JPEG decode path.
I have checked past issues, #3613 ,#4378, those reports are about GPU/nvJPEG memory leaks. This report is CPU-only and leaks on the error path when decoding malformed JPEGs (RSS grows linearly even after gc + malloc_trim)
This issue mirrors a report I previously filed through the repo’s GitHub Security Advisory (private), including PoC and malformed JPEG samples. Since there has been no maintainer response for over 90 days, I’m posting a public issue to ensure the problem is visible and can be tracked.
For responsible disclosure, I will not publish the malformed JPEG samples here. I can provide them privately to maintainers, or they can review the samples already attached in the Security Advisory thread.
Reproduction
Command:
python poc.py case1.jpg --repeat 50 --mode RGB --quiet
Modes tested: UNCHANGED / RGB / GRAY (all leak to varying degrees)
PoC script:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import os, sys, argparse, contextlib, gc, ctypes
os.environ.setdefault("OMP_NUM_THREADS", "1")
os.environ.setdefault("MKL_NUM_THREADS", "1")
os.environ.setdefault("CUDA_VISIBLE_DEVICES", "")
import torch, torchvision
from torchvision.io import ImageReadMode
from torchvision.io.image import decode_jpeg
@contextlib.contextmanager
def swallow_stderr(enable=True):
if not enable:
yield; return
sys.stderr.flush()
fd = sys.stderr.fileno()
old = os.dup(fd)
try:
with open(os.devnull, "wb") as null:
os.dup2(null.fileno(), fd)
yield
finally:
os.dup2(old, fd); os.close(old)
def rss_hwm_kb():
rss = hwm = None
with open("/proc/self/status") as f:
for line in f:
if line.startswith("VmRSS:"):
rss = int(line.split()[1])
elif line.startswith("VmHWM:"):
hwm = int(line.split()[1])
return rss, hwm
def main():
ap = argparse.ArgumentParser()
ap.add_argument("unit", help="the case path")
ap.add_argument("--repeat", type=int, default=50)
ap.add_argument("--mode", choices=["UNCHANGED","RGB","GRAY"], default="RGB")
ap.add_argument("--quiet", action="store_true")
args = ap.parse_args()
print("torch:", torch.__version__)
print("torchvision:", torchvision.__version__)
print("cuda_available:", torch.cuda.is_available())
with open(args.unit, "rb") as f:
data = f.read()
mode = {
"UNCHANGED": ImageReadMode.UNCHANGED,
"RGB": ImageReadMode.RGB,
"GRAY": ImageReadMode.GRAY,
}[args.mode]
# reduce noise
u8 = torch.frombuffer(bytearray(data), dtype=torch.uint8).contiguous()
libc = ctypes.CDLL("libc.so.6")
torch.set_num_threads(1)
print(f"[repro] unit={args.unit} bytes={len(data)} repeat={args.repeat} mode={args.mode}")
for i in range(1, args.repeat + 1):
try:
with swallow_stderr(args.quiet):
_ = decode_jpeg(u8, mode=mode)
except Exception as e:
# Bad JPEG will come here: this is exactly where we need to verify if there is an 'error path leak'
pass
# Try to recycle the 'non leaking' parts as much as possible
gc.collect()
try:
libc.malloc_trim(0)
except Exception:
pass
rss, hwm = rss_hwm_kb()
print(f"[{i}/{args.repeat}] VmRSS={rss/1024:.1f} MB VmHWM={hwm/1024:.1f} MB", flush=True)
if __name__ == "__main__":
main()Observed results
Normal JPEG: RSS stabilizes around ~269 MB after repeated calls.
Malformed JPEG: RSS grows ~linearly to ~5 GB after 50 iterations (see logs below).
for normal case:
torch: 2.9.0+cpu
torchvision: 0.24.0+cpu
cuda_available: False
...
[45/50] VmRSS=269.0 MB VmHWM=270.9 MB
[46/50] VmRSS=269.0 MB VmHWM=270.9 MB
[47/50] VmRSS=269.0 MB VmHWM=270.9 MB
[48/50] VmRSS=269.0 MB VmHWM=270.9 MB
[49/50] VmRSS=269.0 MB VmHWM=270.9 MB
[50/50] VmRSS=269.0 MB VmHWM=270.9 MB
for abnormal case:
torch: 2.9.0+cpu
torchvision: 0.24.0+cpu
cuda_available: False
[1/50] VmRSS=363.8 MB VmHWM=366.2 MB
[2/50] VmRSS=457.4 MB VmHWM=457.4 MB
[3/50] VmRSS=551.1 MB VmHWM=551.1 MB
[4/50] VmRSS=644.7 MB VmHWM=644.7 MB
[5/50] VmRSS=738.3 MB VmHWM=738.3 MB
[6/50] VmRSS=831.9 MB VmHWM=831.9 MB
[7/50] VmRSS=925.6 MB VmHWM=925.6 MB
...
[45/50] VmRSS=4483.3 MB VmHWM=4483.3 MB
[46/50] VmRSS=4576.9 MB VmHWM=4576.9 MB
[47/50] VmRSS=4670.6 MB VmHWM=4670.6 MB
[48/50] VmRSS=4764.2 MB VmHWM=4764.2 MB
[49/50] VmRSS=4857.8 MB VmHWM=4857.8 MB
[50/50] VmRSS=4951.4 MB VmHWM=4951.4 MB
Meanwhile, you can also check the memory usage using "htop".
For case 1, the memory usage is 5GB, and for case 2, the memory usage is over 100GB.
Sample files
I can provide the malformed samples to maintainers privately.
Impact
If a service decodes untrusted user-provided JPEGs, an attacker could repeatedly submit crafted malformed images to exhaust memory and trigger DoS.
Versions
torch: 2.9.0+cpu
torchvision: 0.24.0+cpu (0.25.0 also)