-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Description
🐛 Describe the bug
Hi,
can someone please help to clarify the following situation?
The docs for pre-trained VGG16 (https://docs.pytorch.org/vision/stable/models/generated/torchvision.models.vgg16.html#torchvision.models.vgg16) and other models state
[...] Finally the values are first rescaled to [0.0, 1.0] and then normalized using [...].
However, looking at the code I see that the tensor is resized, cropped and normalized, but I do not see that the tensor is rescaled:
The transformations are first defined here: https://github.com/pytorch/vision/blob/main/torchvision/models/vgg.py#L217-L222
transforms=partial(
ImageClassification,
crop_size=224,
mean=(0.48235, 0.45882, 0.40784),
std=(1.0 / 255.0, 1.0 / 255.0, 1.0 / 255.0),
)
Then the ImageClassification transforms are executed here, if I am not mistaken: https://github.com/pytorch/vision/blob/main/torchvision/transforms/_presets.py#L58-L65
def forward(self, img: Tensor) -> Tensor:
img = F.resize(img, self.resize_size, interpolation=self.interpolation, antialias=self.antialias)
img = F.center_crop(img, self.crop_size)
if not isinstance(img, Tensor):
img = F.pil_to_tensor(img)
img = F.convert_image_dtype(img, torch.float)
img = F.normalize(img, mean=self.mean, std=self.std)
return img
When is the actual rescaling happening?
Context: Currently I am using the VGG16_Weights.IMAGENET1K_FEATURES.transforms() for feature extraction. My image is scaled to [0,255]. The transformation scales the image to [-117, 65,000]. This results in VGG16 activations around 700,000 right before the classification head. If I manually scale to [0,1] the highest activation is around 236, which seems more reasonable to me.
Thank you for any hints.
Versions
Applies to latest torchvision (0.25)