Hi,
Thanks a lot for sharing the code. I have a small question. It seems that the bias of the layers are not masked by the score and not freezed either, and I'm confused about this. Is the bias term different from other trainable parameters (weights)?
Thanks!