Initialization of new model components

I tried adapting the model by adding new parameter into a submodel as follows:
```
# Added within init of  ESMplusplusForMaskedLM
self.some_submodel = Submodel()
```
```
# Added within init of new Submodel class
self.register_parameter(
    "some_param", torch.nn.Parameter(torch.randn(n1, n2)))
```
While `torch.randn(n1, n2)` does not return 0-values (or at least it is very unlikely) the initialisation of the new param in `from_pretrained` actually contains many 0 values and requires re initialization (as else the losses do not behave well). 
```
model = ESMplusplusForMaskedLM.from_pretrained(
    path_esm, local_files_only=True,trust_remote_code=True, 
     )
print((model.some_submodel.some_param==0).sum())
torch.nn.init.normal_(model.some_submodel.some_param)
print((model.some_submodel.some_param==0).sum())
# Out
tensor(61426) # After loading
tensor(0) # After reinitialisation
```
Is there a recommended way to get over that in `from_pretrained` rather than doing the reinitialization manually?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initialization of new model components #15

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Initialization of new model components #15

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions