-
Notifications
You must be signed in to change notification settings - Fork 84
Open
Description
Hello,
I'm trying to add adapters to the codet5p-110m-embedding model, but I encountered an error stating that the architecture is not supported and that I need to specify the modified modules manually.
I provided the modules explicitly, for example:
modified_modules = ['encoder.block.1.layer.0.SelfAttention']
When I visualize the model, I can see that an adapter layer has been inserted at the specified module. However, the adapter layer appears empty, and upon debugging, I noticed that the instantiate function in the AdapterLayer class is not being executed at all.
Do I need to modify anything in the code beyond specifying the modified_modules parameter to ensure the adapters are instantiated and functional?
Thank you for your help!
T5Stack(
(embed_tokens): Embedding(32103, 768)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
(relative_attention_bias): Embedding(32, 12)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=768, out_features=3072, bias=False)
(wo): Linear(in_features=3072, out_features=768, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
(adapter): AdapterLayer()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=768, out_features=3072, bias=False)
(wo): Linear(in_features=3072, out_features=768, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(2-11): 10 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=768, out_features=768, bias=False)
(k): Linear(in_features=768, out_features=768, bias=False)
(v): Linear(in_features=768, out_features=768, bias=False)
(o): Linear(in_features=768, out_features=768, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=768, out_features=3072, bias=False)
(wo): Linear(in_features=3072, out_features=768, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels