-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Description
Hey There, appreciate what you guys are doing, its great work.
I'm trying to access the model weights from HF using transformer Library but stuck due to a swiglu error, any help regarding that would be really great, also secondly where can i find direct implementation of the attn-360 or 1.4b variant, i have 1 billion token dataset extracted from pile that i want to try an off the shelf training on attn-360 models!
Metadata
Metadata
Assignees
Labels
No labels