Skip to content

Swiglu Issue #17

@AsadMir10

Description

@AsadMir10

Hey There, appreciate what you guys are doing, its great work.
I'm trying to access the model weights from HF using transformer Library but stuck due to a swiglu error, any help regarding that would be really great, also secondly where can i find direct implementation of the attn-360 or 1.4b variant, i have 1 billion token dataset extracted from pile that i want to try an off the shelf training on attn-360 models!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions