I implemented my own Llama model to learn the LLM architecture, referring to this helpful tutorial 🤩.
- Sampling Module
- Simple Trainer & LoRA Trainer
| Module | File | Reference |
|---|---|---|
RMS norm |
norm.py | tutorial Llama RMSnorm |
Vocabulary Embedding |
vocab_emb.py | tutorial ChatGLM |
NTK-aware RoPE |
pos_emb.py | tutorial Llama RotaryEmb |
Offline/Online Sliding Window Attention |
attention.py | tutorial LlamaAttention GQA |
MoE MLP |
mlp.py | tutorial LlamaMLP Mixtral MoE |
KV Cache Manager |
TransformerDecoderKVCache | tutorial HF Cache vLLM PageAttention |