We just released a set of ablation studies on TinyEngram’s hyperparameters.
If you're experimenting with Engram or wondering how its settings affect training and performance, this is a good place to discuss.
We’ve tested things like:
- N-gram order (
max_ngram_size)
- Embedding dimension per n-gram (
n_embed_per_ngram)
- Number of hash heads (
n_head_per_ngram)
- Where to inject Engram in the model
Check out the full report here:
engram_parameters_tuning.md
Feel free to share your configs, unexpected results, tuning tips, or questions below.
Previous Discussion: #3