in instruct_llama/core/generation.py line 92 "model = model.eval()" is a bug. It should be model.eval()