Install uv:
curl -LsSf https://astral.sh/uv/install.sh | shInstall dependencies:
uv syncAdd your HuggingFace token to .env:
HF_TOKEN=hf_...
uv run --env-file .env greedy_decode.pyResults on 1x 4090:
- Time taken: 0.75s
- Output throughput: 33.44 tok/s
uv run --env-file .env speculative_decode.pyResults on 1x 4090:
- Time taken: 0.62s
- Output throughput: 40.16 tok/s