Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct
A single shareable card for the Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct matchup. Numbers are the best decode tok/s submitted on the llm-speed suite — every side links back to the run it came from.
Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct
Best decode tok/s across every rig submitted for each model.
At least one side has no submitted data — the gap above is what we know so far. Run the suite on the missing side to fill it in.
Want to fill the gap?
One side of this matchup has no submitted run yet. Run the suite and the next refresh of this page will show your number — with the run id linked.
$ pipx install llm-speed && llm-speed bench
Need the long-form table? Open the Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct comparison for every overlapping (model × hardware) row, source runs, and methodology.