Skip to content
llm-speed

Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct

A single shareable card for the Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct matchup. Numbers are the best decode tok/s submitted on the llm-speed suite — every side links back to the run it came from.

Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct

Best decode tok/s across every rig submitted for each model.

Qwen · 7B
tok/s
decode (best submitted run)
no submitted run yet
Meta · 8B
tok/s
decode (best submitted run)
no submitted run yet
One side has no submitted run. Run the suite on the missing side to fill it in.

Want to fill the gap?

One side of this matchup has no submitted run yet. Run the suite and the next refresh of this page will show your number — with the run id linked.

$ pipx install llm-speed && llm-speed bench

Need the long-form table? Open the Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct comparison for every overlapping (model × hardware) row, source runs, and methodology.