Skip to content
llm-speed
Leaderboard/showdown

Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct

A single shareable card for the Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct matchup. Numbers are the best decode tok/s submitted on the llm-speed suite — every side links back to the run it came from.

Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct

Best decode tok/s across every rig submitted for each model.

Qwen · 7B
tok/s
decode (best submitted run)
no submitted run yet
Meta · 8B
tok/s
decode (best submitted run)
no submitted run yet
At least one side has no submitted data — the gap above is what we know so far. Run the suite on the missing side to fill it in.

Want to fill the gap?

One side of this matchup has no submitted run yet. Run the suite and the next refresh of this page will show your number — with the run id linked.

$ pipx install llm-speed && llm-speed bench

Need the long-form table? Open the Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct comparison for every overlapping (model × hardware) row, source runs, and methodology.