Leaderboard/showdown

Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct

A single shareable card for the Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct matchup. Numbers are the best decode tok/s submitted on the llm-speed suite — every side links back to the run it came from.

Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct

Best decode tok/s across every rig submitted for each model.

Qwen · 7B

Qwen2.5-7B-Instruct

—tok/s

decode (best submitted run)

no submitted run yet

Meta · 8B

Llama-3.1-8B-Instruct

—tok/s

decode (best submitted run)

no submitted run yet

At least one side has no submitted data — the gap above is what we know so far. Run the suite on the missing side to fill it in.

Share on X Post to Reddit Submit to HN

Want to fill the gap?

One side of this matchup has no submitted run yet. Run the suite and the next refresh of this page will show your number — with the run id linked.

$ pipx install llm-speed && llm-speed bench

Need the long-form table? Open the Qwen2.5-7B-Instruct vs Llama-3.1-8B-Instruct comparison for every overlapping (model × hardware) row, source runs, and methodology.