Qwen2.5-72B-Instruct vs Llama-3.3-70B-Instruct

Side-by-side decode tok/s, prefill tok/s, and TTFT for Qwen2.5-72B-Instruct and Llama-3.3-70B-Instruct, sourced from community-submitted runs of the llm-speed suite. Every number on this page links back to the run it came from.

model

Qwen2.5-72B-Instruct

Qwen · 72B

View Qwen2.5-72B-Instruct page →

model

Llama-3.3-70B-Instruct

Meta · 70B

View Llama-3.3-70B-Instruct page →

No overlapping Qwen2.5-72B-Instruct ↔ Llama-3.3-70B-Instruct benchmarks yet.

We don't have a submitted run that covers both sides of this comparison yet. Run the suite on either side to populate this page:

$ pipx install llm-speed && llm-speed bench

read the methodology