Skip to content
llm-speed

qwen3-coder-bench-32k on M4 Max (40-core GPU) + 128GB unified

M4 Max (40-core GPU) + 128GB unifiedM4 Max (40-core GPU) + 128GB unified
suite suite-v1
cli 0.0.3
signedya0OvDcfH4…
Embed badgesubmitted Jun 27, 2026

Workload results

WorkloadBackendModeldecode tok/sprefill tok/sTTFTp50p95
chat-shortollama@0.30.11qwen3-coder-bench-32kQ4_K_M113.3tok/s84.12tok/s1,308ms8.9ms9.2ms
chat-longollama@0.30.11qwen3-coder-bench-32kQ4_K_M97.40tok/s1,342.8tok/s2,344ms10.3ms10.8ms
concurrent-decodeollama@0.30.11qwen3-coder-bench-32kQ4_K_M109.7tok/s9.1ms9.5ms
agent-traceollama@0.30.11qwen3-coder-bench-32kQ4_K_M103.2tok/s3,371.6tok/s477ms9.5ms10.5ms

Reproduce on your machine

Same workload, same model, signed at your rig. The exact command that produced this run:

$ pipx install llm-speed && llm-speed bench --model 'qwen3-coder-bench-32k' --workload 'chat-short'

Runs in about a minute. Your number lands on the leaderboard signed and linkable. How it's measured.

Embed this run

Drop the badge into a README, blog post, or signature. Each render is a backlink to the signed result.

llm-speed: 113 tok/s on M4 Max (40-core GPU) (qwen3-coder-bench-32k)
[![llm-speed: 113 tok/s on M4 Max (40-core GPU) (qwen3-coder-bench-32k)](https://llm-speed.com/badge/r_roktphpc--8.svg)](https://llm-speed.com/r/r_roktphpc--8)

Related benchmarks

Provenance

Run ID
r_roktphpc--8
Fingerprint hash
275ecb2b79296aab
Public key
ya0OvDcfH4La0mEhEhM8iESwvi9/MZz9uibPMNfpovE=
Received
2026-06-27 16:14:03