Qwen3.6-35B-A3B-Q4_K_M.gguf
26 workload results across 1 hardware configuration.
Fastest local config
224.0 decode tok/s
on RTX 5090 (32GB) + AMD Ryzen 7 9850X3D 8-Core Processor (8c) + 30GB via llama.cpp — see full run
Local runs (26 runs)
Runs from contributors' own machines via MLX, llama.cpp, vLLM, exllamav2, or ollama. Signed on the submitter's hardware.
RTX 5090 (32GB) + AMD Ryzen 7 9850X3D 8-Core Processor (8c) + 30GB
| Workload | Backend | Quant | decode tok/s | prefill tok/s | TTFT | Run |
|---|---|---|---|---|---|---|
| chat-short | llama.cpp | — | 223.3tok/s | no data | 263ms | r_dp6sr2_iwcf |
| chat-short | llama.cpp | — | 224.0tok/s | no data | 167ms | r_a56-wxl21lk |
| chat-short | llama.cpp | — | 216.5tok/s | no data | 185ms | r_k070lz99uzi |
| chat-short | llama.cpp | — | 221.6tok/s | no data | 207ms | r_dqelw_a8f2c |
| chat-short | llama.cpp | — | 221.8tok/s | no data | 204ms | r_c25tn-sl4bn |
| chat-short | llama.cpp | — | 208.6tok/s | no data | 220ms | r_d1ydh-i8f9j |
| chat-short | llama.cpp | — | 221.0tok/s | no data | 166ms | r_36c-t9il61l |
| chat-short | llama.cpp | — | 222.4tok/s | no data | 172ms | r_xbis0gb1uq3 |
| chat-short | llama.cpp | — | 216.6tok/s | no data | 349ms | r_rrrs_2wn0__ |
| chat-short | llama.cpp | — | 223.8tok/s | no data | 280ms | r_10yvku19-xt |
| chat-short | llama.cpp | — | 216.3tok/s | no data | 170ms | r_sgqjiuomsug |
| chat-short | llama.cpp | — | 219.8tok/s | no data | 179ms | r_sc_ghtmn-ti |
| chat-short | llama.cpp | — | 223.8tok/s | no data | 148ms | r_p-pbvaxim4m |
| chat-short | llama.cpp | — | 224.0tok/s | no data | 170ms | r_g1aw3mizd32 |
| chat-short | llama.cpp | — | 222.4tok/s | no data | 145ms | r_hj-6scdybyq |
| chat-short | llama.cpp | — | 222.4tok/s | no data | 294ms | r_xsajby03vhb |
| chat-short | llama.cpp | — | 223.3tok/s | no data | 154ms | r_cxuvzpfj-yn |
| chat-short | llama.cpp | — | 212.8tok/s | no data | 168ms | r_a9n_rervciy |
| chat-short | llama.cpp | — | 217.4tok/s | no data | 199ms | r_148vtu_7e08 |
| chat-short | llama.cpp | — | 214.0tok/s | no data | 211ms | r_pnei9cgfzl5 |
| chat-short | llama.cpp | — | 219.0tok/s | no data | 152ms | r_f14_uxceu0r |
| chat-short | llama.cpp | — | 216.2tok/s | no data | 243ms | r_c6t0q3m_tti |
| chat-short | llama.cpp | — | 223.2tok/s | no data | 191ms | r_ta1y71vmvhf |
| chat-short | llama.cpp | — | 212.6tok/s | no data | 240ms | r_ql_2f6nzxj_ |
| chat-short | llama.cpp | — | 217.5tok/s | no data | 288ms | r_0-uc27u4l1s |
| chat-short | llama.cpp | — | 217.0tok/s | no data | 295ms | r_udgtg0c3pjv |