Skip to content
llm-speed
Leaderboard/models/qwen3-6-35b-a3b

Qwen3.6-35B-A3B-Q4_K_M.gguf

26 workload results across 1 hardware configuration.

Fastest local config

224.0 decode tok/s

on RTX 5090 (32GB) + AMD Ryzen 7 9850X3D 8-Core Processor (8c) + 30GB via llama.cpp see full run

Local runs (26 runs)

Runs from contributors' own machines via MLX, llama.cpp, vLLM, exllamav2, or ollama. Signed on the submitter's hardware.

RTX 5090 (32GB) + AMD Ryzen 7 9850X3D 8-Core Processor (8c) + 30GBRTX 5090 (32GB) + AMD Ryzen 7 9850X3D 8-Core Processor (8c) + 30GB

WorkloadBackendQuantdecode tok/sprefill tok/sTTFTRun
chat-shortllama.cpp223.3tok/sno data263msr_dp6sr2_iwcf
chat-shortllama.cpp224.0tok/sno data167msr_a56-wxl21lk
chat-shortllama.cpp216.5tok/sno data185msr_k070lz99uzi
chat-shortllama.cpp221.6tok/sno data207msr_dqelw_a8f2c
chat-shortllama.cpp221.8tok/sno data204msr_c25tn-sl4bn
chat-shortllama.cpp208.6tok/sno data220msr_d1ydh-i8f9j
chat-shortllama.cpp221.0tok/sno data166msr_36c-t9il61l
chat-shortllama.cpp222.4tok/sno data172msr_xbis0gb1uq3
chat-shortllama.cpp216.6tok/sno data349msr_rrrs_2wn0__
chat-shortllama.cpp223.8tok/sno data280msr_10yvku19-xt
chat-shortllama.cpp216.3tok/sno data170msr_sgqjiuomsug
chat-shortllama.cpp219.8tok/sno data179msr_sc_ghtmn-ti
chat-shortllama.cpp223.8tok/sno data148msr_p-pbvaxim4m
chat-shortllama.cpp224.0tok/sno data170msr_g1aw3mizd32
chat-shortllama.cpp222.4tok/sno data145msr_hj-6scdybyq
chat-shortllama.cpp222.4tok/sno data294msr_xsajby03vhb
chat-shortllama.cpp223.3tok/sno data154msr_cxuvzpfj-yn
chat-shortllama.cpp212.8tok/sno data168msr_a9n_rervciy
chat-shortllama.cpp217.4tok/sno data199msr_148vtu_7e08
chat-shortllama.cpp214.0tok/sno data211msr_pnei9cgfzl5
chat-shortllama.cpp219.0tok/sno data152msr_f14_uxceu0r
chat-shortllama.cpp216.2tok/sno data243msr_c6t0q3m_tti
chat-shortllama.cpp223.2tok/sno data191msr_ta1y71vmvhf
chat-shortllama.cpp212.6tok/sno data240msr_ql_2f6nzxj_
chat-shortllama.cpp217.5tok/sno data288msr_0-uc27u4l1s
chat-shortllama.cpp217.0tok/sno data295msr_udgtg0c3pjv

Qwen3.6-35B-A3B-Q4_K_M.gguf on hardware