Skip to content
llm-speed
Leaderboard/models/qwen3-6-27b

Qwen3.6-27B-Q4_K_M.gguf

49 workload results across 1 hardware configuration.

Fastest local config

74.0 decode tok/s

on RTX 5090 (32GB) + AMD Ryzen 7 9850X3D 8-Core Processor (8c) + 30GB via llama.cpp see full run

Local runs (49 runs)

Runs from contributors' own machines via MLX, llama.cpp, vLLM, exllamav2, or ollama. Signed on the submitter's hardware.

RTX 5090 (32GB) + AMD Ryzen 7 9850X3D 8-Core Processor (8c) + 30GBRTX 5090 (32GB) + AMD Ryzen 7 9850X3D 8-Core Processor (8c) + 30GB

WorkloadBackendQuantdecode tok/sprefill tok/sTTFTRun
chat-shortllama.cpp72.38tok/sno data167msr_yluotk909p8
chat-shortllama.cpp72.48tok/sno data172msr_v2vrkr4uah1
chat-shortllama.cpp69.56tok/sno data227msr_u4wa_y_y3vt
chat-shortllama.cpp72.57tok/sno data172msr_l-wg5a6o-vg
chat-shortllama.cpp69.56tok/sno data174msr_w482rf72v6z
chat-shortllama.cpp72.32tok/sno data290msr_ercjdbdw2gi
chat-shortllama.cpp69.60tok/sno data382msr_aj808r0dw53
chat-shortllama.cpp72.31tok/sno data198msr_4upad7ubcpi
chat-shortllama.cpp68.00tok/sno data241msr_qiq8q_cqfk5
chat-shortllama.cpp72.49tok/sno data178msr_3un_x00tm0t
chat-shortllama.cpp70.89tok/sno data185msr_ea1db-lhv0r
chat-shortllama.cpp72.90tok/sno data165msr_7snxv0llk6f
chat-shortllama.cpp72.79tok/sno data297msr_p307g-1cdkl
chat-shortllama.cpp69.84tok/sno data398msr_bgylub2qqr-
chat-shortllama.cpp69.70tok/sno data167msr_5b-e5ortgcv
chat-shortllama.cpp72.68tok/sno data166msr_2g-h0epeovp
chat-shortllama.cpp70.07tok/sno data173msr_es8a88t0ez0
chat-shortllama.cpp72.63tok/sno data170msr_j0xki0x2kn2
chat-shortllama.cpp69.68tok/sno data178msr_2673sv7x5m4
chat-shortllama.cpp72.54tok/sno data171msr_7u98v6--uli
chat-shortllama.cpp69.05tok/sno data191msr_4fxpo_ja16m
chat-shortllama.cpp67.86tok/sno data185msr_qhp37dyi9n1
chat-shortllama.cpp72.56tok/sno data215msr_5m7gbxjpw_c
chat-shortllama.cpp73.20tok/sno data291msr_qlo6j6lnnh3
chat-shortllama.cpp69.95tok/sno data291msr_q5xsw1p-7f8
chat-shortllama.cpp70.37tok/sno data269msr_wowv__0b5s6
chat-shortllama.cpp72.59tok/sno data314msr_8qwzo79cf0s
chat-shortllama.cpp71.99tok/sno data317msr_fo83sbnxax4
chat-shortllama.cpp73.96tok/sno data190msr_f7ulllqu4vk
chat-shortllama.cpp72.79tok/sno data187msr_uhv-sxt4h97
chat-shortllama.cpp71.50tok/sno data213msr_gipbd9a8wrn
chat-shortllama.cpp73.31tok/sno data296msr_wax_x2ryqhk
chat-shortllama.cpp69.65tok/sno data308msr_mqx-i-thqtq
chat-shortllama.cpp69.90tok/sno data180msr_jrwk2oj1el-
chat-shortllama.cpp72.27tok/sno data182msr_432qs45nio9
chat-shortllama.cpp72.39tok/sno data225msr_p3m8a1lj4kg
chat-shortllama.cpp69.56tok/sno data205msr_qrpvh5hd9km
chat-shortllama.cpp69.62tok/sno data177msr_th_ounxmvpx
chat-shortllama.cpp72.44tok/sno data172msr_jbit-ru039h
chat-shortllama.cpp70.04tok/sno data194msr_wmexo75dt0o
chat-shortllama.cpp71.26tok/sno data410msr_-1k-8qr_v8a
chat-shortllama.cpp71.09tok/sno data241msr_mj85rtghwmg
chat-shortllama.cpp68.97tok/sno data189msr_-cpk_tabogc
chat-shortllama.cpp73.76tok/sno data346msr_j1cr6asrnn6
chat-shortllama.cpp68.82tok/sno data413msr_v05axrj3dti
chat-shortllama.cpp73.56tok/sno data167msr_az687c4xasr
chat-shortllama.cpp68.61tok/sno data183msr_2o9kb8hy9uk
chat-shortllama.cpp70.98tok/sno data200msr_carzhcoe-mi
chat-shortllama.cpp72.36tok/sno data168msr_uvy0zplf-sh

Qwen3.6-27B-Q4_K_M.gguf on hardware