Skip to content
llm-speed

x โ€” LLM benchmarks

56 workload results across 1 model.

Fastest known config on x

10.0 decode tok/s

m via llama.cpp (Q4) โ€” see full run

m

WorkloadBackendQuantdecode tok/sprefill tok/sTTFTRun
chat-shortllama.cppQ410.00tok/sno datano datar_0_i4fok_cfg
chat-shortllama.cppQ410.00tok/sno data0.0msr_abgapkfvfla
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_g356kkzjf5c
chat-shortllama.cppQ410.00tok/sno datano datar_3r1vcq0s4vo
chat-shortllama.cppQ410.00tok/sno datano datar_dnvwv68uo3z
chat-shortllama.cppQ410.00tok/sno datano datar_59h1mxy0mzj
chat-shortllama.cppQ410.00tok/sno datano datar_w6ugvsylxe7

Models measured on x

Common questions about x

Direct Q&A drawn from the runs above: fastest LLM, supported model classes, backend rankings, quantization guidance.

Read the x FAQ โ†’