Skip to content
llm-speed

About

About llm-speed

llm-speed is a community benchmark for LLM inference speed. One CLI, one methodology, real numbers across hosted APIs, consumer GPUs, Apple Silicon, and prosumer rigs.

Why this exists

Artificial Analysis covers hosted APIs and datacenter accelerators; MLPerf is enterprise rigs; r/LocalLLaMA is folklore in comment threads. Nobody owns the union — consumer-local plus hosted-API numbers under one protocol, with a permalink per result. That's what this is.

Who runs it

One maintainer, publishing under the pseudonym meadow-kun. Solo project, open source, no company behind it. Issues and DMs welcome via GitHub. What you get for trusting me with a submission: every run is signed by an Ed25519 keypair on your machine, the canonical bytes are auditable from the run page, the suite version is pinned, and the entire pipeline — CLI, ingest, web — is Apache-2.0 in one public repo. Disputes happen in public GitHub issues against the run id. Numbers belong to their submitters, not to me.

How it works

Related work

These are the resources we've learned from and continue to point people toward:

Open source

The CLI, the methodology, and this site are open source under Apache-2.0. Submit issues or ideas via GitHub. Results belong to the people who submitted them.