Skip to content
llm-speed
Leaderboard/model/codellama-13b-instruct-hf-4bit-mlx

CodeLlama-13b-Instruct-hf-4bit-MLX

1 workload result across 1 hardware configuration.

M3 Ultra (60-core GPU) + 96GB unifiedM3 Ultra (60-core GPU) + 96GB unified

WorkloadBackendQuantdecode tok/sprefill tok/sTTFTRun
chat-shortmlx@0.31.3no datano datano datar_kl6m821pgbh

CodeLlama-13b-Instruct-hf-4bit-MLX on hardware