It's official: After more than 57 runs of the MMLU-Pro CS benchmark across 25 LLMs with over 69 hours runtime, QwQ-32B-Preview is THE best local model!

I'm still working on the detailed analysis, but here's the main graph that accurately depicts the quality of all tested models.
Post image

Comments