"It’s become abundantly clear over the course of 2024 that writing good automated evals for LLM-powered systems is the skill that’s most needed to build useful applications on top of these models."
Comments
Log in with your Bluesky account to leave a comment
"If you have a strong eval suite you can adopt new models faster, iterate better and build more reliable and useful product features than your competition."
Comments
https://simonwillison.net/2024/Dec/31/llms-in-2024/