Looking forward to the replies
Reposted from Simon Willison
The article I most want to read right now is a detailed breakdown of prompt engineering project from somebody who hand-wrote their own simple automated evals and then used those to iterate on a prompt over time, measuring the impact each change had on their eval score and shipping an improved app

Comments