every LLM task needs a curated, iteratively developed quant eval. everyone who deploys a single prompt is a data scientist now, whether they realize it or not. there is a lot of money to be made in helping orgs internalize this. - ThreadSky

every LLM task needs a curated, iteratively developed quant eval.

everyone who deploys a single prompt is a data scientist now, whether they realize it or not.

there is a lot of money to be made in helping orgs internalize this.

Reposted from Eugene Yan

Repeat after me:

I will build evals for my tasks.
I will build evals for my tasks.
I will build evals for my tasks.

Comments

Posting Rules

Comments

Posting Rules

Reply