If I had a coworker who gave me incorrect answers to questions more than half the time I talked to him, I would simply stop talking to him. That guy would get fired.
Reposted from Noah Shachtman
OpenAI's "most powerful system" makes shit up more than half of the time. For its "o4-mini" model, the rate is ***79 percent***

Comments