Confidently wrong: No model so far was able to answer this correctly. Not o1 pro, not Gemini advanced, not Claude Opus. The "better" the model, the more confident it was in its wrong answer.

At least Mistral and Claude Sonnet were able to say they didn't know.

Post image

Comments