Finding the objectively correct answer to a hyper specific question is miles different.

It also concretely demolishes the naysaying that LLMs “parrot” their answers. It’s borderline impossible for the training data to have had that exact question. It had to perform actual reasoning to get there.

Comments