I wonder if it seems more impressive to you because it is parroting something in your area of expertise. As a layman I'm not sure how this is any different from, say, generating a song. The only difference is the data it's been trained on.
Comments
Log in with your Bluesky account to leave a comment
the part where its output is confidently wrong and full of blunders is where the illusion falls apart. not really any different from AI art not knowing how to do hands (maybe hands are better now, I don't keep up with it)
Finding the objectively correct answer to a hyper specific question is miles different.
It also concretely demolishes the naysaying that LLMs “parrot” their answers. It’s borderline impossible for the training data to have had that exact question. It had to perform actual reasoning to get there.
"parrot" may be inaccurate, but so is "perform actual reasoning". if that were the case, it wouldn't provide confident, wildly incorrect responses. it's not thinking, it's resolving noise into a signal.
I hate these things because of some irrational humanism in my brain, but the amount of denial regarding LLMs parroting skills is annoying. Yes, this is impressive.
Evaluating a song is a subjective thing. I am interested in its logical reasoning ability on questions with objectively correct answers but which can't just be looked up very directly.
I don't think "logical reasoning ability" is the right way to think about what it's doing. training an LLM is not like teaching math to a human. consider the *ways* it gets things wrong vs the ways humans get things wrong.
You don't have to tell me that. But I am curious about how well it works regardless. I am doing the empirical investigation I should do, to be permitted to have the views I have. I want to understand truth and reality.
Comments
It also concretely demolishes the naysaying that LLMs “parrot” their answers. It’s borderline impossible for the training data to have had that exact question. It had to perform actual reasoning to get there.