If I have time I'll put together a more detailed thread tomorrow, but for now, I think this new paper about limitations of Chain-of-Thought models could be quite important. Worth a look if you're interested in these sorts of things. ml-site.cdn-apple.com/papers/the-i... - ThreadSky

carlbergstrom.com • 6 days ago

If I have time I'll put together a more detailed thread tomorrow, but for now, I think this new paper about limitations of Chain-of-Thought models could be quite important. Worth a look if you're interested in these sorts of things.

https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf

Comments

msderohanes.bsky.social•6 days ago

Thinking and Reasoning are more than just language derived functions, imo. Thank you for sharing 📌

keukaman.bsky.social•6 days ago

I’m new-ish to the under the hood side of AI, so I may be wrong. But isn’t CoT a prompt mechanism rather than model?

jennyemelie.bsky.social•5 days ago

My understanding: sort of both.

You can prompt an LLM to sort of behave in a CoT like fashion, but there are also models being explicitly designed around working that way period which is what CoT is usually referring to.

carlbergstrom.com•5 days ago

I see what you're saying but OpenAI doesn't do us any favors, terminology-wise, by explicitly calling CoT a model.

keukaman.bsky.social•4 days ago

keeping up with all the changes is like drinking from a fire hose

jo.nny.rip•5 days ago

I have been sort of puzzled by your hope for them, because the trick is really very cheap and changes nothing about the fundamentals of the model.

carlbergstrom.com•5 days ago

I don’t think it’s a path to AGI or anything close to that. I do think it might be useful for certain tasks, including, possibly, mathematical problems of intermediate difficulty.

helenaclaire.bsky.social•6 days ago

Please do!

danandsomehandsome.bsky.social•4 days ago

I’m curious on your thoughts to recent developments in the space in the form of the article: https://www.scientificamerican.com/article/inside-the-secret-meeting-where-mathematicians-struggled-to-outsmart-ai/

And I’m also not a fan of following up a scientific journal article with a magazine article… appreciate your thoughts/thread

kabirkumar.bsky.social•2 days ago

bluesky seems to react very differently to this than tpot

fercook.bsky.social•6 days ago

Geeeeee that looks like a nice paper! Reminds me of the definition (by John Tukey?) that statistics is about understanding the processes, and machine learning is about getting a correct inference.

bikeengineer.bsky.social•6 days ago

📌

birchlse.bsky.social•6 days ago

I disagree with the very term Large Reasoning Model for an LLM fine-tuned for reasoning tasks. It's not clear there is any modelling of reasoning going on.

jennykaynz.bsky.social•4 days ago

LLM = Large Language Model not Large Reason(ing) Model.

Whether large volume of language (many words and phrases) or large items of language (big words), the idea is easier than SP = stochastic parrot, as description of "AI" tools (with little intelligence) 💻🦜
https://en.wikipedia.org/wiki/Stochastic_parrot

keawijkstra.bsky.social•5 days ago

Even if you define reasoning as 'cognitive activity' it will not be satisfying. 'Logic' will not help either because non-logical reasoning exists.
Does a machine have cognition? I think we should keep the process of AI in mind in order to define it.

sol-roi.bsky.social•5 days ago

The crux. Define reasoning.

galovasco.bsky.social•5 days ago

Formal reasoning in the tradition of logic?

joeyvazquez.bsky.social•5 days ago

LLMs do great at "formal reasoning in the tradition of logic" because it's so formulaic and maintaining rational consistency (making sense) is completely irrelevant.

The real issue is what's meant by "modelling of reasoning going on."

joeyvazquez.bsky.social•5 days ago

To me, it seems like Jonathan pulled a neat little rhetorical trick of purposely conflating the performance of reasoning tasks with being a reasoning model - he can't successfully argue against the former, so he pulls what amounts to creating a strawman position he can then easily knock down.

jmarkleonard.bsky.social•6 days ago

Is this paper the one hinted at a couple of months ago or an additional paper in the same direction?

rmlockley.bsky.social•6 days ago

📌

openartdata.bsky.social•5 days ago

‘frontier LRMs face a complete accuracy collapse beyond certain complexities. ’

nordiceuropean.bsky.social•4 days ago

One observation: the paper proves that LRMs are ill-suited to a task they're not meant to be used (algorithm execution).

nordiceuropean.bsky.social•6 days ago

Does it prove some hard limits under some general assumptions, or is it an experimental investigation of the current crop of LRMs?

carlbergstrom.com•6 days ago

The latter

nordiceuropean.bsky.social•6 days ago

I see. It's definitely important now, but may be out of date in a year or two :)

I wish the AI companies published all details about how their models work, so that researchers could make theoretical analysis of their principles, instead of this black box prodding.

kabirkumar.bsky.social•2 days ago

several do - e.g. Allen AI, Eleuther AI and I think Pleias as well. Deepseek and Kimi seem to as well

essaywells.bsky.social•6 days ago

If they published details we would be able to laugh at them even harder than we currently do :) I would bet heavily that it's all just transformer LLMs decorated with a horrific tangle of ad-hoc scripting, no progress towards a real model of cognition at all.

nordiceuropean.bsky.social•6 days ago

I won't engage, sorry.

essaywells.bsky.social•6 days ago

You do not have to.

javaid89.bsky.social•5 days ago

I mean that's the point right? They don't want to publish the details ... If they did we would realize that these are useful but imperfect tools and not Gods. They want to make money. Automating mathematicians out of a job isn't very lucrative tho. These things are already better a vast majority ...

javaid89.bsky.social•5 days ago

Of the population at literally everything already on benchmarks...

javaid89.bsky.social•5 days ago

And telling is the details would just mean any competitive advantage they have is now gone. Don't be fooled by OpenAI's name.