These are some truly incredible results. It is of course ridiculous to try and pretend like this model is really doing program synthesis (and thus previous claims about LLMs being an ‘off-ramp to intelligence’ are vindicated). A neural network has now matched average human performance on ARC.
Reposted from fchollet
Today OpenAI announced o3, its next-gen reasoning model. We've worked with OpenAI to test it on ARC-AGI, and we believe it represents a significant breakthrough in getting AI to adapt to novel tasks.

Comments