my take is that Reinforcement Learning Works, in that you can make current-day LLMs really good at basically any task if you can find enough problems and create a robust enough reward signal

but the dream of a generalist model being Smart Enough to do these things emergently is far off

Comments