AI hype is making AI researchers forget painfully learned lessons core to the field.
There's an emerging cope that progress in capabilities isn't slowing down—it's just invisible as benchmarks are saturated; vibe checks are useless because models are now superhuman so we can't perceive improvement.
There's an emerging cope that progress in capabilities isn't slowing down—it's just invisible as benchmarks are saturated; vibe checks are useless because models are now superhuman so we can't perceive improvement.
Comments
This is exactly wrong. In the early days of AI, researchers tackled chess because they thought real-world problems like computer vision would be too easy!
Prediction: as AI solves "extremely hard" benchmarks, AI boosters will start to claim that we have superintelligence yet no one's using it because they're too stupid to recognize it.
(1) The actually hard problems for AI are the things that don't tend to be measured by benchmarks, hence the importance of vibes.
(2) Benchmarks have always been of limited value. (We've been saying this long before they became saturated. https://www.aisnakeoil.com/p/gpt-4-and-professional-benchmarks)
(4) Adoption metrics are far more informative than decontextualized capability measurements.
HT @howard.fm