Claim by Holden Karnofsky:
"AI systems now match human performance on long-standing benchmarks for image recognition, speech recognition, and language understanding...none of these three broad capabilities are considered major challenges for AI anymore."
Agree or disagree?
"AI systems now match human performance on long-standing benchmarks for image recognition, speech recognition, and language understanding...none of these three broad capabilities are considered major challenges for AI anymore."
Agree or disagree?
Comments
Jeez these people. Someday soon, an AI is going to tell them to drink the kool-aid and they will do it.
Someone is out of touch with reality.
No digital machine "understands" anything. It performs algorithmic computations but it understands nothing.
A Tesla cannot understand the object in front of it is a truck, and will kill you slamming into it.
https://www.youtube.com/watch?v=mPUGh0qAqWA
It's mostly a rhetorical transition to saying "assuming AI can't equals won't is risky and baseless thinking in regulation".
But (my reading is) betting on these known weaknesses shouldn't underpin regulatory strategy. E.g., we can't assume logic puzzles can be the new CAPTCHA
What is your opinion on this?
Not so confident all of them can be realistically ironed out.
Sure! Here are five example sentences showing different ways "skibidi" might be used in a playful or catchy context:
Dancing: "Everyone at the party was doing the skibidi dance, and the whole room lit up with energy!"
Expressing excitement: "Skibidi! That was the best concert I've ever been to!"
As a fun catchphrase: "When in doubt, just skibidi your way through it!"
In all of these, "skibidi" serves as a fun, energetic word that’s used more for its rhythm and vibe than for any concrete meaning.
But yes, image and speech and other pattern recognition abilities have been achieved.
https://osf.io/preprints/psyarxiv/7zcj8
humans just don't perform very well?
in the benchmarks and anecdotally, I would think a modern model is operating in the highest decile
teaching similar models how to grade cucumbers has no working interface like human-human collaboration. (yet!)
Image recognition appears to be somewhat brittle but borderline fair to say it's mostly solved
Language and speech have been solved resoundingly and indisputably
a lot of bizarre whataboutism and cope in the replies here ("energy use!!") bc they find that answer uncomfortable(???)
however there is something funny about trying to split something as thin as a hair when the collective of hair splitters can't remotely agree where to aim the blade
not well enough to even pluck a hair
let alone split it
one also suspects they don't really care to discuss said contours, but rather want to "rule it out at the outset"
which haidt I believe got from somewhere else, possibly Shweder
https://bsky.app/profile/cam.fyi/post/3lfbfoo4h3k2q
And that seems plausible.
But besting people on classification benchmarks does not mean besting people on general perception needs in the wild.
"AI" doesn't "understand" anything. This is obvious both to anyone who has ever had an original thought, and to anyone who actually understands what "AI" is and how it works.
I can think only about modeling other person goals like the game where AI is given a key to a vault and must not open the vault to a thief when exposed to the internet.
What we are still missing is rapid skill acquisition.
The "benchmarks" appear to have been chosen based on their difficulty to contemporary models, not as indicators of broad capability.
Winograd/pen-problems are fun gotchas, but solving them does not imply a fundamental shift in how LLM neurons work.
If these skills were evaluable, we'd be better at assessing humans.