ai-notes.bsky.social
The value of a person in no way depends on their intelligence.
109 posts
538 followers
483 following
Regular Contributor
Active Commenter
comment in response to
post
I think for most tasks, the bottleneck is reliability, not capability. So even though capability is definitely increasing on some dimensions (for whatever reason, scaling or otherwise, I don't know) most people just don't notice. Very, very few people need the math abilities of o1-preview.
comment in response to
post
To put it another way: some folks in the NLP community would be horrified if they knew what people actually use search engines for!
comment in response to
post
It's a funny analogy, but I think the situation might be subtler than this. People use search engines for all sorts of things, not just information retrieval. For some of these other tasks, isn't it conceivable that AI would be more fit for purpose?
comment in response to
post
People in science and technology are seeing something very different from people in the humanities, but I think that's a temporary phase.
comment in response to
post
Isn't this just a matter of different subdisciplines using the word "model" in different ways? I feel like I'm watching a mathematician complaining that fields aren't just a bunch of grass, they have to be commutative.
comment in response to
post
Real-world usage spans a very broad set of tasks. Look at the data yourself if you don't believe me, e.g.:
www.nber.org/papers/w32966
And true generality is definitely an engineering goal—it's the famous G in "AGI." All frontier model companies are public and explicit about this.
comment in response to
post
I don't know of any technology adopted as fast as ChatGPT. Examples that are close (personal computers, the internet) indeed became pervasive and foundational. E.g. see www.stlouisfed.org/on-the-econo...
comment in response to
post
I've met a lot of people who are 100% certain that AI will flop. That's probably who this kind of language is aimed at. I completely agree it would be better if they hedged and said, "There's a decent chance AI will be pervasive, and we want you to help decide how we use it."
comment in response to
post
LLM-based chatbots are built for general use and in practice are used for a wide variety of things. I'm genuinely curious: what leads you to see them as application-specific artifacts? Or is this more of a normative statement, that you wish they'd be built and used in a more targeted way?
comment in response to
post
I think it sets a baseline, but not a ceiling. And LLMs have blown way past my baseline expectations for what I guessed next-token prediction would produce. Isn't it at least a reasonable hypothesis they may be learning something deep as a byproduct of a superficial training task?
comment in response to
post
LLMs are a technique, not a tool: they're not "meant" for anything. (Is the fast Fourier transform "meant" for audio engineering or detecting nuclear tests? Why not both?) And at this point, the best LLM-based systems are far better than the average person at math. Surely that's worth exploring?
comment in response to
post
Oh, I see what you're saying! That is interesting, and I don't know of any studies.
comment in response to
post
The belief was that this made it easier to learn to translate the first word, which then made it easier to learn to translate the second, etc. I don't know if they ran careful experiments to show this was the mechanism.
comment in response to
post
I think there might be more to the story. One of the biggest AI believers I know is (1) a socially adept extrovert; and (2) was incredibly skeptical, up until LLMs became good enough that they helped him write a certain type of specialized code much faster.
comment in response to
post
I believe you. There seem to be dramatic differences between subdisciplines. In your work it's useless, but in chemistry, it just won a Nobel. As we figure out what universities should do, I find it helpful to take into account how different our various experiences are.
comment in response to
post
I think her analysis of the structural pressures on universities is excellent! But what I'm seeing on the ground is a mix of those pressures with "endogenous" aspects of the technology itself: its enormous utility for certain kinds of work, and its rapid improvement. Those are critical factors, too.
comment in response to
post
Excellent mini-talk! One missing variable is that many profs (in physics, chemistry, CS) are now finding AI extremely useful for their own work. That makes it harder to see as a "cheating device." This seems like a huge factor in the "pivot," and which may not be equally visible in all disciplines.
comment in response to
post
So is it fair to say your level of belief (or disbelief) would be the same if they'd used the p < 0.05 standard?
comment in response to
post
I suppose the converse question is interesting too: what grand-but-incorrect discoveries would we have made without an understanding of null hypothesis testing?
comment in response to
post
Great essay! You ask, "What are the grand discoveries that we wouldn’t have made without an understanding of null hypothesis testing?" Would the discovery of the Higgs boson count? As I understand it, the transition from "cool theory" to "Nobel prize" hinged on a p-value.
comment in response to
post
Yep! The argument in your paper makes sense. It was just the nonstandard use of "structural stability" that threw me. (In standard usage, e.g., the identity map on a manifold is *not* structurally stable.) Anyway, it's a great article, whatever the terminology you use!
comment in response to
post
Very likely nothing will change for one inference pass, by continuity. But it's entirely possible that after many more next-token inferences you'll see a large enough to change to affect what output token is produced. (This is much like roundoff error accumulating).
comment in response to
post
I should say that by "behavior" I mean the result of just one inference pass, as opposed to long-term dynamics.
comment in response to
post
You're making a simpler and stronger point, I believe: behavior changes *discontinuously* with parameters, a major departure from most neural nets. Traditional "structural stability" is more subtle, and my guess is it would probably be hard to show any real-world transformer is structurally stable.
comment in response to
post
Thanks for this very useful survey! A question: what exactly is your definition of "structural stability"? Usually the term applies to dynamical systems, but how exactly is a transformer a dynamical system? (It actually looks to me like you might be talking about "continuity" instead?)
comment in response to
post
They very much do believe AGI is achievable, and in the (relatively) near future. There are entire social circles in San Francisco that take this for granted. Keep in mind, though, that "intelligence" means something narrow for this crowd, namely pure cognitive capability.
comment in response to
post
There is definitely a point where it breaks down. But I've used it for routine code tasks for about a year, and it's been extremely reliable. Saved me a lot of tedium!
comment in response to
post
Asking an LLM to summarize data is a terrible idea. But ChatGPT is great at writing code for mundane data transformations.
comment in response to
post
Transformer architecture in the description is the "laws of physics" for an LLM. But that's not what makes LLMs work—random transformers do nothing. The power comes from a very specific combination of billions of parameters, which (like the brain) have a rich, intricate structure.
comment in response to
post
Are you seeing this from a dualist position (there's something outside of the laws of physics in the brain)?
comment in response to
post
Couldn't you write an equally low-level description of the brain, full of chemical formulas and equations?
comment in response to
post
That definitely doesn't sound like a win for LLMs. Seems like a classic example of a purpose-built system being a better choice when reliability is critical!
comment in response to
post
How can we be sure the process is just inductive, though? It seems conceivable that some of these systems may do some sort of reasoning. I don't think we can say much with any certainty about the high-level mechanisms inside these models (especially with closed-source frontier systems).
comment in response to
post
Extremely interesting data point! Do they pay your company the same amount as before? Or is it possible there's still some net savings?
comment in response to
post
Gemini 1.5 Pro, however, fumbles the ball! Maybe the takeaway is that questions like this could be good for differentiating between specific chatbots, but don't tell us anything intrinsic about how LLMs work in general.
comment in response to
post
I tried this on Claude, and it too produced a correct, well-explained answer.
comment in response to
post
What version of ChatGPT did you test? I just tried your exact prompt with 4o, and got what looks like a perfect (and well-explained) result. (or am I misreading?)
comment in response to
post
That's an excellent point! "Eagerness to talk about it" and "enthusiastic user" are definitely not the same.
comment in response to
post
My unpopular opinion is that the limpid geometry of linear transformations is far nobler than the bureaucratic murk of formal grammar!