Profile avatar
nsaphra.bsky.social
Waiting on a robot body. All opinions are universal and held by both employers and family. Recruiting students to start my lab! ML/NLP/they/she.
1,992 posts 9,304 followers 1,386 following
Regular Contributor
Active Commenter

Why can’t LMs solve puzzles about the number systems of languages, when they can solve really complex math problems? Our new paper, led by @antararb.bsky.social looks at why this intersection of language and math is difficult, and what this means for LM reasoning! arxiv.org/abs/2506.13886

My partner & I were trying to ID a bird we heard while walking and the guy who had stopped his car to let us pass got out and was like "it's a frog, actually. Cope's Gray. I do frog surveys... Sorry" And then got in his car without a word and drove off. Absolute king

New research from MIT found that those who used ChatGPT can’t remember any of the content of their essays. Key takeaway: the product doesn’t suffer, but the process does. And when it comes to essays, the process *is* how they learn. arxiv.org/pdf/2506.088...

An easy way to tell low-information AI skepticism apart from informed skepticism: extremely confident beliefs about cognition, reasoning, and learning in real brains. (Cog/neuro)scientists don’t know how intelligence develops, but you’re *convinced* prediction objectives have no value?

What is common knowledge in your field but shocks outsiders? Digital resources, particularly eBooks and audiobooks, are going to bankrupt libraries if something isn't done to halt the extortionary pricing models of publishers.

When I was at Sheryl Sandberg’s FB lady intern BBQ I met “Cynthia” and we talked about her cool environmental engineering work in the data centers. When Cynthia scheduled an intern goodbye concert, I discovered that was the real name of Vienna Teng, who I’d been listening to since high school.

the load-bearing institution we broke first was school. this is very weird and we won't know how much it matters for years

to anyone who needs further proof that Zohran is the abundance candidate over Cuomo: Zohran wrote a bill giving the MTA streamlined permitting authority, which would speed up and reduce the costs of subway construction

Excited to share this project specifying a research direction I think will be particularly fruitful for theory-driven cognitive science that aims to explain natural behavior! We're calling this direction "Naturalistic Computational Cognitive Science"

The Power Broker. Man spent my whole childhood complaining about the fall of New York every time we drove on the Cross Bronx Expressway, West Side Parkway, the Henry Hudson Parkway, the Triborough Bridge,

If a human therapist routinely failed to pick up on clear signs of suicidality, or was regularly unable to separate patient delusions from reality, they wouldn't be allowed to practice. These outcomes are dangerous. futurism.com/stanford-the...

"Over four months, LLM users consistently underperformed at neural, linguistic, and behavioral levels." arxiv.org/abs/2506.08872

Instead of like buttons we should have pike buttons and maybe also other types of fish buttons

If you're at #RLDM2025, check out our contributed talk at Session 3 (Fri 6/13, 12:10pm), presented by my brilliant co-first-author on this project @sjohnsonyu.bsky.social! Wasn't able to make it in person, but would love to hear your thoughts @kempnerinstitute.bsky.social @harvardmed.bsky.social

ACL paper alert! What structure is lost when using linearizing interp methods like Shapley? We show the nonlinear interactions between features reflect structures described by the sciences of syntax, semantics, and phonology.

Reasoning is about variable binding. It’s not about information retrieval. If a model cannot do variable binding, it is not good at grounded reasoning, and there’s evidence accruing that large scale can make LLMs worse at in-context grounded reasoning. 🧵