Profile avatar
nsaphra.bsky.social
Waiting on a robot body. All opinions are universal and held by both employers and family. ML/NLP/they/she.
1,427 posts 8,663 followers 1,265 following
Regular Contributor
Active Commenter

I've written a free and accessible guide to Cephalopod Sentience with Alex Schnell, Piero Amodio and Peter Morse, stunningly illustrated by Roksolana Tkach. Please download and share! It's worth it for the illustrations alone! 🐙 thebrooksinstitute.org/sites/defaul...

an old co-author emailed me looking for a confirmation a non-existent paper supposedly authored by me that one of her students has cited in an essay... so i got curious and prompted gpt-4o “papers by abeba birhane”. it listed 9 papers none of which I authored

FUNDED BY NIH if that isn't front and center, people won't understand the cost of science infrastructure and funding being demolished in the US

✨New paper✨ Linguistic evaluations of LLMs often implicitly assume that language is generated by symbolic rules. In a new position paper, @adelegoldberg.bsky.social, @kmahowald.bsky.social and I argue that languages are not Lego sets, and evaluations should reflect this! arxiv.org/pdf/2502.13195

When I say my name, people start speaking French to me, although my French is basic. That also happens with AI systems. We wrote a whole paper on that, testing across models for presumed cultural identity based on names w/ Siddhesh Pawar @rnv.bsky.social @iaugenstein.bsky.social

I've been shocked that a theory-driven method yields practical results this good, especially on attention approximation. I proposed my best new optimizer design originally as a dumb baseline; the fact that you can get these efficiency gains with a principled approach makes me a lil insecure.

Introducing The AI CUDA Engineer: An agentic AI system that automates the production of highly optimized CUDA kernels. sakana.ai/ai-cuda-engi... The AI CUDA Engineer can produce highly optimized CUDA kernels, reaching 10-100x speedup over common machine learning operations in PyTorch. Examples:

Check out our paper on the quality of interpretability evaluations of recommender systems: cset.georgetown.edu/publication/... Led by @minanrn.bsky.social and Christian Schoeberl! @csetgeorgetown.bsky.social

GEM is so back! Our workshop for Generation, Evaluation, and Metrics is coming to an ACL near you. Evaluation in the world of GenAI is more important than ever, so please consider submitting your amazing work. CfP can be found at gem-benchmark.com/workshop

Oh damn Prime actually published their failed chatgpt prompt in the description

📣 Excited to announce our workshop "Agent-Based Models in Neuroscience: Complex Planning, Embodiment, and Beyond" at the upcoming @cosynemeeting.bsky.social #CoSyNe2025! 🧠🤖 🪱🪰🐟🐝🐭💪 Schedule: neuro-agent-models.github.io 🗓️ Join us in Mont-Tremblant, Canada, on March 31!

Our new piece in Nature Machine Intelligence: LLMs are replacing human participants, but can they simulate diverse respondents? Surveys use representative sampling for a reason, and our work shows how LLM training prevents accurate simulation of different human identities.

Data selection for instruction fine-tuning of LLMs doesn't need to be computationally costly. Great work by @wonderingishika.bsky.social! @convai-uiuc.bsky.social

The fact that only 2% of professors identify as fascists shows just how insular and out of touch universities have become.

New paper–accepted as *spotlight* at #ICLR2025! 🧵👇 We show a competition dynamic between several algorithms splits a toy model’s ICL abilities into four broad phases of train/test settings! This means ICL is akin to a mixture of different algorithms, not a monolithic ability.

tell your professors they got no friends only collaborators

"Hitchhiking shrimp" Another shot from Gray's Reef NMS during a research cruise (NOAA/NMS grants) to assess bio diversity and food web interactions. Coming up from a long dive saw this Atlantic sea nettle (Chrysaora quinquecirrha) with a decent sized shrimp riding along. 🐡🐙🦑

I just had a chance to watch this fantastic talk. I really recommend it for anyone interested in how LLMs can help us understand language: www.youtube.com/watch?v=DBor...

Perplexity announced their own DeepResearch that includes a free tier and a generous $20/mo tier People who have tried both are finding the Perplexity version to be on par, perhaps better than OpenAI’s ($200/mo) Analysis of the Chernobyl drone strike yesterday: www.perplexity.ai/search/yeste...

I am delighted to share this new paper on AI collaboration in Chinese news organisations, led by @qingxiaohci.bsky.social‬, which has just been accepted at #CHI25. https://buff.ly/4gCc7hb

First CfP for #EMNLP2025 is live now. Submission deadline to May ARR cycle! Excited to be part of organizing the conference as publicity chair w/ @amuuueller.bsky.social @dallascard.bsky.social so watch out for more updates esp by following the official conf account @emnlpmeeting.bsky.social