stellaathena.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

stellaathena.bsky.social

I make sure that OpenAI et al. aren't the only people who are able to study large scale AI systems.

248 posts 5,081 followers 347 following

Posts 13 Comments 37

After a short era in which people questioned the value of academia in ML, its value is more obvious than ever. Big labs stopped publishing the minute commercial incentives showed up and are relentlessly focused on a singular vision of scaling. Academia is a meaningful complement, bringing... 1/2

submitted 12 days ago • 2 comments

It was great to speak to @chronphilanthropy.bsky.social about AI research and how it's challenging yet necessary for non-profits to work in the LLM space. Many think we should leave the field to companies, but non-profits have different values and goals and those are important. shorturl.at/8IlmF

submitted 71 days ago • 1 comment

Proud to be at the AI Action Summit representing @eleutherai.bsky.social and the open source community. The focus on AI for the public good is exciting! DM me or @aviya.bsky.social to talk about centering openness, transparency, and public good in the AI ecosystem.

submitted 75 days ago • 0 comments

In case you're curious how much of a hellscape X is. I opened it today to get greeted with a porn ad. The censored section is a 20 second video clip of a woman sucking a dick, with audio. It autoplays.

submitted 78 days ago • 1 comment

I have a new favorite academic journal.

submitted 78 days ago • 0 comments

Discussions of AI training dynamics and alignment are often underpinned by formal or informal appeals to a probability distribution over models. If you've thought about such an argument, this is a must read. It's not often you get to say prior work is off by "millions of orders of magnitude"!

submitted 79 days ago • 1 comment

I enjoy the big ML conferences and ACL and COLM so I have at least 8 deadlines a year and should never have to crunch. In reality I have 8 deadlines a year and I’m in a state of perpetual crunch.

submitted 85 days ago • 1 comment

The CDC's political leadership is editing language that they dislike out of academic papers. The primary purpose and de facto consequence of this will be the massive reduction if not complete elimination of discussion of queer people from CDC publications. insidemedicine.substack.com/p/breaking-n...

submitted 82 days ago • 0 comments

Is anyone well-read in the DS-R1 tea leaves and feels confidant they know what the distillation method used was? It's not clear to me if they mean "train on data from another model" or something that I'd consider "actually distilling"? My current guess is synthetic CoT?

submitted 82 days ago • 3 comments

The American response to DeepSeek falsifies the claims that America shouldn't be an open source leader because China will just take it and build on it. If that was true, the response would be "lol thanks for giving us free results, idiot."

submitted 84 days ago • 0 comments

This is a crime. Trump is breaking the law. The NSF is violating its contractual obligations. And because of those crimes committed by the government, researchers will not get paid and on-going scientific research projects will be ruined. You can't just "pause" access to grants.

submitted 85 days ago • 1 comment

Obligatory "actually my lab invented test-time-compute" post. In "Stay on topic with Classifier-Free Guidance," we show that CFG enables a model to expend twice as much compute at inference time and match the performance of a model twice as large. arxiv.org/abs/2306.17806

submitted 86 days ago • 0 comments

If you've automated much of your code-writing, what models do you use to write code? What does your workflow look like? I haven't found one I'm too happy with (but also haven't been trying that hard)

submitted 107 days ago • 4 comments