Profile avatar
stevebyrnes.bsky.social
Researching Artificial General Intelligence Safety, via thinking about neuroscience and algorithms, at Astera Institute. https://sjbyrnes.com/agi.html
138 posts 2,762 followers 82 following
Getting Started
Active Commenter

In an under-appreciated post 2 years ago, someone pointed out that you can train an LLM to roleplay as a sassy talking pink unicorn, or whatever else. But what companies overwhelmingly CHOOSE to do is train LLMs to roleplay as LLMs (1/2) www.lesswrong.com/posts/tAtp4o...

Blog post: “Self-dialogue: Do behaviorist rewards make scheming AGIs?” www.alignmentforum.org/posts/SFgLBQ...

New blog post: “‘Sharp Left Turn’ discourse: An opinionated review” www.alignmentforum.org/posts/2yLyT6...

New blog post: “Heritability: Five battles”. It’s a (very) long and opinionated but hopefully beginner-friendly discussion, structured around five contexts in which people study and argue about heritability. (1/7) www.lesswrong.com/posts/xXtDCe...

Blog post: “Applying traditional economic thinking to AGI: a trilemma” www.lesswrong.com/posts/TkWCKz...

Blog post: “My AGI safety research—2024 review, ’25 plans” www.alignmentforum.org/posts/2wHaCi...

I’ve been enjoying @zefrank.bsky.social’s series of zoology videos—funny and very well researched. [Content warnings: wild animal suffering & dirty jokes.] www.youtube.com/playlist?lis...

Lots of people think human extinction from future Artificial General Intelligence is a stupid thing to worry about. (I think they’re wrong.) There are lots of different reasons people think that, so any one “demo” will be missing-the-point for many. 1/16