davidduvenaud.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

davidduvenaud.bsky.social

Machine learning prof at U Toronto. Working on evals and AGI governance.

48 posts 873 followers 134 following

Posts 7 Comments 43

It's hard to plan for AGI without knowing what outcomes are even possible, let alone good. So we’re hosting a workshop! Post-AGI Civilizational Equilibria: Are there any good ones? Vancouver, July 14th www.post-agi.org Featuring: Joe Carlsmith, @richardngo.bsky.social‬, Emmett Shear ... 🧵

submitted 1 day ago • 1 comment

What to do about gradual disempowerment from AGI? We laid out a research agenda with all the concrete and feasible research projects we can think of: 🧵 www.lesswrong.com/posts/GAv4DR... with Raymond Douglas, @kulveit.bsky.social @davidskrueger.bsky.social

submitted 16 days ago • 1 comment

On top of the AISI-wide research agenda yesterday, we have more on the research agenda for the AISI Alignment Team specifically. See Benjamin's thread and full post for details; here I'll focus on why we should not give up on directly solving alignment, even though it is hard. 🧵

submitted 42 days ago • 1 comment

“What place will humans have when AI can do everything we do — only better?” In The Guardian today, SRI Chair @davidduvenaud.bsky.social explores what happens when AI doesn't destroy us — it just quietly replaces us. 🔗 www.theguardian.com/books/2025/m... #AI #AIEthics #TechAndSociety

submitted 45 days ago • 2 comments

My single rule for productive Bluesky discussions: Start every single reply with a point of agreement. It disarms the combative impulse on both sides, and forces you to try to interpret their words in the most sensible possible way.

submitted 92 days ago • 2 comments

New paper: What happens once AIs make humans obsolete? Even without AIs seeking power, we argue that competitive pressures are set to fully erode human influence and values. www.gradual-disempowerment.ai with @kulveit.bsky.social, Raymond Douglas, Nora Ammann, Deger Turann, David Krueger 🧵

submitted 140 days ago • 1 comment

Happy to have helped a little with this paper:

submitted 183 days ago • 0 comments