Profile avatar
orpheuslummis.info
Building software & events for AI safety, collective intelligence, civ resilience – https://orpheuslummis.info – 📍Montréal
143 posts 325 followers 1,266 following
Regular Contributor
Active Commenter

We are excited to release a short course on AGI safety! The course offers a concise and accessible introduction to AI alignment problems and our technical / governance approaches, consisting of short recorded talks and exercises (75 minutes total). deepmindsafetyresearch.medium.com/1072adb7912c

One of the first components of the CAISI (Canadian AI Safety Institute) research program has just launched: a call for Catalyst Grant Projects on AI Safety. Funding: up to 100K for one year Deadline to apply: February 27, 2025 (11:59, AoE) More details: cifar.ca/ai/cifar-ai-...

Today, we are publishing the first-ever International AI Safety Report, backed by 30 countries and the OECD, UN, and EU. It summarises the state of the science on AI capabilities and risks, and how to mitigate those risks. 🧵 Full Report: assets.publishing.service.gov.uk/media/679a0c... 1/21

a draft paper (for an invited talk at AAAI next month) with a philosophical analysis of work on mechanistic interpretability, with special attention to methods for propositional interpretability. arxiv.org/abs/2501.15740

If we use, to achieve our purposes, a mechanical agency with whose operation we cannot efficiently interfere once we have started it, [...], then we had better be quite sure that the purpose put into the machine is the purpose which we really desire – Wiener, 1960

China announcing ~140 billion USD investment in AI, over next 5 years, following the ~500 billion USD announcement by the US... www.bankofchina.com/aboutboc/bi1...

Care about AGI going well? Contribute to our conference on large-scale AI risks on 26-28th May in Leuven, Belgium, featuring Yoshua Bengio, Dawn Song, and Iason Gabriel as keynote speakers. We invite participants from a wide range of academic disciplines to submit abstracts by 15 Feb, 2025.

AI Safety Events & Training: 2025 week 1 update aisafetyeventsandtraining.substack.com/p/ai-safety-...

Joyeux solstice 🌞

AI Safety Events & Training: 2024 week 51 update – and 2024 review aisafetyeventsandtraining.substack.com/p/ai-safety-...

Guaranteed Safe AI Seminars 2024 review horizonomega.substack.com/p/guaranteed... The monthly seminar series grew to 230 subscribers in 2024, hosting 8 technical talks. We had ~490 RSVPs, with ~76 hours and ~900 views of the recordings. Seeking 2025 funding; plans include bibliography and debates.

Using PDDL Planning to Ensure Safety in LLM-based Agents by Agustín Martinez Suñé Thu January 9, 18:00-19:00 UTC Join: lu.ma/08gr7mrs Part of the Guaranteed Safe AI Seminars

Compact Proofs of Model Performance via Mechanistic Interpretability by Louis Jaburi Thu December 12, 18:00-19:00 UTC Join: lu.ma/g24bvacw Last Guaranteed Safe AI seminar of the year

a game idea to play with the game simulator Genie 2 (deepmind.google/discover/blo...) : describe as best as you can your own situation, then play it - to experience a kind of personal multiverse

Our goals for 2025: - Guaranteed Safe AI Seminars - AI Safety Unconference 2025 - AI Safety Events & Training newsletter - Monthly Montréal AI safety R&D events - Grow partnerships We are looking for donations to support this work. More info: manifund.org/projects/hor...

AI Safety Events and Training: 2024 Week 47 update aisafetyeventsandtraining.substack.com/p/ai-safety-...

Vision for the AI Safety Unconference 2025: - 3 days, online - Custom event app, enabling creating and reviewing sessions, poster sessions, schedule voting, matchmaking, chatting, awards - Collaboratively created with community - Focus on x-risk/catastrophic risk, but open to all AI safety works

Little garden harvest

Today on the Guaranteed Safe AI Seminars series: Bayesian oracles and safety bounds by Yoshua Bengio Relevant readings: - yoshuabengio.org/2024/08/29/b... - arxiv.org/abs/2408.05284 Join: lu.ma/4ylbvs75

this distribution isn't uniform or even well-understood. We might be dealing with: Fat-tailed distributions, Unknown unknowns, Feedback loops, Non-linear effects