Profile avatar
antoine-mln.bsky.social
doing a phd in RL/online learning on questions related to exploration and adaptivity > https://antoine-moulin.github.io/
56 posts 2,069 followers 205 following
Regular Contributor
Active Commenter

another week, another LLM and, yes, another RL seminar. join us at 6 PM UTC today!

super happy about this preprint! we can *finally* perform efficient exploration and find near-optimal stationary policies in infinite-horizon linear MDPs, and even use it for imitation learning :) working with @neu-rips.bsky.social and @lviano.bsky.social on this was so much fun!!

can Grok 3 prove a lower bound on the bayesian regret though? didn't think so... but Itai can! today at 6 PM UTC

Most of the talk around AI and energy use refer to an older 2020 estimate of GPT-3 energy consumption, but a more recent paper directly measures energy use of Llama 65B as 3-4 joules per decoded token. So an hour of streaming Netflix is equivalent to 70-90,000 65B tokens. arxiv.org/pdf/2310.03003

just realized 2025 is the year when the Adam paper gets a test of time award and maybe TRPO? what else?

Hi folks! I'm excited to be on BlueSky! I'm looking forward to posting about computer science research, ML, scientific advances, tasty food, nature, and making groan-worthy puns.

We’re back! Join us for the next talks 🤓

*PLS SHARE* Open position for an *Associate Professor* in Machine Learning at our department (@enginyeria-upf.bsky.social / @upf.edu), via the Serra Hunter programme. DEADLINE: January 13th 2025 www.upf.edu/web/personal...

I'm looking for an emergency reviewer or two for an AISTATS submission in the area of Bayesian ML theory, which at present has the misfortune of three low-quality reviews. If you're able to help, please DM and let me know! Please link your Google Scholar if I don't already know you personally.

Yesterday was my last day at Google DeepMind. Interning there when the first GDM Nobel prize, AlphaProof, the Gemini releases and more all happened and having such amazing/ambitious colleagues was quite humbling and really exciting. I look forward to 2025!

Disentanglement is an intriguing phenomenon that arises in generative latent variable models for reasons that are not fully understood. If you’re interested in learning why, I highly recommend giving Carl’s blog a read!

The slides for my lectures on (Bayesian) Active Learning, Information Theory, and Uncertainty are online now 🥳 They cover quite a bit from basic information theory to some recent papers: blackhc.github.io/balitu/ and I'll try to add proper course notes over time 🤗

Found slides by Ankur Moitra (presented at a TCS For All event) on "How to do theoretical research." Full of great advice! My favourite: "Find the easiest problem you can't solve. The more embarrassing, the better!" Slides: drive.google.com/file/d/15VaT... TCS For all: sigact.org/tcsforall/

Very curious about this one! Join us tomorrow 🤓

Hello Blue Sky! Let's see what you are good for.

🚨 New Paper 🚨 Can LLMs perform latent multi-hop reasoning without exploiting shortcuts? We find the answer is yes – they can recall and compose facts not seen together in training or guessing the answer, but success greatly depends on the type of the bridge entity (80% for country, 6% for year)! 1/N

Join us tomorrow to learn about a magic trick to avoid an annoying truncation in linear MDPs!

The RL theory virtual seminars are also on 🦋 now! Follow @rl-theory.bsky.social to hear about recent advances in RL theory :)

In addition to the Deep Learning Theory starter pack, I've also put together a starter pack for Reinforcement Learning Theory. Let me know if you'd like to be included or suggest someone to add to the list! go.bsky.app/LWyGAAu