amirmesbah.bsky.social - Profile | ThreadSky | a Reddit-style client for Bluesky

amirmesbah.bsky.social

Graduate Student - Interested in RL and its mathematics 👾 > https://amirhosein-mesbah.github.io/

6 posts 45 followers 425 following

Posts 10 Comments 4

First 11 chapters of RLHF Book have v0 draft done. Should be useful now. Next: * Crafting more blog content into future topics, * DPO+ chapter, * Meeting with publishers to get wheels turning on physical copies, * Cleaning & cohesiveness rlhfbook.com

submitted 1 day ago • 0 comments

🚨 Neuromatch Academy Course Applications are OPEN for 2025!! 🚨 Get your application in early to be a student or teaching assistant for this year’s courses! Applications are due Sunday, March 23. Apply & learn more: neuromatch.io/courses/ #mlsky #compneurosky #ai #climatesolutions #ScienceEdu 🧪

submitted 3 days ago • 0 comments

2014 GoogLeNet: The best image classifier was only trainable using weeks of Google's custom infrastructure. 2018 ResNet: A more accurate model is trainable in a 1/2 hour on a single GPU. What stops this from happening for LLMs?

submitted 31 days ago • 3 comments

I am teaching a class on #FoundationalModels for #robotics and Scaling #DeepRL algorithms. This class expands on last year's class and my generalist robotics policies tutorial and code. I plan to share the lectures and code assignments. Starting with the first lectures below.

submitted 39 days ago • 1 comment

I wonder why ML conferences insist on uploading workshop videos on SlideShare while they can use YouTube and the benefits of monetization. Talks on SlideShare are really hard to track!

submitted 40 days ago • 0 comments

i was recently asked to provide 4 "desert island" RL papers. if i were stuck on a desert island i'd hope to have something better to read than #RL papers... but anyway, here's a thread with my choices, maybe you can read them on your flight to @neuripsconf.bsky.social #NeurIPS2024 . Enjoy!

submitted 83 days ago • 4 comments

If you're an RL researcher or RL adjacent, pipe up to make sure I've added you here! go.bsky.app/3WPHcHg

submitted 110 days ago • 51 comments

As my first post on this platform, allow me to advertise the RL theory lecture notes I have been developing with Sasha Rakhlin: arxiv.org/abs/2312.16730 (shameless repost of my pinned tweet)

submitted 98 days ago • 3 comments

I have become a fan of the game-theoretic approaches to RLHF, so here are two more papers in that category! (with one more tomorrow 😅) 1. Self-Play Preference Optimization (SPO). 2. Direct Nash Optimization (DNO). 🧵 1/3.

submitted 98 days ago • 3 comments

Hey academic Bluesky 👀👋

submitted 101 days ago • 0 comments