I really enjoyed this thought provoking talk… thanks @natolambert.bsky.social !

Reposted from Nathan Lambert

Trying to tell the story behind this explosion of research we are in. An unexpected RL Renaissance.
New talk! Forecasting the Alpaca moment for reasoning models and why the new style of RL training is a far bigger deal than the emergence of RLHF.
YouTube: https://buff.ly/41bVRPp

Comments

Posting Rules

Comments

Posting Rules

Reply