I really enjoyed this thought provoking talk… thanks @natolambert.bsky.social !
Reposted from
Nathan Lambert
Trying to tell the story behind this explosion of research we are in. An unexpected RL Renaissance.
New talk! Forecasting the Alpaca moment for reasoning models and why the new style of RL training is a far bigger deal than the emergence of RLHF.
YouTube: https://buff.ly/41bVRPp
New talk! Forecasting the Alpaca moment for reasoning models and why the new style of RL training is a far bigger deal than the emergence of RLHF.
YouTube: https://buff.ly/41bVRPp
Comments