More OLMo! More performance! More details! We applied Tulu post-training to OLMo 2 as well, so you can get strong model performance AND see what your model was actually trained on. - ThreadSky

More OLMo! More performance! More details!
We applied Tulu post-training to OLMo 2 as well, so you can get strong model performance AND see what your model was actually trained on.

Reposted from Kyle Lo @ ICLR 2025

kicking off 2025 with our OLMo 2 tech report while payin homage to the sequelest of sequels 🫡

🚗 2 OLMo 2 Furious 🔥 is everythin we learned since OLMo 1, with deep dives into:

🚖 stable pretrain recipe
🚔 lr anneal 🤝 data curricula 🤝 soups
🚘 tulu post-train recipe
🚜 compute infra setup

👇🧵

Comments

Posting Rules

Comments

Posting Rules

Reply