I’d like to thank the Academy, I mean Deepseek, for what appears to be a never ending stream of content for principles of micro this semester

Reposted from Alex Imas

Please (please) stop quoting 5m as the cost of Deepseek’s training of new model. They built inference model using RL using *existing* *pre-trained* large LLM model. The total cost is way higher!

R1 is a big deal, but first for backwards engineering o1 type model, cost is second.

Comments

Posting Rules

Comments

Posting Rules

Reply