I’d like to thank the Academy, I mean Deepseek, for what appears to be a never ending stream of content for principles of micro this semester
Reposted from
Alex Imas
Please (please) stop quoting 5m as the cost of Deepseek’s training of new model. They built inference model using RL using *existing* *pre-trained* large LLM model. The total cost is way higher!
R1 is a big deal, but first for backwards engineering o1 type model, cost is second.
R1 is a big deal, but first for backwards engineering o1 type model, cost is second.
Comments