Since GPT-4 came out, open source LLMs have scaled datasets much more than compute, at least from the few data points we have.

Data from EpochAI
Post image

Comments