just to note that #DeepSeek, just like Meta's Llama, is a great example of the release-by-blogpost model with shiny (handpicked) eval result tables https://dl.acm.org/doi/10.1145/3630106.3659005
free model weights, but no open data
"technical report" lacking crucial details
auditing & scientific scrutiny avoided
free model weights, but no open data
"technical report" lacking crucial details
auditing & scientific scrutiny avoided
Comments