Too much irreconcilible demands. Open datasets are simultaneously expected to clear the license of every piece of content (something not even asked of Wikipedia), be competitive with pretraining sets years in the making and solve all kind of adjacent ethical problems in AI.
I see. I hope that at some point soon there is a pan Europe project for public/transparent AI and this kind of stuff does actually get funding and support.
If we end up just buying American and avoiding looking in the box we are going to be left behind.
There are several (involved in one of them) but super fragmented, little effort of coordination overall. But yeah would be really absurd to spend billions on datacenter and nothing on the data that's supposed to run on it.
Comments
If we end up just buying American and avoiding looking in the box we are going to be left behind.