Seeing the reception of an open pretraining dataset. - ThreadSky

ThreadSky

About ThreadSky

dorialexander.bsky.social • 19 days ago

Seeing the reception of an open pretraining dataset.

Comments

dorialexander.bsky.social•19 days ago

Super consequential work but I look forward to only do model releases in the near future.

jebediah98.bsky.social•3 days ago

What happened?

dorialexander.bsky.social•3 days ago

Too much irreconcilible demands. Open datasets are simultaneously expected to clear the license of every piece of content (something not even asked of Wikipedia), be competitive with pretraining sets years in the making and solve all kind of adjacent ethical problems in AI.

dorialexander.bsky.social•3 days ago

All of this being magically created without grants or any form of material support.

jebediah98.bsky.social•3 days ago

I see. I hope that at some point soon there is a pan Europe project for public/transparent AI and this kind of stuff does actually get funding and support.

If we end up just buying American and avoiding looking in the box we are going to be left behind.

dorialexander.bsky.social•3 days ago

There are several (involved in one of them) but super fragmented, little effort of coordination overall. But yeah would be really absurd to spend billions on datacenter and nothing on the data that's supposed to run on it.

Posting Rules

Be respectful to others
No spam or self-promotion
Stay on topic
Follow Bluesky's terms of service

Comments

Posting Rules

Reply