One of the things Jay laid out at SXSW earlier this week. Bluesky has a robots.txt-like proposal to allow users themselves to declare if they're okay if they're data is used for AI training.
Reposted from
Jay π¦
We put up a proposal that lays out a way for users to declare whether/how they want their data to be used by things like generative AI or public archives, check it out on github:
github.com/bluesky-soci...
github.com/bluesky-soci...
Comments
"Companies and research teams building AI training sets are expected to respect this intent when they see it, either when scraping websites, or doing bulk transfers using the protocol itself."
"Realistically, a large majority of users may stick with the default "undeclared" state."
it doesn't really do anything I would like to see the default to be "disallow" so that accounts have to opt in this opt "undefined" is lame
That still isn't right! I thought you worded words for a living!