This is a pretty high bar and requires sources that are either bleeding-edge mechanistic interpretability papers or are straight up proprietary secrets, as AI companies don't release their training data sets.
Comments
Log in with your Bluesky account to leave a comment
So any claims that "synthetic data is 100% certainly the direct cause of LLM misinfo" are too confident, and we should not be making them until we have that evidence.
there's more fuzzy (but less interesting) discussion in the other claim, which is that "2) AI companies have incentives to reduce the amount of misinformation their products output, because it a) embarrasses them (bad marketing) and b) degrades the utility of their product (not worth paying for)."
These incentives, again, compete with "we can release something shitty to sail on the hype and fix the bugs when we get more funding" and the result is a mixed bag.
So I do prima facie accept "we should have a healthy distrust of these companies and their negligence"—but to say "they love releasing shitty products just to be evil" is too strong
I'm not saying they're doing it to be evil. I'm saying they're doing this to appease the investment cycle. Tech is incentivized to find "the next big thing" like computers, the Internet, smartphones, and social media.
This is why LLMs are being pushed despite them having limited utility.
Very much agree with you there! With the add-on that the category of "thing that might be the next big thing" contains a few very useful things and a lot outright scams lmao, so the prior probability that something in that bucket is gonna suck is high
I literally don't know what the hell you're on about at this point. You already know these programs are effectively black box algorithms, that nobody has access to. We don't know what they're trained on exactly, or how predictions are weighted.
The alternative is "misinfo outputs might just be the result of randomly selecting the best, but wrong, tokens"
When that happens, rerunning a completion usually produces a different output. In that way, you can determine whether misinfo comes from the attention mechanism or from bad source data.
What we do know is that these programs already give erroneous answers given the datasets they're currently using. If they're generating synthetic data from this dataset which again the entirety of the Internet is what they claim- the obvious conclusion is that synthetic data is erroneous as well
Comments
This is why LLMs are being pushed despite them having limited utility.
Ergo, if you found a false claim in the outputs, and traced it back to training data marked as synthetic, you'd be able to prove this relationship.
When that happens, rerunning a completion usually produces a different output. In that way, you can determine whether misinfo comes from the attention mechanism or from bad source data.
The process you're describing is a human manual process that cannot be automated by these programs. These programs cannot tell the truth from fiction.
https://arxiv.org/pdf/2205.11482
Disclaimer, I haven't read this, just showing that this is a studiable problem. We don't have to guess.