Au contraire—here's one paper detailing automation to solve the fact tracing problem in LLMs, that even predates the release of ChatGPT: arxiv.org/pdf/2205.11482 Disclaimer, I haven't read this, just showing that this is a studiable problem. We don't have to guess. - ThreadSky

lacrunk.bsky.social • 82 days ago

Au contraire—here's one paper detailing automation to solve the fact tracing problem in LLMs, that even predates the release of ChatGPT:

https://arxiv.org/pdf/2205.11482

Disclaimer, I haven't read this, just showing that this is a studiable problem. We don't have to guess.

Comments

lazyblueberry.bsky.social•82 days ago

This is the problem. This is from their conclusion.

lacrunk.bsky.social•82 days ago

Totally. Same as "glue pizza" was correctly referencing an incorrect source (some reddit comment).

Authority laundering via LLM is def a problem and we will need to develop social conventions to combat it.

lacrunk.bsky.social•82 days ago

These are still separate from my consistent core contention that misinfo from _synthetic data specifically_ isn't known to be a problem yet (though it's totally possible it could be! I just haven't seen proof). This is significant because this is a different class of error that's hard to weed out.

lacrunk.bsky.social•82 days ago

ex. here's a hallucination: ChatGPT gives misinfo about a plot event in the Witcher series.

Easy to verify if you just re-run the completion. It comes up with something else. This is high-variance.

In OP's example, this is an easy way to show a student the bot is "lying".

lacrunk.bsky.social•82 days ago

But, if one were to re-run the completion and it gave the same answer each time, i.e. high-bias in responses, it would mean that we _can't_ conveniently immediately write it off as misinfo. It might have memorized a true fact! Or it might be citing a false source.

lacrunk.bsky.social•82 days ago

In practice, most of an LLM's training data is probably going to contain factual information. Otherwise the resultant tool would be useless.

If synthetic data overly skews the proportion of factuality in training data, that's a problem, and will defeat this little heuristic.

Comments

Posting Rules

Reply