Shocker - ChatGPT and other LLMs/similar AI models collapse in on themselves like a flan in a cupboard when they start eating their own data. - ThreadSky

kellyhereid.bsky.social • 221 days ago

Shocker - ChatGPT and other LLMs/similar AI models collapse in on themselves like a flan in a cupboard when they start eating their own data.

Comments

natlyon.com•220 days ago

The LLM models are all reductive. They are designed to dumb shit down. The graphs show the results of recursive idiocy.

natlyon.com•220 days ago

One upside- artists are safe.

poundingthepodium.bsky.social•221 days ago

Autocomplete needs more than autocompleted inputs.

What would be really surprising is if it didn't, like it had accidentally stumbled on the name of god and kept feeding that back to itself to calculate fragments of reality.

But, nope, just autocomplete needing good data. And no magic sky monkey.

nafnlaus.bsky.social•220 days ago

This paper is being grossly misrepresented. It's about *randomly* feeding outputs back in, endlessly, with no selective process.

Synthetic data is a growing portion of model training. Read the paper on the (spectacularly-well performing) LLaMA 3.1 as an example.

pekka.bsky.social•220 days ago

nafnlaus.bsky.social•220 days ago

https://ai.meta.com/research/publications/the-llama-3-herd-of-models/

Please, people, stop sharing papers on topics you're unfamiliar with. Synthetic data is not only "a thing", but it's incredibly important. Synthetic data is *how you learn*. That's what "mulling over" new information is - lack thereof was a historic *weakness* in LLMs.

nafnlaus.bsky.social•220 days ago

People, use basic logic here. For example: AI image generators have been widely used over the past couple years. Their images are all over the net. Now compare the output quality of modern AI image generators to old ones. It's light night and day, *for the better*. *Way* better than old ones.

nafnlaus.bsky.social•220 days ago

Where is your "collapse"?

* Creators are a selective filter
* Websites are a selective filter
* Dataset curators are a selective filter
* Even automated tools, like aesthetic gradient raters, are selective filters.

Ultimately, if something looks good, *it doesn't matter* who or what made it.

nafnlaus.bsky.social•220 days ago

You always want new sources of information, of course. Lock someone in Plato's Cave for long enough and they'll forget what the world outside looks like apart from shadows. But even the very act of selective rating is itself the addition of new information to the system.

nafnlaus.bsky.social•220 days ago

I use synthetic data extensively in LLMs (for diffusion models, see e.g. Howard Arson - https://bsky.app/profile/theophite.bsky.social/post/3ky3flcn4vq2y). Let's say I want to make a "needle in a haystack" model that finds text on a specific exact topic, and I want a big training dataset of such things. How to do go about it?

jcsalterego.bsky.social•221 days ago

mm flan

roastedcauliflower.bsky.social•220 days ago

Came here to say this

ithacan.bsky.social•220 days ago

is that an oblique Eddie Izzard / Ottoman Empire reference?

roastedcauliflower.bsky.social•220 days ago

I just like flan, no disrespect to Ms. Izzard or the Ottoman Empire

kellyhereid.bsky.social•220 days ago

YES it is

kellyhereid.bsky.social•220 days ago

Suppose it's technically the Austro-Hungarian that does the collapsing but who's counting

ithacan.bsky.social•220 days ago

ah yes, Austro-Hungarian! 😆

srshaman.bsky.social•220 days ago

Humans teachers 1
AI teachers 0

gothamgirlblue.com•220 days ago

📌

thelittlepakeha.net•221 days ago

It struck me as extremely similar to the LLM issue where they start scraping other AI content

thelittlepakeha.net•221 days ago

I was watching years-old Ann Reardon/How To Cook That videos last night and in one of them she talked about content farms stealing the work of smaller YouTubers and how it drives actual creatives offsite so they all just start plagiarising each other and you get nothing new or interesting.

paulallopenna.bsky.social•221 days ago

This is the modeling equivalent generations of inbreeding.

mattwritesbooks.bsky.social•221 days ago

Did they ever solve that “copy of a copy” problem for printers? No? Well I’m sure they can do it for an astronomically harder circumstance.

britsagainstbrexit.bsky.social•219 days ago

In-breeding "AI" vapourware, it's just as crap as the data.
Myth buster: "AI" = a bit of smart code, fast processing and data, that's it, there's no magic!

stillwellgray.ca•220 days ago

Ok so where do u keep ur flan then

lizardky.bsky.social•221 days ago

As I keep pointing out, humans "train" on pre-existing art, music, literature, etc., and their output is reliably used to train the next generation, and so on, and this leads, over time, to vastly increased diversity, complexity, and quality.
1/?

lizardky.bsky.social•221 days ago

2/2 If AI actually was *intelligent*, self-training would rapidly produce creative works beyond human capability, compressing a millennia of artistic progress into months. But it's NOT, and there's no reason to think it will be using current paradigms.

polyparadigm.bsky.social•220 days ago

Do we have a unified paradigm of cognition?

Seems like cog sci is pre-paradigmatic last I checked

nih-llamas.bsky.social•220 days ago

This sounds a lot like how they make movie franchises like Transformers.

justingundlach.bsky.social•221 days ago

I experienced a very small version of this first hand about 20 years ago (details are cumbersome to explain). It makes complete sense to me that people didn't grok this in advance, even tho it now looks like a pretty obvious pitfall.

dgoldman.bsky.social•220 days ago

The study looks at the results of "indiscriminately learning from data produced by other models." Of course there's no surprise. Novelty requires selection. Indiscriminate learning will result in collapse. Filtered learning will not. I mean, maybe it will. But then evolution is wrong.

sjshancoxli.liberalcurrents.com•220 days ago

[pre-war steel intensifies]

kellyhereid.bsky.social•221 days ago

Nice discussion of the paper in the commentary piece here:

sillyrookie.bsky.social•220 days ago

Hell of a thumbnail. I knew Habsburg LLM content was gonna look horrible, but damn!

roddd.bsky.social•220 days ago

This might be the most "Torment Nexus" paragraph I've ever read

jaishrijuice.bsky.social•221 days ago

Fantastic! I hope this is the end of AI ubiquity and that no-one ever speaks of it again!

infiniteview.bsky.social•220 days ago

Except in this context:
"Hey remember those AI things from a few years back?"
"Yeah they were the worst"
"Well someone is trying it again"
"I'll get my pitchfork"

nlsmith.com•220 days ago

One time I was in high school and on PCP and this happened to my friend's little sister's face.

javanaise.bsky.social•220 days ago

i thought we already had the conversation between art and nonsense at the 1863 salon de paris.

thatac.bsky.social•220 days ago

Oh look my shocked face.

How could no on predict this sort of outcome after seeing what happened to people on FB becoming dumber with every post or the Qanon faithful becoming dumber with every new drop?

blitzthedragon.bsky.social•220 days ago

Explains the rapid proliferation on Facebook of those BEAUTIFUL CABIN CREWS

Comments

Posting Rules

Reply