Josh Tenenbaum on scaling up vs growing up and the path to human-like reasoning #NeurIPS2024 - ThreadSky

maxkw.bsky.social • 74 days ago

Josh Tenenbaum on scaling up vs growing up and the path to human-like reasoning #NeurIPS2024

Comments

nafnlaus.bsky.social•74 days ago

I'm already groaning out loud right at the first line of the right hand side...

nafnlaus.bsky.social•74 days ago

Left second paragraph is wrong, should be the same as the right second paragraph. Left third paragraph is painfully wrong - despite the name, Transformers is not at all a "linguistic model", and the very first thing it does is throw away everything linguistic. And it's literally called...

nafnlaus.bsky.social•74 days ago

"unsupervised learning" - the *entire point* is that it's not "carefully curated data".

I'm really glad I'm not there, I'd be having to suppress groans throughout this entire presentation.

davidpfau.com•73 days ago

Carefully curating the data has been one of the biggest factors in improvements in language models since scaling more got too cumbersome.

nafnlaus.bsky.social•73 days ago

Key words: "improvements", not "existence".

LLMs learn fine without curated data. Also, "carefully curated" is a stretch. Supervised learning is curated on the scale of each training element or pair. "Curated" supervised learning is on the scale of whole dataset, websites, etc (or broad filters)

nafnlaus.bsky.social•73 days ago

It's simply wrong to state that LLMs require carefully curated data. That is not factual. You can *improve* results with curated data, but the exact same thing can be said about training a human being.

nsaphra.bsky.social•45 days ago

I would argue well before scaling more got too cumbersome. The GPT2 paper has a single point to make: Datasets at the time were full of garbage text and curating it improved the resulting model enormously. There are no architectural innovations etc. It's just about data.

thepopper.bsky.social•73 days ago

In other words, the goal is to invent a machine with Chomsky's Language Acquisition Device?

I read Chomsky saying that was one of the key differentiators between humans and LLM's.

Comments

Posting Rules

Reply