- ThreadSky | a Reddit-style client for Bluesky

segyges.bsky.social • 44 days ago

Comments

chaoticsidd.bsky.social•44 days ago

I is just compression, yeah.

segyges.bsky.social•44 days ago

this basically hinges on how deep an understanding of "compression" as a concept you have

alexgude.com•44 days ago

Guy like 5 sigma to either end:

"Intelligence is just compression"

(I don't know if I believe this but… I don't not believe it?)

segyges.bsky.social•44 days ago

the tldr if anyone is unfamiliar is that to store something with maximum efficiency, you have to understand it. things you understand better are more predictable, so you need less information to recreate them if you understand them better. there is a lot of work on these lines. chatgpt built on this

rismith.bsky.social•44 days ago

Along those lines, can you build a "lossless" version of an LLM? The lossy version is pretty fucking lossy.

segyges.bsky.social•44 days ago

lossless over some training data? yes. lossless over training data that covers most things you care about? probably. over the string you are going to send it, which probably did not exist when it was trained? likely not

micahrj.bsky.social•44 days ago

You can use an LLM to implement a lossless compressor for arbitrary text, see e.g.: https://bellard.org/nncp/

The encoder and decoder share the same predictive model, so the encoder can spend fewer bits on more predictable symbols and more bits on less predictable ones

segyges.bsky.social•44 days ago

you can just overtrain a standard decoder-only on a single block of text until it's easily capable of running the entire thing verbatim with max sampling and it will have compressed that text

andrewtheblueskier.bsky.social•44 days ago

coattail: I keep seeing some compression benchmark named the Hutter Prize but all the ML nerds scoff at it, are they being harsh or is HP that silly

segyges.bsky.social•44 days ago

the hutter prize fucking rules and marcus hutter deserves hall of fame status for putting his money on "intelligence is compression" in like 2005, way way downcourt.

it's extremely hardware constrained however, so it doesn't measure current state of the art in llms or similar

segyges.bsky.social•44 days ago

like. the chatgpt paper cites marcus hutter. absolutely pivotal high-commitment insight into how to make this sort of thing actually go vroom

andrewtheblueskier.bsky.social•44 days ago

The scoffing I see is more of "this isn't state of the art, it's just its own thing now" than an attack on the merits, so this checks out and Hutter needs his jersey retired

segyges.bsky.social•44 days ago

i still like it a lot tbh, i think the severity of the constraint makes it extremely interesting as a standalone challenge

yonderdavid.bsky.social•44 days ago

As someone with family who raise beef it also hinges on disambiguating the worst initialism overloading of all time.

segyges.bsky.social•44 days ago

i actually think that finding your way to "intelligence is compression" is how you get to the right end of the curve

segyges.bsky.social•44 days ago

left end is "ai is just compression, so it's not really intelligence
right end is "ai is just compression, and that is what intelligence is"

snowjob.bsky.social•44 days ago

I was legitimately talking about human memory and comprehension as just heuristic compression during an informational interview I had last week!

What do folks think they're doing when they speak or listen? In speech therapy it's literally called encoding and decoding as it concerns reading/writing

segyges.bsky.social•44 days ago

i think the "compression is an insult" take comes from not thinking about it too hard tbh

snowjob.bsky.social•44 days ago

Sounds like they didn't finish extracing or unzipping that idea in their head

cpaxton.bsky.social•44 days ago

All intelligence is compression, even ours

andrewtheblueskier.bsky.social•44 days ago

If you can guess the pattern to the digits ...627262464195387 and extend backwards reliably, you both learn about the rule and happen to compress the string

andrewtheblueskier.bsky.social•44 days ago

(it's the final digits of any arbitrarily big power tower of 3s, such as the famous Graham's number)

annakeesey.bsky.social•44 days ago

I wish I understood this

segyges.bsky.social•44 days ago

does this help? it explains the right end https://bsky.app/profile/segyges.bsky.social/post/3loyi3ixi322x

annakeesey.bsky.social•44 days ago

That helps. But I’m so unversed in the philosophy and discourse of…digitalia that I’m going to have to start way back. Has anybody written clearly for the layperson on these ideas?

segyges.bsky.social•44 days ago

i'm not sure anyone's even written clearly for a professional audience for these ideas lmao

this does make me realize there is a gap here though, thank you

if you want to know what i mean, see e.g. this which is roughly our canonical reference: http://prize.hutter1.net/

annakeesey.bsky.social•44 days ago

Fascinating and helpful!

segyges.bsky.social•44 days ago

the core idea is information theory which is its own field and where there is a lot of math and not a lot of "state clearly the core ideas in english that normal people can read", this stuff is just ... biting the bullet super hard that information theory describes how minds work and using it for ai

Comments

Posting Rules

Reply