It's fine to ignore these tools, but I actually think trying them out in the context of something you known well is worth doing. The mistakes these things make will get harder to see with the more coats of paint OpenAI and others slap on them. - ThreadSky

tokenize.bsky.social • 13 days ago

It's fine to ignore these tools, but I actually think trying them out in the context of something you known well is worth doing. The mistakes these things make will get harder to see with the more coats of paint OpenAI and others slap on them.

Comments

orman.bsky.social•13 days ago

The *known* mistakes will, which is a problem when that doesn't correspond to the underlying processes improving qualitively

nialltubes.bsky.social•13 days ago

So along with all the energy and water they use, we now have to waste our time teaching them. Just cut out the middle-man and go and ask an expert in whatever you need to know. At least (you'd hope) they won't lie or give you a bullshit answer they just made up.

tokenize.bsky.social•13 days ago

I mean, by all means, don't use them.
The energy required to run these is an issue, but the models will get smaller through quantization and architectural refinements. I also don't think everyone can know or ask an expert. But even if they should, people are using these and will continue to do so.

orman.bsky.social•13 days ago

Architectural refinements might make the resource-per-weight smaller, but AFAIK the expectation is still that qualitative leaps require multiplying the number of parameters which AFAIK necessarily multiplies the resource costs to both query and train

tokenize.bsky.social•12 days ago

It’ll happen on multiple fronts. Quantization is already very helpful, and hardware for lowering the power efficiency of running neural networks will advance a lot. It’s why Same Altman is out there trying to raise a trillion $. Think about how Bitcoin mining on GPUs moved to FPGAs.

tokenize.bsky.social•12 days ago

And you’ll probably have a different models within the MoE running in different places (locally for some, in data centers for others)

Because the energy investment is a lot about the hardware it’s running on. Although I’m pretty sure models will get smaller in terms of the number of weights in time

orman.bsky.social•12 days ago

Bitcoin is the archetypal O(1/n) problem so perhaps not the best comparison.

While I haven't kept up w/ the research, AIUI the fundamental bottlenecks are that a query is at least linear in the weights and unavoidably *super*linear in context size. Both of which are fundamental to the LLM's utility

tokenize.bsky.social•13 days ago

My research interest is in making sure software systems that use LLMs to generate code or designs are verifiably correct. Or more than they are today.

For me, the only broader implication of LLM use I find interesting is a harm reduction strategy. And the open-sourcing of models (weights and all)

orman.bsky.social•13 days ago

(sorry, not trying to harp on you personally at all, just had lots of of remarks)

Verifiable correctness is an insanely strong property in general, even for normal code. I can't conceive how you'd get anywhere close with LLMs, given they're black boxes

tokenize.bsky.social•12 days ago

No worries. And agreed. There is a whole field related to it and I don’t intend to do more than use some of its primitives to try and see if I can improve the current state of LLMs and not to formally verify LLMs themselves.

g--0.bsky.social•12 days ago

if the AI companies think the models are going to use way less power why are they dumping billions into data centers

g--0.bsky.social•12 days ago

any way you slice it if all those planned centers go online and get utilized, that’s an environmental disaster

tokenize.bsky.social•12 days ago

Agreed. I think they are making the determination that no matter what happens, they will be out-competed if they don't build them now. This, to me, has less to do with the models and where things may go than markets and first-mover advantage.

tokenize.bsky.social•12 days ago

It's why it would be nice to have things like methane generators not allowed to power data centers or requirements on new construction relying more on renewables with each year, but that's not the current regulatory climate, to say the least

g--0.bsky.social•12 days ago

it clearly says they expect the energy usage to increase. the reason for that doesn’t actually matter

Comments

Posting Rules

Reply