I've studied AI programming a little bit, and I can explain it. What AI basically does is, it looks at correct answers that it can learn from, and then it uses that learning to create new content that RESEMBLES what it has seen before. But that need not be correct. It only has to have the same form.
ChatGPT / LLMs are things that give you a response in the shape of an answer. It’s not using logic to work anything out. It’s essentially autocomplete, but at a much larger scale.
It cannot do the thing we use most computer software for (working through processes with maths reliably & accurately).
AFAIK it's because LLMs convert everything to words, so it doesn't actually do the maths, but searches its library for the most likely answer. And given how often people are wrong online, it's working with bad data.
AI basically stops computers doing the one thing they're good at.
That would break it, its an autocomplete on steroids, adding layers to avoid admitting to copyright infringment on top of it, has already made it much less efficient at its supposed job.
I tried to use it for simple language learning and it got super basic grammar wrong. It also mixed up several Romance languages and had a hard time admitting to being wrong.
one of the fundamental engineering challenges of these things is that there's no clear on ramp or off ramp for specialized logic. every attempt to bolt another non-llm feature onto the side of the llm requires you to try to understand what's happening in the conversation
It's because it's just fundamentally not possible, they haven't written code to make chatgpt, they've just shovelled data into a black box that spits out words. They can't just "fix" the code because they didn't actually write it.
Exactly. It's basically one of those chat bots from the early 00s with access to the internet and a library of OG works. It's function is to give you an answer, whether it makes sense or not.
This is also, I think, why so many people working on LLMs at places like OpenAI and Anthropic have gotten weirdly religious about it. Coders already tend to be superstitious, but working constantly around this black box they can't comprehend the workings of? That's gonna twist your mind Ito style.
Yeah they don't understand the thing they're working on because it's just a black box, so all they can do is tech priest rituals. "Did this actually we did actually improve the LLM or was it just random chance that this hand we generated had the correct number of finger?"
In fact chatgpt is more a text generator, they can make an agent specialized for it but they have priorities with starters packs pictures. Have you tried on Claude witch is more accurate on such things ?
It's the same than using a Ferrari to deliver parcel, it's a good car but inefficient for that.
Search how itterative models works and you will understand why.
Okay, but in this analogy LLMs are the Cybertruck. A big stupid status symbol that marks you out as being a big stupid rube with no taste or understanding who is tricked out of money by grifters.
LOL no just no. A system presented as intellectually superior to humans should be able to make calculations. Humans can make calculations without calculators, you know that right?
It has one! ChatGPT can run limited python and see the output, and 123*456 is a valid python program. So it's a calculator that has a calculator and STILL can't do math.
We have calculators. And so many websites to do math. I use them quite frequently because some of that shit is just a big nope for me.
They all have the gigantic advantage of NOT slurping up grotesque amounts of electricity and water, so why would we put them inside a framework that does do that?
That’s fantastic, it’s the 1 that shows all the steps, right? Helped me pass college algebra because of that (my main weakness academically is higher math).
And not because I cheated! It would help get through the steps when I was stuck in spots.
I can write my ass off though, don’t need ChatGPT.
I think the point is that calculations are literally the only thing computers do. It's how they work at their very core. They should not be doing them wrong, especially the supposedly fancy new super computer nonsense.
The GenAI does word calculations. It does not read the words, it just does a statistical analysis of how often they are close to other words. So it is doing the same thing to the numbers. It doesn't recognise them as numbers in the traditional sense, and isn't trained to apply maths to them.
My point is that we HAVE plenty of online interfaces that will "get a fucking sum right", and that use far fewer resources to do it while being much more reliable.
The lying theft machine does NOT need that feature, which will do the exact same thing but cost 10x more per calculation.
One mildly depressing thing is that one could easily build a low cost local voice assistant that could pretty reliably assemble any sum you dictated to it, chuck it at a calculator, and give you the answer, yet I suspect most people wouldn't like it - not chatty/friendly/sycophantic enough.
Genuinely get the impression that the lure of the LLM is at least in part one of those things that boils down to The Psychology Of The Individual, possibly sometimes in slightly hair-raising ways.
If I remember correctly, ChatGPT has access to a Python playground where it can check the answer, so they basically did put a calculator in it. I think it's called Advanced Data Analysis mode. (ChatGPT is still shit but it can check. It still doesn't do it nearly often enough)
Honestly my guess would be that since LLMs don't 'understand' anything that's asked if them, they can only infer relationships to other words in their database, they haven't worked out how to recognise when the LLM is being asked a maths question and let the calculator take the wheel.
They do put calculators in it. Problem is you have an unreliable mediator (the LLM itself) inserted between the you and the calculator. If you can't rely on the LLM 100% for non-calculator tasks, you also can't rely on it 100% to put your question into the calculator and return the answer
its like if you have an employee that has access to a calculator and you ask them to do some math for you, but you know the employee is generally incompetent. sometimes they might still find a way to mess it up.
Because people are being told by their bosses and by ads for Gen AI that it CAN in fact do calculations correctly. It is a one stop shop for all possible use cases, if you can't figure out the prompt, that's a you problem. So they do, and when they say it's wrong they're told to check their prompt
Yeah I know, it's nonsense. Saw a thread yesterday (I may be conflating two) about how it's being marketed as a tool for which we must find a use - but that means it's not a tool doesn't it. Tools are designed with a use in mind.
They're using the same strategy as car marketers - use this for *everything* even when it's not suitable and uses way more energy than you need. It's society's fault if you can't park! It's society's fault if there's traffic!
... And it's worked for them, so I'm scared it's working for this too.
Then you need to learn about the technology before you try to form good critiques. Because it is a large language model. NO ONE would tell you that makes it good at mathematics.
and yet an awful lot of people are telling me that it's somehow going to evolve into an all-knowing general intelligence that one would certainly presume would be good at mathematics
I think what I mean is we shouldn’t waste too much time arguing against those people, they really aren’t the strategic dangerous ones here, just the grifters going with the flow
There's a looong answer to this involving the difference between the pseudo eigenvectors in a massive data set and actual knowledge if you fancy being bored into genuine tears by linear algebra
TL;DR - LLMs 'understand' literally nothing and never will, ironically because of maths
LLMs have no real concept of numbers. They can deal in abstracts (longer; shorter) but ask one to write 250 words and it will faceplant. So much of it is little more than fancy autocomplete based on probability driven by eating the entire internet. A souped up version of the keyboard row on a phone.
It's not actually "thinking" in that sense. It's guessing the next best word/characters based on what it was trained on.
If it's seen the calculation you want lots of times in it's training, it will most likely get it right... If it's not seen it before, then most likely wrong.
Maybe a more complicated one could recognise that it's a calculation... Read it in and hand it over to a dedicated program...
But I don't think that is happening, yet.
Which in itself showcases how little LLM creators are thinking. They should be able to hand off + reintegrate components. Then again, AI is now so big that the once greatest search engine is determined to ruin itself with AI-gen results that are often comically wrong and sometimes dangerously so.
Gemini does that on the phones, it will hand off the data to the clock app or whatever. I guess for it to be useful the calculator would need to hand back the data afterwards.
It doesn't always though. I do a lot of decimal to hex conversions. I used to just Google them because it had a little plugin that did the conversion. Phrase the conversion query wrong now and you get the AI *answer*, which is always wrong and not even stable. If I'm lucky I get the old plugin.
One of the things I think probably surprises people is the progress in these things is not linear. I’ve done a lot of experimenting with style and voice in LLMs and that is absurdly inconsistent. So complexity is being added by needing to target a specific version – if that’s even possible.
Ah I meant talking to your phone, so it hands off to other system apps if it's been allowed to. Know exactly what you mean about new Google. Totally unpredictable, i often ask it time related questions and it's gone from 100% correct to 75%.
Comments
They take an input sentence, turn each complete word into a numerical token, and do matrix multiplication on those numbers using pre-trained weights.
It’s not a calculator.
It cannot do the thing we use most computer software for (working through processes with maths reliably & accurately).
AI basically stops computers doing the one thing they're good at.
Incredible!
https://bsky.app/profile/posistress.bsky.social/post/3lnqi4mueh22n
And this answer will almost always be 'shaped' like coherent language.
This makes them potentially much more dangerous, in a stupid way, then a computer just erroring out.
Which means the LLM failed to properly parse that it should use the knowledge engine.
MORE ACCURATE?
THE ONLY ACCEPTABLE LEVEL OF ACCURACY FOR THE MATH MACHINE WHEN DOING MATH IS 100% ACCURACY!
It's the same than using a Ferrari to deliver parcel, it's a good car but inefficient for that.
Search how itterative models works and you will understand why.
And python can be use like a calculator.
Learning how to use a little Python and opening a terminal and putting your problem in so you can actually gain some understanding and insight of what is going on.
https://www.programiz.com/python-programming/examples/calculator
Like all tools it's specialized
They all have the gigantic advantage of NOT slurping up grotesque amounts of electricity and water, so why would we put them inside a framework that does do that?
And not because I cheated! It would help get through the steps when I was stuck in spots.
I can write my ass off though, don’t need ChatGPT.
The lying theft machine does NOT need that feature, which will do the exact same thing but cost 10x more per calculation.
... And it's worked for them, so I'm scared it's working for this too.
no one *who knows what they're talking about* would, but that's an entirely different sentence!
TL;DR - LLMs 'understand' literally nothing and never will, ironically because of maths
https://bsky.app/profile/ketanjoshi.co/post/3lnixuywpws2v
If it's seen the calculation you want lots of times in it's training, it will most likely get it right... If it's not seen it before, then most likely wrong.
But I don't think that is happening, yet.