Perfect demonstration of why you shouldn't completely trust AI - always check reliable sources to confirm info that models give you - though it is worth noting that reasoning models like OpenAI's o1 solve this by making the AI create a chain of thought!
Or you could just not use it and look up the source to get your information. If you're looking it up to check anyway what's the point of using the AI in the first place?
Duh, you have to *tell it* to count the third one. AI is still learning. You can’t just expect it to know things. Also you didn’t format the questions the right way, it can’t answer questions the right way if you’re not asking them the right way. 🙄
The same way LLMs are good at some tasks, and bad at others, the same way a tool like hammer works. ironically, you can ask GPT why does it miscounts the 'r's and it will give probably give you a better answer, than if i asked you.
In some respects he's correct in that you need to understand how they work in order to get results.
Because it tokenizes whole words you need to separate the letters of strawberry to get the correct answer. There is a little science to conversing with these stochastic parrots, and I hate it
to me or to the machine that gives you wrong answers about stuff or to mark cuban for believing that the machine that gives you incorrect answers about stuff is a substitute for knowledge or expertise?
it's scary how quickly we are externalizing/outsourcing our mental faculties.
there's a day in the near future where some people will legit not be able to think of the lights go out.
Skynet will never have to throw a single nuke. it's just gonna block your API key
My favorite adversarial paper so far is the one that shows that adding additional nonpertinent information, such as the strawberry being big or named Chad, will change the answer of math focused prompts.
The example posted above is a very widely circulated example. The simple explanation is they have since programmed this one fix after a wave of bad press, not that they fixed teaching their predictive text algorithm how to count.
But also we know there's a randomization element to it too. A stopped clock is right twice a day. Sometimes randomly generated text will "guess" correctly.
That's with one of the most selective and coetitive medical matriculation, graduation, and post-studies systems on thr planet (American Medicine in callous towards the poor, but it does produce the ginest doctors and nurses on the planet.)
British Medical Journal (BMJ) estimated that medical errors are responsible for over 250,000 deaths per year in the US.
Other studies, like one by Johns Hopkins, also point to medical errors as a significant cause of death, with some estimating as many as 440,000 deaths annually.
AI literally gets it wrong every time, that exactly why it had to be pulled from places like GitHub, most of the windows OS, basically any system that actually needs to work. Yes it's great for crappy fake videos and pretty pictures or anything subjective, but for any actual accuracy it sucks balls
Step 1: ask ChatGPT a question
Step 2: consult real sources, gain the domain specific knowledge required to determine whether ChatGPT was correct (it was not)
Step 3: credit ChatGPT to justify billions of dollars in sunk costs
The point of demonstrating that it can't count letters is that these things are advertised as artificial intelligence but have no ability to reason whatsoever. It's not AI, it's an LLM, and it's not capable of doing the things Mark Cuban wants nor is there any reason to believe it will be able to.
But then the point is a bad one because their inability to count letters well is more because they have been implemented with tokenization than an inability to reason.
higher resolution tokens increase computational requirements. Tokens are used to save resources. Byte-level tokenizers exist.
If they're spending this much money to develop the thing and still have to lie, one has to wonder how much money it would cost to deliver the thing they're promising. And if it's worth it. Their whole model is just scaling, the paradigm won't iterate to what Sam Altman promises.
From a technical standpoint you can't know the entire set of problems that it's not good at ahead of time, so you settle for a prioritized list starting with advice that results in personal harm but lower priority stuff like the ability to accurately count letters improve as the models get better
the problem persists because LLMs break up their inputs into chunks which they they convert to numbers ("tokens") on which they do some linear algebra. They're fundamentally just *bad* at this because they don't see "strawberry" as a word with letters in it. They won't get better without tool access
Ironically, it’s easier to get a model that can write a Python snippet that would count the number of ‘r’s and LangChain can even have the LLM call it for you
this isn't to say "wow 'AI' is so good!" but it is to say that it's like asking a calculator to do real analysis—it's just not how it works. (This does suggest that tech freaks need to stop promoting LLMs as the be-all end-all of anything, though!)
I have continually reposted this request. If this request is fulfilled, i humbly ask for a 1% stake in your blessings paid out in a Visa giftcard.
We can worry about contact information later.
I promise to spend some money at Staples(my local home office provider).
Thank you.
❤
i want one with a saddle stitch stapler binding finisher, i would increase the efficiency of my workflow dramatically if i had that and I know mark cuban could make it happen without breaking a sweat
the funny thing about the "top five" is that together they *could* fulfill a wish like that...but for each and every american, and still have more money left over then they'd know what to do with. a reasonable society would place a limit on needlessly hoarding things other members need...🤦
Hell, nationalize video poker parlors and you got small business loans and local,state,and federal taxbase AND the personal destruction of large number of individuals.
Now we just have the latter.
Le sigh.
Le pant.
Yeah, this post by Mark Cuban struck me as a bit off.
AI isn’t going to replace much of anything at this point, and with the declining state of our education system it does t have much in the way of correct information from which to improve.
Thinking education gives you all the answers, instead of giving you the ability to ask the right questions, is kind of a big sign of not being well educated.
This test is a bit outdated, recent llm don't do this mistake any more : https://chatgpt.com/share/67b43a18-5fc8-800c-b37d-b0a8c285a703
Still, be careful with llm, treat their responses as any foreign answer on the Internet. Don't trust it more than a post on a forum or social media.
Fun fact, if you insist there are 50 Rs, it will just as quickly apologize and agree with you. Not a thinking machine, just an algorithm for stringing together words.
I wonder sometimes how much worse the output of my old toy markov chain generator would be than these if I had the resources to have it ingest basically the whole of written language instead of the irc channels it lived in. The model math IS better than weighted rng but...
Comments
That is more a damnation of how management and people at the helms of power operate, than any kind of selling point for Cat I Farted (ChatGPT)
/s
lol
The usual guards have been fired and replaced with a LLM that will answer one question with great confidence but unknown accuracy.
What question do you ask?
Because it tokenizes whole words you need to separate the letters of strawberry to get the correct answer. There is a little science to conversing with these stochastic parrots, and I hate it
If it commits fewer errors than people, then it's (LLM) a legitimate solution
It's Mark Cuban, we already know hes a knob
there's a day in the near future where some people will legit not be able to think of the lights go out.
Skynet will never have to throw a single nuke. it's just gonna block your API key
https://youtu.be/NMS2VnDveP8?si=MoeVh2GH3kN02qrN
Which one are you using?
Ya'll over here acting like most Americans can spell or challenge pig-headed management.
You know what seperates a mediocre tech from an excellent tech?
An excellent tech will read the OM and Schematics. Most people just don't or cant.
They just have to make fewer mistakes than people, and for a while now, they have.
There are novelties like the Strawberry test, and hallucinations in transcription,
But studies suggest there are less instances of those than there are careless clerks.
Other studies, like one by Johns Hopkins, also point to medical errors as a significant cause of death, with some estimating as many as 440,000 deaths annually.
Step 2: consult real sources, gain the domain specific knowledge required to determine whether ChatGPT was correct (it was not)
Step 3: credit ChatGPT to justify billions of dollars in sunk costs
If an LLM sees strawberry as two tokens [straw][berry], there are no Rs. Straw and berry arent R
Any attempt to count them requires secondary inference on the part of the LLM
higher resolution tokens increase computational requirements. Tokens are used to save resources. Byte-level tokenizers exist.
When you don't understand what you don't know, you're exactly the easiest to replace with an LLM. Don't just accept ignorance.
but I just did it for myself right now and the problem persists to this day, despite it having been reported on.
We can worry about contact information later.
I promise to spend some money at Staples(my local home office provider).
Thank you.
❤
Now we just have the latter.
Le sigh.
Le pant.
AI isn’t going to replace much of anything at this point, and with the declining state of our education system it does t have much in the way of correct information from which to improve.
Still, be careful with llm, treat their responses as any foreign answer on the Internet. Don't trust it more than a post on a forum or social media.