- ThreadSky | a Reddit-style client for Bluesky

roryblank.bsky.social • 58 days ago

Comments

gleaningsage.bsky.social•58 days ago

AI can't overcome a 3rd grade spelling bee lol, much less an "advanced degree"

lexicled.bsky.social•58 days ago

Perfect demonstration of why you shouldn't completely trust AI - always check reliable sources to confirm info that models give you - though it is worth noting that reasoning models like OpenAI's o1 solve this by making the AI create a chain of thought!

skittbell.bsky.social•57 days ago

Or you could just not use it and look up the source to get your information. If you're looking it up to check anyway what's the point of using the AI in the first place?

lexicled.bsky.social•57 days ago

True, though if people are too lazy for that at least checking their sources is better than nothing in my mind

sabordesoledad.bsky.social•58 days ago

Gemini highlight of the day:

Comment image

richard-nnz.bsky.social•58 days ago

Lol

themagusunion.bsky.social•58 days ago

But he's one of the "good ones"! He tweets all the right things!! 😉😉

katrinathelamia.bsky.social•58 days ago

Sure, you WILL out perform these people in various jobs. I don't doubt that

That is more a damnation of how management and people at the helms of power operate, than any kind of selling point for Cat I Farted (ChatGPT)

reillybrown.bsky.social•58 days ago

You know, before I read this I would have sworn that spell-checker is the #1 job that AI would be qualified for lol

bribleck.bsky.social•58 days ago

Hmm. Could this logically be true? Because it has 3, it must also have 2? But not really a straightforward answer to the prompt.

douglasurner.bsky.social•58 days ago

No. Because, if you spell it with two you don’t get a word - let alone a word that means ‘strawberry.’

hyperboleceleste.bsky.social•58 days ago

An essential part of AI being useful is being smart enough to verify and then edit and use the output

gayknaves.bsky.social•58 days ago

ChatGPT once told me that the /r/ sound in the word "box" was implied.

but-imjustagirl.bsky.social•57 days ago

Duh, you have to *tell it* to count the third one. AI is still learning. You can’t just expect it to know things. Also you didn’t format the questions the right way, it can’t answer questions the right way if you’re not asking them the right way. 🙄

/s

but-imjustagirl.bsky.social•57 days ago

Also, #DeepSeek apparently knew about the strawberry test already.

Comment image

archontrolloff1in3.bsky.social•58 days ago

he should see a doctor his posts look like they were written by someone with dementia. less legible than Dumbold Rump.

ansambel.bsky.social•58 days ago

The same way LLMs are good at some tasks, and bad at others, the same way a tool like hammer works. ironically, you can ask GPT why does it miscounts the 'r's and it will give probably give you a better answer, than if i asked you.

tollberg.bsky.social•58 days ago

Tried this a while back. Strawbrery.

Comment image

wtf-means-wtf.bsky.social•58 days ago

strawbrery.

lol

hexihamaski.bsky.social•58 days ago

Can you call it artificial intelligence if it cannot count to three?

tinselfire.bsky.social•58 days ago

I don't know who Cuban is, but that statement alone tells me I can just block and forget he exists.

roryblank.bsky.social•58 days ago

he's one of the guys from shark tank

tinselfire.bsky.social•58 days ago

Already forgotten that show existed.

roryblank.bsky.social•58 days ago

he also owns Cost Plus Drugs and some other stuff. Either way he's mostly an investor I think.

tinselfire.bsky.social•58 days ago

Okay, so I'm not missing out on anything, got it.

kevinwardrop.bsky.social•58 days ago

LOL, FAIL

Comment image

nf1nk.bsky.social•58 days ago

You are in a dungeon, you come to two famous doors, one leads to death, the other freedom.

The usual guards have been fired and replaced with a LLM that will answer one question with great confidence but unknown accuracy.

What question do you ask?

mchirdon.bsky.social•57 days ago

I mean, does it really need all those Rs?

nicolekidmansnose.bsky.social•58 days ago

What would we do without AI

senseileroof.bsky.social•58 days ago

...think for ourselves?

nicolekidmansnose.bsky.social•58 days ago

We would have to count the Rs in the word "strawberry", all alone and without help!!!

playwiththej.bsky.social•58 days ago

The way these people are completely and irreversibly deluded.

robelder.bsky.social•58 days ago

In some respects he's correct in that you need to understand how they work in order to get results.

Because it tokenizes whole words you need to separate the letters of strawberry to get the correct answer. There is a little science to conversing with these stochastic parrots, and I hate it

Comment image

theramennoodle.bsky.social•58 days ago

Perhaps you just aren't curious enough to understand one of those Rs doesn't count

shermantank.bsky.social•58 days ago

*VERY LOUD INCORRECT BUZZER*

roryblank.bsky.social•58 days ago

to me or to the machine that gives you wrong answers about stuff or to mark cuban for believing that the machine that gives you incorrect answers about stuff is a substitute for knowledge or expertise?

shermantank.bsky.social•58 days ago

The machine that costs way too much money, how the hell did it get that wrong???

blenderbanana.bsky.social•58 days ago

How is it that most Americans can only read at an elementary school level?

If it commits fewer errors than people, then it's (LLM) a legitimate solution

shermantank.bsky.social•57 days ago

Strawberry has three R’s, it somehow fucked that up

blenderbanana.bsky.social•57 days ago

It did. It's pretty good at spotting cancers, building syllabi, and designing engines though

juicyb3rry.bsky.social•58 days ago

Who knew this labeler would be so useful?

Comment image

runawaycornpop.bsky.social•58 days ago

I mean...

It's Mark Cuban, we already know hes a knob

sidremus.bsky.social•58 days ago

it's scary how quickly we are externalizing/outsourcing our mental faculties.
there's a day in the near future where some people will legit not be able to think of the lights go out.
Skynet will never have to throw a single nuke. it's just gonna block your API key

crashblossom.bsky.social•58 days ago

My favorite adversarial paper so far is the one that shows that adding additional nonpertinent information, such as the strawberry being big or named Chad, will change the answer of math focused prompts.

adavelikeanyother.bsky.social•58 days ago

Yeh. Can't wait for DOGE AI to fill the firing gaps it created in the federal government. It'll be fine. Works just like voice recognition.

https://youtu.be/NMS2VnDveP8?si=MoeVh2GH3kN02qrN

seanspeaks.bsky.social•58 days ago

I’m not able to get any of the models to do the two r thing.

Which one are you using?

alifeinscrubs.medsky.social•57 days ago

Maybe it depends on the version you’re using?

Comment image

smartvpants.bsky.social•57 days ago

The example posted above is a very widely circulated example. The simple explanation is they have since programmed this one fix after a wave of bad press, not that they fixed teaching their predictive text algorithm how to count.

smartvpants.bsky.social•57 days ago

But also we know there's a randomization element to it too. A stopped clock is right twice a day. Sometimes randomly generated text will "guess" correctly.

mrbrightaide.bsky.social•57 days ago

"That's the wrong question"

blenderbanana.bsky.social•58 days ago

Does it have a higher rate of errors than people?

Ya'll over here acting like most Americans can spell or challenge pig-headed management.

You know what seperates a mediocre tech from an excellent tech?

An excellent tech will read the OM and Schematics. Most people just don't or cant.

vaiyt.bsky.social•58 days ago

I would expect most people to be able to count to three.

blenderbanana.bsky.social•58 days ago

These systems do not have to be perfect,

They just have to make fewer mistakes than people, and for a while now, they have.

There are novelties like the Strawberry test, and hallucinations in transcription,

But studies suggest there are less instances of those than there are careless clerks.

blenderbanana.bsky.social•58 days ago

That's with one of the most selective and coetitive medical matriculation, graduation, and post-studies systems on thr planet (American Medicine in callous towards the poor, but it does produce the ginest doctors and nurses on the planet.)

blenderbanana.bsky.social•58 days ago

British Medical Journal (BMJ) estimated that medical errors are responsible for over 250,000 deaths per year in the US.

Other studies, like one by Johns Hopkins, also point to medical errors as a significant cause of death, with some estimating as many as 440,000 deaths annually.

spydertracks.bsky.social•58 days ago

AI literally gets it wrong every time, that exactly why it had to be pulled from places like GitHub, most of the windows OS, basically any system that actually needs to work. Yes it's great for crappy fake videos and pretty pictures or anything subjective, but for any actual accuracy it sucks balls

katekoenigmusic.bsky.social•57 days ago

I do not like his take Sam I am, I do not like his take at all

heydankurtz.bsky.social•58 days ago

The third “R” cannot be contained

randymiller.bsky.social•58 days ago

Step 1: ask ChatGPT a question
Step 2: consult real sources, gain the domain specific knowledge required to determine whether ChatGPT was correct (it was not)
Step 3: credit ChatGPT to justify billions of dollars in sunk costs

clark84.bsky.social•58 days ago

I really wish billionaire Mark Cuban would shut his fucking mouth.

digitaldiogenes.bsky.social•58 days ago

This is more a limitation of tokenization, and LLM being trained on tokens and not byte-level (single letter) token sizes.

If an LLM sees strawberry as two tokens [straw][berry], there are no Rs. Straw and berry arent R

Any attempt to count them requires secondary inference on the part of the LLM

broadflatnails.bsky.social•58 days ago

Actual sensible post about AI on Bluesky? Rare

shanesnover.com•58 days ago

The point of demonstrating that it can't count letters is that these things are advertised as artificial intelligence but have no ability to reason whatsoever. It's not AI, it's an LLM, and it's not capable of doing the things Mark Cuban wants nor is there any reason to believe it will be able to.

digitaldiogenes.bsky.social•58 days ago

But then the point is a bad one because their inability to count letters well is more because they have been implemented with tokenization than an inability to reason.

higher resolution tokens increase computational requirements. Tokens are used to save resources. Byte-level tokenizers exist.

shanesnover.com•58 days ago

If they're spending this much money to develop the thing and still have to lie, one has to wonder how much money it would cost to deliver the thing they're promising. And if it's worth it. Their whole model is just scaling, the paradigm won't iterate to what Sam Altman promises.

shemeshka.bsky.social•58 days ago

So say that it's a question ill suited for llms instead of confidently answering incorrectly

nishtahir.com•57 days ago

From a technical standpoint you can't know the entire set of problems that it's not good at ahead of time, so you settle for a prioritized list starting with advice that results in personal harm but lower priority stuff like the ability to accurately count letters improve as the models get better

swiftyfox.bsky.social•57 days ago

Wow. First your counter example is amazing.

When you don't understand what you don't know, you're exactly the easiest to replace with an LLM. Don't just accept ignorance.

magdev.bsky.social•57 days ago

Ain't no way an LLM is going to replace a human being.

swiftyfox.bsky.social•57 days ago

Counterpoint; people who use them instead of knowing what they're doing. /j

soleryth.bsky.social•58 days ago

What a waste of our computing ressources and our electricity

wikibara.bsky.social•57 days ago

Yeah, something like this is definitely a trustworthy notion.

Comment image

roryblank.bsky.social•58 days ago

I am not the person who came up with this test, I saw this article, https://www.inc.com/kit-eaton/how-many-rs-in-strawberry-this-ai-cant-tell-you.html
but I just did it for myself right now and the problem persists to this day, despite it having been reported on.

hoxha-red.bsky.social•58 days ago

the problem persists because LLMs break up their inputs into chunks which they they convert to numbers ("tokens") on which they do some linear algebra. They're fundamentally just *bad* at this because they don't see "strawberry" as a word with letters in it. They won't get better without tool access

moll.dev•58 days ago

Ironically, it’s easier to get a model that can write a Python snippet that would count the number of ‘r’s and LangChain can even have the LLM call it for you

hoxha-red.bsky.social•58 days ago

this isn't to say "wow 'AI' is so good!" but it is to say that it's like asking a calculator to do real analysis—it's just not how it works. (This does suggest that tech freaks need to stop promoting LLMs as the be-all end-all of anything, though!)

roryblank.bsky.social•58 days ago

anyway, mark cuban give me $5000 to buy a photocopier to make zines with

brianobliviondds.bsky.social•57 days ago

I have continually reposted this request. If this request is fulfilled, i humbly ask for a 1% stake in your blessings paid out in a Visa giftcard.
We can worry about contact information later.
I promise to spend some money at Staples(my local home office provider).
Thank you.
❤

roryblank.bsky.social•58 days ago

i want one with a saddle stitch stapler binding finisher, i would increase the efficiency of my workflow dramatically if i had that and I know mark cuban could make it happen without breaking a sweat

thelionseye.bsky.social•58 days ago

the funny thing about the "top five" is that together they *could* fulfill a wish like that...but for each and every american, and still have more money left over then they'd know what to do with. a reasonable society would place a limit on needlessly hoarding things other members need...🤦

brianobliviondds.bsky.social•58 days ago

Hell, nationalize video poker parlors and you got small business loans and local,state,and federal taxbase AND the personal destruction of large number of individuals.
Now we just have the latter.
Le sigh.
Le pant.

paxilpopr.bsky.social•58 days ago

Yeah, this post by Mark Cuban struck me as a bit off.

AI isn’t going to replace much of anything at this point, and with the declining state of our education system it does t have much in the way of correct information from which to improve.

roisin.bsky.social•58 days ago

What did chatgpt mean by this?

Comment image

partixan.bsky.social•58 days ago

wtf, i love chatGPT now ?!?!?1?

nefnoj.bsky.social•58 days ago

you have my condolences, but yeah, they have a good point there.

frrest.bsky.social•57 days ago

Thinking education gives you all the answers, instead of giving you the ability to ask the right questions, is kind of a big sign of not being well educated.

jaymarose.bsky.social•58 days ago

Judgment has been phased out in favor of homogeny. Easier to scale. I mean profit and exploitation.

meviv.bsky.social•58 days ago

AI is designed to allow billionaires to lower your wages by claiming you can be replaced. That's what it does.

elroycraich.bsky.social•58 days ago

it is genuinely insane how stupid these people are and/or think we are. if ELIZA was invented today it would be secretary of state by March

boule.bsky.social•58 days ago

The winner combination of no education but a mindset to learn!

bennettelder.net•58 days ago

"But educated people would know how to push back when the AI screws up like that." Yeah, that's our whole damn point, Mark.

jessipoof.bsky.social•58 days ago

Mark Cuban gets on my last nerve

benlog.bsky.social•58 days ago

This test is a bit outdated, recent llm don't do this mistake any more : https://chatgpt.com/share/67b43a18-5fc8-800c-b37d-b0a8c285a703
Still, be careful with llm, treat their responses as any foreign answer on the Internet. Don't trust it more than a post on a forum or social media.

stovey.thequeer.house•58 days ago

the way i laughed LMAO

chemotaxis.bsky.social•58 days ago

lol it's really stubborn about it too

Comment image

adutton.bsky.social•58 days ago

"Ohhhh, you mean straw-berry. Yeah, well of course there's three in that .. " *sweating*

b19wing.bsky.social•58 days ago

HELP IT NEEDED YOU TO TELL IT THAT 😭😭😭

alexm920.bsky.social•58 days ago

Fun fact, if you insist there are 50 Rs, it will just as quickly apologize and agree with you. Not a thinking machine, just an algorithm for stringing together words.

b19wing.bsky.social•58 days ago

HELP REALLY???? Omfg

azraelnewtype.bsky.social•58 days ago

I wonder sometimes how much worse the output of my old toy markov chain generator would be than these if I had the resources to have it ingest basically the whole of written language instead of the irc channels it lived in. The model math IS better than weighted rng but...

takkunotori.bsky.social•58 days ago

I also heard it has a mad hard time making images of watches that aren't at 10 and 2.

chrisbufkin.bsky.social•58 days ago