Tow Center for Digital Journalism study found that "AI" chatbots provided incorrect answers to more than 60 percent of queries, with Musk's Grok 3 responding to 94 percent of the queries incorrectly.
"Premium chatbots provided more confidently incorrect answers than their free counterparts."
"Premium chatbots provided more confidently incorrect answers than their free counterparts."
Comments
-51% had clear errors
-19% introduced "factually inaccurate "statements, numbers and dates"
-13% altered or just completely made up quotes
They're doing it to save money, cut corners, attack labor, and (as with the LA Times) entrench ownership bias:
Yet OpenAI and others want you to believe this technology is just a few breaths and another million away from sentience.
"(1) the correct article, (2) the correct publisher, and (3) the correct URL".
1,3 is important; 2 is redundant but still got graded down.
So realistically, the only colors worth alarm is Red and down because of biased grading.
They can't be trusted, and so you end up spending more time checking what they've done than it would have taken to just do it yourself.
It's like hiring an idiot and having to constantly check all their work.
what could go wrong
If AI isn’t relied upon for factual recollection and instead for creative output or output that will be vetted… I think then we’re looking at better use cases. For example, coding can be helped. Or brainstorming.
Nah.
Because he’s an idiot.
Doesn't point out which model used as Perplexity offers at least 6 different models; though a few are mentioned.
Useful!
Both highlighted substantive flaws in the study's design.
Grok 3's response was the worse one, as reviewed, suggesting it failed to comprehend the article's content and research approach.
this explains much
I was surprised at how many people I know who are not in tech who use ChatGPT for therapy