remember that whole story about how ChatGPT aced the bar exam? Oops! OpenAI 100% just lied about that. It didn't happen. Oopsie! https://www.nytimes.com/2024/05/15/opinion/artificial-intelligence-ai-openai-chatgpt-overrated-hype.html?unlocked_article_code=1.sE0.SV0g.r4iVMq0NT6z7&smid=nytcore-ios-share&referringSource=articleShare&sgrp=c-cb
Comments
Oopsie! https://www.tiktok.com/@boxoutmusic/video/7224179902394666282
This is the way the world ends
This is the way the world ends
My antivirus doesn't like NYT for some reason.
Turns out, the amount of prompting, cutting, and editing it took to show what was presented took longer and required much more effort than just regular shoots.
(The Louisiana bar exam is also a lot of regurgitation, but it throws some truly disgusting hypos at you to analyze and write essays on)
And because he didn't know the language, he was completely unaware. If I hadn't been watching he would not have caught it.
2/2
A lot of the reporting in that New York Times article was covered by him. Good lil podcast about it.
THREAD:
Here's the paper that's the source of the reference. Read it.
https://link.springer.com/article/10.1007/s10506-024-09396-9#Sec11
Let's go down the list of the problems with the claims that OpenAI made it up, one by one.
https://royalsocietypublishing.org/doi/10.1098/rsta.2023.0254
Katz is not an OpenAI employee. He's a Chicago-Kent Professor of Law, and cofounder of 273 Ventures, a legal AI company.
https://kentlaw.iit.edu/law/faculty-scholarship/faculty-directory/daniel-martin-katz
2) The source paper here CONFIRMS that the scaled MBE score was calculated correctly by Katz et al.
It does criticize the rigour of the essay grading, in that no rubrick was used and the graders weren't NCBE trained, but that's just as likely to downplay GPT-4 as upplay it
4a) Maximally vs. minimally tailored questions:
There are two ways you can ask the AI questions. You can format them up all nice and neat (such as putting quotes around the question), tell it to give ranked choices and proper...
That was easily the worst job of my career BTW. I swore off software development afterwards
it isn't just always a guy in India at a computer, the datasets they train on are all taken from said low-paid humans
Love to train the robots to replace my work with "good enough" content.
But it ends up that:
good structure + sometimes ok-ish facts + audience wanting to believe + tech corps wanting $$$ = giant hype bubble
https://techpolicy.press/us-senate-ai-working-group-releases-policy-roadmap
https://link.springer.com/article/10.1007/s10506-024-09396-9/tables/1
https://bsky.app/profile/sababausa.bsky.social/post/3kmnnqv5fwu2c
Damn that's crazy