horton.hearsa.foo
Follow me for thoughts on applying GenAI in the healthcare industry!
According to one Bluesky user, “A literal toddler brain made of mush.”
493 posts
290 followers
2,331 following
Getting Started
Active Commenter
comment in response to
post
If you don’t want to hear from me, Bluesky has tools for that. But I feel like you must get something out of sticking around and calling me names instead
comment in response to
post
I feel entitled to…post on public social media that anyone can join?
comment in response to
post
I’m a stubborn guy
comment in response to
post
I stay here mostly to annoy people like you who try to bully me off
comment in response to
post
I know you’re not seriously asking but for anyone else who might see this thread here’s one good example
And lol I think grok sucks but nice try
comment in response to
post
Sure, that’s true. But we’ve already come a long way from “every letter is bullshit”
comment in response to
post
One more insult on your way out I see. Stay classy
comment in response to
post
The most likely word is often (but, importantly, not always!) the right one
comment in response to
post
For sure, it gets things wrong, but you can’t really simultaneously say that it is all bullshit and also that it’s trained to output the most likely word from a huge corpus of human-written text.
comment in response to
post
Right, but if it’s trained on likely words, that means it has at least some training to pick correct answers.
Like if I type “The capital of France is”, it will say “Paris” because it’s been trained in a way that rewards “Paris” more than “Madrid” in that phrase
comment in response to
post
That statement has no relation to its own truth or falsity.
Like what are you even saying?
comment in response to
post
Wait, so is every letter bullshit or is it choosing from a distribution of likely word forms?
comment in response to
post
You know what they say about when you assume…
You hate LLMs for hallucinating but you’re literally making up my life story out of your imagination
comment in response to
post
Summarize
“to express the most important facts or ideas about something or someone in a short and clear form”
dictionary.cambridge.org/us/dictionar...
comment in response to
post
Anyway, how did we get around to litigating my follower count? I never claimed I was an important voice on a Bluesky, I’m just one guy who likes arguing about LLMs
comment in response to
post
I think people on here are interesting and I have wide interests 🤷 I copied the following list of a few accounts I liked when I moved over here, and I like my feed so I haven’t ever gone back to unfollow anyone
comment in response to
post
lol this isn’t even a debatable point. You can give an LLM a text and it will make a summary. Is it good? Is it bad? It depends, and a lot of people are studying that. But it literally _does_ summarize
comment in response to
post
I don’t post here for wide engagement either, can’t your see I only have 300 followers! For the kinds of things I post, most people have stayed on Twitter
comment in response to
post
You caught me, I used AI to write the sentence “Follow me for thoughts on applying GenAI in the healthcare industry!”
comment in response to
post
> (most have almost zero engagement)
Also buddy I’m really not sure you’re the one to throw stones here
comment in response to
post
> A literal toddler brain made of mush.
This is great, I think I’ll add it to my bio
comment in response to
post
Personal insults are what come after you know you lost the argument
comment in response to
post
Sick burn
comment in response to
post
Ok! You have a good day too
comment in response to
post
Well, you’re entitled to your own opinion
comment in response to
post
I agree! In the meantime, I’m going to keep working on what I can to make things better for people who have to deal with the health insurance they have right now
comment in response to
post
It’s funny you calling me creeper when you’re the one who screenshotted my profile for some reason and posted it
comment in response to
post
I post under my real name here, you can look up what I work on. Here’s an example:
includedhealth.com/blog/tech/bu...
comment in response to
post
This is a public app…
comment in response to
post
Look I shouldn’t even engage at this point but this is funny to me because my last job was literally a company that helped people understand why their claims were denied and help them challenge it. I definitely don’t work for the insurance companies…
comment in response to
post
So you came to the thread to add something irrelevant?
comment in response to
post
This isn’t even your thread lmao…
If you actually care to educate yourself about the state of AI translation, here’s a paper to start with:
“Over all, the best performing system in general seems to be Claude-3.5-Sonnet (wins in [language pairs])”
www2.statmt.org/wmt24/pdf/20...
comment in response to
post
Yeah that’s not what he said though, he was talking about software we had before
comment in response to
post
Like yes, sometimes people do research on ways to improve things we already had
comment in response to
post
LLMs are way better at translation and summarization than the software we had before
comment in response to
post
So we’re going to pretend like adaptations of literature didn’t exist before AI? Who else read the Great Illustrated Classics growing up?
comment in response to
post
Thank you
comment in response to
post
I mean no, when you asked for evidence I didn’t think I’d have to personally defend the methodology of a peer-reviewed paper, yet here we are (I mean the second one I shared, the first I admit was only pre-print)
comment in response to
post
I've spent too much time on this already, but the research on LLM summarization quality is out there to find on Arxiv or Google Scholar if you really want to read it and engage with it in good faith. I don't really care to be the one who educates you on the entire field of evaluating summaries.
comment in response to
post
Also, how they define hallucination isn't relevant to the results that support my original claim. In the experiment I care about, they just did pairwise comparisons of the summaries and asked humans which was better, and LLM summaries won more often.
comment in response to
post
As a trivial example, if I was summarizing this article on the NBA finals and randomly inserted the sentence "Obama was president" into my summary, that would be factual and at the same time completely ungrounded in the source text.
bleacherreport.com/articles/252...
comment in response to
post
They ask "How much hallucinated content is factual, even
when unfaithful?" and include a measurement of what % of hallucinations are factual vs not.
It's entirely possible to "hallucinate" in a summary (include something not actually in the source) and still have it be factual.
comment in response to
post
> The problem being that extrinsic hallucinations are necessarily wrong, they're just not supported explicitly by the text.
If you look at the paper that originally defines "extrinsic hallucinations", this is not a "problem" but in fact part of the definition that they consider extensively.
comment in response to
post
To start, can we agree on the fact that finding people to write and review summaries for academic research is difficult?
From there, my question is: how exactly would you want these researchers to source participants in the studies? Grad students and MTurk are pretty standard approaches.
comment in response to
post
I can't believe I'm having to say this, but Chinese grad students are humans