LLMs hallucinating nonexistent software packages with plausible names leads to a new malware vulnerability: "slopsquatting." - ThreadSky

daviddlevine.com • 49 days ago

LLMs hallucinating nonexistent software packages with plausible names leads to a new malware vulnerability: "slopsquatting."

Comments

wijn.bsky.social•49 days ago

Stoop calling it “hallucinating”. “Making shit up” is a lot more accurate.

caravellin.bsky.social•48 days ago

No. It's hard to find a human metaphor for LLMs because anything you pick will come with incorrect anthropocentrism, but one large benefit "hallucinate" has over "making shit up" is that the latter implies an intent & voluntary control that aren't there.

Holds even stronger for "lying" btw.

kristenarden.bsky.social•48 days ago

Maybe it's because computers are not and never will be "intelligent". That term was selected on purpose *to* anthropomorphize a machine. And it's worked for them so far.

gremlijns.bsky.social•47 days ago

this is the exact reason *to* call it making shit up; "hallucinate" implies a lack of control, and even illness. but it is code which does what it is designed to do: draw conclusions. purposely. and since it's also extrapolating, we also call that 'making shit up', especially since it's inaccurate

caravellin.bsky.social•47 days ago

The code isn't designed to "draw conclusions", it's designed to complete a prompt with plausible follow-up text. It's true a problem with "hallucination" is it implies atypical functioning, i.e. when it gives wrong answers it functions differently from when it doesn't, but so does "making things up"

talysalankil.snarktheater.com•47 days ago

The problem with lying/making shit up is it's giving in to the advertisement that these are intelligent in some way. You can't lie if you don't know the truth first. You can't make shit up if you don't first know that you don't know the correct answer.

gremlijns.bsky.social•47 days ago

lying is not synonymous with making shit up. no one should say they're lying because they can't be intentional because they're not sentient.

'making things up' is literally what they are designed to do. idk what else everyone thinks that predictive models are designed to do, but, predictions

gremlijns.bsky.social•47 days ago

require ability not present in binary logic gates.

hallucinations are involuntary, happen without any external stimulus, and perceive something which does not exist. my computer isn't sitting there inventing shit on its own. it's asked to invent an output. calling that hallucinating is ableist!

daviddlevine.com•49 days ago

I actually agree with you, but I'm using the terminology that most of my intended audience uses and understands.

ellenahzulu.bsky.social•48 days ago

Gotcha. Let's make a concerted effort to change the terminology as quickly as possible.

daviddlevine.com•48 days ago

I think getting people to stop calling statistical language prediction models "artificial intelligence" is more important, and neither is likely to happen at this point.

chatsubo657.bsky.social•48 days ago

Yeah. I’ve hallucinated some stuff that turned out to be sensible and useful in the long term.
Lying is just lying.

caravellin.bsky.social•48 days ago

Right, and what LLMs do isn't lying as that implies a deliberate distortion of a truth LLMs have no notion of. "Bullshitting" is closer.

Also, hate to contradict further but "sometimes is useful on accident" is a feature of hallucinations that *does* fit LLMs quite well.

30-50grisar.bsky.social•48 days ago

the "broken clocks twice a day" thing but with a slightly higher probability of being right

foxborne.bsky.social•48 days ago

And with some, it's just "once in a while".

caravellin.bsky.social•48 days ago

Yeah I think the probability of being right depends a lot on domain - the closer to dataset, the more accurate - but which is the most appropriate isn't obvious. Most common queries in dataset ? Undoubtedly better vs broken clock. All possible queries? Undoubtedly worse. Most important queries...?

hono4kami.bsky.social•48 days ago

tbh it does feel like talking to someone who is *hallucinating*.

molynj.bsky.social•48 days ago

Agree. I've been saying the same for a while now.

the-greetest.bsky.social•49 days ago

Valid output that is dangerously incorrect.

ciw.bsky.social•48 days ago

Agree — “interpolating beyond reality” would be the most accurate description of what’s going on. In many cases (most?), we want LLMs to stay in-bounds of reality when responding to queries.

lupa.bsky.social•48 days ago

📌

truthinadvertorial.bsky.social•48 days ago

One man’s slop is another man’s vibe. Both equally not great, tho.

volcommie.bsky.social•48 days ago

📌

nafnlaus.bsky.social•48 days ago

People have been package-name squatting for ages. Different spellings of popular packages are a particularly ripe target.

menger-spongebob.bsky.social•48 days ago

hell, look at the leftpad incident

nafnlaus.bsky.social•48 days ago

I always get worried in particular about packages whose common names are different from what you actually type to install them (for example, to install PyTorch, it's not "pip install PyTorch" or "pip install pytorch", but rather "pip install torch).. It's a really common situation...

segment6.net•48 days ago

I feel like an important question is why it's so easy to get malware into these repos

borogove.bsky.social•48 days ago

A close read of this article reveals a dude saying “here’s a hypothetical threat that my company happens to sell a solution for.”

nocatsnomasters.bsky.social•48 days ago

Not really. Dependency attacks have been well-known for years and delivering them via typo'd dependencies in package managers and whatnot is not new. What's relevant here is that code-generating AIs are making the problem worse for shops dumb enough to allow them.

davelf.bsky.social•48 days ago

Yay. 🤮 can’t wait until AI-generated job reqs list expertise in fake software as a “Must-Have”

30-50grisar.bsky.social•48 days ago

3 years experience with Frongle-based Smourgh stacks and deploying with Kulverts

dubbaewwteeeff.bsky.social•48 days ago

Kulverts is actually pretty plausible as a name for a Kubernetes tool...

caffeineislife.bsky.social•48 days ago

Hah, that's exactly what I thought too. 😅

sirusthemaddj.bsky.social•48 days ago

Literally a month after some code bases dropped, there were employers asking for two years experience with it, despite it not being publicly available until then.

davelf.bsky.social•48 days ago

Design too: “Must have 5 years experience with Figma” back when it had just hit the market

robtelford.bsky.social•48 days ago

In mid-1994, not long after I had built my company’s first web site while figuring out HTML worked (with no books on the market) I recall seeing a job ad for the post of webmaster for a large international corporation ‘two years experience making commercial web sites required’

Was very tempted

sirusthemaddj.bsky.social•48 days ago

I think the real questions would be asking recruiters across all forms of industries if they truly believe applicants have access to casual time travelling devices, or that time chamber thing from Dragon Ball Z...

gavinfred.bsky.social•48 days ago

More proof that AI is artificial stupidity.

magatzapper.bsky.social•49 days ago

So let's use LLMs to rewrite decades-old stable code that performs vital government functions, what could possibly go wrong? 🙄🙄🙄🙄🙄

uniongal.bsky.social•49 days ago

You forgot part of that “and host it in the cloud. What could go wrong?”

ampallanguk.bsky.social•48 days ago

Using spinning rust as the archival 5year lifespan storage medium instead of decades worth of constantly improving tape backup with 100 year lifespans; "what could possibly go wrong" indeed.

themstems.bsky.social•48 days ago

damn, the term “slopsquatting” is the first interesting ai news I’ve heard
I’m impressed with that one whew

mynameisalwin.bsky.social•47 days ago

The LLM may not be mature enough to produce answers that require reasoning, it just needs retraining

daviddlevine.com•47 days ago

LLMs by their design do not reason. They only produce outputs that statistically resemble their inputs. Basically all they do is "what's the most likely next word?" If the output is factually correct that's just a happy accident.

discrowpanik.bsky.social•47 days ago

Try telling that to all these clueless zombies who think the content-theiving Hallucination Machines can fact-check just as well as a college student banging out a research paper on a Dell back in 2005.

ellenahzulu.bsky.social•48 days ago

LLMs are palantirs. They are persuasion engines. LLMs seek to provide the user with data in a way that the LLM has determined would be most appealing to that user. Their true primary purpose is to affect the user's choices & future decisions & actions.

kozufox.bsky.social•48 days ago

No, their true primary purpose is to give a response that is statistically the most 'correct'. If you feed it all the words ever written and ask it common questions, this will result in it giving the most common answers. Anything more than that is anthropomorphizing.

nocatsnomasters.bsky.social•48 days ago

Uh.... this is not how LLMs work.

menger-spongebob.bsky.social•48 days ago

which step of training does that?

sirusthemaddj.bsky.social•48 days ago

GenAI models just processes data. It effectively is just automated predictive text, despite any usage of Web results to pretend otherwise. There is no intelligence.

I get the need to demonise the technology because of all the morons rushing to try and profit off of it, but making crap up isn't it.

ellenahzulu.bsky.social•48 days ago

"making crap up" is what LLMs do. They are bullshitters. You're unaware of attys using LLMs for legal research. The results are predictive text, but the data is bullshit. It doesn't actually exist anywhere. The text is made up in order to persuade the user & the reader. IOW a palantir. Get it?

sirusthemaddj.bsky.social•48 days ago

.....

So, the way GenAI text works is that it sets a value against each word, and then generates output either from a table of weighted values from equivalent words or phrases set up, or generated from a websearch result similar to the input words.

So, unless you're claiming English is "made up"..

ellenahzulu.bsky.social•48 days ago

You are obviously unaware of attys learning that legal research using LLMs = bullshit. Cases made up out of thin air. Quotes fabricated. All in service of crafting a *persuasive legal argument* for a judge to rule upon. Palantirs are tools. LLM are tools. Purpose = influence & persuade.

sirusthemaddj.bsky.social•48 days ago

As for the Palantír, they're also only shows real objects and sites, and the reason using it is a bad idea in LOTR is because other magic users can influence and override it's use, so congratulations in being completely wrong on three separate fronts.

ellenahzulu.bsky.social•48 days ago

Palantirs show past, present & future outcomes--all of which may be correct, false or manipulated. Some of which are shown in order to persuade the user to make decisions & take actions. If you program an LLM to exclude some data, promote other data, make up data--you're manipulating the user.

sirusthemaddj.bsky.social•48 days ago

Now, are we going to scream like a complete buffoon about made up dooms about GenAI, or are we going to learn what the problems with the technology are and not sound like an attentive seeking troglodyte?

Here's a hint. Stick with the copyright infringement.

You can leave now.

thaxter.bsky.social•49 days ago

Nations well prepared to weaponize this: Russia, China, Israel, USA.

Doesn't seem safe for anyone to roll that D4.

uniongal.bsky.social•49 days ago

The mosaic effect. Literally willing this into existence for no apparent reason other than they can.

ebolaisamongus.bsky.social•48 days ago

Das crazy and f*cked. Something new to think about for org cyber security.

murrayinontario.bsky.social•48 days ago

📌

wyrdandnerdy.bsky.social•49 days ago

Yeah basically a lot of LLMs don't actually "know" much about coding.

But they "know" a function/procedure could theoretically be written to do anything.

So when the LLM gets stuck it just fills it with a call to a non-existent piece of code like GetDeusExMachinaHere() or whatever.

schroedinger.bsky.social•48 days ago

LLMs are probability engines. If you prompt it on a word or sequence of words that occurs a lot in the training data, they'll spit back words that tend to follow those initial words the vast majority of the time. If you do something to get it down a low-probability path or feed it a sequence that

schroedinger.bsky.social•48 days ago

doesn't occur (much) in its training data, it's still digging around for words that have some probability of following the words you've prompted it with. Except you've kicked it into a low probability space so by definition nothing it's going to find has a high probability of occurring in reality.

schroedinger.bsky.social•48 days ago

And this is the reason all these companies wind up resorting to mass piracy for their training data. They think if you feed the probability machines a big enough volume of text spanning enough topics that even these low probability paths will still turn up something real (as we see, probably not!).

schroedinger.bsky.social•48 days ago

This isn't fundamentally news stuff, down in the weeds it's really no different than something like Spotify's recommendation engine. But the Spotify recommendation engine mostly works because it's a narrowly confined space so you're more likely to find pairings that the user will rate as useful.

schroedinger.bsky.social•48 days ago

But it's still not perfect! If you have truly oddball tastes the recommendation engine will probably not work. I remember with Pandora like 20 years ago, I started with Megadeth and it quickly got me to Oasis. As it happens, I *do* also like Oasis. But it's not what I was in the mood for right then.

exostrologist.bsky.social•48 days ago

It's absolute cargo cult science. 1000 monkeys at 1000 keyboards, all flinging poop and hitting the 's' key repeatedly until one of them writes Shakespeare

zaxbit.bsky.social•48 days ago

Hell yeah gonna vibecode some slopsquat

elephasmaxi.bsky.social•48 days ago

For some, using a locally installed LLM not connected to internet can be a solution. Maybe it will not be 100% up to date but does it really need to be? My biggest worry is malware infected libraries.

harrypujols.com•48 days ago

It’s fine if you’re not using it for development of software and algorithms that are actually going to be out there.

elephasmaxi.bsky.social•48 days ago

I guess my solution was focused on the weekend warrior coder. I find Codestral 22b meets 90% of my needs. When things get sticky I resort to gtp4o. (But I still worry about contaminated libraries like the XZ Utils library, discovered in early 2024 that threatened Linux server systems worldwide.)

magnus919.com•48 days ago

This makes the hallucinations worse.

elephasmaxi.bsky.social•48 days ago

To be honest, I have terminated a few 'false starts' which could be considered halucinations. But I took this to mean I needed to craft a better, more detailed, prompt. Eventually these all were worked through.

daviddlevine.com•48 days ago

Not a solution. A locally installed LLM is exactly as likely to hallucinate nonexistent package names as an internet-connected one. It's the internet connection of the *compiler* that allows the malware to enter. And all modern development environments require an internet connection to run.

projectenigma.bsky.social•47 days ago

I think gcc/clang and vim don't?

daviddlevine.com•47 days ago

I don't think those are "modern" in this context. GCC is 38 years old, vim is 33, and clang is 19. None will reach out to the internet to import a named package, so they're not vulnerable to this exploit.

projectenigma.bsky.social•47 days ago

I'm old enough so I can call clang modern at least.

elephasmaxi.bsky.social•48 days ago

I guess I'm relying on the assumption that, as time passes, more and more seeding of LLM releases will occur. But you are right, once it's there it's a danger.

diahane.bsky.social•48 days ago

torrleonard.bsky.social•48 days ago

TLDR:

https://bsky.app/profile/janelleshane.com/post/3lmnpkz53vc2e

mikehoney.bsky.social•47 days ago

"What a world we live in: AI hallucinated packages are validated and rubber-stamped by another AI that is too eager to be helpful."

@garymarcus.bsky.social ICYMI

berdyx.bsky.social•48 days ago

There goes the vibe coding trend...

astia.bsky.social•49 days ago

@lookitup.baby 🤪

bullmastiffla.bsky.social•48 days ago

They aren’t hallucinating. They’re working as designed. Just being trained on harmful data because these companies don’t care about safety.

altibel.bsky.social•48 days ago

📌

sapandfury.bsky.social•48 days ago

how is this different from humans?

erinnthered.bsky.social•48 days ago

When a human looks for something and it's not there they do not hallucinate an answer. And they are certainly less likely to keep repeating the same mistake thst sends them to malware every time.

sapandfury.bsky.social•48 days ago

you've obviously never seen corporate software or a joke

erinnthered.bsky.social•47 days ago

You obviously haven't met an AI guru. They absolutely ask that kind of thing seriously.