1. LLM-generated code tries to run code from online software packages. Which is normal but
2. The packages don’t exist. Which would normally cause an error but
3. Nefarious people have made malware under the package names that LLMs make up most often. So
4. Now the LLM code points to malware.
2. The packages don’t exist. Which would normally cause an error but
3. Nefarious people have made malware under the package names that LLMs make up most often. So
4. Now the LLM code points to malware.
Reposted from
David D. Levine
LLMs hallucinating nonexistent software packages with plausible names leads to a new malware vulnerability: "slopsquatting."
Comments
This issue is due to laziness and being hurried
Most (all?) of these systems are "slow learners" that cannot "learn" to counteract adversarial patterns with few instances.
The Q is whether the space of adversarial attacks is amenable to "patching" or not.
Don’t be the fastest, just be good and secure too.
So what then is the best use case for AI? Learning? Instead of production code? Can you trust AI even as a teacher?
What/where is the LLMs value?
As someone else has put it, think of it like the code was written by an intern that copy-pastes without running/reading the code.
Did you know Pakistan is looking to build a large number of data centres because It's experiencing a glut of solar power?
Data centres powered by fossil fuels does not mean that any programs stored or run there are bad.
https://bsky.app/profile/daviddlevine.com/post/3lmnllla4vr2q
haven't trusted a word from him since, just add this to a growing pile of attack vectors
Ask GenAI simple things and you get simple answers. Write your prompts with some detail and get better code.
A name, that an AI would hallucinate if asked to create a fictional threat.
I know nothing about tech, but there is this wierd thing I see
99% of the posts on social media describe LLMs, roughly, as a scam on the order of crypto
in real life, many people tell me they are using LLMs in their daily work and that LLMs are useful
That doesn’t mean it’s net + or - for society: big tobacco, Starbucks, and national parks all give lots of people something they believe they want.
my company hires a consulting engineering firm
An engineer told me: most of us [ engineers ] when we need a price or vendor for a non standard part, we now use ChatGPT as our intial query as it works better then google
that is a clear real world example of LLM delivering something that
the part that brought this up is that I wanted screws made of a special plastic called PEEK , in the size commonly known as "M3" which is a very small screw, but I wanted ones a lot longer then most vendors provide
1. Search engines are bad in part because they’re drowning in LLM slop
2. The use case you describe is also vulnerable to slopsteading supply chain attacks
I guess we will see if it remains very $$ to run, or if tech delivers some cheaper solution
it is so amusing that Google , with SEO, made searches so bad that people are abandoning G and using Chat GPT or some such as a primary search engine
karma !!
And I have wall-to-wall unit tests, the same as if I pair programmed with a real human. Sometimes the tests fail.
I do have a question though: If you use ChatGPT to pull off stupid HTML tricks, why don't you just copypasta the changes for future reference?
The image depicts screenshot of a dialog with ChatGPT where it confidentially spits out CDK TypeScript code, referencing libraries that don't exist.
There are AWS libs that can do it, but ChatGPT spit out a mash-up of the actual name:
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-codestarconnections-connection.html
https://en.wikipedia.org/wiki/Npm_left-pad_incident
Doesn't matter if an LLM or a bad human actor references/recommends it, the whole point of such repositories is to validate their packages.
let's say you need to do some complex math operations in python. math_py is a "reasonable" name for a lib that would do that, so LLMs will sometimes hallucinate and write code referencing that lib
only after noticing that a human creates malware under that name
If chatGPT makes up a package name that doesn't exist and someone publishes something malicious under that name, that package will be the "real" package. This isn't about changing existing packages.
Usually it's someone who takes over an abandoned package and injects exploit code into it. This is, however, a novel new approach
Go AI and die.
It’s like every month we plumb new, undiscovered depths of Dumb
(1984 aside, but that book was more about the present, not the future)
I would feel uncomfortable using the one; but I do not, using the other. But I am not sure whether such a distinction is merited?
(I should say that I do always 'hand-check' the results for myself!)
One glass of beer isn't any better for you than once shot of liquor, even though it goes down easier. Same active ingredient, same health risks, just in a more palatable container
vibe coding my way into getting all company data taken hostage
I don’t know why this shocks me so much, it’s not like there aren’t already thousands of examples of how idiotic people who use this shit are.
i just meant that there's a habit of reaching for an answer and using it uncritically, which i've observed across sources.
1/
I understand that this is a total failure of imagination on my part.
bad actors are always so creative
Thinking one is immune to missing a hallucination with serious consequences is trusting in an infallibility that no one can really live up to.
when they refer to real books they invent fake page numbers and fake quotes
fake facts
there's no bottom because the companies can't control the underlying mechanism, which is not reality-accountable
And then maintainers get bug reports that some functionality doesn't work... When it doesn't exist in the first place.
https://m.youtube.com/watch?v=EUrOxh_0leE&t=3377s&pp=ygURYW5nZWxhIGNvbGxpZXIgYWk%3D
Either they aren’t and that’s scary
Or they ARE and it’s because they authored the malware, which is HORRIFYING.
Although in my favour this is likely to extend my career for a number of years.
"Once downloaded, the Dave Smith Trojan goes to work cataloguing all the photos it can find in the system, and arranges them into an infinitely looping PowerPoint slideshow with AI Brummie audio commentary about where they were taken."
https://en.wikipedia.org/wiki/Bitsquatting
You don’t debug code on Hex, you have very long arguments with whatever you’re dealing with now.
Wow.
I had a feeling it would be what it was before I even read it, and sadly I was right.
We really should not let people connect improbability drives to the internet.
ECC memory everywhere basically fixes this, but also makes your memory chips 10-15% more expensive, so...
I think the problem with "hallucination" is that it makes it sound less of a serious issue than it is (I'm not the OP) and carries the suggestion that it may be fixable, but the bottom line is it's a FEATURE not a bug, they're non-deterministic by design.
Because they would have to pay royalties.