The rise of AI-powered code generation tools is reshaping how developers write software – and introducing new risks to the software supply chain in the process.
AI coding assistants, like large language models in general, have a habit of hallucinating. They suggest code that incorporates software packages that don’t exist.
As we noted in March and September last year, security and academic researchers have found that AI code assistants invent package names. In a recent study, researchers found that about 5.2 percent of package suggestions from commercial models didn’t exist, compared to 21.7 percent from open source models.
Running that code should result in an error when importing a non-existent package. But miscreants have realized that they can hijack the hallucination for their own benefit.
All that’s required is to create a malicious software package under a hallucinated package name and then upload the bad package to a package registry or index like PyPI or npm for distribution. Thereafter, when an AI code assistant re-hallucinates the co-opted name, the process of installing dependencies and executing the code will run the malware.
The recurrence appears to follow a bimodal pattern – some hallucinated names show up repeatedly when prompts are re-run, while others vanish entirely – suggesting certain prompts reliably produce the same phantom packages.
As noted by security firm Socket recently, the academic researchers who explored the subject last year found that re-running the same hallucination-triggering prompt ten times resulted in 43 percent of hallucinated packages being repeated every time and 39 percent never reappearing.
Exploiting hallucinated package names represents a form of typosquatting, where variations or misspellings of common terms are used to dupe people. Seth Michael Larson, security developer-in-residence at the Python Software Foundation, has dubbed it “slopsquatting” – “slop” being a common pejorative for AI model output.
“We’re in the very early days looking at this problem from an ecosystem level,” Larson told The Register. “It’s difficult, and likely impossible, to quantify how many attempted installs are happening because of LLM hallucinations without more transparency from LLM providers. Users of LLM generated code, packages, and information should be double-checking LLM outputs against reality before putting any of that information into operation, otherwise there can be real-world consequences.”
Larson said that there are many reasons a developer might attempt to install a package that doesn’t exist, including mistyping the package name, incorrectly installing internal packages without checking to see whether those names already exist in a public index (dependency confusion), differences in the package name and the module name, and so on.
“We’re seeing a real shift in how developers write code,” Feross Aboukhadijeh, CEO of security firm Socket, told The Register. “With AI tools becoming the default assistant for many, ‘vibe coding’ is happening constantly. Developers prompt the AI, copy the suggestion, and move on. Or worse, the AI agent just goes ahead and installs the recommended packages itself.
The problem is, these code suggestions often include hallucinated package names that sound real but don’t exist
“The problem is, these code suggestions often include hallucinated package names that sound real but don’t exist. I’ve seen this firsthand. You paste it into your terminal and the install fails – or worse, it doesn’t fail, because someone has slop-squatted that exact package name.”
Aboukhadijeh said these fake packages can look very convincing.
“When we investigate, we sometimes find realistic looking READMEs, fake GitHub repos, even sketchy blogs that make the package seem authentic,” he said, adding that Socket’s security scans will catch these packages because they analyze the way the code works.
What a world we live in: AI hallucinated packages are validated and rubber-stamped by another AI that is too eager to be helpful
“Even worse, when you Google one of these slop-squatted package names, you’ll often get an AI-generated summary from Google itself confidently praising the package, saying it’s useful, stable, well-maintained. But it’s just parroting the package’s own README, no skepticism, no context. To a developer in a rush, it gives a false sense of legitimacy.
“What a world we live in: AI hallucinated packages are validated and rubber-stamped by another AI that is too eager to be helpful.”
Aboukhadijeh pointed to an incident in January in which Google’s AI Overview, which responds to search queries with AI-generated text, suggested a malicious npm package @async-mutex/mutex, which was typosquatting the legitimate package async-mutex.
He also noted that recently a threat actor using the name “_Iain” published a playbook on a dark web forum detailing how to build a blockchain-based botnet using malicious npm packages.
Aboukhadijeh explained that _Iain “automated the creation of thousands of typo-squatted packages (many targeting crypto libraries) and even used ChatGPT to generate realistic-sounding variants of real package names at scale. He shared video tutorials walking others through the process, from publishing the packages to executing payloads on infected machines via a GUI. It’s a clear example of how attackers are weaponizing AI to accelerate software supply chain attacks.”
Larson said the Python Software Foundation is working constantly to make package abuse more difficult, adding such work takes time and resources.
“Alpha-Omega has sponsored the work of Mike Fiedler, our PyPI Safety & Security Engineer, to work on reducing the risks of malware on PyPI such as by implementing an programmatic API to report malware, partnering with existing malware reporting teams, and implementing better detections for typo-squatting of top projects,” he said.
“Users of PyPI and package managers in general should be checking that the package they are installing is an existing well-known package, that there are no typos in the name, and that the content of the package has been reviewed before installation. Even better, organizations can mirror a subset of PyPI within their own organizations to have much more control over which packages are available for developers.” ®