The Models We Trust — Why AI Supply Chains Are Becoming the Next Cyber Battleground

Why the next great breach may walk in through our own front door — and how a signature can keep it out

Where to Begin

If you run artificial intelligence anywhere in your organization, there is one small habit worth forming before any other. It costs little, it asks for no new genius, and it quietly closes a door that most teams have left wide open. The habit is this: start checking where a model actually came from before you let it run.

In plain terms, that means a handful of sensible practices, none of them exotic. Bring outside models in through one approved, internal place — a registry you control — rather than letting each team install whatever happens to sit at the top of a leaderboard that afternoon. Before any model is allowed near production, ask it to prove its identity with a cryptographic signature, exactly the way you would verify a signed piece of software. Let an unfamiliar model stretch its legs in a sealed-off room first, where it can do no harm. And keep a simple ledger of every model you use and where each one was born.

There is nothing clever or new in any of this. It is the same care you already take with ordinary code, finally extended to the strange new artifacts that have begun to behave like code. The only surprising thing is how rarely it is done.

This Is Already Happening

None of this is a thought experiment, and none of it is rare. The record is already there for anyone willing to look.

Early in 2025, researchers discovered two malicious models sitting openly on Hugging Face — the busiest meeting place in the machine learning world — both of which the platform's own scanner had cheerfully marked as safe. The trick, which the researchers nicknamed nullifAI, abused the ordinary way models are packaged. The files were deliberately broken so that the harmful code ran first and the model only failed to load afterwards — and that small reversal was enough to walk straight past the tool built to catch exactly this. The hidden code, once it ran, opened a secret line back to the attacker. The case now sits in MITRE's public catalogue of real-world attacks, a permanent footnote in the history of the field.

It was not a single bad apple. A year earlier, security researchers had already found more than a hundred malicious models on the same platform, some carrying back doors that handed an attacker the keys the very moment the model was loaded. Worse still, the scanner everyone leaned on was later found to have flaws of its own — gaps wide enough for an attacker to slip through entirely, so that a dangerous model could be stamped “safe” by the one tool meant to protect you. A green light, as it turned out, had never quite meant what everyone assumed.

The rot reaches beneath the models, too, into the plumbing they depend on. Over five quiet days at the end of December 2022, the team behind PyTorch — one of the most widely used machine learning frameworks on earth — had to beg anyone who had installed its nightly version to tear it out at once. An attacker had uploaded a poisoned package to a public repository using the exact name of one PyTorch shipped itself, and because the public copy was fetched first, ordinary installation pulled in the hostile version by default. The hidden payload swept up system details and private files and smuggled them out through disguised network traffic. Nobody was singled out. People simply ran the same install command they always ran, on the wrong few days.

And the craft only grows more confident. Not long ago, a malicious model dressed up as a release from a famous AI lab was downloaded almost a quarter million times before anyone noticed, its loading script performing a little pantomime of legitimate work while it quietly opened a back door behind the curtain.

Then there is the most unsettling possibility of all — a poisoned model can sail through your tests, pass every benchmark, and behave like a model citizen right up until the single moment it was built to wait for.

Why a Signature Changes Everything

Lay these stories side by side and a pattern rises to the surface. Scanners can be fooled. Benchmarks can be passed by a model that is lying. Tests tell you only that a model works, never that it is doing nothing more than what you asked. Every one of these defenses puts the same question to the model — do you behave? — and a well-made back door is designed, from birth, to answer yes.

Model signing asks a different question, and a far more honest one: do I actually know who made you, and has anyone touched you since?

That is the question the rest of the software world settled long ago, with the quiet discipline of signed releases, and it is why the field of artificial intelligence is now reaching for the same familiar tools — signatures for the models themselves, plain inventories of what we are running, a record of where training data has been, proof of who fine-tuned what, and clear rules about which models are ever allowed to wake up inside our systems.

Beneath all of it lies one idea worth saying simply. A model is not a sleepy, harmless file. It is a small engine that makes its own choices, trained on data we cannot fully see, with habits we may never fully observe in testing. We spent a whole generation learning not to run unsigned programs handed to us by strangers. Models deserve at least the same caution — and at this moment, most of them are given far less.

So the next great breach may not begin with a cunning email or a ransom note on a frozen screen. It may begin far more gently than that: with a beautifully built, warmly reviewed, open-source model that an honest member of your own team brought through the door in perfect good faith — because nothing, anywhere along the way, ever asked it to prove where it had come from. Signing is simply how we learn to ask. And asking, it turns out, is most of the answer.

◆ Bibliography

01ReversingLabs. Malicious ML models discovered on Hugging Face platform. RL Blog, 2025. Link ↗
02Infosecurity Magazine. Malicious AI Models on Hugging Face Exploit Novel Attack Technique. February 7, 2025. Link ↗
03MITRE ATLAS. Malicious Models on Hugging Face (AML.CS0031). Case study. Link ↗
04CSO Online. Attackers hide malicious code in Hugging Face AI model Pickle files. August 15, 2025. Link ↗
05Veriprajna, A.. I Found Backdoored AI Models on Hugging Face. Medium, 2026 (citing JFrog research and CVE-2025-10155). Link ↗
06PyTorch. Compromised PyTorch-nightly dependency chain between December 25th and 30th, 2022. Official advisory, December 31, 2022. Link ↗
07Wiz. Malicious PyTorch dependency 'torchtriton' on PyPI: Everything you need to know. January 3, 2023. Link ↗
08BleepingComputer. PyTorch discloses malicious dependency chain compromise over holidays. January 1, 2023. Link ↗
09InfoWorld. Malicious Hugging Face model masquerading as OpenAI release hits 244K downloads. 2026 (citing HiddenLayer research). Link ↗
10Hubinger, E., et al.. Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training. Anthropic, January 2024. arXiv:2401.05566. Link ↗

The Models We Trust — Why AI Supply Chains Are Becoming the Next Cyber Battleground

Where to Begin

This Is Already Happening

Why a Signature Changes Everything

Ready to make security a growth advantage?