In today’s column, I examine the underhanded and insidious efforts by foreign entities to siphon off American-made generative AI and large language models (LLMs), doing so to craft their own AI variations at a fraction of the cost and to exploit the hard-earned progress made in AI by the United States.

It is outrageous, illegal, and being undertaken surreptitiously. The United States is urgently taking notice on behalf of American AI makers and performing rapid action to detect, curtail, and seek to prevent these shameful and unlawful intrusions.

This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here ).

The AI Technique Of Distillation

I will start by covering crucial foundational considerations.

Suppose an AI maker wants to use one of their existing full-sized AI models to enhance a smaller and less capable one of their AI models. This can be readily performed via a technique known as distillation. The typical use of distillation involves an AI maker deciding to create or enhance an SLM (small language model). They pour some of the contents of the LLM into the SLM, aiming to further fill in or pump up what the SLM can do (see my detailed explanation on how AI distillation works, at the link here).

You can think of AI distillation as a teacher-student type of arrangement. The LLM acts as the teacher. The SLM is the student. The larger-sized LLM shares aspects with the SLM to bolster the capabilities of the smaller AI. This is a relatively routine practice and is commonly undertaken. AI makers do this frequently, and so do AI practitioners and hobbyists. If done appropriately and legally, it is perfectly aboveboard.

The twist is that distillation can be utilized in a legal way but can also be performed illegally.

The illegal approach involves surreptitious distilling from someone else’s LLM and essentially stealing their intellectual property (IP). Why would this be done? Because you can take a relatively slim or hollow SLM and pump it up to become much more full-bodied at a super low price. The SLM emerges as a robust LLM overnight. Rather than having to pay and get suitable approval, the underhanded path rips off the hard work and vast invested efforts of whoever made and owns the teaching LLM.

Being Sneaky And Stay Below The Radar

You might be thinking that detecting when an illegal distillation is taking place ought to be easy-peasy. All you seemingly need to do is monitor when the teaching LLM is actively giving up tons of its content. It would be akin to a water pipe that someone turned on widely or slyly tapped into, and the water is gushing out. If the contents of the teaching LLM are gushing out, voila, you’ve got an unauthorized distillation happening.

The thieves are wise to such adversarial detection. They know that if they simply pumped out content at a high rate of distillation, doing so would be caught and summarily cut off. It is a much too obvious form of a cyberhack. Though an individual who isn’t in the know might try this blatant means, a large entity or actor would be too astute to fall into that crude method.

A sophisticated cyberhacking would employ proxy swarms. You might liken this to using thousands upon thousands of small drones. Drones are relatively small, cheap, and yet are extraordinarily powerful when used in a massive way. We’ve all seen how lots of drones working in unison can readily threaten a large warship at sea or an expensive large-sized warplane.

Spinning Up Thousands Or Millions Of Accounts

In the case of AI distillation thievery, here’s how a foreign entity might proceed. Keep in mind that a foreign entity could be a country or some entity that has sizable resources to devote toward cyberhacking.

The entity creates thousands or perhaps millions of fake accounts in the generative AI model that is being targeted. This isn’t being done singularly by human hand. Instead, an automated script running on a computer server will create these accounts (they become AI bot-controlled accounts). Furthermore, servers across the globe are tapped into so that the accounts appear to be geographically dispersed. It isn’t obvious where the accounts originate from.

If you are wondering why an AI maker wouldn’t instantly get suspicious about perhaps millions of new accounts, the gist is that many of the major LLMs already have hundreds of millions of accounts, and new accounts by actual people are being created at an amazing pace. OpenAI has stated that ChatGPT and GPT-5 have somewhere around 900 million weekly active users. The stats suggest that with ChatGPT, GPT-5, Google Gemini, Anthropic Claude, xAI Grok, Microsoft CoPilot, and additional mainstay LLMs, the number of worldwide AI users in total is perhaps 1.5 billion or more.

Thus, creating thousands or even millions of new accounts by a cyberhacker is not going to raise alarm bells, especially by dispersing the geographic origins of the accounts. It will look as though more people from around the world are opting to make use of modern-era generative AI. No-harm, no-foul.

Don’t Need To Break Glass

Does distillation break or crack the AI and, therefore, ought to be detectable?

Nope, it is the mere act of submitting prompts and obtaining responses. The idea is straightforward for doing the distilling. Suppose you wanted to find out what AI can tell you about Einstein’s most famous equation. You could merely ask a question and get a response. Then, based on the response, you ask another question. Keep doing this until it seems that you’ve extracted as much as feasible from the AI about e=mc squared.

Collect together all those prompts and responses. Keep them recorded as pairs. Those prompt-response pairs are then fed into the AI that you are trying to train in Einstein’s theory of relativity. By pumping in perhaps thousands or millions of such pairs, the other “student” AI patterns on the prompts and responses, ultimately becoming boosted on the topic of Einstein’s theory.

No need to do anything tricky or out of the ordinary. Just submit prompts, collect responses, and do so until it seems that enough has been distilled to move on to some other topic. Distillation has the appearance of an everyday user who is interacting with the AI on a normal basis. You would be hard-pressed to discern that it was a bot that was essentially stealing from the AI.

Is It Stealing If Only Dipping In

A frequent question comes up when I give talks about AI and distillation, namely that AI distillation doesn’t especially seem to be a crime per se. Normal users are allowed to enter prompts and get responses from LLMs. The cyberhacker is doing the same.

If you were to inspect the online licensing agreements of the AI makers, you’ll see a clause that says you cannot use the prompt-response pairs for distillation. The AI makers don’t want you to use their AI for distillation, and adamantly stipulate that you aren’t to do so. It is a flat no. You are welcome to use the prompt-response pairs for all sorts of other purposes, but not for distillation.

Another angle about whether this is legal or illegal has to do with the fact that when you get the AI to give you responses, you aren’t actually removing anything from the AI. The AI is merely sharing with you a response. It displays contents. The actual contents of the AI are still intact. In that sense, it perhaps seems odd to claim that you are “stealing” from the AI.

We customarily think of stealing as removing an item. When someone steals a camera that’s in the front seat of your car, they take the camera away from the car, and they rob you of the possession of the camera. The AI giving you a response to a prompt is not going to somehow remove content from the AI.

The more appropriate way to think of this is when someone makes a bootleg copy of a movie, the original movie is still intact, but the bootlegger has nonetheless committed a crime and stolen something of value. The same applies to LLMs (well, just to let you know, there’s all manner of arcane debates on that -- I’ve covered those IP issues elsewhere, see the link here ).

Jailbreaking Often Included

I’ve mentioned earlier that to do the AI distillation, you can merely enter everyday prompts. That is indeed the case. But sometimes there are special inner elements of AI that are guarded by the AI maker. For example, there are usually AI safeguards that won’t let you ask for details on how to make toxic poisons or explosive devices.

A foreign entity might want to get those facets from the AI.

To do so, they will employ various AI cracking schemes, often referred to as jailbreaking. The use of jailbreaking can potentially enable the foreign entity to extract secrets that are highly sensitive or supposed to be kept away from all users of the AI. For my discussion of how jailbreaking is undertaken, see the link here .

AI Distillation By Foreign Entities

Now that you are sufficiently up-to-speed about AI distillation, let’s shift our focus to how American makers of AI are being ripped off by foreign entities via the use of AI distillation techniques.

A publicly posted policy memorandum on April 23, 2026, by Michael J. Kratsios, Assistant to the President for Science and Technology Director in the White House Office of Science and Technology Policy, entitled “Adversarial Distillation of American AI Models,” made these salient points (excerpts):

  • “The United States leads the world in artificial intelligence (AI) technologies. That lead reflects decades of foundational research, bold entrepreneurial risk-taking, and hundreds of billions of dollars in annual private investment.”
  • “However, the United States government has information indicating that foreign entities, principally based in China, are engaged in deliberate, industrial-scale campaigns to distill U.S. frontier systems.”
  • “Leveraging tens of thousands of proxy accounts to evade detection and using jailbreaking techniques to expose proprietary information, these coordinated campaigns systematically extract capabilities from American AI models, exploiting American expertise and innovation.”
  • “Industrial distillation activities that aim to systematically undermine American research and development and access proprietary information are unacceptable.”

I liken these foreign entity activities to the types of subterfuge that took place during the Cold War era. I’m sure you know that spies would try to obtain American secrets, such as how to make certain kinds of missiles or weapons. Espionage tactics often leaned into the use of dispersed human actors, including individual researchers, governmental officials, industry practitioners, and others, to make copies of secret plans, proprietary documents, and so on.

The Big Picture Comes To Mind

Nowadays, those same spying tradecraft precepts are being retooled as AI bots that converge in proxy swarms on a targeted LLM in an AI distillation attack. This allows scaling far beyond what human hands alone could accomplish. Deploying thousands of semi-autonomous accounts to perform coordinated queries is relatively cheap and easy to undertake. It is much less expensive than building the same content from scratch, can be done in a fraction of the time in comparison to the right way to do things, and is quite difficult to detect.

Not the perfect crime, but it ranks up there in the AI underhanded cyberhacking world.

American companies working individually won’t necessarily have the wherewithal to tackle the spying tactics of AI distillation that occur on an industrial scale. Sure, they are doing what they can to devise AI safeguards around this, but the foreign entities are going after a wide swath of LLMs and can keep maneuvering as they do so.

A mix of defensive tactics and strategies is being constantly crafted and advanced.

Technical defenses include:

  • Behavioral monitoring across accounts (detect coordinated querying patterns).
  • Use of data watermarking or data fingerprinting to trace model lineage.
  • Adopt differential privacy or output perturbation (though this can degrade usefulness).
  • Enforce query throttling tied to aggregate signals, not just per-account limits.
  • Devise stronger jailbreak resistance via adversarial training.

Operational controls that can be implemented include:

  • Establish account verification tiers to limit high-volume access.
  • Enact API usage auditing and anomaly escalation.
  • Proceed with red-teaming focused on extraction scenarios.

Policy responses include:

  • Consider the adoption of various AI distillation-related export controls on model weights and high-end computing.
  • Craft legal frameworks treating large-scale AI extraction as IP theft and economic espionage.
  • Seek to establish agreed and enforceable international norms around AI model distillation practices.

Steps Outlined In The Memorandum

The recently released White House memorandum offers several steps that are being undertaken, including sharing information across American AI companies about AI distillation subterfuge taking place, and having the federal government work closely with AI makers to develop best practices for identifying, mitigating, and remediating these industrial-scale efforts by foreign entities.

It is a never-ending cat-and-mouse game.

There is a famous line known amongst AI insiders that there are two types of AI companies: those that have had their AI breached and those that don’t know it yet. Boom, drop the mic. Seriously, there are undoubtedly foreign entities at this very moment performing AI distillation on American-made LLMs. It is real. It is happening. And more is coming down the pike.

We must be vigilant, take proactive AI cybersecurity precautions, and protect the revered goose that lays the golden eggs.