Distillation might never have been so controversial since Prohibition. Once again, society is locked in a complex and moral argument about what should be allowed, how much you can ban, and what is good for us. But this time around it’s not alcohol that’s at stake, but artificial intelligence.
The word, and the metaphor, is a useful one. When making alcoholic drinks, distillation separates off the unwanted water to reduce the liquid to more of what is actually desired. In machine learning, the work is much the same: reducing a model so that it can be smaller and more efficient without necessarily losing much or any of the important part.
It has been an important technique that helped build many of the AI systems that people use today. But, this week, it has become one of the most important and controversial ideas in the whole of artificial intelligence.
That happened when Anthropic, which makes Claude, claimed this week that competitors from China were stealing its central technologies. It accused them of using the technique to “illicitly extract Claude’s capabilities to improve their own models”. And it said those attacks were “growing in intensity and sophistication”, and suggested that if the industry and governments did not act now then it could transform the AI industry.
At its heart, distillation is a relatively simple process. It involves training smaller AI models with the outputs of larger ones. AI companies do this with their own models: they might initially create a powerful, “frontier” model, for instance, and then train another more efficient version of that same model. That is useful because LLMs such as those that power ChatGPT or Claude use vast amounts of resources, in terms of chips and energy, and so a more efficient model can produce answers more cheaply and quickly.
But the same process can be used by rivals to, in effect, take the important parts of other companies models. That was what Anthropic accused its rivals of doing. It said that three Chinese labs had “generated over 16 million exchanges with Claude through approximately 24,000 fraudulent accounts”.
Distillation has already led to some controversy. In the wake of the popularity of DeepSeek last year, for instance, OpenAI suggested that it had used the technique to build its efficient AI models. That in turn led to a semi-crisis across the AI industry: the value of US tech stocks dramatically reduced, for instance, as investors feared that technologies such as those in ChatGPT were perhaps not as special, and therefore not as valuable, as they had presumed.
In its accusations, however, Anthropic took an even more weighty tone. It said that distilled models of this kind “lack necessary safeguards, creating significant national security risks”. That could mean, for instance, that anyone distilling its models could do so without the controls that mean that they cannot be used to “for example, develop bioweapons or carry out malicious cyber activities”.
“Foreign labs that distill American models can then feed these unprotected capabilities into military, intelligence, and surveillance systems—enabling authoritarian governments to deploy frontier AI for offensive cyber operations, disinformation campaigns, and mass surveillance,” it wrote. “If distilled models are open-sourced, this risk multiplies as these capabilities spread freely beyond any single government’s control.”
The accusation came at a sensitive time. Anthropic is currently locked in an argument with the US government over whether and how its systems can be used by the US government. It is currently fighting with the US Department of Defence which has demanded that Anthropic allow the government to put Anthropic to “any lawful use”, which the company has suggested would include mass surveillance and autonomous weapons.
But distillation could make some of that argument moot. If it is to become widespread – and Anthropic suggested that it would, if immediate and important steps were not taken – then companies might not be in any position to argue about how their technologies can be used anyway.
Anthropic’s argument about distillation comes laced with some irony. Various commentators pointed out that the creation of such AI models in the first place is done through an analogous process: feeding vast amounts of text into a system so that it can learn how language works, by spotting patterns in the way that words are used. Many of the big AI companies have faced criticism that they have done that in the same “illicit” way that Anthropic accused its competitors of doing, without due respect or renumeration to the original creators of those texts.
The law and ethics around those processes remains unclear. AI companies are locked in high-profile and high-cost legal battles with rightsholders who have accused them of stealing their work. Those legal cases will set precedents that themselves could make or break both the companies themselves as well as the economic and ethical underpinnings of the technology that they have become so rich by making.
That, of course, is the thing about information: it wants to be free. Unlike rockets or chips, it can’t easily be subject to export bans or restrictions. And so the argument about distillation all boils down to one about control.