AI got the blame for the Iran school bombing. The truth is far more worrying

On the first morning of Operation Epic Fury, 28 February 2026, American forces struck the Shajareh Tayyebeh primary school in Minab, in southern Iran, hitting the building at least two times during the morning session. American forces killed between 175 and 180 people, most of them girls between the ages of seven and 12.

Within days, the question that organised the coverage was whether Claude, a chatbot made by Anthropic, had selected the school as a target. Congress wrote to the US secretary of defense, Pete Hegseth, about the extent of AI use in the strikes. The New Yorker magazine asked whether Claude could be trusted to obey orders in combat, whether it might resort to blackmail as a self-preservation strategy, and whether the Pentagon’s chief concern should be that the chatbot had a personality. Almost none of this had any relationship to reality. The targeting for Operation Epic Fury ran on a system called Maven. Nobody was arguing about Maven.

Eight years ago, Maven was the most contested project in Silicon Valley. In 2018, more than 4,000 Google employees signed a letter opposing the company’s contract to build artificial intelligence for the Pentagon’s targeting systems. Workers organised a walk out. Engineers quit. And Google ultimately abandoned the contract. Palantir Technologies, a data analytics company and defence contractor co-founded by Peter Thiel, took it over and spent the next six years building Maven into a targeting infrastructure that pulls together satellite imagery, signals intelligence and sensor data to identify targets and carry them through every step from first detection to the order to strike.

The building in Minab had been classified as a military facility in a Defense Intelligence Agency database that, according to CNN, had not been updated to reflect that the building had been separated from the adjacent Islamic Revolutionary Guard Corps compound and converted into a school, a change that satellite imagery shows had occurred by 2016 at the latest. A chatbot did not kill those children. People failed to update a database, and other people built a system fast enough to make that failure lethal. By the start of the Iran war, Maven – the system that had enabled that speed – had sunk into the plumbing, it had become part of the military’s infrastructure, and the argument was all about Claude. This obsession with Claude is a kind of AI psychosis, though not of the kind we normally talk about, and it afflicts critics and opponents of the technology as fiercely as it does its boosters. You do not have to use a language model to let it organise your attention or distort your thinking.

In 2019, the scholar Morgan Ames published The Charisma Machine, a study of how certain technologies draw attention, resources and attribution toward themselves and away from everything else. The usual framework for understanding this dynamic is “hype”, but hype only describes what boosters do, and it assigns critics a privileged debunking role that still leaves the technology at the centre of every argument. A charismatic technology shapes the whole field around it, the way a magnet organises iron filings. LLMs may be the most powerful instance of this type in history.

By the time the war began, “AI safety” and “alignment” and “hallucination” and “stochastic parrots” had become the terms of every argument about artificial intelligence, structuring and limiting what we could even say. Worse, “artificial intelligence” itself had come to be synonymous with LLMs. When the school was bombed, those were the terms people reached for, despite the fact that this critical apparatus offered a poor fit for the older, more mature stack of technologies involved in targeting. The real question, the question almost nobody was asking, is not about Claude or any language model. It is a bureaucratic question about what happened to the kill chain, and the answer is Palantir.

As military jargon goes, “kill chain” is a remarkably honest term. In essence, it refers to the bureaucratic framework for organising the steps between detecting something and destroying it. The oldest reference to the term itself I can find is from the 1990s, but the idea is quite old – dating at least to the 1760s, when French artillery reformers began replacing the gunner’s experienced eye with ballistic tables, elevation screws and standardised firing procedures. The steps in the kill chain are subject to constant change, to keep pace with changes in targeting doctrine, but also to incorporate whatever management fads come to afflict the military’s strategic thinkers. The US military has named and renamed the steps for 80 years. In the second world war the sequence was find, fix, fight, finish. By the 1990s the air force had stretched it to find, fix, track, target, engage, assess, or F2T2EA. Every generation of military technology has been sold on the promise of making everything about kill chains shorter, except for the acronyms.

Palantir’s Maven Smart System is the latest iteration of this compression, and it grew out of a shift in strategic thinking during Obama’s second term. In 2014, the secretary of defense, Chuck Hagel, and his deputy, Robert Work, announced what they called the “third offset strategy”. An “offset” in this line of thinking is a bet that a technological advantage can compensate for a strategic weakness the country cannot fix directly. The first two offsets addressed the same problem: the United States could not match the Soviet Union in conventional forces. The thinking was that the Red Army could just continue to throw personnel at a problem, as they did at Stalingrad, or, to be anachronistic, as the contemporary Russian army did at Bakhmut and Avdiivka. Nuclear weapons, the first offset, made the personnel advantage irrelevant in the 1950s. When the Soviets reached nuclear parity in the 1970s, precision-guided munitions and stealth offered the promise that a smaller force could defeat a larger one. By 2014, that advantage was eroding. China and Russia had spent two decades acquiring precision-guided munitions and building defence systems designed to keep American forces out of range. Robert Work insisted that the third offset was not about any particular technology but about using technology to reorganise how the military operated, letting the US make decisions faster than China and Russia, overwhelming and disorienting the enemy by maintaining a faster operational tempo than they could match.

At the funeral of children killed in the US strike on Shajareh Tayyebeh primary school. Photograph: Amirhossein Khorgooei/AP

In April 2017, early in the first Trump administration, Work helped establish the Algorithmic Warfare Cross-Functional Team, designated Project Maven. One of the generals overseeing Maven, Lt Gen Jack Shanahan, put the problem plainly: thousands of intelligence analysts were spending 80% of their time on mundane tasks, drowning in footage from surveillance drones that no one had time to watch. A single Predator drone mission could generate hundreds of hours of video, and the analysts tasked with understanding this were faced with an information overload problem. “We’re not going to solve it by throwing more people at the problem,” Shanahan said. “That’s the last thing that we actually want to do.” The core conceit of the project was that the machine could watch so that the analyst could think.

The Pentagon needed someone to build it. Google took the contract, and what happened next became the most visible labour action in the history of Silicon Valley.

After Google abandoned the Maven contract, Palantir took it over in 2019. The XVIII Airborne Corps began testing the system in an exercise called Scarlet Dragon, which started in 2020 as a tabletop wargaming exercise in a windowless basement at Fort Bragg. Its commander, Lt Gen Michael Erik Kurilla, wanted to build what he called the first “AI-enabled corps” in the army. The goal was to test whether the system could give a small team the targeting capacity that had previously required thousands of people.

Over the next five years, Scarlet Dragon grew into a military exercise using live ammunition, spanning multiple states and branches of the armed forces, with “forward-deployed engineers” from Palantir and other contractors embedded alongside soldiers. Each time the exercise was run, it was meant to answer the same question: how fast could the system move from detection to decision? The benchmark was the 2003 invasion of Iraq, where roughly 2,000 people worked the targeting process for the entire war. During Scarlet Dragon, 20 soldiers using Maven handled the same volume of work. By 2024, the stated goal was 1,000 targeting decisions in an hour. That is 3.6 seconds per decision, or from the individual “targeteer’s” perspective, one decision every 72 seconds.

The Maven Smart System is the platform that came out of those exercises, and it, not Claude, is what is being used to produce “target packages” in Iran. There are real limits to what a civilian such as myself can know about this system, and what follows is based on publicly available information, assembled from Palantir product demos, conferences, as well as instructional material produced for military users. But we can know quite a bit.

The Maven interface looks like a military-skinned version of corporate project management software crossed with a mapping application. What the military analyst building the target list sees is either a map layered with intelligence data or a screen organised into columns, each representing a stage of the targeting process. Individual targets move across the columns from left to right as they progress through each stage, a format borrowed from Kanban, a “lean manufacturing” workflow system developed at Toyota, and now widely used in software development.

Before Maven, operators worked across eight or nine separate systems simultaneously, pulling data from one, cross-referencing in another, manually moving detections between platforms to assemble the intelligence and approvals needed for each strike. Maven consolidated all of these behind a single interface. Cameron Stanley, the Pentagon’s chief digital and AI officer, called it an “abstraction layer”, a common term in software engineering, meaning a system that hides the complexity underneath it. Humans run the targeting. Underneath the interface, machine-learning systems analyse satellite imagery and sensor data to detect and classify objects, scoring each identification by how confident the system is that it got it right. Three clicks convert a data point on the map into a formal detection and move it into a targeting pipeline. These targets then move through columns representing different decision-making processes and rules of engagement. The system recommends how to strike each target – which aircraft, drone or missile to use, which weapon to pair with it – what the military calls a “course of action”. The officer selects from the ranked options, and the system, depending on who is using it, either sends the target package to an officer for approval or moves it to execution.

The AI underneath the interface is not a language model, or at least the AI that counts is not. The core technologies are the same basic systems that recognise your cat in a photo library or let a self-driving car combine its camera, radar and lidar into a single picture of the road, applied here to drone footage, radar and satellite imagery of military targets. They predate large language models by years. Neither Claude nor any other LLMs detects targets, processes radar, fuses sensor data or pairs weapons to targets. LLMs are late additions to Palantir’s ecosystem. In late 2024, years after the core system was operational, Palantir added an LLM layer – this is where Claude sits – that lets analysts search and summarise intelligence reports in plain English. But the language model was never what mattered about this system. What mattered was what Maven did to the targeting process: it consolidated the systems, compressed the time and reduced the people. That is not a new idea. The US military has been trying to close the gap between seeing something and destroying it for as long as that gap has existed, and every attempt has produced the same failure. Maven may not even be the most extreme case.

In the late 1960s, the US faced a version of the same problem in Vietnam. Supplies were moving south along the Ho Chi Minh trail through jungle the military could not see into. The solution was Operation Igloo White, a $1bn-a-year programme that scattered 20,000 acoustic and seismic sensors along the trail. These sensors transmitted data to relay aircraft overhead, which fed the signals to IBM 360 computers at Nakhon Phanom airbase in Thailand. The computers analysed the sensor data and predicted where convoys would be, and strike aircraft were directed to those coordinates.

Lao and Vietnamese porters carrying supplies south along the Hoi Chi Minh trail to resupply the insurgency in the south, c1963. Photograph: Pictures from History/Universal Images Group/Getty Images

The system could sense but it could not see. It could detect a vibration but it could not tell a truck from an ox cart. The North Vietnamese figured this out. They played recordings of truck engines, herded animals near the sensors to trigger vibration detection, and hung buckets of urine in trees to set off the chemical detectors. The system could be fooled because nobody in the process could look at what it was sensing. The air force claimed 46,000 trucks were destroyed or damaged over the course of the campaign. The CIA reported that the claims for a single year exceeded the total number of trucks believed to exist in all of North Vietnam. The system’s own output was the only measure of its performance, and nobody outside the system had standing to challenge it. Air force historian Bernard Nalty later called the service’s casualty computations “an exercise in metaphysics rather than mathematics” and his colleague Earl Tilford concluded that “the air force succeeded only in fooling itself”. When daytime reconnaissance flights failed to find the wreckage of all those trucks, air force personnel invented a creature to explain the absence. They called it the “great Laotian truck eater”.

The pattern that played out in Vietnam – a targeting system that could only measure its own performance and ended up believing its own output – is actually older than digital computing. Michael Sherry’s 1987 book The Rise of American Air Power traces it to the founding doctrine of precision bombing, whose confidence in its own methods made examining what those methods produced unnecessary. “Belief in success,” Sherry wrote, “encouraged imprecision about how to achieve it.” By 1944, operations analysts on both sides of the Atlantic were measuring bombing in a shared language of industrial optimisation. Civilians bombed out of their homes were recorded as “dehoused”. For every tonne of bombs dropped, analysts calculated how many hours of enemy labour it destroyed. One British evaluation treated the bomber itself as a capital asset: a single sortie against a German city wiped off the cost of building the aircraft, and everything after that was “clear profit”. Sherry called the resulting mindset “technological fanaticism”.

Sherry’s point was not that anyone chose destruction. It was that the people refining the technique of bombing stopped asking what the bombing was for. But even by the time the operations researchers had got their hands on targeting, this logic was already taking shape. As the historian of science William Thomas has argued, the operations analysts did not impose this logic on the military; the military was already converting operational experience into systematic procedure, and had been for decades. Nobody stopped making judgments. But the judgments were no longer about whether the bombing served a strategic purpose. They were about how to measure it and how to optimise around those measurements.

Carl von Clausewitz, the 19th-century Prussian general whose writings remain the foundation of western military thought, had a word for everything the optimisation leaves out. He called it “friction”, the accumulation of uncertainty, error and contradiction that ensures no operation goes as planned. But friction is also where judgment forms. Clausewitz observed that most intelligence is false, that reports contradict each other. The commander who has worked through this learns to see the way an eye adjusts to darkness, not by getting better light but by staying long enough to use what light there is. This “staying” is what takes time. Compress the time and the friction does not disappear. You just stop noticing it. Clausewitz called this kind of planning a “war on paper”. The plan proceeds without resistance, not because there is none, but because everything connecting the plan to the real world has been stripped out.

Air power is uniquely vulnerable to this. The pilot never sees what the bomb hits. The analyst works from imagery, coordinates and databases. The entire enterprise is mediated by representations of the target, not the target itself, which means the gap between the package and the world can widen without anyone in the process feeling it. The 2003 invasion of Iraq, the operation that Scarlet Dragon would later use as its benchmark, was a case in point. Marc Garlasco, the Pentagon’s chief of high-value targeting during the invasion, ran the fastest targeting cycle the US had operated to that point. He recommended 50 strikes on senior Iraqi leadership. The bombs were precise – they hit exactly where they were aimed – but the intelligence behind them was not. None of the 50 killed its intended target. Two weeks after the invasion, Garlasco left the Pentagon for Human Rights Watch, went to Iraq, and stood in the crater of a strike he had targeted himself. “These aren’t just nameless, faceless targets,” he said later. “This is a place where people are going to feel ramifications for a long time.” The targeting cycle had been fast enough to hit 50 buildings and too fast to discover it was hitting the wrong ones.

The air force’s own targeting guide, in effect during the Iraq war, said this was never supposed to happen. Published in 1998, it described the six functions of targeting as “intertwined”, with the targeteer moving “back” to refine objectives and “forward” to assess feasibility. “The best analysis,” the manual stated, “is reasoned thought with facts and conclusions, not a checklist.” But Jon Lindsay, who served as a navy intelligence officer in Kosovo and later studied special operations targeting in Iraq, found something different. Once a target was reified on a PowerPoint slide – the target intelligence package, or TIP – it became a black box. Questioning the assumptions behind it got harder as the hunt gained momentum, as the folder thickened with what Lindsay calls “representational residua”. There was more machinery for building up a target than for inspecting the quality of its construction. Personnel became disinclined to ask whether some targets were potential allies, or not actually bad guys at all, because producing targets meant participating in the hunt. The targeting guide had warned about this too. “If targeteers don’t provide full targeting service,” it read, “then other well meaning but undertrained and ill-experienced groups will step in.” Maven eventually would.

Lindsay’s book Information Technology and Military Power is the most careful study I’ve found of how targeting actually works, at least partially because it was written by someone who actually did it. During the Kosovo air war, Gen Wesley Clark demanded 2,000 targets, which made it easy to justify any target’s connection to the Milošević government. The CIA nominated just one target during the entire war: the federal directorate of supply and procurement. Analysts had a street address but not coordinates, so they tried to reverse-engineer a location from three outdated maps. They ended up hitting the Chinese embassy – which had recently relocated – 300 metres from the building they were aiming for. The state department knew that the embassy had moved. The military’s facilities database did not. Target reviews failed to notice, because each validation relied on the last. Lindsay calls this “circular reporting”: an accumulation of supporting documents that “created the illusion of multiple validations” while amplifying a single error. The PowerPoint slide looked as well vetted as the hundreds of others that Nato struck without incident. On the night of the strike, an intelligence analyst phoned headquarters to express doubts. Asked specifically about collateral damage, he could not articulate a concern. The strike proceeded. It killed three Chinese journalists. Lindsay, writing in his journal at the time, called the result “an immense error, perfectly packaged”.

The bomb-hit former Chinese embassy in Belgrade in 2010. Photograph: Andrej Isaković/AFP/Getty Images

In 2005, Lt Col John Fyfe of the US air force published a study of time-sensitive targeting during the 2003 invasion. Fyfe highlighted the different ways UK and US forces approached this challenge. In the Combined Air Operations Center, RAF officers served in key leadership positions alongside their American counterparts. They operated under more restricted rules of engagement. Fyfe noted that their “more reserved, conservative personalities” produced what he called “a very positive dampening effect on the sometimes harried, chaotic pace of offensive operations”. The contrast between shifts was visible: American leaders pressed ahead full bore, while British officers methodically reconsidered risk and cost-benefit trade-offs before approving execution. On UK-led shifts, there were no friendly fire incidents and no significant collateral damage. On numerous occasions, Fyfe notes, the British officer in charge prevented the operation from getting ahead of itself. What the next generation of reformers would measure as latency – the delay between identifying a target and striking it – was the window in which mistakes could be caught.

From inside the efficiency frame, every feature Fyfe describes registered as a defect. The UK shifts were slower. The restricted rules of engagement added constraints. The dampening effect added time. Speed saves lives, the argument goes, but the fastest targeting cycle before Maven was Garlasco’s, and it struck 50 buildings without hitting a single intended target. Scarlet Dragon eliminated all of it. The disagreements about targeting stopped. So did the deliberation, the hesitation and the moments when someone had time to object or notice something was off.

Organisations that run on formal procedure need someone inside the process to interpret rules, notice exceptions, recognise when the categories no longer fit the case. If the organisation concedes that its outcomes depend on the discretion of the people executing it, then the procedure is not a procedure but a suggestion, and the authority the organisation derives from appearing rule-governed collapses. So the judgment has to happen, and it has to look like something else. It has to look like following the procedure rather than interpreting it.

I’ve come to think of this as the “bureaucratic double bind” – the organisation cannot function without the judgment, and it cannot acknowledge the judgment without undermining itself and being seen as “political”. One solution to this problem is to replace the judgment with a number. In his 1995 book Trust in Numbers, the historian of science Theodore Porter argued that organisations adopt quantitative rules not because numbers are more accurate but because they are more defensible. Judgment is politically vulnerable. Rules are not. The procedure exists to make discretion disappear, or seem to. The system’s actual flexibility lives entirely in this unacknowledged interpretive work, which means it can be removed by anyone who mistakes it for inefficiency.

In 1984, the historian David Noble showed that when the US military and American manufacturers automated their factory floors, they consistently chose systems that were slower and more expensive but which moved decision-making away from workers and into management. The point was not efficiency – it was frequently extremely wasteful – but control. A worker who understands what they are doing can exercise judgment the institution cannot govern. Move that understanding into the system, and the worker has nothing left to do but follow instructions. Alex Karp, the CEO of Palantir, describes exactly this achievement in his 2025 book, The Technological Republic. “Software is now at the helm,” he writes, with hardware “serving as the means by which the recommendations of AI are implemented in the world.” His model for what this should look like comes from nature: bee swarms and the murmurations of starlings. “There is no mediation of the information captured by the scouts once they return to the hive,” Karp writes. The starlings need no permission from above, they require “no weekly reports to middle management, no presentations to more senior leaders, no meetings or conference calls to prepare for other meetings”. This sounds liberating, even utopian. But the signal that passes without mediation is also the signal that nobody can question.

Karp thinks he is destroying bureaucracy. He is encoding it. The contempt for meetings and weekly reports and presentations to senior leaders; he treats these as the bureaucratic process itself. They are not. They were where people interpreted procedure, the place where someone could notice when categories no longer fit the case. The targeting doctrine is still there. They are columns on a workflow board now, stages a target passes through on its way to being struck. What Karp eliminated was the discretion the institution could never admit it depended on. What remains is a bureaucracy that can execute its rules but with no one left to interpret them. Bureaucracy encoded in software does not bend. It shatters.

The target package for the Shajareh Tayyebeh school presented a military facility. Lucy Suchman, whose 1987 book Plans and Situated Actions remains the sharpest account of how formal procedures obscure the work that actually produces their outcomes, would not have been surprised. Plans always look complete afterward. They achieve completeness by filtering out everything that wasn’t legible to their categories. This package looked like every other package in the queue. But outside the package, the school appeared in Iranian business listings. It was visible on Google Maps. A search engine could have found it. Nobody searched. At 1,000 decisions an hour, nobody was going to. A former senior government official asked the obvious question: “The building was on a target list for years. Yet this was missed, and the question is how.” How indeed.

Graves being prepared after the school bombing in Iran. Photograph: Iranian Foreign Media Department/Reuters

Congress did not authorise this war. In two weeks, American forces struck 6,000 targets. The school was one of them. American forces killed almost 200 people, and the reporting reached for “AI error”, which domesticated the event into something a better algorithm or better guardrails could have prevented.

In the days after the strike, the charisma of AI organised the entire political conversation around the technology: whether Claude hallucinated, whether the model was aligned, whether Anthropic bore responsibility for its deployment. The constitutional question of who authorised this war and the legal question of whether this strike constitutes a war crime were displaced by a technical question that is easier to ask and impossible to answer in the terms it set. The Claude debate absorbed the energy. That is what charisma does.

It has also occluded something deeper: the human decisions that led to the killing of between 175 and 180 people, most of them girls between the ages of seven and 12. Someone decided to compress the kill chain. Someone decided that deliberation was latency. Someone decided to build a system that produces 1,000 targeting decisions an hour and call them high-quality. Someone decided to start this war. Several hundred people are sitting on Capitol Hill, refusing to stop it. Calling it an “AI problem” gives those decisions, and those people, a place to hide.

An earlier version of this article appeared on Artificial Bureaucracy, Kevin T Baker’s Substack

Listen to our podcasts here and sign up to the long read weekly email here.

AI got the blame for the Iran school bombing. The truth is far more worrying | Iran

Tags: