Rising thermal pressure on AI hardware

AI workloads and high-performance computing have placed unprecedented strain on data center infrastructure. Thermal dissipation has emerged as one of the toughest bottlenecks, with traditional methods such as airflow and cold plates increasingly unable to keep pace with new generations of silicon.

“Modern accelerators are throwing out thermal loads that air systems simply cannot contain, and even advanced water loops are straining. The immediate issues are not only the soaring TDP of GPUs, but also grid delays, water scarcity, and the inability of legacy air-cooled halls to absorb racks running at 80 or 100 kilowatts,” said Sanchit Vir Gogia, CEO and chief analyst at Greyhound Research. “Cold plates and immersion tanks have extended the runway, but only marginally. They still suffer from the resistance of thermal interfaces that smother heat at the die. The friction lies in the last metre of the thermal path, between junction and package, and that is where performance is being squandered.”

Cooling costs: the next data center budget crisis

Cooling isn’t just a technical challenge but also an economic one. Data centers spend heavily to manage the immense heat generated by servers, networking gear, and GPUs. Hence, the cost of cooling a data center is also a significant expense.

“As per 2025 AI infra buildouts TCO analysis, over 45%-47% of data center power budget typically goes into cooling, which could further expand to 65%-70% without advancement in cooling method efficiency,” said Danish Faruqui, CEO at Fab Economics. “In 2024, Nvidia Hopper H100 had 700 watts of power requirements per GPU, which scaled in 2025 to double with Blackwell B200 and Blackwell Ultra B300 to 1000 W and 1400 watts per GPU. Going forward in 2026, it will again more than double by Rubin and Rubin Ultra GPU to 1800W and 3600W.”

The thermal budget per GPU is at least doubling every year, therefore, in order to deploy the latest GPU and best compute performance, it is imperative for hyperscalers and neocloud providers to solve thermal bottlenecks.

Faruqui added microfluidics-based direct-to-silicon cooling can limit cooling expense to less than 20% within data center power budget but would require significant technology development optimization around microfluidics structure size, placement and non-laminar flow analysis in micro channels. If achieved, microfluidic cooling could be the sole enabler for Rubin Ultra GPU TDP budget of 3.6kW per GPU.