When AI hardware comes up, the conversation is usually about compute: how powerful the GPU is, how fast the memory is, how hard the packaging is. But there’s an often-overlooked gate that’s turning into a hard limit for AI data centers: heat. When a full rack of GPUs blows past a hundred-plus kilowatts, getting that heat out becomes just as critical as the compute itself.
This piece lays out AI cooling and liquid cooling all at once. First, why air cooling isn’t enough and how liquid cooling works; then how far adoption has come, and where Taiwan’s supply chain fits in. This is the deep-dive version of Gate 6, “cooling and liquid cooling,” in The AI Hardware Supply Chain, End to End.
Why AI Servers Have No Choice but Liquid Cooling
First, feel the scale of the heat through some numbers. A single B300 GPU’s power draw is estimated in market reporting at up to about 1,400 watts (analyst TrendForce summarizes it more broadly as over a kilowatt); and NVIDIA’s latest GB300 NVL72, which packs 72 GPUs plus 36 CPUs into one rack, hits a full-rack power draw of about 130 to 142 kilowatts.
For comparison: a traditional machine room using air conditioning plus server fans can typically only hold about 5 to 15 kilowatts per rack. The industry broadly agrees that beyond about 15 to 25 kilowatts per rack, the design and power consumption of pure air cooling quickly become difficult. AI racks routinely shoot up to 50, 100, even 130-plus kilowatts, and the airflow, noise, and fan power of air cooling all spiral out of control.
So the fix is direct: catch the heat from right beside the chip before it spreads through the whole server. Bringing coolant onto the chip to conduct the heat directly is far more effective than carrying it away indirectly with large volumes of air. That’s why liquid cooling enters the picture.
How Does Liquid Cooling Work? Three Key Terms
Liquid cooling sounds high-tech, but unpacked it really comes down to three key terms.
Direct-to-chip liquid cooling (the cold-plate type): a metal cold plate sits tight against the GPU, and coolant flows through the tiny channels inside the plate to carry the chip’s heat away. The server body doesn’t touch liquid directly, the barrier to adoption is lower, and it’s the current mainstream approach in AI data centers.
Immersion: more radical — submerge the entire server in a non-conductive dielectric fluid so the heat goes straight into the liquid. Cooling efficiency is higher, but servicing is a hassle and the bar for material compatibility and standardization is high, so adoption is slower.
CDU (coolant distribution unit): the heart of the liquid-cooling system. It handles pumping, heat exchange, temperature control, and filtration, delivering coolant steadily to every cold plate and then handing the absorbed heat over to the machine room’s water loop or cooling equipment. Without a CDU, the cold plates and coolant are just a stagnant pool.
Core-Data Snapshot
The numbers below help you grasp both “why there’s no choice but liquid cooling” and “how far adoption has come.” Penetration figures are analyst estimates.
| Item | Data | Time / Nature |
|---|---|---|
| Traditional enterprise rack power density | About 5-15 kW/rack (air cooling can hold) | Current |
| Threshold where pure air cooling struggles | About 15-25 kW/rack and up | Industry consensus |
| GB300 NVL72 full-rack power draw | About 130-142 kW (fully liquid-cooled standard) | 2026, NVIDIA/analysts |
| AI data center liquid-cooling penetration | About 14% in 2024 → about 33% in 2025 | TrendForce estimate |
| AI chip liquid-cooling penetration | About 47% in 2026 | TrendForce estimate |
| Next-gen extreme rack | Rubin Ultra/Kyber up to about 600 kW | 2027 outlook |
Where We Are in 2026: From Option to Standard
The biggest shift for liquid cooling is that in 2026 it goes from “advanced option” to “default equipment.”
The most representative signal is NVIDIA’s GB300 NVL72: officially designed as a fully liquid-cooled rack architecture, with no air-cooled version offered anymore. Analyst penetration figures keep climbing too: AI data center liquid cooling rose from about 14% in 2024 to about 33% in 2025, and liquid-cooling penetration for AI chips is estimated to reach about 47% in 2026. Technically, the transitional mainstream right now is liquid-to-air (L2A) — coolant first carries the heat off the chip, then dumps the heat into the air — and from 2027, liquid-to-liquid (L2L, connecting directly to the machine room’s cooling-water loop) will pick up pace.
Look further ahead and it gets even wilder. The Rubin Ultra/Kyber rack NVIDIA shows on its roadmap has a full-rack power draw of up to about 600 kilowatts, targeted for the second half of 2027 (volume specs may still be adjusted). This means cooling has gone from “the cleanup work after the server is installed” to a core problem that has to be designed in from the start alongside power delivery and the rack itself.
Taiwan’s Thermal Supply Chain: Another Hidden Strength
At this gate too, Taiwan stands in a critical position. To be clear up front: the following only describes public supply-chain roles; it does not compile beneficiary stocks, price targets, or buy/sell timing.
Taiwan already had a complete supply chain in thermal components (fans, heat pipes, vapor chambers, chassis), and it’s now riding that into high-end liquid cooling. NVIDIA’s Blackwell partner ecosystem list publicly names Taiwanese firms such as AVC (Asia Vital Components) and Delta. In the industry’s division of labor, cold plates are supplied by the likes of AVC and Auras, the coolant distribution unit (CDU) has Delta as a key player, and Jentech participates in vapor chambers and thermal-conduction parts.
A common playbook is for a Taiwanese team to ship “server + rack + power + cooling” as a full set, which is also why Taiwan can capture value all the way from the chip to the full-rack system. This is just an industry map; it makes no investment judgment about any individual stock.
Cooling Is Actually Tied to Power
Finally, one easily overlooked linkage: cooling and power delivery are bound together.
As rack power draw shoots up to a hundred-plus kilowatts — and looks set to reach megawatt class in the future — the existing power-delivery architecture starts to strain on efficiency and copper losses. NVIDIA is therefore pushing an 800-volt DC (800 VDC) data center power architecture, aimed at supporting racks from 100 kilowatts to megawatt class, and claims it improves efficiency and sharply reduces copper usage. GB300 also adds energy storage on the power side, cutting AI computing’s peak demand on the grid by up to about 30%.
In other words, as compute packs ever denser, the real ceiling is shifting from “the chip” toward “cooling and power.” This extended story continues at the data-center-and-power gate of the supply-chain overview.
Key Takeaways for This Gate
After looking at cooling, first remember the causal chain: AI chips draw more and more power, a single rack blows past a hundred-plus kilowatts, air cooling can’t hold it, and so liquid cooling goes from option to standard.
Liquid cooling mainly relies on three things: the cold plate that sits on the chip, the immersion approach that submerges the whole unit in liquid, and the CDU that orchestrates the coolant. Adoption is moving fast — analysts estimate liquid-cooling penetration for AI chips reaches about 47% in 2026, and NVIDIA’s GB300 goes fully liquid-cooled across the board. Taiwan is a key supply-chain player in cold plates, CDUs, and vapor chambers.
To see what these heat-generating beasts actually look like, check out What Is Blackwell; to see how all eight gates of the chain string together, head back to the supply-chain overview.