The AI Hardware Supply Chain, End to End: From a Single GPU to a Data Center, Where Are the Eight Chokepoints in the Global Lifeline?

From a single GPU to a whole data center: where the global AI-hardware lifeline gets stuck across eight gates, with TSMC, HBM and CoWoS all in the mix.

5/26 · Penna

Eight layers of the AI hardware supply chain: from AI chips, HBM, advanced packaging, and foundry, to optical interconnect, liquid cooling, data centers, power, and export controls

Contents

Every time the news says “such-and-such company bought another tens of thousands of Nvidia GPUs,” it sounds like they just bought a batch of graphics cards and plugged them in. In reality it’s not that simple.

An AI chip first has to be designed, sent into a fab to be etched onto silicon, then bonded with memory through special packaging, then strung together by optical fiber into a cluster of thousands computing in unison, then somehow have the hundreds of kilowatts of heat per rack carted away, and finally crammed into a data center that eats power like a monster. That whole journey is what the industry calls the “AI hardware supply chain.”

The most interesting thing about this chain is that its lifeline isn’t spread evenly around the world — it’s concentrated on a few points, two of which are critical links right in Taiwan. Understanding where these eight gates are is more useful than memorizing which company’s stock went up or down.

The Whole Chain in 30 Seconds

Picture the whole AI hardware supply chain as building a super-factory: first etch the computing circuitry onto silicon (foundry), then bond the ultra-fast computing brain and the three-dimensional memory bank that holds the data onto the same substrate (advanced packaging + HBM), forming one monster compute card (the AI chip). Then connect tens of thousands of these cards with a light-speed-class high-speed network (optical interconnect); because they run so hot that traditional air cooling can’t keep up, you rely on liquid cooling (cooling and liquid cooling); and finally pack them neatly into a power-guzzling facility (data center and power). And every step of this globe-spanning factory is watched by the legal firewalls of the major powers (geopolitics and export controls).

Gate	What it does (in plain terms)	Representative players	Is it a chokepoint?
1 AI chips	Decides what compute looks like	Nvidia, AMD, Google TPU	Design is concentrated; Nvidia about 80%
2 HBM memory	Feeds data to the chip so it doesn’t starve	SK Hynix, Micron, Samsung	✅ Three-way oligopoly, supply tight
3 Advanced packaging	Bonds chips and memory into one	TSMC CoWoS, ASE	✅ Capacity gridlocked, mainly Taiwan
4 Foundry	Etches the design onto silicon	TSMC, Samsung, ASML (equipment)	✅ Dual bottleneck: advanced process + EUV
5 Optical interconnect	Connects thousands of chips into a cluster	Optical-module makers, silicon-photonics makers	High technical bar, no shortage yet
6 Cooling and liquid cooling	Carts away hundreds of kilowatts per rack	Delta Electronics, Asetek, Vertiv	Tightens as power density rises
7 Data centers and power	Builds facilities, brings the power in	Cloud giants, power/nuclear operators	✅ Grid and land are becoming the ceiling
8 Export controls	Decides who can buy and who can build	U.S. BIS, allies, China	A cross-layer variable affecting every link

Core-Data Snapshot Table

The figures below are the “dashboard” for the whole chain. To be clear up front: numbers like capacity and market share are mostly estimates from research firms or financial media, not month-by-month official disclosures from companies, so we try to mark the timepoint and nature here. When reading, grab the “order of magnitude” and “trend” rather than the decimal points.

Topic	Value	Timepoint/Nature
TSMC CoWoS monthly capacity	About 70,000–80,000 wafers/month at end-2025, targeting about 120,000–130,000/month by end-2026	2025–2026E, target/industry estimate
HBM market share (by revenue)	SK Hynix about 54–57%, Micron and Samsung each around 20%	2025 Q4 to 2026 H1, estimate, measures vary
Flagship GPU: Nvidia B300	288 GB HBM3e, bandwidth about 8 TB/s, FP4 about 15 PFLOPS, about 1,400 W each	Shipping from 2025 H2
Next gen: Nvidia Vera Rubin	288 GB HBM4 each, bandwidth target about 22 TB/s, full-rack TDP about 190–230 kW	Pre-launch spec for 2026 H2
Optical-interconnect generation	800G is already mainstream; 1.6T/silicon-photonics ramps into production 2025–2027	2024–2027
AI rack power density	Commonly 30–50 kW, next gen up to 80–120 kW/rack	2024–2026
Top-five cloud providers’ capex	About $600–690 billion in 2026 (up roughly 30% year over year)	2026E, institutional estimate

Gate 1 · AI Chips (GPUs)

What it does: GPUs were originally designed for gaming and 3D graphics, using a large number of simple cores for parallel computing. That structure happens to be very well suited to splitting a neural network into countless small matrices and computing them all at once. By analogy, a CPU is like a professor doing advanced math, solving one problem at a time; a GPU is like a thousand grade-schoolers who only know addition and subtraction — strength in numbers — and it runs AI several orders of magnitude faster.

Who’s making money: This gate is almost entirely dominated by Nvidia alone. Industry data shows Nvidia held about 80% of the AI-accelerator market in 2025, and on the finer AI-GPU subcategory the estimate even reaches about 86%. The real moat isn’t just the chip itself but the CUDA software ecosystem bolted on top — everyone’s programs are written to run on it, and switching costs are high. AMD’s MI series is estimated at under 10% share, climbing slowly; cloud giants like Google and Amazon take another route, designing their own ASICs (application-specific chips, such as TPU and Trainium) to save cost and differentiate within their own clouds, but their overall global share is still far smaller than Nvidia’s.

Metric	Value	Timepoint/Nature
Nvidia AI-accelerator share	About 80%	2025, market estimate
Nvidia AI-GPU submarket share	About 86%	2025, market estimate
AMD AI-GPU share	Under 10%, rising with the MI series	2025E, estimate

Taiwan in this gate: Chip design isn’t in Taiwan, but Nvidia’s full-rack AI systems (like the Vera Rubin NVL72) are often turned into mass-producible products and shipped by Taiwanese makers (such as Pegatron). Taiwan does not just handle wafers and packaging — it’s also an important assembly base for full-rack AI servers.

In the near term, the “AI servers” you hear about are almost all still Nvidia-led; in-house chips are more of a long-term play, not something that will flip the table this year.

Gate 2 · HBM High-Bandwidth Memory

What it does: However fast the chip is, if the data can’t keep up, it’s just idling. HBM (High Bandwidth Memory) exists to solve exactly this. Regular DRAM lays memory out in horizontal strips; HBM instead stacks memory layer upon layer vertically, connects it with “through-silicon vias,” and links to the GPU through a super-wide interface. In one line: it’s super-wide memory standing up, meant to keep the GPU from starving while it waits for data.

Where it’s stuck now: HBM is a three-way oligopoly — SK Hynix, Micron, and Samsung. SK Hynix leads in both technology and share (about 50-something percent depending on the measure); Micron rocketed from a single-digit share at the end of 2024 to about 20%, while Samsung is catching up. The key point is that supply is very tight: Micron has publicly said its 2026 HBM4 capacity is “completely sold out.” That means even if wafer and packaging capacity expands, if HBM can’t keep up, full systems still can’t ship.

Maker	HBM share (by revenue)	Timepoint/Nature
SK Hynix	About 54–57%	2025 Q4–2026 H1, estimate
Micron	About 18–21%	Same as above, growing fast
Samsung	About 20–22%	Same as above, catching up

Note: HBM share figures vary depending on whether you count “all HBM” or “only the latest HBM4,” and on the timepoint; the ranges here are taken from various reports. The direction is consistent: Hynix leads, Micron catches up fast.

Taiwan in this gate: Taiwan doesn’t produce HBM wafers, but HBM ultimately gets stacked and tested at TSMC and packaging-and-test houses, then sent for assembly at Taiwanese server makers. The scarcer HBM is, the higher the bargaining power of advanced packaging and system integration, which actually amplifies Taiwan’s weight downstream.

Gate 3 · Advanced Packaging (CoWoS)

What it does: To push 20-plus TB per second of data between the GPU and HBM, the two have to be “extremely close,” and traditional circuit-board routing simply can’t do it. Advanced packaging places multiple chips on a silicon interposer, right next to each other, or even stacks them directly. TSMC’s CoWoS is the representative of this kind of technology, bonding the GPU and several HBM chips into one giant module. Think of it as: precisely gluing several Lego blocks onto the same baseplate to make one ‘big block.’

Why it’s a chokepoint: High-end GPUs almost all use CoWoS or similar packaging, so CoWoS monthly capacity directly determines how many of these chips can ship in a year. The 2025–2026 reports are almost unanimous: Nvidia’s CoWoS-L capacity is “completely booked,” and TSMC has had to outsource some orders to packaging-and-test houses like ASE and Amkor as a safety valve.

Timepoint	TSMC CoWoS monthly capacity	Nature
End-2023	About 13,000–16,000 wafers/month	Estimate
End-2024	About 30,000–40,000 wafers/month	Estimate
End-2025	About 70,000–80,000 wafers/month	Estimate
End-2026 (target)	About 120,000–130,000 wafers/month	Target/industry estimate

Taiwan in this gate: CoWoS’s main capacity is concentrated in several science-park fabs in Taiwan, and most new expansion projects are also in Taiwan. Even where part is outsourced, most of it is tightly linked to the Taiwanese supply chain (SPIL, for example, is itself a Taiwanese company). At the “GPU + HBM packaging” layer, Taiwan is one of the world’s single most critical geographic concentration points; if it were disrupted, high-end AI platforms would struggle to find substitute capacity in the near term.

Gate 4 · Foundry and Lithography Equipment

What it does: Foundry means “making chips for others.” Nvidia designs its own GPUs but hands manufacturing to TSMC. Process nodes (5nm, 3nm, 2nm) can be roughly understood as “line width” — the smaller the number, the more transistors fit into the same area and the more power-efficient it is. High-end AI chips now almost all use the 3–5nm class.

How concentrated it is: TSMC holds about 64% of the global pure-foundry market (2024 Q3), far ahead of second-place Samsung’s 12%. More crucially, TSMC’s sub-7nm advanced process contributed about 74% of its wafer revenue, and the world’s high-end AI chips almost all rest on this most advanced production line.

Metric	Value	Timepoint
TSMC global foundry share	About 64%	2024 Q3
Samsung foundry share	About 12%	Same as above
TSMC sub-7nm process share of revenue	About 74%	2025 Q4

And there’s a bottleneck hidden further back — ASML: To make sub-7nm, you must use extreme-ultraviolet (EUV) lithography machines, and the Netherlands’ ASML is the only company in the world that can produce production-grade EUV — a near-monopoly. One machine costs €180 million to €380 million. That means simply restricting exports of ASML equipment can directly choke whether downstream fabs can make more advanced processes.

Taiwan in this gate: TSMC is the largest supplier of the world’s most advanced process, and that technology and capacity are highly concentrated in Taiwan, which makes “whether the Taiwan Strait is stable” a direct precondition for the world’s AI-chip supply. This is also why the international community is so sensitive to the situation in the Taiwan Strait.

Gate 5 · Optical Interconnect

What it does: When training a large model, thousands or even tens of thousands of GPUs must constantly exchange data. As distance grows and speed rises, traditional copper wires lose signal integrity. So clusters switch internally to optical-fiber transmission, converting electrical signals into light, sending them out, and converting them back. Speeds climb from 400G up to 800G, then to 1.6T.

Where it stands now: 800G optical modules were already mainstream in 2024–2025 AI data centers, doubling the bandwidth of 400G while cutting energy per bit by 30–40%. The 1.6T modules (including co-packaged optics, CPO, and silicon-photonics designs) ramp into production successively in 2025–2027. The key to CPO is packaging the optical engine right next to the chip, greatly shortening circuit distance and saving a lot of power; by Nvidia’s figures, each port can drop from about 30 watts to about 9 watts. In a cluster with tens of thousands of GPUs, that kind of power saving gets greatly amplified.

Taiwan in this gate: Taiwan has quite a few networking and server-system makers integrating optical modules and switches, and it also supplies circuit boards, mechanical parts, and testing. That said, silicon-photonics chips themselves are still mainly designed and produced by U.S. and Chinese makers; at this layer Taiwan leans more toward contract manufacturing and components, with limited public market-share data. This gate has a high technical bar, but so far there’s been no “whole-line shortage” like CoWoS.

Gate 6 · Cooling and Liquid Cooling

What it does: A single B300 GPU draws 1,400 watts, and a rack with 72 of them easily exceeds 150–200 kilowatts. A traditional rack relying on air conditioning plus fans can handle only about 5–10 kilowatts per rack — nowhere near enough for an AI rack. So liquid cooling enters: send coolant straight to a cold plate on the chip to carry away heat (direct-to-chip liquid cooling), or simply submerge the whole server in a non-conductive liquid (immersion).

Get a feel for the numbers:

Rack type	Power density	Notes
Traditional enterprise rack	About 5–10 kW/rack	Air cooling is fine
Current AI GPU rack	About 30–50 kW/rack	Air cooling is near its limit
Next-gen AI rack	About 80–120 kW/rack	Liquid cooling is mandatory

Taiwan in this gate: Taiwan already has a complete supply chain in cooling components (fans, heat pipes, cold plates, chassis), and now many makers are moving into cold plates, coolant distribution units (CDUs), and full-rack liquid-cooling integration. Delta Electronics is a clear example, making liquid-cooling and power solutions from the rack level to the facility level, pitched at high-density GPU scenarios. A common play is for Taiwanese teams to export the whole “server + rack + power + cooling” package.

Gate 7 · Data Centers and Power

What it does: Once the racks are built, they have to go into a data center, which also needs power on the scale of hundreds of MW (megawatts) connected behind it. A large AI data center can use as much power as 100,000 households, and the biggest projects use even more.

Where the money goes: The capital expenditure of the top five cloud providers (Amazon, Microsoft, Google, Meta, and so on) is surging. Several institutions estimate a combined total of about $600–690 billion in 2026, up roughly 30% year over year — that order of magnitude alone is close to 2% of U.S. GDP. Most of this money turns into GPUs, facilities, and power infrastructure.

Where the power comes from becomes a new problem: Cloud giants have long signed large volumes of renewable-energy purchase agreements, but AI’s appetite is so big that even nuclear has been put on the table. Small modular reactor (SMR) players like NuScale and Oklo keep getting named, and Amazon is partnering with X-energy to build SMRs in Washington State to power large loads. Realistically, though, before 2030 AI facilities will mainly rely on the traditional grid plus renewables, with SMRs being more of a long-term backup after 2030.

There’s an increasingly emphasized turning point here: AI’s real bottleneck is shifting from “GPU count” to “the grid and land.” In some regions, the grid already struggles to absorb more data centers on the hundreds-of-MW scale. To cram a 200-kilowatt rack into an existing grid, operators can only bet simultaneously on more power-efficient chips, more efficient optical interconnect, and more aggressive liquid cooling.

Taiwan in this gate: Taiwan’s power structure and nuclear policy are both fairly sensitive, so in the near term it’s unlikely to become the world’s most concentrated location for AI facilities. Taiwan plays more of a role by participating in the global build-out of data centers with equipment and technology like “efficient power, uninterruptible power systems, and liquid cooling,” rather than being a facility hub itself.

Gate 8 · Geopolitics and Export Controls

This gate isn’t a physical link within the supply chain; it’s more like a layer of rules draped over the preceding seven gates. Who can buy how many high-end GPUs, and who can make advanced processes, is largely decided by the export regulations of the U.S. and its allies.

New rules from January 2026: The U.S. Commerce Department shifted the review of some mid-tier AI chips for China (like the Nvidia H200 and AMD MI325X — chips that “don’t reach the highest spec”) from the original near-blanket rejection of “presumption of denial” to “case-by-case review” under strict conditions. In plain terms, it opened a narrow door with a pile of attached conditions (including taxation, third-party test certification, and not crowding out U.S. domestic supply). The highest-spec chips and re-exports mostly remain under strict control.

Roping allies in to squeeze together: The MATCH Act under consideration in the U.S. Congress aims to require allies like the Netherlands and Japan to follow suit within a deadline on equipment export restrictions to China, covering EUV and older DUV immersion machines — effectively making it harder for China to expand even at 7–14nm.

Taiwan listed as a Tier 1 partner: Under the U.S. “AI diffusion” framework, Taiwan, most EU countries, Japan, and South Korea are listed as Tier 1 and are not subject to quota caps under that framework (Taiwan still has to maintain its existing export controls and compliance, and is not entirely exempt), which officials regard as a “vote of confidence” in Taiwan’s technology-protection regime.

China’s situation: With high-end GPUs and equipment restricted, China is on one hand developing its own chips (such as Huawei’s Ascend) and on the other relying on techniques like model compression and distillation to save compute. Research indicates the controls act more like a tool for “raising costs and adding delay” than an airtight blockade.

Which Gates Are the Tightest?

Lay out all eight gates, and the ones that truly “choke the whole chain” are actually concentrated in four links — and those four are also highly concentrated geographically, which is exactly where geopolitics keeps its eyes.

Tightest link	Why it’s tight	Mainly concentrated in
Advanced-process foundry (sub-7nm)	The world’s high-end chips almost all rely on this production line	🇹🇼 TSMC (Taiwan)
Advanced packaging CoWoS/CoWoS-L	Supply fell short in 2025–2026, determines GPU shipments	🇹🇼 Mainly Taiwan
HBM3e/HBM4 memory	Three-way oligopoly, capacity completely booked	🇰🇷 South Korea leads, 🇺🇸 Micron catches up
EUV/DUV lithography equipment	The only ticket to making advanced processes	🇳🇱 ASML (Netherlands) monopoly

From a geopolitical-risk standpoint, Taiwan holds both trump cards — advanced process and CoWoS — at once, so if the Taiwan Strait situation shifts, the world’s AI-chip supply would be hit hard immediately; the Netherlands’ ASML is the sole supplier of EUV, so any change in export policy ripples across the globe; China is bottlenecked on equipment and high-end chip imports and forced to rely more on local alternatives, but in the near term it’s still hard to compete with the U.S.-plus-allies ecosystem.

What This Chain Tells Us

After all eight gates, it boils down to three judgments:

First, the AI compute race is essentially a race over hardware and capacity. However powerful a model is, if you can’t make the chips, can’t do the packaging, can’t feed the memory, or can’t supply the power, it’s all just slideware. So when following AI trends, watching the supply-chain bottlenecks often reads the direction sooner than watching the model launches.

Second, the bottlenecks are highly concentrated, and Taiwan sits at the center. The two tightest links — advanced process and advanced packaging — both rest on Taiwan, which is both Taiwan’s strategic value and the world’s most-watched single-point risk. Understanding this is what lets you see why chips became the main battlefield of great-power rivalry.

Third, the next ceiling may not be chips, but power. As the grid and land start to fall behind compute expansion, whoever can solve power supply, cooling, and energy efficiency holds the key to the next round. The “unsexy” links — nuclear, liquid cooling, optical interconnect — are actually the ones worth watching over the long run next.

This piece is an overall guide to the supply chain; later, Penchan will break each gate (like CoWoS, HBM, export controls) into a more in-depth standalone piece. If you want to understand individual companies first, read on with the customers and ecosystem up and down Nvidia’s chain.

FAQ

Why does AI have to use GPUs — can't it use regular CPUs?

AI training is essentially hundreds of millions of simple matrix operations running at the same time. A CPU is like a math professor who carefully solves one problem at a time; a GPU is like a thousand grade-schoolers who only do addition and subtraction but, working together, get the job done far faster. So AI runs almost entirely on GPUs or similar parallel-computing chips, and Nvidia currently holds about 80% of the market.

What is CoWoS, and why is everyone scrambling for it?

CoWoS is an advanced packaging technology from TSMC that bonds a GPU chip and multiple HBM memory chips tightly together on a silicon interposer, forming one giant AI accelerator module. High-end GPUs almost all need it, but capacity is limited and clearly fell short of demand in 2025–2026, so CoWoS monthly capacity directly determines how many high-end GPUs can ship in a year.

How is HBM different from regular computer memory?

Regular DRAM is memory sticks laid out horizontally; HBM (high-bandwidth memory) stacks memory layer upon layer vertically and connects to the GPU through a super-wide interface, delivering enormous data bandwidth. Think of it as ‘super-wide memory standing up,’ meant to keep the GPU from starving while it waits for data. Three players dominate: SK Hynix leads, with Micron and Samsung catching up.

Just how important is Taiwan in the AI supply chain?

TSMC holds about 60% of the global foundry market, and high-end AI chips almost all rely on its sub-7nm advanced process; CoWoS advanced-packaging capacity is also mainly concentrated in Taiwan. In other words, both critical manufacturing links for the world’s high-end AI chips rest on Taiwan, which is also the fundamental reason the world watches the Taiwan Strait so closely.

Just how much power does an AI data center consume?

A large AI data center can use as much power as 100,000 households, and the biggest projects use even more. The top five cloud providers’ 2026 capital expenditure is estimated at about $600 to $690 billion, and whether power and the grid can keep up is gradually replacing GPU count as the real ceiling on AI expansion.

With the U.S. controlling chip exports, can China buy high-end GPUs now?

Starting in January 2026, the U.S. shifted some mid-tier AI chips (like the H200) from the original ‘presumption of denial’ to ‘case-by-case review’ under strict conditions — effectively opening a narrow door to a limited degree. But the highest-spec chips and equipment remain restricted, and research also shows the controls act more like ‘raising costs and adding delay’ than a complete blockade.

Disclaimer and disclosures

This article is for general information and education only. It is not investment, legal, tax, or professional advice. Markets and regulations may change at any time, and the information reflects conditions at the time of writing.

Penchan is not a registered securities investment adviser. Any securities, digital assets, or financial products mentioned are covered for informational purposes only and are not buy or sell recommendations. Make your own decisions and accept your own risk.

Some or all of this article involved AI (Penna) assistance. The exact share varies by article. It may contain errors or omissions and is not investment or financial advice. Please verify against original sources.

The author may hold some assets mentioned in this article. Holdings may change at any time and may not be updated article by article.

See this site's Legal Notice and Disclosures and Privacy Policy.