Every time the news says “such-and-such company bought another tens of thousands of Nvidia GPUs,” it sounds like they just bought a batch of graphics cards and plugged them in. In reality it’s not that simple.
An AI chip first has to be designed, sent into a fab to be etched onto silicon, then bonded with memory through special packaging, then strung together by optical fiber into a cluster of thousands computing in unison, then somehow have the hundreds of kilowatts of heat per rack carted away, and finally crammed into a data center that eats power like a monster. That whole journey is what the industry calls the “AI hardware supply chain.”
The most interesting thing about this chain is that its lifeline isn’t spread evenly around the world — it’s concentrated on a few points, two of which are critical links right in Taiwan. Understanding where these eight gates are is more useful than memorizing which company’s stock went up or down.
The Whole Chain in 30 Seconds
Picture the whole AI hardware supply chain as building a super-factory: first etch the computing circuitry onto silicon (foundry), then bond the ultra-fast computing brain and the three-dimensional memory bank that holds the data onto the same substrate (advanced packaging + HBM), forming one monster compute card (the AI chip). Then connect tens of thousands of these cards with a light-speed-class high-speed network (optical interconnect); because they run so hot that traditional air cooling can’t keep up, you rely on liquid cooling (cooling and liquid cooling); and finally pack them neatly into a power-guzzling facility (data center and power). And every step of this globe-spanning factory is watched by the legal firewalls of the major powers (geopolitics and export controls).
| Gate | What it does (in plain terms) | Representative players | Is it a chokepoint? |
|---|---|---|---|
| 1 AI chips | Decides what compute looks like | Nvidia, AMD, Google TPU | Design is concentrated; Nvidia about 80% |
| 2 HBM memory | Feeds data to the chip so it doesn’t starve | SK Hynix, Micron, Samsung | ✅ Three-way oligopoly, supply tight |
| 3 Advanced packaging | Bonds chips and memory into one | TSMC CoWoS, ASE | ✅ Capacity gridlocked, mainly Taiwan |
| 4 Foundry | Etches the design onto silicon | TSMC, Samsung, ASML (equipment) | ✅ Dual bottleneck: advanced process + EUV |
| 5 Optical interconnect | Connects thousands of chips into a cluster | Optical-module makers, silicon-photonics makers | High technical bar, no shortage yet |
| 6 Cooling and liquid cooling | Carts away hundreds of kilowatts per rack | Delta Electronics, Asetek, Vertiv | Tightens as power density rises |
| 7 Data centers and power | Builds facilities, brings the power in | Cloud giants, power/nuclear operators | ✅ Grid and land are becoming the ceiling |
| 8 Export controls | Decides who can buy and who can build | U.S. BIS, allies, China | A cross-layer variable affecting every link |
Core-Data Snapshot Table
The figures below are the “dashboard” for the whole chain. To be clear up front: numbers like capacity and market share are mostly estimates from research firms or financial media, not month-by-month official disclosures from companies, so we try to mark the timepoint and nature here. When reading, grab the “order of magnitude” and “trend” rather than the decimal points.
| Topic | Value | Timepoint/Nature |
|---|---|---|
| TSMC CoWoS monthly capacity | About 70,000–80,000 wafers/month at end-2025, targeting about 120,000–130,000/month by end-2026 | 2025–2026E, target/industry estimate |
| HBM market share (by revenue) | SK Hynix about 54–57%, Micron and Samsung each around 20% | 2025 Q4 to 2026 H1, estimate, measures vary |
| Flagship GPU: Nvidia B300 | 288 GB HBM3e, bandwidth about 8 TB/s, FP4 about 15 PFLOPS, about 1,400 W each | Shipping from 2025 H2 |
| Next gen: Nvidia Vera Rubin | 288 GB HBM4 each, bandwidth target about 22 TB/s, full-rack TDP about 190–230 kW | Pre-launch spec for 2026 H2 |
| Optical-interconnect generation | 800G is already mainstream; 1.6T/silicon-photonics ramps into production 2025–2027 | 2024–2027 |
| AI rack power density | Commonly 30–50 kW, next gen up to 80–120 kW/rack | 2024–2026 |
| Top-five cloud providers’ capex | About $600–690 billion in 2026 (up roughly 30% year over year) | 2026E, institutional estimate |
Gate 1 · AI Chips (GPUs)
What it does: GPUs were originally designed for gaming and 3D graphics, using a large number of simple cores for parallel computing. That structure happens to be very well suited to splitting a neural network into countless small matrices and computing them all at once. By analogy, a CPU is like a professor doing advanced math, solving one problem at a time; a GPU is like a thousand grade-schoolers who only know addition and subtraction — strength in numbers — and it runs AI several orders of magnitude faster.
Who’s making money: This gate is almost entirely dominated by Nvidia alone. Industry data shows Nvidia held about 80% of the AI-accelerator market in 2025, and on the finer AI-GPU subcategory the estimate even reaches about 86%. The real moat isn’t just the chip itself but the CUDA software ecosystem bolted on top — everyone’s programs are written to run on it, and switching costs are high. AMD’s MI series is estimated at under 10% share, climbing slowly; cloud giants like Google and Amazon take another route, designing their own ASICs (application-specific chips, such as TPU and Trainium) to save cost and differentiate within their own clouds, but their overall global share is still far smaller than Nvidia’s.
| Metric | Value | Timepoint/Nature |
|---|---|---|
| Nvidia AI-accelerator share | About 80% | 2025, market estimate |
| Nvidia AI-GPU submarket share | About 86% | 2025, market estimate |
| AMD AI-GPU share | Under 10%, rising with the MI series | 2025E, estimate |
Taiwan in this gate: Chip design isn’t in Taiwan, but Nvidia’s full-rack AI systems (like the Vera Rubin NVL72) are often turned into mass-producible products and shipped by Taiwanese makers (such as Pegatron). Taiwan does not just handle wafers and packaging — it’s also an important assembly base for full-rack AI servers.
In the near term, the “AI servers” you hear about are almost all still Nvidia-led; in-house chips are more of a long-term play, not something that will flip the table this year.
Gate 2 · HBM High-Bandwidth Memory
What it does: However fast the chip is, if the data can’t keep up, it’s just idling. HBM (High Bandwidth Memory) exists to solve exactly this. Regular DRAM lays memory out in horizontal strips; HBM instead stacks memory layer upon layer vertically, connects it with “through-silicon vias,” and links to the GPU through a super-wide interface. In one line: it’s super-wide memory standing up, meant to keep the GPU from starving while it waits for data.
Where it’s stuck now: HBM is a three-way oligopoly — SK Hynix, Micron, and Samsung. SK Hynix leads in both technology and share (about 50-something percent depending on the measure); Micron rocketed from a single-digit share at the end of 2024 to about 20%, while Samsung is catching up. The key point is that supply is very tight: Micron has publicly said its 2026 HBM4 capacity is “completely sold out.” That means even if wafer and packaging capacity expands, if HBM can’t keep up, full systems still can’t ship.
| Maker | HBM share (by revenue) | Timepoint/Nature |
|---|---|---|
| SK Hynix | About 54–57% | 2025 Q4–2026 H1, estimate |
| Micron | About 18–21% | Same as above, growing fast |
| Samsung | About 20–22% | Same as above, catching up |
Note: HBM share figures vary depending on whether you count “all HBM” or “only the latest HBM4,” and on the timepoint; the ranges here are taken from various reports. The direction is consistent: Hynix leads, Micron catches up fast.
Taiwan in this gate: Taiwan doesn’t produce HBM wafers, but HBM ultimately gets stacked and tested at TSMC and packaging-and-test houses, then sent for assembly at Taiwanese server makers. The scarcer HBM is, the higher the bargaining power of advanced packaging and system integration, which actually amplifies Taiwan’s weight downstream.
Gate 3 · Advanced Packaging (CoWoS)
What it does: To push 20-plus TB per second of data between the GPU and HBM, the two have to be “extremely close,” and traditional circuit-board routing simply can’t do it. Advanced packaging places multiple chips on a silicon interposer, right next to each other, or even stacks them directly. TSMC’s CoWoS is the representative of this kind of technology, bonding the GPU and several HBM chips into one giant module. Think of it as: precisely gluing several Lego blocks onto the same baseplate to make one ‘big block.’
Why it’s a chokepoint: High-end GPUs almost all use CoWoS or similar packaging, so CoWoS monthly capacity directly determines how many of these chips can ship in a year. The 2025–2026 reports are almost unanimous: Nvidia’s CoWoS-L capacity is “completely booked,” and TSMC has had to outsource some orders to packaging-and-test houses like ASE and Amkor as a safety valve.
| Timepoint | TSMC CoWoS monthly capacity | Nature |
|---|---|---|
| End-2023 | About 13,000–16,000 wafers/month | Estimate |
| End-2024 | About 30,000–40,000 wafers/month | Estimate |
| End-2025 | About 70,000–80,000 wafers/month | Estimate |
| End-2026 (target) | About 120,000–130,000 wafers/month | Target/industry estimate |
Taiwan in this gate: CoWoS’s main capacity is concentrated in several science-park fabs in Taiwan, and most new expansion projects are also in Taiwan. Even where part is outsourced, most of it is tightly linked to the Taiwanese supply chain (SPIL, for example, is itself a Taiwanese company). At the “GPU + HBM packaging” layer, Taiwan is one of the world’s single most critical geographic concentration points; if it were disrupted, high-end AI platforms would struggle to find substitute capacity in the near term.
Gate 4 · Foundry and Lithography Equipment
What it does: Foundry means “making chips for others.” Nvidia designs its own GPUs but hands manufacturing to TSMC. Process nodes (5nm, 3nm, 2nm) can be roughly understood as “line width” — the smaller the number, the more transistors fit into the same area and the more power-efficient it is. High-end AI chips now almost all use the 3–5nm class.
How concentrated it is: TSMC holds about 64% of the global pure-foundry market (2024 Q3), far ahead of second-place Samsung’s 12%. More crucially, TSMC’s sub-7nm advanced process contributed about 74% of its wafer revenue, and the world’s high-end AI chips almost all rest on this most advanced production line.
| Metric | Value | Timepoint |
|---|---|---|
| TSMC global foundry share | About 64% | 2024 Q3 |
| Samsung foundry share | About 12% | Same as above |
| TSMC sub-7nm process share of revenue | About 74% | 2025 Q4 |
And there’s a bottleneck hidden further back — ASML: To make sub-7nm, you must use extreme-ultraviolet (EUV) lithography machines, and the Netherlands’ ASML is the only company in the world that can produce production-grade EUV — a near-monopoly. One machine costs €180 million to €380 million. That means simply restricting exports of ASML equipment can directly choke whether downstream fabs can make more advanced processes.
Taiwan in this gate: TSMC is the largest supplier of the world’s most advanced process, and that technology and capacity are highly concentrated in Taiwan, which makes “whether the Taiwan Strait is stable” a direct precondition for the world’s AI-chip supply. This is also why the international community is so sensitive to the situation in the Taiwan Strait.
Gate 5 · Optical Interconnect
What it does: When training a large model, thousands or even tens of thousands of GPUs must constantly exchange data. As distance grows and speed rises, traditional copper wires lose signal integrity. So clusters switch internally to optical-fiber transmission, converting electrical signals into light, sending them out, and converting them back. Speeds climb from 400G up to 800G, then to 1.6T.
Where it stands now: 800G optical modules were already mainstream in 2024–2025 AI data centers, doubling the bandwidth of 400G while cutting energy per bit by 30–40%. The 1.6T modules (including co-packaged optics, CPO, and silicon-photonics designs) ramp into production successively in 2025–2027. The key to CPO is packaging the optical engine right next to the chip, greatly shortening circuit distance and saving a lot of power; by Nvidia’s figures, each port can drop from about 30 watts to about 9 watts. In a cluster with tens of thousands of GPUs, that kind of power saving gets greatly amplified.
Taiwan in this gate: Taiwan has quite a few networking and server-system makers integrating optical modules and switches, and it also supplies circuit boards, mechanical parts, and testing. That said, silicon-photonics chips themselves are still mainly designed and produced by U.S. and Chinese makers; at this layer Taiwan leans more toward contract manufacturing and components, with limited public market-share data. This gate has a high technical bar, but so far there’s been no “whole-line shortage” like CoWoS.
Gate 6 · Cooling and Liquid Cooling
What it does: A single B300 GPU draws 1,400 watts, and a rack with 72 of them easily exceeds 150–200 kilowatts. A traditional rack relying on air conditioning plus fans can handle only about 5–10 kilowatts per rack — nowhere near enough for an AI rack. So liquid cooling enters: send coolant straight to a cold plate on the chip to carry away heat (direct-to-chip liquid cooling), or simply submerge the whole server in a non-conductive liquid (immersion).
Get a feel for the numbers:
| Rack type | Power density | Notes |
|---|---|---|
| Traditional enterprise rack | About 5–10 kW/rack | Air cooling is fine |
| Current AI GPU rack | About 30–50 kW/rack | Air cooling is near its limit |
| Next-gen AI rack | About 80–120 kW/rack | Liquid cooling is mandatory |
Taiwan in this gate: Taiwan already has a complete supply chain in cooling components (fans, heat pipes, cold plates, chassis), and now many makers are moving into cold plates, coolant distribution units (CDUs), and full-rack liquid-cooling integration. Delta Electronics is a clear example, making liquid-cooling and power solutions from the rack level to the facility level, pitched at high-density GPU scenarios. A common play is for Taiwanese teams to export the whole “server + rack + power + cooling” package.
Gate 7 · Data Centers and Power
What it does: Once the racks are built, they have to go into a data center, which also needs power on the scale of hundreds of MW (megawatts) connected behind it. A large AI data center can use as much power as 100,000 households, and the biggest projects use even more.
Where the money goes: The capital expenditure of the top five cloud providers (Amazon, Microsoft, Google, Meta, and so on) is surging. Several institutions estimate a combined total of about $600–690 billion in 2026, up roughly 30% year over year — that order of magnitude alone is close to 2% of U.S. GDP. Most of this money turns into GPUs, facilities, and power infrastructure.
Where the power comes from becomes a new problem: Cloud giants have long signed large volumes of renewable-energy purchase agreements, but AI’s appetite is so big that even nuclear has been put on the table. Small modular reactor (SMR) players like NuScale and Oklo keep getting named, and Amazon is partnering with X-energy to build SMRs in Washington State to power large loads. Realistically, though, before 2030 AI facilities will mainly rely on the traditional grid plus renewables, with SMRs being more of a long-term backup after 2030.
There’s an increasingly emphasized turning point here: AI’s real bottleneck is shifting from “GPU count” to “the grid and land.” In some regions, the grid already struggles to absorb more data centers on the hundreds-of-MW scale. To cram a 200-kilowatt rack into an existing grid, operators can only bet simultaneously on more power-efficient chips, more efficient optical interconnect, and more aggressive liquid cooling.
Taiwan in this gate: Taiwan’s power structure and nuclear policy are both fairly sensitive, so in the near term it’s unlikely to become the world’s most concentrated location for AI facilities. Taiwan plays more of a role by participating in the global build-out of data centers with equipment and technology like “efficient power, uninterruptible power systems, and liquid cooling,” rather than being a facility hub itself.
Gate 8 · Geopolitics and Export Controls
This gate isn’t a physical link within the supply chain; it’s more like a layer of rules draped over the preceding seven gates. Who can buy how many high-end GPUs, and who can make advanced processes, is largely decided by the export regulations of the U.S. and its allies.
New rules from January 2026: The U.S. Commerce Department shifted the review of some mid-tier AI chips for China (like the Nvidia H200 and AMD MI325X — chips that “don’t reach the highest spec”) from the original near-blanket rejection of “presumption of denial” to “case-by-case review” under strict conditions. In plain terms, it opened a narrow door with a pile of attached conditions (including taxation, third-party test certification, and not crowding out U.S. domestic supply). The highest-spec chips and re-exports mostly remain under strict control.
Roping allies in to squeeze together: The MATCH Act under consideration in the U.S. Congress aims to require allies like the Netherlands and Japan to follow suit within a deadline on equipment export restrictions to China, covering EUV and older DUV immersion machines — effectively making it harder for China to expand even at 7–14nm.
Taiwan listed as a Tier 1 partner: Under the U.S. “AI diffusion” framework, Taiwan, most EU countries, Japan, and South Korea are listed as Tier 1 and are not subject to quota caps under that framework (Taiwan still has to maintain its existing export controls and compliance, and is not entirely exempt), which officials regard as a “vote of confidence” in Taiwan’s technology-protection regime.
China’s situation: With high-end GPUs and equipment restricted, China is on one hand developing its own chips (such as Huawei’s Ascend) and on the other relying on techniques like model compression and distillation to save compute. Research indicates the controls act more like a tool for “raising costs and adding delay” than an airtight blockade.
Which Gates Are the Tightest?
Lay out all eight gates, and the ones that truly “choke the whole chain” are actually concentrated in four links — and those four are also highly concentrated geographically, which is exactly where geopolitics keeps its eyes.
| Tightest link | Why it’s tight | Mainly concentrated in |
|---|---|---|
| Advanced-process foundry (sub-7nm) | The world’s high-end chips almost all rely on this production line | 🇹🇼 TSMC (Taiwan) |
| Advanced packaging CoWoS/CoWoS-L | Supply fell short in 2025–2026, determines GPU shipments | 🇹🇼 Mainly Taiwan |
| HBM3e/HBM4 memory | Three-way oligopoly, capacity completely booked | 🇰🇷 South Korea leads, 🇺🇸 Micron catches up |
| EUV/DUV lithography equipment | The only ticket to making advanced processes | 🇳🇱 ASML (Netherlands) monopoly |
From a geopolitical-risk standpoint, Taiwan holds both trump cards — advanced process and CoWoS — at once, so if the Taiwan Strait situation shifts, the world’s AI-chip supply would be hit hard immediately; the Netherlands’ ASML is the sole supplier of EUV, so any change in export policy ripples across the globe; China is bottlenecked on equipment and high-end chip imports and forced to rely more on local alternatives, but in the near term it’s still hard to compete with the U.S.-plus-allies ecosystem.
What This Chain Tells Us
After all eight gates, it boils down to three judgments:
First, the AI compute race is essentially a race over hardware and capacity. However powerful a model is, if you can’t make the chips, can’t do the packaging, can’t feed the memory, or can’t supply the power, it’s all just slideware. So when following AI trends, watching the supply-chain bottlenecks often reads the direction sooner than watching the model launches.
Second, the bottlenecks are highly concentrated, and Taiwan sits at the center. The two tightest links — advanced process and advanced packaging — both rest on Taiwan, which is both Taiwan’s strategic value and the world’s most-watched single-point risk. Understanding this is what lets you see why chips became the main battlefield of great-power rivalry.
Third, the next ceiling may not be chips, but power. As the grid and land start to fall behind compute expansion, whoever can solve power supply, cooling, and energy efficiency holds the key to the next round. The “unsexy” links — nuclear, liquid cooling, optical interconnect — are actually the ones worth watching over the long run next.
This piece is an overall guide to the supply chain; later, Penchan will break each gate (like CoWoS, HBM, export controls) into a more in-depth standalone piece. If you want to understand individual companies first, read on with the customers and ecosystem up and down Nvidia’s chain.