In this AI race, most companies have to queue up to buy GPUs from Nvidia. Google is the exception, holding a card no one else has: an AI chip it designs itself, the TPU.
That card makes it the least Nvidia-dependent of all the big AI players. While rivals hand large budgets to Nvidia and still have to wait on supply, Google runs much of its training and inference on its own TPUs. If you want the full picture of Google as a company first, start with What kind of company is Google.
This piece answers just one question: by making its own chips, how much does Google actually save, how deep is this moat, and which hurdles can’t it clear?
The one-line take. The TPU really does save money, but what it saves is Google’s own large-scale compute; it hasn’t let Google break free of that global supply chain running from TSMC to HBM.
The TPU is Google’s own in-house AI chip
TPU is short for Tensor Processing Unit, a chip Google purpose-built to run its own AI workloads—a specialized ASIC. It isn’t sold at retail; it mainly serves Google internally and is also rented out to enterprises through Google Cloud.
Google has been making TPUs for a decade, stacking one generation on top of the next. The latest one out is the seventh-generation Ironwood, which Google positions as “the first TPU for the age of inference,” with another step up in performance per watt over the previous Trillium generation. After that comes the eighth generation, where Google did something it had never done before: it split training and inference into two different chip designs, each optimized on its own.
We won’t dig into the principles behind the chip here; if you want to know how the TPU differs from the GPU and what an ASIC is, read our supply-chain series, the AI accelerator chip breakdown and the dedicated TPU piece.
Where exactly does making your own chips save money
The most direct evidence comes from Google’s own earnings. On its 2025 earnings call, Google said it had cut Gemini’s per-unit serving cost by roughly 78% that year, crediting model optimization plus efficiency and utilization gains from its own hardware.
The savings logic has two layers. One is bypassing the middleman’s margin: by designing its own chips, it doesn’t have to buy GPUs from Nvidia at retail prices. The other is the bespoke fit: the TPU is designed for Google’s own workloads, so it runs more precisely on target. Some analysts estimate that on certain large language model training workloads, the latest-generation TPU can run at a total cost of ownership about 40% lower than a comparable Nvidia chip.
Here we should add an honest caveat, so readers don’t picture the edge as bigger than it is. That 40% is an analyst estimate for “optimized, internal use cases,” and it varies by workload. On general-purpose, on-demand cloud inference, independent benchmarks show the Nvidia ecosystem still holds a clear lead today. In other words, where the TPU saves the most is the kind of work Google itself runs—the super-high-volume jobs that can be scheduled at leisure—not every type of compute it sells to others.
Behind the end-to-end story sits a very long supply chain
“In-house chips” makes it sound like Google does everything itself; in reality, it relies on a whole row of partners.
The chip’s architecture is led by Google, but the design collaboration is split among several companies: the latest-generation training chip is co-designed with Broadcom, with the two parties signed all the way through 2031; the inference chip is handled by Taiwan’s MediaTek; and Google is also in talks with Marvell as a third design partner. The one that actually builds the chip is TSMC’s advanced process; the eighth-generation TPU is targeting 2nm, with mass production expected by the end of 2027.
The memory side is even more delicate. The TPU needs large amounts of HBM (high-bandwidth memory), and according to industry reports, Samsung has recently supplied a relatively high share of the HBM for Google’s TPUs—but that’s an industry rumor, not an official figure from the three memory makers, and Google has never disclosed the exact allocation. Demand for advanced packaging (CoWoS) has also surged, likewise stuck on TSMC’s capacity.
Putting it all together, Google does have more control than rivals who can only buy GPUs externally, but it hasn’t truly broken free of the global supply chain. TSMC’s capacity, the supply of HBM, the advanced-packaging bottleneck—these are all the links the whole industry is scrambling for, and all within the scope of U.S. export controls. For how the entire chain works, see The end-to-end AI hardware supply chain.
Even its rivals are renting its chips
The TPU’s strength is clearest from a slightly paradoxical angle: Google’s rivals are using its chips too.
Anthropic signed Google’s largest-ever TPU purchase agreement, on the order of a million units; the market has also heard that Meta is in talks with Google over a large TPU deployment. The most intriguing thread is Anthropic: Google pours heavy investment into it on one hand while renting compute to it on the other, a complex triangle we unpack specifically in Why Google invests in its rival Anthropic.
That rivals are willing to hand something as critical as training their next-generation models over to the TPU is, in itself, an endorsement of its cost-performance.
Chips need power—where does the power come from
Chips are only half the story; the other half is power. AI compute at this scale draws a staggering amount of electricity, so energy has become another front for Google.
Its bets run along two paths. One is nuclear: Google signed an agreement with Kairos Power, planning to deploy up to 500 MW of small modular reactors (SMRs) for power between 2030 and 2035, with the first targeted to come online in 2030. The other is renewables: it signed a 1 GW solar power purchase agreement in Texas, signed over 1 GW more across several U.S. power markets, and in early 2026 spent $4.75 billion to acquire Intersect Power, a clean-energy and data-center infrastructure company—all aimed at locking down power sources first, so expansion doesn’t stall for lack of electricity.
This energy thread will grow alongside the data centers; it’s an increasingly critical piece that’s easy to overlook when you look at Google’s AI strategy.
The parts not yet laid bare
Plenty of numbers on this topic are stuck at the “industry reports” stage, and we’ll flag them honestly here:
- The exact share of each HBM supplier: the claim that Samsung’s share is relatively high recently comes from industry chatter and Korean media; none of the three memory makers have come forward to confirm it, and the exact allocation is undisclosed.
- The TPU’s share of TSMC’s capacity: there are estimates based on MediaTek’s orders, but how much of TSMC’s advanced process and packaging capacity Google itself takes up has never been disclosed officially.
- Ironwood’s process node: the industry points to TSMC 3nm, but Google hasn’t announced it officially.
- The internal procurement ratio between TPU and Nvidia: Alphabet’s earnings don’t break out the dollar amounts for in-house TPUs versus externally purchased GPUs separately.
None of these have an official verdict yet, so any single, ever-more-precise number is worth keeping a question mark beside.
Penna’s take
Pull the TPU card back for a wide view, and its value isn’t in being “the cheapest”—it’s in having “the most control.”
While others fret over scrambling for Nvidia’s supply and agonize over purchase prices, Google holds a chip line it can schedule on its own. In an era where compute is national power, that’s some serious backbone. But this moat has its clear boundaries: where it saves the most is its own large-scale compute, not every kilowatt it sells externally; and its water source is still that global supply chain running from TSMC and HBM to advanced packaging. Google is closer to “chip self-sufficiency” than anyone, yet no one has truly achieved it.
Further reading: What kind of company is Google, The end-to-end AI hardware supply chain, Why Google invests in its rival Anthropic.