The earlier ASIC gate noted that, besides buying NVIDIA, the cloud giants are all building their own AI chips — and the most veteran of these is Google’s TPU. This piece spells out the TPU.

First what a TPU is and how it differs from a GPU, then how the generations evolved, whether it will replace NVIDIA, and why even Anthropic uses it at scale, plus what Broadcom and Taiwan’s firms do in it. This is the deep-dive on Google TPU for Gate 1, “AI chips,” of The AI Hardware Supply Chain, End to End.


What Is a TPU

TPU is short for Tensor Processing Unit, Google’s in-house custom chip built specifically for machine learning — a kind of ASIC (a chip tailored to one specific task).

Its design philosophy differs from a GPU’s. A GPU is a more general-purpose parallel-computing chip that can compute anything, with a mature software ecosystem (CUDA). A TPU instead co-designs matrix computation, HBM memory, chip-to-chip interconnect, the compiler, and the framework all together, specializing in AI training and inference, aiming to be more cost-effective on cost, power, latency, and scaling. An analogy: a GPU is like a Swiss army knife that can cut anything, while a TPU is like a chef’s knife Google custom-made for its own handful of dishes.

One clarification: Google does not avoid NVIDIA. Google Cloud has explicitly said it will work closely with NVIDIA and offer next-generation NVIDIA instances, and TPU and GPU run as two parallel product lines within Google Cloud.


Core-Data Snapshot

Below we capture the TPU’s key milestones. Specs are from Google’s official sources; shares are research-firm estimates.

TopicDataTiming / Nature
Latest generation in productionSeventh-gen Ironwood (TPU7x)Google Cloud documentation (2026-05)
Ironwood specsA single superpod strings 9,216 chips, each with 192 GiB HBMGoogle official
Eighth generationTPU 8t (training) / 8i (inference) unveiled 2026-04, not yet commercialGoogle official announcement
TPU share in Google’s AI serversEstimated about 78% for 2026 (the only cloud provider shipping more ASICs than GPUs)TrendForce estimate
Anthropic adoptionUp to a million chips, over 1GW in 2026; about 3.5GW via Broadcom from 2027Anthropic / Broadcom SEC

A Brief History of the Generations: From v1 to Ironwood

The TPU isn’t new. Google has used the first-gen TPU inside its own data centers since 2015, early on running services like RankBrain, Street View, and AlphaGo. It then evolved steadily: the second, third, and fourth generations moved gradually from inference toward large-scale training; the fifth generation split into the cost-saving v5e and the performance-focused v5p; the sixth generation is called Trillium (v6e).

The latest generation currently in production and available on Google Cloud is the seventh-gen Ironwood (TPU7x): a single superpod can string up to 9,216 chips, each with 192 GiB of HBM, specializing in large-model pretraining and inference. The eighth-gen TPU 8t (training) and 8i (inference) had their specs unveiled in April 2026, with the official pitch centered on higher performance per dollar and performance per watt, but for now only intent registration is open, with no general-availability (GA) documentation seen yet — just understand it as “unveiled but not yet commercial,” and don’t treat it as something you can already rent.


TPU vs GPU: Will It Replace NVIDIA

This is the most frequently asked question, and the answer has to be read at two scales: “Google’s own” and “the whole market.”

The TPU’s strength lies in vertical integration: the needs of Google’s own models (such as Gemini) can directly shape the chip’s design, and together with its emphasis on cost and energy efficiency plus ultra-large-scale interconnect, it has a real edge on Google’s own workloads. Research firm TrendForce estimates that in 2026 the TPU’s share of Google’s own AI-server shipments approaches 80%, and Google is the only major cloud provider where “in-house ASIC shipments exceed GPUs.”

But that’s Google’s own internal mix, not global market share. Across the whole market, NVIDIA’s moat is still in place: the CUDA software ecosystem, mature developer tooling, and cross-cloud, cross-framework versatility all keep GPUs the mainstay, and Google Cloud itself keeps selling NVIDIA instances. The TPU also still carries migration costs for external developers — even Google still labels the TPU’s native support for PyTorch as a “preview” stage. So the pragmatic view is this: the TPU is ramping quickly within Google itself and at some AI labs, but in the short term it’s far from replacing NVIDIA — it looks more like coexistence and a division of labor.


Who Uses TPUs

First, Google itself. From early search ranking and Street View to today’s Gemini, a large share of Google’s own AI products run on TPUs. External users can rent them through Google Cloud (Cloud TPU VM, GKE, Vertex AI).

The most closely watched external customer is Anthropic. In October 2025 it announced an expanded adoption of Google Cloud TPUs, scaling up to a million chips and bringing over 1GW of capacity in 2026; in April 2026 it signed further agreements with Google and Broadcom to obtain roughly 3.5GW of next-generation TPU compute through Broadcom starting in 2027. Note that Anthropic runs a diversified strategy, using AWS Trainium and NVIDIA GPUs at the same time, with Amazon still its primary cloud and training partner. Using multiple vendors and spreading bets is exactly the norm for today’s large AI companies. Separately, private-equity giant Blackstone also announced a joint venture with Google in May 2026 to build a TPU cloud in the United States, offering another route to TPU access beyond Google Cloud.


Who Helps Google Build TPUs, and Taiwan’s Role

Google controls the TPU’s architecture and software, but the chip’s detailed design, manufacturing, and packaging need partners.

The clearest is Broadcom. In an April 2026 SEC filing, Broadcom confirmed it had signed a long-term agreement with Google to do custom design and supply for Google’s future TPU generations, and to provide networking and other components for the next-generation AI rack, with the collaboration running through 2031. On manufacturing, the market widely links the TPU’s advanced process and packaging to TSMC, but Google has not officially disclosed the specific process node — this part is supply-chain and media speculation. There have also been media reports that MediaTek is involved in the design of Google’s next-generation inference TPU, but likewise without official confirmation, so it can only be treated as “market reporting.”

As for the “TPU stocks” people often discuss in Taiwan’s market, the market and analysts will name a slate of supply-chain firms (design services, testing, packaging-and-test, test interfaces, substrates, thermal and power, and more). Here it must be made especially clear: these lists mostly come from brokerages’ supply-chain speculation and benefit-driven imagination, not from announcements by Google or Broadcom; being named does not mean a firm has landed orders, nor how much it benefits. This article only describes industry roles; it does not compile beneficiary stocks, does not rank individual names, and does not constitute investment advice.


Key Takeaways for This Gate

The TPU is Google’s in-house AI-purpose chip, co-designing compute, memory, interconnect, and software, following a vertically integrated route built around cost and energy efficiency. The latest in production is the seventh-gen Ironwood, with the eighth-gen 8t/8i unveiled but not yet commercial.

Inside Google’s own AI servers, the TPU’s share is already high (research firms estimate close to 80% for 2026), but globally, NVIDIA remains the mainstay through the CUDA ecosystem and general-purpose versatility, and the two coexist within Google Cloud. Even Anthropic uses TPUs at scale to run Claude, with Broadcom doing the custom design behind it. The Taiwan-firm lists that get cited are mostly analyst speculation — just understand the industry roles, and don’t treat it as a stock-picking list.

To see which category Google TPU falls into, go back and read What Is an ASIC; to see the GPU-mainstay side, read What Is a GPU and Blackwell; to see how these chips get fed data, read HBM; to head back to the whole chain, return to the supply-chain overview.