What Is Blackwell? B200, GB200, GB300, NVL72 to Rubin, All on One Page

A plain-English guide to NVIDIA's Blackwell generation: first get the levels straight — B200/B300 are GPUs, GB200/GB300 are superchips pairing a Grace CPU with two GPUs, and NVL72 strings 72 GPUs into a single rack. We break down the spec differences, where it's shipping, and the role of the next-gen Rubin and Taiwan's contract manufacturers.

5/27 · Penna

NVIDIA Blackwell AI superchip illustration: a flagship accelerator with two side-by-side compute dies on one substrate surrounded by memory

TL;DR

Blackwell is NVIDIA's current flagship AI chip generation. B200/B300 are GPUs, GB200/GB300 are superchips pairing a Grace CPU with two GPUs, and GB300 NVL72 strings 72 GPUs into a full rack-scale supercomputer. 2026 is still dominated by Blackwell shipments, while the next-gen Rubin has been announced and is slated for cloud-provider deployment in the second half of the year.
Beginners and long-term watchers who want to understand 'what's actually the difference between B200, GB200, NVL72, and Rubin, and why the news keeps mentioning them.'
Remember the levels first: B200/B300 are GPUs, GB200/GB300 are superchips, NVL72 is a full rack; in 2026 the mainstay is still Blackwell, Rubin is the next baton but its specs are still marked preliminary, and HBM4 plus the rack-scale supply chain are the biggest variables.

Contents

Every time NVIDIA holds a launch event, the news fills up with a pile of names: B200, GB200, GB300, NVL72, Rubin. They all sound impressive, but are they different things, or just different ways of saying the same thing?

This piece lays out the Blackwell generation all at once. First we separate the three levels — GPU, superchip, and rack-scale system — then look at how the specs differ, where it’s shipping, and the role of the next-gen Rubin and Taiwan’s contract manufacturers. This is the deep-dive version of Gate 1, “AI chips,” in The AI Hardware Supply Chain, End to End.

First, Get It Straight: GPU, Superchip, and Rack Are Three Levels

What trips people up most in the news is mixing three different levels together. Let’s pull them apart:

B200 / B300: a single GPU, the most basic compute chip.
GB200 / GB300: one Grace CPU paired with two GPUs, bound into a single “superchip.”
GB200 / GB300 NVL72: 36 Grace CPUs and 72 GPUs strung together with a high-speed link into a full rack, operated as one supercomputer.

So when you see GB300 NVL72, it means a full rack of 72 GPUs; when you see B300, it means the single GPU inside. Hold onto this small-to-large ladder and the numbers later won’t get confusing.

What Is Blackwell? Why Is It So Powerful

Blackwell is the AI chip architecture NVIDIA launched in 2024 and shipped in volume from 2025 into 2026, succeeding the previous Hopper generation (H100/H200).

Its most crucial design is stitching two large compute dies into a single GPU. Because of process limits, there’s a ceiling on how big a single die can be, so NVIDIA uses a 10 TB/s ultra-fast link (NV-HBI) to join two dies into a GPU that “looks like one,” packing about 208 billion transistors in total, with manufacturing handed to TSMC’s 4NP process.

On compute, Blackwell leans on an ultra-low-precision number format called NVFP4 (think of it as representing numbers in a more economical way, trading that for more operations per second), pushing AI inference throughput up in one big jump. In a sentence: Blackwell is a flagship AI GPU that “works like two in one.”

Core-Data Snapshot

Below we put the Blackwell generation’s key specs side by side. First, three terms: HBM is the high-speed memory sitting next to the GPU, PFLOPS is how many floating-point operations it can do per second, and CoWoS is the advanced packaging that seals the GPU and HBM together. Numbers follow NVIDIA’s published figures.

Product	Level	Key specs	Status
B200	GPU	192GB HBM3E, 8 TB/s bandwidth, NVFP4 about 10 PFLOPS, power about 1200W	Shipping
B300 (Blackwell Ultra)	GPU	288GB HBM3E, 8 TB/s bandwidth, NVFP4 about 15 PFLOPS, power about 1400W	Shipping
GB200 NVL72	Rack (72 GPU + 36 Grace)	NVFP4 about 720 PFLOPS, 13.4TB HBM3E, rack on the order of 120kW	Shipping
GB300 NVL72	Rack (72 Ultra + 36 Grace)	NVFP4 about 1,080 PFLOPS, 20TB HBM3E, rack on the order of 120kW	Deploying
Vera Rubin NVL72	Next-gen rack	Single Rubin 288GB HBM4 / 22 TB/s, rack NVFP4 inference about 3,600 PFLOPS	H2 2026 (preliminary specs)

(The NVFP4 figures for Blackwell here are dense values; the sparse values NVIDIA’s marketing often cites are roughly double. Rack power varies with the power-delivery and cooling configuration; the Rubin rack figure is an inference measure and can’t be compared directly with Blackwell’s dense values.)

From B200 to B300: Same Architecture, Pushed Up a Notch

Launched in 2025 and rolling into commercial deployment from the second half of the year, the B300 — officially codenamed Blackwell Ultra — is the beefed-up version of the same architecture.

The two upgrades you feel most: memory goes from 192GB to 288GB HBM3E, up 50%, fitting larger models; low-precision (NVFP4) dense compute is also up about 50%. The cost is power rising from about 1,200 watts to about 1,400 watts. The rack-scale GB300 NVL72 therefore pulls memory from 13.4TB up to 20TB, making it better suited to running inference on very large models. For cloud providers, this is a “same production line, specs jump up a notch” upgrade that rides the existing momentum, with no need to redo the whole architecture.

Where Is It Shipping?

Blackwell isn’t a PowerPoint spec — it’s already running for real.

NVIDIA officially labels Blackwell “full production,” with both HGX B200 and B300 shipping. The rack-scale GB300 NVL72 has landed too: cloud provider CoreWeave was first to deploy it commercially in mid-2025, and Microsoft Azure went further in October 2025, standing up a production-grade cluster of thousands of GB300 GPUs for OpenAI’s compute fleet. In other words, the AI compute expansion of 2026 still leans on Blackwell and Blackwell Ultra as the mainstay.

The capacity bottleneck is still the same old place: how many Blackwell can ship is limited by TSMC’s CoWoS advanced packaging and HBM memory supply, the two gates the earlier standalone pieces have already broken down.

The Next-Gen Rubin: Announced, but Don’t Rush to Say It Replaces Blackwell

NVIDIA has formally announced the next-gen platform Vera Rubin, and the chip itself has already entered full production. The rack-scale Vera Rubin NVL72 is built from 72 Rubin GPUs plus 36 Vera CPUs; each Rubin switches to next-gen HBM4 memory (288GB, 22 TB/s bandwidth), and the rack’s NVFP4 inference compute reaches up to about 3,600 PFLOPS, another big jump above Blackwell.

But tap the brakes here. The product page still marks the specs as “preliminary, subject to change,” with an official target of deployment by cloud providers like AWS, Google Cloud, and Microsoft in the second half of 2026. Research firms also estimate that NVIDIA’s high-end GPU shipments in 2026 will still be dominated by Blackwell (its share rising from roughly 60% toward 70%), with Rubin still carrying supply-chain tuning and schedule risk — and the validation and supply of HBM4 is the single most critical variable. So the pragmatic view is: 2026 is Blackwell’s home turf, and Rubin is the next baton in the queue, not an immediate handover.

Taiwan’s Role at This Gate

Chip design sits with NVIDIA and manufacturing with TSMC, so the job of “assembling the rack-scale system and making it mass-producible and shippable” falls mainly to Taiwan.

Foxconn has publicly shown off a full Vera Rubin NVL72 system; supply-chain reports name Quanta, Wistron and Wiwynn, Inventec, and other Taiwanese ODM/EMS players as participating in the contract manufacturing of GB200/GB300 rack-scale systems, and also note that NVIDIA, racing to lock up capacity, has reserved part of certain Taiwanese makers’ server-plant space through 2026. In other words, Taiwan doesn’t just make the wafers and packaging — assembling and shipping “a whole rack of AI supercomputer” is also a key global base. This is only an industry map and makes no investment judgment on any individual stock.

Key Takeaways for This Gate

After looking at Blackwell, first remember that small-to-large ladder: B200/B300 are GPUs, GB200/GB300 are superchips, NVL72 is a system of 72 GPUs in a full rack.

Technically, Blackwell uses “two dies stitched into one” plus NVFP4 low-precision compute to push throughput high; the B300 (Blackwell Ultra) then nudges both memory and compute up about 50% each. The mainstay in 2026 is Blackwell — already shipping in volume and really deployed in the cloud; Rubin is the announced next baton, targeting a second-half debut, but its specs are still preliminary, with HBM4 and the rack-scale supply chain as the biggest variables.

To learn about the HBM that feeds data to these GPUs and the CoWoS that binds the chips together, see What Is HBM and What Is CoWoS; to see how all eight gates of the chain string together, head back to the supply-chain overview.

FAQ

What is Blackwell?

Blackwell is the AI chip architecture NVIDIA launched in 2024 and shipped in volume across 2025-2026. It uses TSMC’s 4NP process to stitch two large compute dies together with a 10 TB/s ultra-fast link into a single GPU — effectively two in one — purpose-built to push the compute for AI training and inference.

What's the difference between B200, GB200, and NVL72?

These are three levels. B200/B300 is a single GPU; GB200/GB300 is a ‘superchip’ that binds one Grace CPU with two GPUs; GB200/GB300 NVL72 then strings 36 Grace CPUs plus 72 GPUs together with a high-speed link into a full rack, operated as one supercomputer.

How is the B300 (Blackwell Ultra) stronger than the B200?

The B300 is the beefed-up version of Blackwell. The most obvious change is memory climbing from 192GB to 288GB HBM3E (up about 50%), with low-precision (NVFP4) dense compute also up roughly 50%, and power rising from about 1,200 watts to about 1,400 watts. In short, same architecture, specs pushed up another notch.

Is Blackwell shipping now? When does Rubin arrive?

Blackwell is already what NVIDIA officially labels ‘full production,’ with both HGX B200 and B300 shipping, and GB300 NVL72 already in real cloud-provider deployment. The next-gen Rubin (Vera Rubin platform) has been formally announced, with an official target of cloud-provider deployment in the second half of 2026 — but its specs are still marked preliminary and subject to change.

What role does Taiwan play in the Blackwell supply chain?

Taiwan is the primary contract-manufacturing base for full rack-scale AI systems. Foxconn, Quanta, Wistron/Wiwynn, Inventec, and other Taiwanese ODM/EMS players turn NVIDIA’s reference designs into mass-producible, shippable GB200/GB300/Rubin NVL systems. This describes industry roles and does not constitute investment advice.

Disclaimer and disclosures

This article is for general information and education only. It is not investment, legal, tax, or professional advice. Markets and regulations may change at any time, and the information reflects conditions at the time of writing.

Penchan is not a registered securities investment adviser. Any securities, digital assets, or financial products mentioned are covered for informational purposes only and are not buy or sell recommendations. Make your own decisions and accept your own risk.

Some or all of this article involved AI (Penna) assistance. The exact share varies by article. It may contain errors or omissions and is not investment or financial advice. Please verify against original sources.

The author may hold some assets mentioned in this article. Holdings may change at any time and may not be updated article by article.

See this site's Legal Notice and Disclosures and Privacy Policy.