Talk about AI hardware and you’ll inevitably hit one question: computers have had CPUs for ages, so why do you need to go out of your way and use a GPU to run AI?

This piece spells out the difference between CPU and GPU in plain terms, then looks at why large-scale AI training runs almost entirely on GPUs. It’s the side-by-side companion to the GPU gate, and the entry-level foundation for The AI Hardware Supply Chain, End to End.


The Difference in One Line: Quality vs. Quantity

The most crucial difference between a CPU (central processing unit) and a GPU (graphics processing unit) lies in the “number” of cores and their “division of labor.”

A CPU has anywhere from a few to a few dozen very powerful cores, good at working through complex, variable tasks one at a time in sequence. A GPU, by contrast, has thousands of relatively simple cores, good at running large batches of the same kind of computation in parallel at once. One favors quality, the other favors quantity.

Here’s an analogy: a CPU is like a few PhD students, each one brilliant and able to solve very hard problems, but few in number; a GPU is like a few thousand grade-schoolers, each able to do only simple arithmetic, but in massive numbers, so when they all pitch in on a job like “compute the same kind of problem a few million times,” they turn out to be astonishingly fast.


Why AI Training Uses GPUs

The key is the “shape of the work” in AI training.

Underneath, training an AI model is really a flood of repetitive matrix computations that can run simultaneously (lots of numbers being multiplied and added). This kind of “do the same move a few million times” work plays right into the strength of a GPU’s thousands of cores computing together. A GPU can crunch a huge swath at once, whereas a CPU’s handful of cores can only process it in queued-up batches, dozens of times slower or worse.

In other words, the CPU isn’t bad — this just isn’t its kind of work. AI computation happens to look exactly the way a GPU loves it, so the training and inference of large models run almost entirely on GPUs (or even more specialized chips).


The CPU Isn’t Replaced — It’s a Division of Labor

The arrival of GPUs hasn’t sidelined the CPU. A real AI system has the two dividing the labor and working together.

The CPU handles scheduling, controls the whole flow, and takes care of logic decisions and data preparation; the GPU shoulders the mass of parallel computation. The CPU is like a project manager, arranging who does what and when; the GPU is like a big crew of line workers, blazing through the uniform tasks they’ve been handed. Drop either side and the whole setup runs poorly.


More Specialized Than GPUs: TPUs and ASICs

A GPU is good at parallel computing, but it’s still fairly “general” — it can run many kinds of computation. Some companies want to go leaner and faster, so they build even more specialized chips.

The TPU (Google’s Tensor Processing Unit) and the AI ASIC are chips purpose-built for specific AI work: more efficient than a GPU on that one job, at the cost of being less general. They coexist and divide the labor with GPUs, rather than one displacing the other. To see how these chips break down, read What Are AI Chips.


Key Takeaways for This Gate

CPU favors quality, GPU favors quantity: a CPU uses a few powerful cores to run sequential, complex tasks, while a GPU uses a vast number of small cores to run parallel, uniform computations.

AI training is a mass of parallelizable matrix multiplication; a GPU crunches a huge swath at once while a CPU can only queue up slowly, and that’s why large-scale AI training runs mainly on GPUs or dedicated accelerators. And in a real system, the CPU handles scheduling and the GPU handles the math, with the two working as a team.

To see the GPU’s role in the AI supply chain, read the GPU gate; to see how the various AI chips break down, read What Are AI Chips and What Is an ASIC; to step back and look at the whole chain, head back to The AI Hardware Supply Chain, End to End.