Wafer-scale technology is making waves again, this time promising to enable artificial intelligence (AI) models with trillions of parameters to run faster and more efficiently than traditional GPU-based systems. Engineers at The University of California, Riverside (UCR) claim to have developed a chip the size of a frisbee that can move massive amounts of data without overheating or consuming excessive electricity.
They call these massive chips wafer-scale accelerators, which Cerebras manufactured on dinner plate-sized silicon wafers. These wafer-scale processors can deliver far more computing power with much greater energy efficiency, traits that are essential as AI models continue to grow larger and more demanding.
The dinner plate-sized silicon wafers are in stark contrast to postage stamp-sized GPUs, which are now considered essential in AI designs because they can perform multiple computational tasks like processing images, language, and data streams in parallel.
However, as AI model complexity increases, even high-end GPUs are starting to hit performance and energy limits, says Mihri Ozkan, a professor of electrical and computer engineering in UCR’s Bourns College of Engineering and the lead author of the paper published in the journal Device.
Figure 1 Wafer-Scale Engine 3 (WSE-3), manufactured by Cerebras, avoids the delays and power losses associated with chip-to-chip communication. Source: The University of California, Riverside
“AI computing isn’t just about speed anymore,” Ozkan added. “It’s about designing systems that can move massive amounts of data without overheating or consuming excessive electricity.” He compared GPUs to busy highways, which are effective, but traffic jams waste energy. “Wafer-scale engines are more like monorails: direct, efficient, and less polluting.”
The Cerebras Wafer-Scale Engine 3 (WSE-3), developed by UCR engineers, contains 4 trillion transistors and 900,000 AI-specific cores on a single wafer. Moreover, as Cerebras reports, inference workloads on the WSE-3 system use one-sixth the power of equivalent GPU-based cloud setups.
Then there is Tesla’s Dojo D1, another wafer-scale accelerator, which contains 1.25 trillion transistors and nearly 9,000 cores per module. These wafer-scale chips are engineered to eliminate the performance bottlenecks that occur when data travels between multiple smaller chips.
Figure 2 Dojo D1 chip, released in 2021, aims to enhance full self-driving and autopilot systems. Source: Tesla
However, as UCR’s Ozkan acknowledges, heat remains a challenge. With thermal design power reaching 10,000 watts, wafer-scale chips require advanced cooling. Here, Cerebras uses a glycol-based loop built into the chip package, while Tesla employs a coolant system that distributes liquid evenly across the chip surface.
Related Content
- Wafer Scale Emerging
- Startup Spins Whole Wafer for AI
- Powering and Cooling a Wafer Scale Die
- Cerebras’ Third-Gen Wafer-Scale Chip Doubles Performance
- Cerebras Wafer-Scale Chip will Power Scottish Supercomputer
The post Wafer-scale chip claims to offer GPU alternative for AI models appeared first on EDN.