//php echo do_shortcode(‘[responsivevoice_button voice=”US English Male” buttontext=”Listen to Post”]’) ?>
Google Research has open-sourced its Coral NPU IP (previously codenamed Kelvin), which it is giving to the industry in a bid to accelerate edge AI implementations by reducing fragmentation and improving security. Synaptics is the first to implement this NPU in silicon as part of its Astra SL2610 series of IoT device SoCs.
Google has been working on a new approach to enable the edge AI ecosystem to grow, Billy Rutledge, director of edge AI research at Google, told EE Times.
“We’re trying to affect nascent wearable-class SoCs before they become mature to try to avoid the things that we’ve seen in the mobile phone SoC industry,” Rutledge said. “There’s a lot of fragmentation in the world of mobile phones. Can we prevent that from happening in wearables?”
Google had built a mini-TPU ASIC for edge AI marketed under the Coral brand as far back as 2017; this recent work on the Coral NPU builds on learnings from that project, though the IP itself isn’t based on the architecture of the earlier Coral ASIC. Rutledge said the team identified the key roadblocks for edge AI being sufficient performance for sophisticated models within tight power budgets, cumbersome toolchains not helped by ecosystem fragmentation, and security. Google wants to address all of these challenges at the ecosystem level by offering an easy-to-use standards-based platform as a next generation to the original Coral.
“This time around, we’re taking an open standards approach, an IP based approach,” Rutledge said. “Instead of offering a proprietary ASIC from Google, we’re going up one level higher and offering IP for other silicon companies to take for free.”
The Coral NPU today is a 4-way superscalar 32-bit RISC-V CPU. A vector engine is under development as part of the Coral NPU platform, and a matrix engine, which will also come later.
RISC-V, being open, modular and extensible fit the bill perfectly, Rutledge said.
“We’ve chosen RISC-V because of its rapid adoption and interest in the ecosystem today, but also it’s modular and flexible, easily extendable by others, and there’s no licensing fees or royalties required,” Rutledge said. “It gives us a good basis to build open-source hardware designs and be able to share them broadly with others.”
Google imagines most companies will start with the small, lightweight CPU as a consistent front-end to other execution units on chip. A compiler and software stack that can lower models from any ML framework onto the CPU has also been open-sourced.
“We’re creating this open-standards-based pipeline from the ML frameworks all the way down to the NPU front end and trying to encourage consideration and adoption, and then specialization of that for different industry segments,” Rutledge said.
The move to RISC-V is intended to reduce fragmentation in software stacks, thereby accelerating edge AI deployments, Rutledge said. If Google expects implementations of its NPU to be customized by silicon vendors, how can it ensure compatibility?
“Following the specs and compliance, if we’re able to keep those policies and checks in place, then our tools should work properly,” Rutledge said, noting that math extensions are probably going to be common to most compute-heavy designs, limiting fragmentation.
Synaptics silicon
Synaptics has the first production deployment of the Coral NPU; the company has incorporated it into the Astra series of AI-enabled IoT SoCs, which range from application processor level to microcontroller level parts. The first SKUs to become available are the SL2610 family of five chips for applications ranging from smart appliances to retail point of sales terminals and drones. All parts in this family have dual Arm Cortex-A55 cores implemented in 12 nm. Some parts in the SL2610 family have the NPU subsystem, some don’t.

“We feel this is going to be a very disruptive product for the spaces that it is positioned for – it’s primarily for IoT,” Nebu Philips, senior director of strategy and business development at Synaptics told EE Times. “A lot of the IoT silicon out there is repurposed from other large primary markets from the large semi players. [Their] primary market is automotive or industrial and that die will always have some IP that is very specific for their primary markets.”
Synaptics has dubbed its combination of AI hardware and software Torq. The hardware block includes the Coral NPU (implemented as a tightly-coupled CPU optimized for Synaptics’ PPA requirements), plus Synaptics’ home-grown AI accelerator (the T1, a fixed-function 1 TOPS (INT8) accelerator for transformer and CNN operations). The Coral NPU will be used to accelerate scalar operations without having to go across the AXI bus to the A55s. On the software side, there is an open-source toolchain which includes Google’s MLIR/IREE compiler and runtime.
While many models and frameworks are becoming standard, there is still a lot of fragmentation on the compiler side, Philips said, noting that many silicon vendors have acquired software toolchain providers for this reason (Qualcomm/Edge Impulse, ST/Cartesiam, Infineon/Imagimob, etc).
“All these are nicely packaged into pretty good user experiences, but it’s very tightly coupled with their silicon portfolio,” Philips said. “[The ecosystem] is beginning to show some signs of lock-in because the tooling is very closely tied to the silicon. Along with Google, we want to get ahead of that [by] going completely open source.”
Synaptics’ toolchain is based on IREE, an MLIR-based compiler and runtime project that started at Google Research. The companies partnered to support multiple ML frameworks at the front end and Synaptics Torq at the lower levels. (A “vanilla” version will be hosted on the Coral NPU site, though today it’s fundamentally the same as the Synaptics version since Synaptics has the only implementation of the NPU in silicon, Rutledge said).
Coral roadmap
Google Research is working with Verisilicon to productize and support the Coral NPU; Verisilicon has performed extensive design verification testing on the IP for different process nodes, Rutledge said, as well as building test silicon and overall ensuring quality. Verisilicon will make the IP available to companies on Google’s behalf (and support it, for a fee), but working with Verisilicon is optional if companies don’t need that support, he said.
On Coral’s roadmap are a new version of the vector core, which will support LLM inference, and a matrix engine, once the appropriate RISC-V profile is ratified. CHERI support is also on the roadmap.
As part of Google, the Coral NPU team are working behind the scenes with other parts of the business on tiny model development, Rutledge said.
“We can optimize the Coral NPU further to support new architectures coming from Google, which is something we’re pretty excited about,” he said. “[Google] is the thought leader in this space for new architectures, having created both MobileNet and the transformer. We want to make sure that the Coral NPU is co-designed and optimized for the Google architectures as they come out.”