NXP’s has extended its ultra-low power i.MX series line with the RT700, this device incorporates AI processing via the integrated eIQ® Neutron neural processing unit (NPU), enhanced compute with five total cores including two Arm® Cortex®-M33 cores and Cadence® Tensilica® HIFI1 and HIFI4 DSP blocks. The chip is designed to optimize time spent in sleep mode for up to a 50% improvement in power efficiency. With over 7.5 MB of SRAM, designers can split up the memory to either lock it down to either core or separate it to be shared between them. The large memory ensures users do not have to prune their NPU model or real-time operating system (RTOS) to fit the memory, easing the design process. The RT700 supports the embedded USB (eUSB) standard in order to connect to other USB peripherals at the lower 1.2 V I/O voltage instead of the traditional 3.3 V. Finally, an integrated DC-DC converter allows users to power up the onboard peripherals. A block diagram can be seen in Figure 1.
Figure 1 Block diagram of the new i.MX RT700 crossover MCU with an upgrade in the number of cores, amount of memory, advanced peripherals, as well as a new NPU. Source: NXP
The crossover MCUs
NXP’s crossover family of MCUs were created to offer the performance of an applications processor, or a higher-end core running at higher frequencies, with the simplicity of the MCU. It is a direct alternative to customers that purchase low-end microprocessors with memory management units (MMUs) to run rich OSs where external DDR is often necessary as well as the desire to use an RTOS. Instead, crossover MCUs streamline this task by bumping up the performance of the MCU and including high speed peripherals such as GPUs. In essence, a microprocessor chassis with a RTOS running on an MCU core as the engine.
Enhanced performance
While the 4-digit category of this crossover lineup concentrates more on performance running from 500 MHz to 1 GHz, the 3-digital subcategory is specialized for battery-powered, portable applications. The RT500 was optimized for its low-power 2D graphics capabilities while the RT600 introduced higher performance DSP capabilities, the RT700 combines the power efficiency and performance of these two predecessors (Figure 2). The five cores in the RT700 means the M33 can do the RTOS work with two DSPs and the 325 MHz eIQ Neutron NPU alongside them to accelerate complex, multi-modal AI tasks in hardware.
Figure 2: The i.MX, RT700 family combines both existing RT500 and RT600 families, offering even lower power consumption while adding more performance through the increase of cores and other architectural enhancements. Source: NXP
Power optimization
The design revolves around NXP’s energy flex architecture with heterogeneous domain computing to size the power consumption to the application’s specific compute needs, all built optimized based upon the RT700’s specific process technology. Two different power domains, the compute subsystem and the sense subsystem, serve high-speed processing and low-power compute scenarios respectively.
The RT700 can use as little as 9 µW in sleep mode while having more than 5 MB of memory content retention, ensuring that the device consumes as little power as possible in a deep sleep state with a short wakeup time while still retaining information within SRAM as it was kept on. The run mode power consumption has been reduced to 12 mW from the previous 17 mW of the RT500 (Figure 3).
Figure 3: The i.MX RT700 exhibits a 30% improvement in power consumption while in run mode and a 70% improvement in sleep mode.
The aptly named sense subsystem is generally geared towards sensor-hub type applications that are “always on”. The eIQ NPU will further optimize power consumption by minimizing time spent in run mode and maximizing sleep mode. Figure 4 shows the power consumption executing a typical ML use case on the Arm Cortex-M33 and the power consumption after the algorithm has been accelerated with the eIQ Neutron NPU with dynamically adjusting duty cycle.
Figure 4: eIQ Neutron NPU acceleration will maximize the amount of time the device spends in sleep mode, ensuring processing is done as rapidly as possible to switch back into low power sleep modes. Source: NXP
Benchmarks
Benchmarks performed on MLPerf tiny benchmark suite for anomaly detection, keyword spotting, visual wakewords, and image classification on the Arm Cortex-M33 and the eIQ NPU can be seen in Figure 5. There is an immediate contrast showing up to 172x acceleration on models with the NPU.
Figure 5: MLPerf tiny benchmark showing improvements in standard ML models for anomaly detection, keyword spotting, visual wakewords, and image classification. Source: NXP
This is a critical enhancement in the RT700 over previous generations as use cases for smart AI-enabled edge devices are cropping up exponentially. This can be seen with the increase in worldwide shipments for TinyML, or types of ML that are optimized to run on less powerful devices often at the edge compared. TinyML is a large shift in the conventional view of AI hardware with beefy datacenter GPUs for data-intensive deep tasks and model training. The rise of edge computing shares the processing burden between the cloud and the edge device, allowing for much lower latencies while also removing the bandwidth burden required to constantly communicate data to the cloud and back. This opens up many opportunities however, it does force a higher burden on smart data processing to optimize power management. The RT700 attempts to meet this demand with its integrated NPU while also easing the burden on developers by using common software languages for more simplified programmability.
Aalyia Shaukat, associate editor at EDN, has worked in the design publishing industry for nearly a decade. She holds a Bachelor’s degree in electrical engineering, and has published works in major EE journals.
Related Content
- Ultra-wideband tech gets a boost in capabilities
- The AI-centric microcontrollers eye the new era of edge computing
- Industrial IoT SOM pairs edge processing with wireless connectivity
- AI hardware acceleration needs careful requirements planning
- System-on-module enables machine learning at the edge
googletag.cmd.push(function() { googletag.display(‘div-gpt-ad-native’); });
–>
The post New i.MX MCU serves TinyML applications appeared first on EDN.