Sub-threshold transistors have allowed a 32bit ARM Corex-M4F microcontroller to run at 35µA/MHz, and sleep at 100nA with the real-time clock (RTC) running.
Announced by University of Michigan spin-out Ambiq Micro, it uses technology invented at the University and developed through collaboration foundry giant TSMC.
Sub-threshold operation – where low supply and gate voltage means mosfets are either ‘off’ or partially ‘on’, but never fully on – is a known route to low power consumption, as power scales with V2, but is also a known route to chips that are hopelessly sensitive to process and temperature variation.
Ambiq has a contrary view: that sub-threshold can be made to work in mass production, and put money on it. In 2013 the firm released stand-alone sub-threshold RTC chips that consume only 55nA. “The RTC was a stepping-stone to prove it is real technology because people were in disbelief,” Ambiq v-p of marketing Mike Salas told Electronics Weekly.
Salas points out that Swiss watch companies were using sub-threshold chips years ago, but they only had 5-10 hand-crafted transistors. Ambiq’s mission, and that of the University before, has been to find a way to make reliable sub-threshold chips on a big scale using industry standard processed and cad tools.
According to Salas, one hurdle on the way to sub-threshold chips was that transistor models provided by foundries did not extend accurately into the sub-threshold region. To push the envelope to lower voltages “it took 5-6 years, a lot of information gathering, a lot of test chips, a partnership with TSMC, and a lot of modelling work”, he said.
From this came not only better models, but digital and analogue circuits specifically for sub-threshold transistors.
How these work remains secret. “The circuits are very dynamic, very adaptive, and compensate for bad effects,” is all Salas would say.
Research has shown sub-threshold operation is not suitable for all parts of a chip.
In the microcontroller, dubbed ‘Apollo’, “there is intelligent partitioning on where to provide sub-threshold and when not”, said Salas. “A couple of places have standard super-threshold transistors, there is a big chunk of near-threshold, and in other areas there is real sub-threshold, down to 0.5V.”
Now it is working, would the firm licence its technology?
“A lot of people have asked, and ‘no’ is the basic answer,” said Salas. “But we do want to licence it to ancillary chip makers – very selectively. For example, if radios out there are minimising our value in a solution.”
This broader approach to energy reduction could well be extended to software and compilers which, according to Salas, need to be designed for energy rather than code size or performance.
Apollo is implemented on TSMC 90nm CMOS. This was chosen simply because it was the finest geometry available with embedded flash, said Salas, and was nothing to do with sub-threshold performance or leakage. “If you were implementing purely for sleep power, you would go for 180nm, at the expense of much higher active power,” he said, adding that Apollo doesn’t have a leakage problem, borne out by the 100nA in sleep+RTC figure.
Ambiq has also said, without further explanation, that leakage current of ‘off’ transistors is used to compute in both digital and analogue domains.
Why chose to implement an ARM Cortex-M4F rather than the smaller but less potent M0.
“The power delta between M0 and M4 is so small, and the M4 will execute faster and so shut off more quickly,” said Salas.
The Cortex-M4 is an M3 plus DSP extensions.
Energy Micro, the previous record holder for energy efficient MCUs (and now part of Silicon Labs), had similar arguments for adopting Cortex-M3 rather than M0 for its first product, although eventually added a parallel M0 family.
The F in M4F indicates the CPU has floating point extensions. “This is tremendously valuable for the IoT where sensors are on all the time. Floating point helps with the analytics algorithms,” said Salas. “The other bonus is that customers who use Matlab generally have to take the code and convert to fixed-point for smaller code and lower energy. We save customers the float-to-fixed conversion.”
As well as the core, sub-threshold techniques have been applied to both analogue and digital peripheral domains in Apollo. “Our power floor is just so much lower than competitors,” claimed Salas. And like those competitors, Ambiq has optimised its peripheral architecture for power saving. “We all play the same architectural games, like deep FIFOs to avoid turning on the core,” he said.
Including two on-chip dc-dc buck converters, Ambiq is claiming 840µA at 24MHz (top speed, 35µA/MHz) running a CoreMark from flash at 3.3V. “We didn’t cheat on the headline number. It is a real CoreMark running out of flash. I could quote a much lower figure running out of RAM,” Salas added.
At 3.8V, this improves to 32µA/MHz. Sleep with RAM retained is 130nA at 3.3 or 3.8V, or 100nA at both voltages with no RAM retention.
Although core voltage is low, external pins work like any other MCU. Said Salas: “It looks and feels like any other microcontroller – the magic is inside the chip, all power and voltage converted.”
Wake from sleep is 10us with the RC oscillator clock.
Silicon samples are with Ambiq and “power numbers look great,” said Salas.
There are to be four Apollo MCUs, all with the same peripherals, differing only in memory: 64-512kbyte flash and 16-64kbyte RAM.
Peripherals include: 10bit 13channel 1Msample/s ADC, ±2ºC temperature sensor, voltage comparator, x8 SPI master, x2 I2C master, SPI/I2C slave, UART, RTC, clock oscillators (LF RC, HF RC, XTAL) and x8 timers.
Operation is over 1.8 to 3.8V and -40 to 85ºC.
Package options will be 4.5×4.5mm 64pin BGA with 50 GPIO pins, or 2.4×2.77mm 42pin chip-scale with 27 GPIO.
Volume production is scheduled to commence in the spring.
steve bush