Skip to content

EM•Mark active power – TI CC2340R5

Having shown how small(er) EM programs can in fact execute fast(er) – especially when fetching instructions from zero wait-state SRAM – EM•Mark now reveals the (even greater) impact of program size on the power consumption of resource-constrained MCUs.

Program size matters

Every MCU datasheet which characterizes active power consumption invariably does so by executing either:  an empty while(1) loop, or else the legacy CoreMark program.

Expressed in units such as mA or μA / Mhz , case will typically measure higher due to the extra power consumed by reading a block of instructions from the Flash when (not if) a CPU cache-miss occurs – a reasonable assumption knowing the size of legacy CoreMark.

We often focus on how the cache improves program execution time – by automatically loading instructions from slow(er) Flash into fast(er) SRAM when needed.  But since Flash draws more  current than SRAM when active, the cache also helps lower total power consumption.

As we reported previously, placing the entire EM•Mark program in SRAM can improve execution time; and as you'll now learn, this same sort SRAM-only configuration (which the EM language strongly encourages) can have even greater impact on active power consumption.

The envelope, please ...

Using the setup described in EM•Mark Results, the following summarizes active power consumption when executing CoreMark versus EM•Mark with different memory configurations:

CoreMark text + const [ Flash ] 7.77 mW 1.37 mJ
EM•Mark text + const [ Flash ] 6.84 mW 1.03 mJ
EM•Mark text + const [ SRAM ] 5.09 mW 0.63 mJ

Active Power – Summary

Armed with a Joulescope JS220  energy analyzer, we've captured traces of power consumption over time – enabling us to visualize overall energy utilization [ E = ∫ Pdt ] when executing programs on our MCU:

Image info

EM•Mark Power Capture – SRAM

To align with MCU datasheet conventions, our earlier summary reports the average amount of power in mW consumed during our ten benchmark iterations [ 7.77, 6.84, 5.09 ] ; our Core­Mark result in fact matches the 2.6 mA [ @ 3V ] found in the TI CC2340R5 datasheet .

At the same time, we feel that energy  measured in mJ [ 1.37, 1.03, 0.63 ] better reflects the dynamics of a "live" system executing an application program; in somewhat simple terms, hardware  contributes raw power while software  adds the critical dimension of time.

Case in point, consider some factors contributing to the (significantly lower) 0.63 mJ of energy consumed when executing ten iterations of EM•Mark from zero wait-state SRAM:

fetching instructions from SRAM requires less power than Flash

programs will generally execute faster when placed in SRAM

we can actually power-down  the Flash and its cache

Faster clock, more power, but less energy ???

Some MCUs (but not the TI CC2340R5) allow software to change the frequency of the master clock, selecting among a set of discrete chip-specific rates ranging from (say) 4 MHz to ≥ 100 MHz.  The MCU datasheet will then specify a corresponding range of active power values in mA or μA / Mhz .

At first glance, the (linear) relationship between clock frequency and current draw suggests that faster execution requires more power – a reasonable trade-off.  But the MCU datasheet only presents an "instantaneous" perspective that lacks the dimension of time, masking opportunities to reduce  overall energy consumption.

Looking deeper at MCU power consumption, we can distinquish between leakage (static) and switching (dynamic) current – with the latter proportionally tracking clock rate.  But once we factor in the fixed  amount of static current consumed by on-chip memory / peripherals, a faster clock can in fact lower energy consumption.

As silicon processes advance towards smaller transistors in higher densities, leakage current becomes an even larger factor in our energy calculus – especially for embedded applications with low active duty-cycles that spend most of their time in "deep-sleep" modes.

Finally, a special tip-of-the-hat to Joe Circello  for our multiple rounds of discussions in the summer of 2022 on silicon process as well as hardware architecture, and how EM might influence future MCU designs.

These results further confirm some hypotheses put forth in Tiny code → Tiny chips, and suggest a very different SRAM-centric MCU architecture with a faster clock, low-cost Flash, and no  cache – yielding a smaller chip with fewer gates which consumes less power.

All possible because EM can dramatically reduce program size ::  ↓ KB  ⇒ ↓ mJ

How can you get involved

review this earlier post, which talks about MCU power specs in the context of total energy usage

dive into the CoreMark Reimagined  and EM•Mark Results  articles for more technical details

share your experiences in measuring power / energy, as well as your thoughts on ULP MCU design

Happy coding !!!   🌝   💻