AMD Details Renoir: The Ryzen Mobile 4000 Series 7nm APU Uncoveredby Dr. Ian Cutress on March 16, 2020 11:00 AM EST
Power and Battery Life
Earlier in the year AMD was keen to promote that in Renoir it has made significant advances as to how power is managed across the APU, leading to increased performance and better battery life. The two key figures here were ‘20% reduced SoC power’ and ‘5x reduction in power gating latency’ (also known as an 80% reduction, because you can’t have a 5x reduction of a time). We now have some details.
First up it should be mentioned that 7nm helps a lot here. The smaller process node, with smaller transistors (assuming they’ve been laid out correctly), will require a lower voltage. That lower voltage directly translates into lower power, and we’ve seen how well AMD has pushed the 7nm designs on the desktop and in the enterprise space to know that compared to previous process nodes, there is a lot of power to save here. That being said, the design choices and features matter too.
AMD’s power management all goes through a system-level management controller. For this generation, AMD has re-written the firmware with speed in mind (they claim 33% faster), but also made other improvements, such as aggressive clock gating of the L3 cache when not needed, and using power optimized circuits for IO features such as for the embedded display controller and PCIe physical layers.
The updated system management controller (SMC) is built around user preference. In this case if the user tells the OS he or she wants more performance, or more battery life, then the SMC can take into consideration everything involved in the system and plan accordingly. If the OS can provide guidance as to an upcoming workload, then voltages and frequencies (or parts of the chip unused can be put in idle), then the SMC is built to understand it.
Ultimately there are many sensors around the APU, monitoring activity and the type of activity going on in that particular region, even down to the types of instructions being used. The SoC is a lot more dynamic in its clock control, allowing for different clock domains in various parts of the SoC to be adjusted depending on both the activity of the region but also the thermal limits, system limits, and other items that might affect performance. This is especially useful for powering down parts of the SoC that are not in use, leading to AMD’s efficiency claims, or the performance claims such as maintaining a specific bandwidth across an interconnect (quality of service). The thresholds for these activity monitors can be set by the OS and by the user. The SMU also takes into account the power source (battery vs power supply) and connected hardware (displays, power over USB).
For the power gating latency, AMD has doubled the save and restore bus width from the buffers to the cores, allowing for a system to resume faster from a CPUOFF state. Not only this, but AMD is using the ACPI 6.3 specifications to take advantage of offering multiple C states in the OS.
One of the issues of the previous generation of Picasso APUs, on the left, is that there was only a single set of states that the processor could be in. This means that at any time, the CPU could fall from a power state (a P state) into a lower power state, or an idle state, or an off state. If the CPU went too far down this stack, while it would be saving power, each hop down the rabbit hole meant a longer time to get back out of it, diminishing performance and latency but also requiring more power changes at the silicon level. Each hop in its own right requires additional power.
With the new Renoir designs, a system can take advantage of multiple different sets of states. This means that the CPU can’t go down too low when the system is in use. With a system in use, the OS or system controller can’t put parts of the core into low power states because those are not available, which means that even if the system goes into the lowest power mode possible, while the system is still being used, then there are fewer jumps to get back up to high speed.
As the system becomes less used, known as ‘increased idle duration’, then the system has access to sets of states that allow the parts of the APU to enter deeper idle states. This means that the system can only enter a low frequency domain if that part of the core has been sufficiently idle, or user interaction has willed it.
This is all part of the ACPI 6.3 standard, and AMD states that this combined with the reduced SoC power gives both better battery life and better immediate performance for the user. To show this in action, AMD pinpointed a common activity that most users might be familiar with: opening applications.
In this case, AMD took the start of the PCMark 10 Application Loading benchmark. In this benchmark a number of applications are loaded, and the requirements are often more IO driven than CPU driven. A good CPU with a fast reaction time will keep its power and frequency low while the IO requests are being done, and speed up one or two threads when the CPU needs to get involved.
In AMD’s benchmark, where they are using frequency as a proxy for power, They show that in the initial 5 seconds of the test, the new Ryzen 4000 CPU is hovering at an idle frequency, whereas the older Ryzen 3000 CPU is fluttering around, even peaking near 4.0 GHz when it doesn’t need to. This allows parts of the new CPU to be powered down for longer periods of time, even when the system is actually in use.
When I asked AMD’s executives where they stand on battery life, one of them hinted that the difference between themselves and the competition (in similar designs) should be on the order of minutes rather than dozens of minutes. Specifically AMD sees itself better than the competition in productivity/web browsing workloads, graphics workloads, and video playback, and cited that most battery benchmarks don’t often take into account a good mix of ‘the average user’. A number of the media responded that often our benchmarks are geared towards different types of users consummate to our audience, such as gamers or content creators. Ultimately we will see what the results are when we have hardware on hand.
Post Your CommentPlease log in or sign up to comment.
View All Comments
Spunjji - Tuesday, March 17, 2020 - linkOof. Thanks for sharing that, but I feel like I know more than I ever wanted to about CODECs... and still not enough :'D
ikjadoon - Tuesday, March 17, 2020 - linkCodecs live or die by industry & content support. I don't see *anyone* major taking up H.266, much less EVC or LCEVC.
The transition to AV1 is not merely technical, but also political. MPEG Group has sowed their own demise.
Two of H.265's patent pools backed away away, scared of AV1's enthusiasm. I don't see Sisvel's snivelling going far, even with the latest batch of patent "disputes". How is their VP9 patent pool going?...
VVC's bit-freeze is when...2021? It'll take years for decoding hardware and *nobody* is eager to run back to MPEG.
A good read: https://www.streamingmedia.com/Articles/ReadArticl...
March 2020 update: https://www.streamingmedia.com/Articles/News/Onlin...
MPEG, Sisvel, etc. missed the boat about a half decade ago.
name99 - Thursday, March 19, 2020 - linkFirstly, don't confuse the world as you wish it to be from the world as it is.
Apple is all on on h.265, and ATSC3, the next US digital TV spec, already in preliminary usage in some markets, is based on h.265. When you stream netflix at 4K UHD, you're probably using h.265.
Secondly, be aware of the schedules.
For h.265, the first version of the spec was 2013. By Sept 2015 Apple released the A9 with 265 decode, and a year later the A10 with 265 encode.
BUT these were only exposed to the public in 2017 with iOS11. The point of the delay,
presumably, was to ensure a large enough critical mass of users already at the point of announcement.
This suggests that Apple will be planning a similar transition of their entire line to VVC, but it will take time -- maybe a decoder chip in 2023, an encoder in 2024, a public commitment to "the new Apple codec is VVC across all devices" in 2025.
The relevance of Apple is that Apple is something of the leader in this space. Sure, sure, there's a tiny world of open source renegades talking about Dirac and Opus and Matroska and suchlike. But in the real world of serious money, Apple generally represents the vanguard.
So these future codecs are interesting, yes. But they're also only going to be mainstream relevant (if you're with Apple) maybe late 2025, for everyone else maybe a year or so later?
name99 - Thursday, March 19, 2020 - linkOh I realize I left out one point.
We saw exactly the same stories back when h.265 became relevant. Here's a sample thread:
Complain all you like about the existence of patent pools, but they are just not that big a deal to the big boys, especially hardware. So Apple, or your Samsung/LG/Sony TV has to pay 1, or 5 dollars per device. That's perfectly feasible in return for a spec that is both very good, and more or less legal certainty.
There's just no reason to believe the driving forces in 2025 are any different from those in 2015.
grant3 - Tuesday, April 14, 2020 - link"they are just not that big a deal to the big boys, especially hardware"
And yet it's "big boys" like intel, cisco, google, samsung, nvidia etc. and oh ya, APPLE, who are the founding members of the royalty-free Open Media alliance i.e. the developers of AV1.
Hifihedgehog - Monday, March 16, 2020 - linkHow much later is later, guesstimate-wise? :)
Kevin G - Monday, March 16, 2020 - linkI was expecting B550 around Computex (which is still a go last I checked). However with the recent outbreaks, I wonder how much has been pushed back or will simply be a preview then. Supply chains are disrupted at various points. I'm just curious what we know has been pushed back, what is still on schedule and what is likely to be affected that hasn't yet. I'd make for an interesting piece. :)
AshlayW - Monday, March 16, 2020 - linkAsbolutely hyped to pick up a 4400G(?) for my Slim system. Currently has a 3400G but we are edging closer to me being able to play all the games I like at 1080p, on a single chip system.
dianajmclean6 - Monday, March 23, 2020 - linkSix months ago I lost my job and after that I was fortunate enough to stumble upon a great website which literally saved me• I started working for them online and in a short time after I've started averaging 15k a month••• iｃash68.ｃｏM
peevee - Monday, March 23, 2020 - linkOn a desktop APU it would make sense to separate CPU and GPU dies, there is plenty of space and yields would be higher. Of course the GPU side should be Navi and socket should be upgraded to 4 memory channels to allow a truly decent integrated GPU similar to 5500.