AMD Details Renoir: The Ryzen Mobile 4000 Series 7nm APU Uncoveredby Dr. Ian Cutress on March 16, 2020 11:00 AM EST
AMD’s latest Ryzen mobile product is the first design the company has done that combines CPU, GPU, and IO all on a monolithic die in TSMC’s 7nm process.
The CPU part of the design is very similar to what we’ve seen on the desktop: two quad core groups each with their own L3 cache shared between the cores. Compared to the desktop design, the mobile is listed as being ‘optimized for mobile’, primarily by the smaller L3 cache – only 4 MB per quad-core group, rather than the 32 MB per quad-core group we see on the desktop. While the smaller L3 cache might mean more trips out to main memory to get data, overall AMD sees it as saving both power and die area, with this level of cache being the right balance for a power limited chip.
Compared to the previous generation of Zen mobile processors, this generation on the CPU side of the equation comes with the 15% per-core iso-frequency improvement, down to the improvements at the heart of each core. We’ve covered these in detail in our desktop analysis. However for the mobile platform, not only is there a raw performance uplift, but we’re also seeing frequency uplift as well, moving from 4.0 GHz in the prior gen up to 4.3 GHz here. Actual workload performance AMD says gets a significant uplift due to the new power features we’ll discuss in due course.
On the GPU side is where we see bigger changes. AMD does two significant things here – it has reduced the maximum number of graphics compute units from 11 to 8, but also claims a +59% improvement in graphics performance per compute unit despite using the same Vega graphics architecture as in the prior generation. Overall, AMD says, this affords a peak compute throughput of 1.79 TFLOPS (FP32), up from 1.41 TFLOPS (FP32) on the previous generation, or a +27% increase overall.
AMD manages to improve the raw performance per compute unit through a number of changes to the design of the APU. Some of this is down to using 7nm, but some is down to design decisions, but it also requires a lot of work on the physical implementation side.
For example, the 25% higher peak graphics frequency (up from 1400 MHz to 1750 MHz) comes down a lot to physical implementation of the compute units. Part of the performance uplift is also due to memory bandwidth – the new Renoir design can support LPDDR4X-4266 at 68.3 GB/s, compared to DDR4-2400 at 38.4 GB/s. Most GPU designs need more memory bandwidth, especially APUs, so this will help drastically on that front.
There are also improvements in the data fabric. For GPUs, the data fabric is twice as wide, allowing for less overhead when bulk transferring data into the compute units. This technically increases idle power a little bit compared the previous design, however the move to 7nm easily takes that onboard. With less power overhead for bulk transfer data, this makes more power available to the GPU cores, which in turn means they can run at a higher frequency.
Coming to the Infinity Fabric, AMD has made significant power improvements here. One of the main ones is decoupling the frequency of Infinity Fabric from the frequency of the memory – AMD was able to do this because of the monolithic design, whereas in the chiplet design of the desktop processors, the fix between the two values has to be in place otherwise more die area would be needed to transverse the variable clock rates. This is also primarily the reason we’re not seeing chiplet based APUs at this time. However, the decoupling means that the IF can idle at a much lower frequency, saving power, or adjust to a relevant frequency to mix power and performance when under load.
Again we see the double bus width from the graphics to the engine pop up here, giving a better power-per-bit metric. But one of the key aspects from this graph is showing that the power consumed by the fabric in the new processors is very even across a wide bandwidth range compared to the older processor, where the voltages likely had to be stepped up as bandwidth increased, and introducing additional latency factors for performance. Luckily Renoir does away with this, and AMD are claiming a 75% better fabric efficiency compared to the previous generation.
Orthogonal to the raw improvements, AMD has also improved the media capabilities, with a new HDR/WCG encode engine for HEVC, which according to AMD should give a 31% encoding speedup when used.
Post Your CommentPlease log in or sign up to comment.
View All Comments
uibo - Monday, March 16, 2020 - linkHTPC market insignificant
Samus - Tuesday, March 17, 2020 - linkI mean realistically there isn't anything a 10w Atom can't decode anymore...everything is overkill for HTPC.
As far as encoding, for what the general consumer does (twitch, etc) any midrange CPU can handle that in the background on top of any other tasks you demand. It won't be a 10-15w part, but certainly a 35w part.
R3MF - Tuesday, March 17, 2020 - linkeven AV1?
bearing in mind that a new htpc has a ~6 year life and AV1 is the future of streaming video.
close - Tuesday, March 17, 2020 - linkI have an X5-Z8350 Atom tablet at home, I will give it a run with an AV1 encoded full HD Youtube stream and see if it handles it reasonably. I would assume that a box with more adequate cooling would do even better.
PeachNCream - Tuesday, March 17, 2020 - linkHave to agree with this. HTPCs had a very brief glimmer of market presence a few years ago, but they never really took off or make a substantial enough splash. The population at large has little interest in adding the relative complexity of a computer to their media viewing experience and most home users are purchasing laptops, not even desktops, which are even less well-suited to acting as a fixed system attached to a large display panel. If AMD does grab that market, it will not be a measurable number of sales to say the least.
Spunjji - Tuesday, March 17, 2020 - linkI love my HTPC and am excited to rebuild it around Renoir, and I fully endorse the sentiment of this post. Most people get by with a Fire stick or the built-in "smart" features of your average modern TV.
PeachNCream - Tuesday, March 17, 2020 - linkIf I had time and was more interested in consuming video content, I would probably dive into building a HTPC as well, but it would be to appeal mainly to a desire to tinker. From a practical standpoint, I would be hard-pressed to find a credible amount of work for computer dedicated to that task because watching videos isn't something I do when I'm not on an exercise bike and my phone is good enough for that chore.
stephenbrooks - Tuesday, March 17, 2020 - linkLaptops make pretty good "HTPCs"... I plugged mine into a projector and sound system just today in fact
DanNeely - Monday, March 16, 2020 - linkFor power efficiency media en/decoding is normally done with fixed function hardware; doing it in software on the GPU's general purpose cores eats power like crazy. AV1 not being present means Renoir doesn't have a fixed function block - whether due to not being done yet, taking too much die area, or something else - but not being here means you're going to have to wait until the 5000 series APUs to get support in an AMD CPU.
Santoval - Tuesday, March 17, 2020 - linkBear in mind that this year will see the release of no less than *three* new video codecs. MPEG plan to release H.266/VVC (Versatile Video Coding), EVC (i.e Essential Video Coding) or MPEG-5 Part-1 and LCEVC (i.e. Low Complexity Enhancement Video Coding) or MPEG-5 Part-2. Each codec is targeted at a different market. For instance H.266/VVC is the direct successor of H.265/HEVC, while EVC is partly targeted against AV1 (its baseline profile, which is ~30% more bitrate efficient than H.264, will be royalty free).
LCEVC is not so much a new codec but a new technique to combine two layers of any two codecs at any resolution in a "hybrid" (stacked) way, in order to reduce computational complexity. Which works apparently. I place a link at the end of the comment which explains how that works. In other words the codec market of the next couple of years is going to quite more loaded and competitive than simply choosing between H.265, VP9 and AV1. This is something chip manufacturers will almost certainly take into account.
By the way, it is not yet fully clear if AV1 is going to be royalty free. Sisvel launched a patent pool for AV1 last year. Whether it has merit or not remains to be seen. However, patent confusion is worse than paying royalties for patents. If chip manufacturers have plans to add decoding and encoding support for VVC and EVC, for instance, they have already accounted the costs. But if they add AV1 support thinking it was patent free and then Sisvel goes to court to sue that would be a very unpleasant and unexpected surprise. Sisvel's patent claims are going to stall AV1 support unless they are resolved.