AMD Details Renoir: The Ryzen Mobile 4000 Series 7nm APU Uncoveredby Dr. Ian Cutress on March 16, 2020 11:00 AM EST
AMD’s latest Ryzen mobile product is the first design the company has done that combines CPU, GPU, and IO all on a monolithic die in TSMC’s 7nm process.
The CPU part of the design is very similar to what we’ve seen on the desktop: two quad core groups each with their own L3 cache shared between the cores. Compared to the desktop design, the mobile is listed as being ‘optimized for mobile’, primarily by the smaller L3 cache – only 4 MB per quad-core group, rather than the 32 MB per quad-core group we see on the desktop. While the smaller L3 cache might mean more trips out to main memory to get data, overall AMD sees it as saving both power and die area, with this level of cache being the right balance for a power limited chip.
Compared to the previous generation of Zen mobile processors, this generation on the CPU side of the equation comes with the 15% per-core iso-frequency improvement, down to the improvements at the heart of each core. We’ve covered these in detail in our desktop analysis. However for the mobile platform, not only is there a raw performance uplift, but we’re also seeing frequency uplift as well, moving from 4.0 GHz in the prior gen up to 4.3 GHz here. Actual workload performance AMD says gets a significant uplift due to the new power features we’ll discuss in due course.
On the GPU side is where we see bigger changes. AMD does two significant things here – it has reduced the maximum number of graphics compute units from 11 to 8, but also claims a +59% improvement in graphics performance per compute unit despite using the same Vega graphics architecture as in the prior generation. Overall, AMD says, this affords a peak compute throughput of 1.79 TFLOPS (FP32), up from 1.41 TFLOPS (FP32) on the previous generation, or a +27% increase overall.
AMD manages to improve the raw performance per compute unit through a number of changes to the design of the APU. Some of this is down to using 7nm, but some is down to design decisions, but it also requires a lot of work on the physical implementation side.
For example, the 25% higher peak graphics frequency (up from 1400 MHz to 1750 MHz) comes down a lot to physical implementation of the compute units. Part of the performance uplift is also due to memory bandwidth – the new Renoir design can support LPDDR4X-4266 at 68.3 GB/s, compared to DDR4-2400 at 38.4 GB/s. Most GPU designs need more memory bandwidth, especially APUs, so this will help drastically on that front.
There are also improvements in the data fabric. For GPUs, the data fabric is twice as wide, allowing for less overhead when bulk transferring data into the compute units. This technically increases idle power a little bit compared the previous design, however the move to 7nm easily takes that onboard. With less power overhead for bulk transfer data, this makes more power available to the GPU cores, which in turn means they can run at a higher frequency.
Coming to the Infinity Fabric, AMD has made significant power improvements here. One of the main ones is decoupling the frequency of Infinity Fabric from the frequency of the memory – AMD was able to do this because of the monolithic design, whereas in the chiplet design of the desktop processors, the fix between the two values has to be in place otherwise more die area would be needed to transverse the variable clock rates. This is also primarily the reason we’re not seeing chiplet based APUs at this time. However, the decoupling means that the IF can idle at a much lower frequency, saving power, or adjust to a relevant frequency to mix power and performance when under load.
Again we see the double bus width from the graphics to the engine pop up here, giving a better power-per-bit metric. But one of the key aspects from this graph is showing that the power consumed by the fabric in the new processors is very even across a wide bandwidth range compared to the older processor, where the voltages likely had to be stepped up as bandwidth increased, and introducing additional latency factors for performance. Luckily Renoir does away with this, and AMD are claiming a 75% better fabric efficiency compared to the previous generation.
Orthogonal to the raw improvements, AMD has also improved the media capabilities, with a new HDR/WCG encode engine for HEVC, which according to AMD should give a 31% encoding speedup when used.
Post Your CommentPlease log in or sign up to comment.
View All Comments
Spunjji - Tuesday, March 17, 2020 - linkOof. Thanks for sharing that, but I feel like I know more than I ever wanted to about CODECs... and still not enough :'D
ikjadoon - Tuesday, March 17, 2020 - linkCodecs live or die by industry & content support. I don't see *anyone* major taking up H.266, much less EVC or LCEVC.
The transition to AV1 is not merely technical, but also political. MPEG Group has sowed their own demise.
Two of H.265's patent pools backed away away, scared of AV1's enthusiasm. I don't see Sisvel's snivelling going far, even with the latest batch of patent "disputes". How is their VP9 patent pool going?...
VVC's bit-freeze is when...2021? It'll take years for decoding hardware and *nobody* is eager to run back to MPEG.
A good read: https://www.streamingmedia.com/Articles/ReadArticl...
March 2020 update: https://www.streamingmedia.com/Articles/News/Onlin...
MPEG, Sisvel, etc. missed the boat about a half decade ago.
name99 - Thursday, March 19, 2020 - linkFirstly, don't confuse the world as you wish it to be from the world as it is.
Apple is all on on h.265, and ATSC3, the next US digital TV spec, already in preliminary usage in some markets, is based on h.265. When you stream netflix at 4K UHD, you're probably using h.265.
Secondly, be aware of the schedules.
For h.265, the first version of the spec was 2013. By Sept 2015 Apple released the A9 with 265 decode, and a year later the A10 with 265 encode.
BUT these were only exposed to the public in 2017 with iOS11. The point of the delay,
presumably, was to ensure a large enough critical mass of users already at the point of announcement.
This suggests that Apple will be planning a similar transition of their entire line to VVC, but it will take time -- maybe a decoder chip in 2023, an encoder in 2024, a public commitment to "the new Apple codec is VVC across all devices" in 2025.
The relevance of Apple is that Apple is something of the leader in this space. Sure, sure, there's a tiny world of open source renegades talking about Dirac and Opus and Matroska and suchlike. But in the real world of serious money, Apple generally represents the vanguard.
So these future codecs are interesting, yes. But they're also only going to be mainstream relevant (if you're with Apple) maybe late 2025, for everyone else maybe a year or so later?
name99 - Thursday, March 19, 2020 - linkOh I realize I left out one point.
We saw exactly the same stories back when h.265 became relevant. Here's a sample thread:
Complain all you like about the existence of patent pools, but they are just not that big a deal to the big boys, especially hardware. So Apple, or your Samsung/LG/Sony TV has to pay 1, or 5 dollars per device. That's perfectly feasible in return for a spec that is both very good, and more or less legal certainty.
There's just no reason to believe the driving forces in 2025 are any different from those in 2015.
grant3 - Tuesday, April 14, 2020 - link"they are just not that big a deal to the big boys, especially hardware"
And yet it's "big boys" like intel, cisco, google, samsung, nvidia etc. and oh ya, APPLE, who are the founding members of the royalty-free Open Media alliance i.e. the developers of AV1.
Hifihedgehog - Monday, March 16, 2020 - linkHow much later is later, guesstimate-wise? :)
Kevin G - Monday, March 16, 2020 - linkI was expecting B550 around Computex (which is still a go last I checked). However with the recent outbreaks, I wonder how much has been pushed back or will simply be a preview then. Supply chains are disrupted at various points. I'm just curious what we know has been pushed back, what is still on schedule and what is likely to be affected that hasn't yet. I'd make for an interesting piece. :)
AshlayW - Monday, March 16, 2020 - linkAsbolutely hyped to pick up a 4400G(?) for my Slim system. Currently has a 3400G but we are edging closer to me being able to play all the games I like at 1080p, on a single chip system.
dianajmclean6 - Monday, March 23, 2020 - linkSix months ago I lost my job and after that I was fortunate enough to stumble upon a great website which literally saved me• I started working for them online and in a short time after I've started averaging 15k a month••• iｃash68.ｃｏM
peevee - Monday, March 23, 2020 - linkOn a desktop APU it would make sense to separate CPU and GPU dies, there is plenty of space and yields would be higher. Of course the GPU side should be Navi and socket should be upgraded to 4 memory channels to allow a truly decent integrated GPU similar to 5500.