The AMD Ryzen 9 3950X Review: 16 Cores on 7nm with PCIe 4.0by Dr. Ian Cutress on November 14, 2019 9:00 AM EST
Going For Power: Is 105W TDP Accurate?
For regular readers, we have covered the discrepancy in how different companies ascribe the Thermal Design Power to their product lines:
- Aligning Turbo with AMD’s Frequency Metrics
- Data in our Ryzen 2700X Review
- Talking TDP, Turbo and Overclocking with Intel Fellow Guy Therien
- TDP and Turbo Explained
- The Lynnfield Followup: Turbo Mode and Overclocking Investigated
While Intel’s TDP represents the internal power measured for long and sustained high performance (also motherboard dependent), AMD’s metric is more akin to actual thermal cooling requirements for a given cooler rating. That being said, the power consumption of AMD’s first and second generation Ryzen processors has often been parallel to the TDP rating on the box, with the CPU levelling out to the TDP value as we load up the cores with a high energy workload.
For example, here’s our 16-core 1950X data. The Threadripper 1950X is a 180 W chip, and we saw the cores take a total of 134 W.
Here’s our Ryzen 7 2700X data.
This 105 W TDP processor was only recording 86W across the cores at full load.
It’s worth noting that our data is primarily to do with the total power consumed by the cores. There are other power factors at play, such as the Infinity Fabric, the DRAM controller, the PCIe controller, and any other IO, which might add up to the power of the overall package. The maximum power available to a processor should be the package, of which the cores take up most of the sum.
With Ryzen 3000 and Zen 2, AMD’s attachment to TDP was not as clinical as its first two generations of hardware. In our Ryzen 7 3700X review, with the 12-core processor, we saw this:
The Ryzen 7 3700X is a 65 W processor, and yet we can see that the cores total up to 74 W by themselves, with the rest of the chip taking another 16W or so, totalling 90 W for the whole chip. This aligns with AMD's 'PPT', the maximum power that can be supplied to the socket, which is around 88W. This is perhaps indicative of two things: firstly, that Intel’s turbo policy was creating 95 W TDP chips that consumed 160W in turbo modes and AMD believed it had headroom, or pushing these new chips to the edge required a little more power.
With the Ryzen 9 3900X, with 12 cores, we saw the same thing again.
Despite this being a 105 W TDP chip, the cores at full load saw 122 W peak, with the rest of the chip getting ~24 W, making for an overall 146 W power draw (as measured by the processor internally). PPT for this chip is meant to be 142W.
This shows that Zen 2 has a different strategy to the previous Zen chips when it comes to how AMD is mixing the difference between TDP and PPT. If we saw the same thing with the Ryzen 9 3950X, then it pretty much confirms the hypothesis.
At its peak, the 3950X draws 137 W for the cores when 10 cores are loaded. The chip as a whole hits ~144-145W at that level, well above the 105 W TDP rating on the box and bang on the 142W PPT. This is partly why AMD is recommending a large liquid cooler for this chip. Under Intel’s definition, the TDP rating is a guarantee for the power consumption at base frequency, although most Intel processors can go above that frequency and stay within the power. We might be seeing something similar here with AMD now.
It is worth noticing that when up to two cores are loaded, we see each core getting around 18 W of power, but when all the cores are loaded, we are seeing between 6.9 W and 7.6 W. This is compared to the 12-core 3900X, which has about 17.5 W per core initially, and falls down to 10 W per core. AMD is trying to get a higher single core frequency from the 16-core hardware, so by giving more power when a single core is loaded, this might help.
One other thing to note is where the peak power is observed. We kind of already saw this on the Ryzen 9 3900X in that review, where the peak power of the chip happened when 10 cores were loaded, not the full 12 cores. The difference between the two was minimal, but we’re seeing this on a larger scale with the Ryzen 9 3950X.
When looking at both the cores-only power and the CPU total power, we get a peak with this processor when 10 cores are loaded. This would indicate a 3+2+3+2 mix on the CCXes, which is perhaps an inflection point when current densities start getting much higher and per-core power has to be reduced to ensure everything is still working optimally. The power differential between 10-core use and 16-core use is almost 20W, so users that don’t always use all the cores all the time might exhibit good per-thread performance up to 10 core workloads.
Speaking of frequencies, this has been a touchy topic of late. We have seen with recent news and testing that some users are not observing peak single core frequencies of their Ryzen processors. As we explained in our deep dive of the issue, part of it comes down to the fact that AMD’s turbo policies for Zen 2 are different to Intel: only one core in a set is likely to turbo up to the highest frequency, whereas Intel’s Turbo Boost 2.0 mandates that all cores should hit peak turbo. The other part of it is the testing methodology, but also the fact that the ACPI standards at the OS level can indicate a turbo on a shorter time scale than software can record, ultimately giving users a smeared out version of that turbo value. Then there are other things, like BIOS versions and Windows power plans.
With our Ryzen 9 3950X, the on-the-box single core turbo frequency is listed as 4.7 GHz. We tested using the ASRock X570 Taichi motherboard, a very high-end product, using Windows 10 v1909 on AGESA 1004B, on both the High Performance (HP) power plan and the Ryzen High Performance (RHP) power plan. For peak single core frequencies, we were able to see 4525 MHz on the HP plan, and 4650 MHz on the RHP plan. This latter value is pretty much on the button for the on-the-box turbo value (I’m sure some people will disagree about those 50 MHz).
These values on the RHP power plan were very instantaneous, as when we put a consistent single thread load on the core, the frequencies very quickly came down.
On the Ryzen High Performance power plan, our sustained single core frequency dropped to 4450 MHz. In these tests, we use an affinity mask to limit how many cores are active while we run POV-Ray, and take the reading about 30 seconds into the benchmark, which allows a core to experience a form of heat soak and reach a reliable current density. This is also how we reached the 18 W per core value for 1-2 core loading in the graphs above, indicating that in order to get a sustained 4.7 GHz single core frequency, AMD would need to drive around 21-24W to the core in order to get that value. It is very likely that the CPU can hit those high numbers, for microseconds at a time, as per the ACPI/CPPC2 stack, but for any user doing per-second or per 100ms monitoring, they’re not likely to see it.
Within this frequency graph though, we can see that the frequency beyond 3 cores has segments. Between 3 cores and 8 cores loaded, we get 4225 MHz to 4125 MHz (100 MHz range), and even at all cores loaded, we’re seeing 3875 MHz, well above the 3500 MHz base frequency listed on the box.
In our full review, we are testing the Ryzen 9 3950X on both the HP and RHP power plans.
Post Your CommentPlease log in or sign up to comment.
View All Comments
halfflat - Wednesday, November 27, 2019 - linkFor Brownian motion? That seems weird. Nonetheless, it can't alone explain the speed up.
Most favourable scenario: code consists only of floating point mul and add pairs, together with 64-bit integer multiplication. The floating point operations could become 4x faster in AVX2 (twice as wide as SSE, and using FMAs); to see the observed 2x speed up, that means the floating point operations constituted 2/3 of the execution time in the SSE version.
The AVX512 version, ignoring any consequent downclocking, could make those floating point operations 8x faster than the SSE case, and the 64-bit integer multiplies also 8x faster. That's still not 10x, it ignores the lower throughput of 8-wide i64 muls compared to scalar muls, and also discounts the slower clock speed.
halfflat - Thursday, November 28, 2019 - linkJust an update: ran a simple test (square eight times all the 64-bit ints in a 1024-long array) wrapped in google benchmark on a Skylake Xeon with gcc-8.2 -O3. The kernel is almost entirely multiplications, and ultimately saw a roughly 2x speed up with AVX512 compared to AVX2, and a 2.5x speed up with AVX512 compared with a 'no architecture specified' compilation.
w1p30ut3r - Friday, November 22, 2019 - linkIts very, very simples. If you gaming lonly buy an intel... If you work and gaming buy a 3950x... If you only work buy a threadripper or a xeon...
Parkab0y - Sunday, October 4, 2020 - linkI really want to see something like this about zen3 5000
trusttechbd - Sunday, October 18, 2020 - linkIntel 9th Gen Core i5-9400 Processor price in bangladesh trusttech
madymadme - Saturday, November 7, 2020 - linkGoing to buy
AMD Ryzen 9 5900X,
Gigabyte B550 AORUS PRO AC,
Noctua NH-D15 Dual 140m Fans,
G.skill Trident Z RGB Series 16GB (2x8GB) 4000 MHz DDR4 Memory F4-4000C18D-16GTZRB
is corsair CV550 watt ok with the above spec ? & I have Quadro K2000D graphic card
is this specification ok ? & which ram to get please help a little & thanks for reading & replying