Galaxy S9 Exynos 9810 Hands-On - Awkward First Resultsby Andrei Frumusanu on February 25, 2018 6:45 PM EST
- Posted in
- Exynos 9810
- Galaxy S9
- Galaxy S9+
Following our launch article I promised an update on the performance scores of the Exynos 9810 variant of the Galaxy S9. I was able to have some time with one of the demo devices at the launch event and thoroughly benchmark it with a few of our common tests.
|Samsung Exynos SoCs Specifications|
|SoC||Exynos 9810||Exynos 8895|
|CPU||4x Exynos M3
One Core : 2.704 GHz
Two Core: 2.314 GHz
Four Core: 1.794 GHz
4x 512KB L2
4096KB L3 DSU
|4x Exynos M2
@ 2.314 GHz
|4x Cortex A55 @ 1.95 GHz
512KB L3 DSU
|4 x Cortex A53 @ 1.690 GHz
@ 572 MHz
@ 546 MHz
As a refresher, early in the year Samsung LSI had dropped a bombshell in claiming an astounding 2x single-thread performance improvement with the new Exynos 9810. While this initially caused a lot of controversy and discussions on the validity of the claim, early this year we exclusively covered the high-level micro-architectural features of the new Exynos M3 core and by then it was clear that the performance claims were not just marketing claims. The new Samsung CPU core is the first “very wide” CPU microarchitecture to power Android SoCs and the first to finally follow Apple’s footsteps in the direction of maximising single-thread performance. As a result it stands to be a very interesting - and ideally very powerful - SoC for the Android market.
Determining Clock Speeds
Firstly one of the biggest questions for me was confirming the final clock that Samsung would use on the Galaxy S9. We detected the clock as 2704 MHz, which is 200MHz less than the 2.9 GHz that Samsung's LSI division advertises for the chipset. What makes the story more compelling is that the 2.7 GHz clock is only achievable when one of the cores in the cluster is active - thus making Samsung employ scalable maximum frequencies depending on active core numbers in the big cluster. At two active cores the frequency drops down to 2314 MHz while three and four active cores the cores clock down to only 1794 MHz.
We can also confirm that the Mali G72MP18 GPU is running at a very conservative 572MHz. This is not what we had expected - the previous generation Exynos 8895 had a larger MP20 configuration, running at a similar 546MHz. The resulting performance gains for the GPU thus seem to be even lower than we had expected, as I was betting on a ~650-700 MHz clock for the graphics.
I was also able to confirm the cache configurations of the CPUs with help of our latency test. The L1D cache of the M3 cores is 64KB, up from the 32KB on the previous generation. The M3 cores also come with 512KB of private L2 caches, and a shared 4MB L3 cache.
The little A55 cores came at a surprise as they look to be in a separate cluster, rather than in a single DynamIQ cluster with the big cores. This creates something similar to a big.Little design, but each part of the 4+4 is its own DynamIQ cluster. So here it looks like Samsung has decided not to employ the optional L2 caches for the Cortex A55s, and instead the cluster solely relies on a shared 512KB L3 cache of the DSU. The latency scores to DRAM are outlandishly good and the best we’ve ever seen among current Android SoCs, so Samsung has definitely introduced a new generation of interconnect or memory controllers.
Parsing the Benchmark Results: Geekbench Looks Good
In our testing we were able to confirm the GeekBench 4 scores already leaked, where we saw the Exynos 9810 achieving excellent performance gains and vastly outpacing the Snapdragon 845, and coming into the territory of the Apple A10 and A11. Meanwhile versus the last-generation Exynos 8895, the floating point performance increases handily exceed Samsung’s projected gains of 2x as we see a 114% improvement even at the lowered 2.7GHz frequency.
When looking at the performance per clock it is clear how the Exynos M3 distinguishes itself as a much wider microarchitecture compared to any other existing CPU which powers Android SoCs.
Parsing the Benchmark Results: PCMark and Web Tests
Finally I stumbled upon some very questionable performance figures when testing system performance. I’m not going to go into the details for every benchmark as they are generally all painting the same picture:
What seems clear is that there is something is very very wrong with the Exynos 9810 S9+ that I tested. It was barely able to distinguish itself from last year’s Exynos 8895, let alone the Snapdragon 845 in the Qualcomm Reference Device which we previewed earlier this month. I looked through the system and monitored frequencies and indeed the big cores were reaching the maximum 2.7GHz core frequency. The only explanation I have right now is that it’s possible that the DVFS configuration, as well as the scheduler, are currently so conservatively tuned that there is barely any activity on the big cores.
I dug a bit more through the system and found out Samsung uses some new scheduler called “eHMP”. I’m not sure if this is something based on EAS but the system did use schedutil as a frequency governor.
One of the Samsung spokesmen confirmed to me that the demo unit were running special firmware for MWC and that they might not be optimized. I’m having a bit of a hard time believing they would so drastically limit the performance of the device for the show demo units and less so that they would mess around with the scheduler settings. I did get confirmation that Samsung is planning to “tune down” the Exynos variant to match the Snapdragon performance – however the current scores which I got on these devices make absolutely no sense so I do hope this is just a mistake that will be resolved in shipping firmwares and we see the full potential of the SoC.
Parsing the Benchmark Results: Graphics
On the GPU side, the lower cluster count of the new Mali G72MP18 is a surprise, as the minor clock bump is negated by the fact that the new SoC has two less GPU cores compared to the 8895. If the performance per clock per core between the G71 and G72 were the same then this would actually mean a downgrade in raw GPU power from the Exynos 8895, so any increase, if any, should come solely thanks to the architectural changes of the new G72 GPU, power efficiency improvements, as well as possibly SoC memory subsystem improvements.
In Manhattan 3.1 the Exynos 9810 sees a mere 7% increase and lags far behind the new Snapdragon 845’s Adreno 630.
In T-Rex, the increase is 18% which might be one of the benchmarks that Samsung sourced their 20% improvement from. Here the Exynos is more near to the performance of the Snapdragon 845.
I wasn’t able to properly measure power on the event demo devices, as they had different interface settings than my tool had been programmed with, so I only was able to make some inaccurate estimates based on coarse current readout from the system.
For CPU workloads, our usual CPU power virus used up 3.1W at 1-core 2.7 GHz loads. 2-core 2.3 GHz seemed to have floated around 3.1-3.5W, and a 4-core load at 1.8 GHz maintained this power consumption.
Over the following days I will need more time, and hopefully get some SPEC figures to paint a more accurate picture. For now the results could swing either way and be either positive or negative for the M3 cores. It’s clear that the higher frequencies have a very large power penalty, and Samsung should want to operate more in the low-to-mid frequencies, hence the current frequency scheme.
On the GPU side for Manhattan fluctuated between 4.5 and 5.2W, which is an improvement over the Exynos 8895. But again, this is still at a disadvantage compared to the Snapdragon 845.
Overall today’s quick benchmarking session opened up more questions than it managed to answer. Hopefully with more time we will be able to investigate the working of the new SoC and, fingers crossed, today’s results are not representative of shipping product as that would otherwise be an utterly massive disappointment.
Post Your CommentPlease log in or sign up to comment.
View All Comments
grahaman27 - Monday, February 26, 2018 - linkand android with vulkan. Touch latency IS hardware related.
SydneyBlue120d - Monday, February 26, 2018 - linkAre You sure?!? AFAIK no one is using Vulkan other than games...
lilmoe - Monday, February 26, 2018 - linkSamsung uses vulcan for their launcher and some of their apps.
Android doesn't use vulcan, and the problem isn't about which api is being "utilized". Ever used chrome on a desktop and compared it to naive browsers of the respective OSs? Yea. Google's hardware UI acceleration layer is crap. They've had industry leading software rendering, I'll give them that. But their software rendering layer is still present in the rendering pipeline and, coupled with hardware acceleration introduced with ICS (with sloppy improvements over time) , it uses a lot more system resources with more cpu time than competing mobile OSs.
Samsung are doing the right thing with the Exynos gs9. 4 small cores should be plenty for rendering, workout even maxing them out. I bet Apple's monsoon cores barely fire at any UI related task.
Modern Android phones should be able to last much, MUCH longer with all the sophistication in their SoCs, but Google ain't doing much to help at all, not even when nothing is being referred or displayed. Why don't we have internet permission native in Android yet? Shouldn't I be able to decide which app can use my network or even run in the background? Thanks for the small bone in Oreo Google?
You'll hear lots of airheads saying that Android's hardware acceleration is just as efficient as iOS and Windows phone/mobile. Lol, funny.
tuxRoller - Monday, February 26, 2018 - linkHow do you think it differs between the two?
I'm always interested in learning something new.
lilmoe - Monday, February 26, 2018 - linkOverhead. Android's implementation relies more heavily on the CPU where it shouldn't. A pixel rendered on Android has a longer path to go before it reaches the screen compared to iOS or Windows Phone. As an example of Google's attempts at "fixing" that, they ramped up the CPU clocks to max as soon as the user touches the screen............ totally inelegant. That's not how you improve on a problem...
In addition to that, there's also the fact that Android apps run in VM. They'll never be as responsive as native apps at the same power draw. Sure, they're "optimized" when installed on a device, but lets not mistake that with what Microsoft was doing with Windows Phones, this is where Google had a huge chance to improve, and failed.
On a side note, there was a very interesting sub-test in the GFX benchmark called driver overhead. I'm not sure if its still there, but even during Lollipop, the difference between Android and iOS were massive, not sure where it is now, but I wouldn't vouch on the gap being massively reduced.
tuxRoller - Tuesday, February 27, 2018 - linkBoth iOS & Android use software and hardware rendering. Android has a nice list of supported hardware-accelerated drawing operations (https://developer.android.com/guide/topics/graphic... but I've been unable to find a similar list for iOS though I know it doesn't support a fully accelerated pipeline.
The native vs jit (though art is more of a hybrid jit/aot runtime) difference is very small, and really only shows up in ram (gc had has become much better).
The driver overhead is only a test of drivers (ie, pure graphics api test -- direct rendering in Linux speak).
So, if you've any details you can offer to support your claim, again, I'd love to read them.
lilmoe - Tuesday, February 27, 2018 - link@tux
Partial VS Full acceleration, and that's from the link you provided you provided. Flexibility in hardware acceleration is nice, but the fact that it's there means that there is an extra layer of overhead. Common sense? Not sure what exactly you don't understand here.
JIT will always be that. What they've improved was load times and the latency in actually reaching the rendering pipeline, not the pipeline itself, at least not to a very significant degree. Why does Android need significantly more memory than iOS per app if there's no significant difference?
Drivers make a huge difference in everything dependant on the GPU. Again, not sure what you don't understands here.
Why do you need more details when Google is "fixing" a hardware acceleration issue with more aggressive cpu and memory clocks than is needed? Doesn't that raise any flags for you? I've provided a lot of details throughout my arguments here in the comment section on anandtech for several years. I don't have them saved, but if you still think Android's rendering pipeline is just as efficient, then I have nothing else to say. The difference in the visual quality of apps between Android and iOS speaks for itself.
ladyanita22 - Tuesday, February 27, 2018 - linkApple chipsets have already more CPU power than Android ones, so that's absolutely not true. Actually, iPhones have much more power left than Android phones for UI rendering. Have you tried iOS11 on an iPhone 5s? That's one hell of a powerful device and still, its UI performance is worse than Androids of similar power.
lilmoe - Wednesday, February 28, 2018 - link@ladyanita22
I never believed Apple had "more CPU power" than chips powering Android. Powerful? Yes. More powerful than others? Heck NO, not at ~3 watts per core, on a DUAL core (multi-core power draw is guaranteed to be MORE than 3 watts). In comparison, at 3 watts, Exynos and Snapdragon deliver WAY more performance.
Yes, the iPhone 5S has a great CPU, and I have reasons to believe the problem isn't from the CPU, but from Apple. They do have a proven history of slowing down older phones, and while power delivery problems (evident from recent events) is one of the reasons of the slowdowns, I don't buy that that's the only instances where slowdowns occure one bit, especially for older devices.
tuxRoller - Tuesday, February 27, 2018 - linkNo, that's not what it means, AND iOS ISN'T fully accelerated, and no, I'm not going to look through your posting history. If you've actual evidence, let's see it. The rather broad way you answered my post suggests I might not find anything pertinent.