What Intel and AMD are Offering

Before we can dive into benchmarks, it is good to see how the vendors position their CPUs. Before we do that, a quick specsheet of the most important AMD and Intel CPUs.

Model Speed (GHz) Max. clock 4 cores busy (GHz) L2 Cache (KB) L3 Cache (MB) Interconnect Bandwidth in One Direction
Intel Xeon X5570 2.93 3.2 4 x 256 KB 8 MB 12.3 GB/s
Intel Xeon X5560 2.80 3.066 4 x 256 KB 8 MB 12.3 GB/s
Intel Xeon X5550 2.66 2.93 4 x 256 KB 8 MB 12.3 GB/s
AMD Opteron 2435 2.6 2.6 6 x 512 KB 6 MB 9.8 GB/s
Intel Xeon E5540 2.53 2.66 4 x 256 KB 8 MB 11.7 GB/s
AMD Opteron 2431 2.4 2.4 6 x 512 KB 6 MB 8.8 GB/s
AMD Opteron 2389 2.9 2.9 4 x 512 KB 6 MB 8.8 GB/s
Intel Xeon E5530 2.4 2.53 4 x 256 KB 8 MB 11.7 GB/s
Intel Xeon E5430 2.66 2.66 2 x 6 MB N/A Via FSB
AMD Opteron 2427 2.2 2.2 6 x 512 KB 6 MB 8.8 GB/s
AMD Opteron 2384 2.6 2.6 6 x 512 KB 6 MB 4 GB/s
Intel Xeon E5520 2.26 2.33 4 x 256 KB 8 MB 11.7 GB/s
Intel Xeon E5506 2.13 2.13 4 x 256 KB 4MB 9.8 GB/s
AMD Opteron 2378 2.4 2.4 4 x 512 KB 6 MB 4 GB/s

What do you get for your money? The six-cores of AMD are shown in “forest green”.

Intel Xeon Model Speed (GHz) / TDP Price AMD Opteron Model Speed (GHz) / TDP - ACP Price
X5570 2.93 / 95W $1386      
X5560 2.80 x 95W $1172      
X5550 2.66 / 95W $958 2435 2.6 / 75-115W $989
E5540 2.53 / 80W $744 2431 2.4 / 75-115W $698
      2389 2.9 / 75-115W $698
E5530 2.4 / 80W $530 2387 2.8 / 75-115W $523
L5520 2.26 / 60W $530 2376 HE 2.3 / 55-79W $575
L5510 2.13 / 60W $423 2374 HE 2.2 / 55-79W $450
E5520 2.26 $373 2427 2.2 / 75-115W $455
E5506 2.13 $266 2382 2.6 / 75-115W $316
E5504 2.00 $224      
E5502 1.86 $188 2378 2.4 / 75-115W $174

AMD has clearly recognized that it can not beat the best Xeon X55xx when it comes to raw performance. The two top models, the X5570 and X5560 stay out of reach. AMD is basically saying that – with the right application – the new six-core Opteron should be able to keep up with equally clocked Xeons X55xx. In case of the 2435, you get lower power consumption as a bonus. Notice also that the best quad-core Opterons have become significantly cheaper. The 2.9 GHz 2389 “Shanghai”, which used to be positioned against the 2.66 GHz X5550 is now competing with the E5540. The 2.9 GHz Shanghai is still no match for the Xeon E5540 2.53 but it is important to look at the complete server price. 32 GB of reg DDR-3 1066 still costs about $1200, whereas 32 GB of DDR-2 800 costs around $850. It is out of the scope of this article, but it is clear that even if the CPUs cost the same, the AMD based server will be less costly. The Xeon X55xx is after all a very new platform.

For those who love stats, the die size and transistor count table:

CPU Transitor Count (Million) Process Die Size Cores
Intel Dunnington (Xeon 74xx) 1900 45 nm 504 mm2 6
Intel Gainestown (Xeon 55xx) 731 45 nm 265 mm2 4
AMD Istanbul (Opteron 24xx) 904 45 nm 346 mm2 6
AMD Shanghai (Opteron >237x) 705 45 nm 263 mm2 4
AMD Barcelona (Opteron 23xx) 463 65 nm 283 mm2 4
Intel Tigerton (Xeon 73xx) 2 x 291 = 582 65 nm 2 x 143 mm2 4
Intel Harpertown (Xeon 54xx) 2 x 410 = 820 45 nm 2 x 107 mm2 4

AMD’s Istanbul is quite a large chip, but not as expensive as “Barcelona” to produce. The champion is the Harpertown when it comes to the lowest production costs.

Istanbul's Improvements Our Benchmark Methods and Choices
Comments Locked

40 Comments

View All Comments

  • duploxxx - Wednesday, June 3, 2009 - link

    ESX 4 should add IOMMU to the AMD istanbul platform, not sure how far this is implemented in the beta esx4 builds.

    Are you using the paravirtualization scsi driver in the new esx4 platform, I would expect bigegr differences between 3.5 and 4 and not just because EPT is included in esx4 together with enhanced HT.

    for the rest very good thorough review.

    The only thing I always miss in reviews is that although it is good to test the fastest out there, it is now where near the most deployed platform, you rather should look at the 5520-5530 against 2387 - 2431 as the mid range platform that will be deployed in a wide range of systems, this will have a much healthier performance/price/power platform then the top bin. Even the 5570 is not supported in all OEM platforms for the TDP range.
  • Adul - Monday, June 1, 2009 - link

    I do not see oracle running on top of windows all that often. It is normally running on some *nix OS. How about running the same benchmark on say RHEL instead?
  • InternetGeek - Monday, June 1, 2009 - link

    There's actually an odd bug on Oracle's DB that makes it run faster on Windows than on Linux. Search on the internet and you'll find info about it.

    In the other hand, in my now 9 years in the IT industry I've only come across one Oracle DB running on HP-UX. Everything else (Sybase, MySQL, etc) runs on Windows.
  • LizVD - Friday, June 5, 2009 - link

    Could you provide us with a link for that? I'd like to see if this "bug" corresponds with the behaviour we're seeing on our tests.
  • Nighteye2 - Monday, June 1, 2009 - link

    You give a good description of how it works and how it has so much benefit, but then you benchmark only dual-socket servers?

    It would be fairer to also test and compare octo-socket servers - to see the real impact of that HT assist feature.
  • phoenix79 - Monday, June 1, 2009 - link

    Completely agreed (I was typing up a comment about this too when yours popped up)

    I'd love to see some 4-way VMWare scores
  • ltcommanderdata - Monday, June 1, 2009 - link

    Yes. Nehalem is in a great position in the DP market, but isn't yet available in MP. It'd be great to see six-core Dunnington and six-core Istanbul go head to head. Conveniently their highest models have similar clock speeds at 2.66GHz and 2.6GHz respectively although Dunnington would be a lot more power hungry and although I don't remember their prices, probably more expensive too.
  • JohanAnandtech - Tuesday, June 2, 2009 - link

    Dunnington vs Istanbul coming up ... But we are going to take some time to address the shortcomings of this "deadline" article such as better power consumption readings.
  • solori - Monday, June 1, 2009 - link

    "Notice that HT-assist is a performance killer in 2P configurations: you remove two times 1 MB of L3-cache, which is a bad idea with 8 VM’s hitting your two CPUs."

    BIOS guidance suggests that HT Assist be disabled by default on 2P systems, and enabled only for specialized workloads. So that begs the question: Were vAPUS tests performed with or without HT Assist in the 2P configuration? It was not clear.

    I assume AMD-V and RVI were enabled for ALL workloads in ESX 3.5 and 4.0 (forced for 32-bit workloads.) Is this accurate? Based on the number of ESX 3.5 installations out there, this probably should be clearly stated...

    I do want to take issue with your memory sizing and estimates on vCPU loading. Let me put it this way: while Nehalem-EP has better memory bandwidth and SMT threads, Opteron has access to abundant memory. Therefore, it does not make sense - for example - to be OK with enabling SMT but then constrain the benchmark to 24GB due to a Xeon memory limitation.

    I would urge you to look at 48GB configurations on Xeon and Istanbul for your comparison systems. By the way, in consolidation numbers, this makes a significant reduction in $/VM with only a minor increase in per-system CAPEX.

    Another interesting issue you touched on is tuning and load balance. Great job here. These are "black magic" issues that - as you noted - can have serious effects on virtualization performance (ok, scheduling efficiency.) Knowing your platform's balance point(s) is critical to performance sensitive apps but not so critical for light-load virtualization (i.e. not performance sensitive.)

    It sounds like your learning - through experimentation with vAPUS - that virtualization testing does not predict similar results from "similarly configured machines" where performance testing is concerned. In fact, the "right balance" of VM's, memory and vCPU/CPU loading for one system may be on the wrong side of the inflection point for another.

    All and all, a very good article.
  • JohanAnandtech - Tuesday, June 2, 2009 - link

    "this probably should be clearly stated... "

    Good suggestion. I adapted the article. RVI and EPT are always on if possible (so also 32 bit). HT-assist is of always on "Auto" (so off) unless we indicate otherwise.

    "Therefore, it does not make sense - for example - to be OK with enabling SMT but then constrain the benchmark to 24GB due to a Xeon memory limitation. "

    1) You must know that vApus Mark I uses too much memory for the webportals. They can run without any performance loss in 2 GB, even 1 GB. So as we move up on the number of tiles we run, it is best to reclaim the wasted memory.

    2) I agree that a price comparison should include copious amount of memory (48 GB or so).

    3) We don't have more than 24 GB DDR-3 available right now. It would be unfair to force the system to swap in a performance comparison.

    "Opteron has access to abundant memory". What do you mean by this? Typical 2P Opterons have 64 GB, 2P Nehalems 72 GB as upper limit?

    "In fact, the "right balance" of VM's, memory and vCPU/CPU loading for one system may be on the wrong side of the inflection point for another"

    Great comment. Yes, that makes it even more complex to compare two systems. That is why we decided to show 2 datapoints for the 2 tile systems.

    Collin, thanks for the excellent comments. It is very rewarding to notice that people take the time to dissect our hard work. Even if that means that you find wrinkles that we have to iron out. Great feedback.




Log in

Don't have an account? Sign up now