CrossFire Xpress 3200: RD580 for AM2
by Wesley Fink on June 1, 2006 12:05 AM EST- Posted in
- Motherboards
Disk Controller Performance
With the variety of disk drive benchmarks available, we needed a means of comparing the true performance of the wide selection of controllers. The logical choice was Anand's storage benchmark first described in Q2 2004 Desktop Hard Drive Comparison: WD Raptor vs. the World. The iPEAK test was designed to measure "pure" hard disk performance. The hard drive is kept as consistent as possible while varying the hard drive controller; The idea is to measure the performance of a hard drive controller with a consistent hard drive.
We played back Anand's raw files that recorded I/O operations when running a real world benchmark - the entire Winstone 2004 suite. Intel's iPEAK utility was then used to play back the trace file of all IO operations that took place during a single run of Business Winstone 2004 and MCC Winstone 2004. To try to isolate performance differences to the controllers that we were testing, we used the Maxtor 120GB 7200 RPM 8MB cache IDE drive in all IDE tests. SATA1 tests used the 60GB 7200RPM 8MB DiamondMax Plus 9, and SATA2 was tested with the Hitachi 250GB SATA2 drive with SATA2 enabled with the Hitachi utility. The drive was formatted before each test run and a composite average of 5 tests on each controller interface was tabulated in order to ensure consistency in the benchmark.
iPEAK gives a mean service time in milliseconds; in other words, the average time that each drive took to fulfill each IO operation. In order to make the data more understandable, we report the scores as an average number of IO operations per second so that higher scores translate into better performance.
Any concerns about SB600 should be put to rest with these tests. IDE, SATA and SATA2 test results are very competitive with NVIDIA, ULi, and Silicon Image. The performance patterns hold steady across both Multimedia Content IO and Business IO, with the ULi, ATI, and Silicon Image based disk controllers providing the fastest IO operations followed by the on-board NVIDIA nForce4 SATA controllers. The performance generated by the ULi and ATI IDE controller logic is particularly excellent, while the SATA performance of both is up to 12% better when compared to the nForce4 chipset. The SATA performance of the Silicon Image 3132 is very competitive with the core logic chipsets in our tests.
Memory Testing - Optimum tRAS
As expected, DDR2 memory behaves quite differently than DDR in tRAS testing. As you can see from the standard chart below, a 2GB kit of Corsair 8500 (DDR2-1066) experienced steadily increasing bandwidth until the maximum tRAS setting of 18 was reached.
This is a very different pattern than DDR tRAS testing, where maximum bandwidth was reached at some intermediate tRAS setting and bandwidth decreased as tRAS was decreased or increased from this optimum value. In fact, at tRAS 18 we did get the highest bandwidth with all else equal, but the tRAS 18 setting was unstable - causing memory failures and random reboots.
We did further memory testing using Sandra 2007 unbuffered test results and found the optimum combination of bandwidth and stability was achieved at a tRAS setting of 13. Similar results were achieved with the DDR2 8500 Corsair memory on the nForce 590 chipset. We have shared our test results with Corsair and asked for more information on tRAS settings, performance, and stability with high-speed DDR2 memory. All stock benchmarking was performed with Corsair 8500 settings of DDR2-800 at 3-3-3-13 settings at 2.147V.
Memory Bandwidth
Memory bandwidth performance was verified using Sandra 2007. Both buffered and unbuffered tests were run with the stock 4800+ at DDR2-800 3-3-3-13 at 2.147V.
Both standard Buffered Sandra 2007 Memory Performance and Unbuffered Performance are almost identical in the ATI RD580 AMD and the NVIDIA 590 chipsets. This clearly demonstrates that both architectures perform about the same using the same memory and the same CPU with on-board AM2 memory controller. Any differences between the ATI and NVIDIA AM2 memory scores are likely the result of memory tweaking.
You can clearly see the AM2 processor exhibits dramatically higher memory bandwidth than the Athlon64 in Socket 939 running DDR memory. Unfortunately, that much improved memory bandwidth does not currently translate into similarly improved performance.
With the variety of disk drive benchmarks available, we needed a means of comparing the true performance of the wide selection of controllers. The logical choice was Anand's storage benchmark first described in Q2 2004 Desktop Hard Drive Comparison: WD Raptor vs. the World. The iPEAK test was designed to measure "pure" hard disk performance. The hard drive is kept as consistent as possible while varying the hard drive controller; The idea is to measure the performance of a hard drive controller with a consistent hard drive.
We played back Anand's raw files that recorded I/O operations when running a real world benchmark - the entire Winstone 2004 suite. Intel's iPEAK utility was then used to play back the trace file of all IO operations that took place during a single run of Business Winstone 2004 and MCC Winstone 2004. To try to isolate performance differences to the controllers that we were testing, we used the Maxtor 120GB 7200 RPM 8MB cache IDE drive in all IDE tests. SATA1 tests used the 60GB 7200RPM 8MB DiamondMax Plus 9, and SATA2 was tested with the Hitachi 250GB SATA2 drive with SATA2 enabled with the Hitachi utility. The drive was formatted before each test run and a composite average of 5 tests on each controller interface was tabulated in order to ensure consistency in the benchmark.
iPEAK gives a mean service time in milliseconds; in other words, the average time that each drive took to fulfill each IO operation. In order to make the data more understandable, we report the scores as an average number of IO operations per second so that higher scores translate into better performance.
Any concerns about SB600 should be put to rest with these tests. IDE, SATA and SATA2 test results are very competitive with NVIDIA, ULi, and Silicon Image. The performance patterns hold steady across both Multimedia Content IO and Business IO, with the ULi, ATI, and Silicon Image based disk controllers providing the fastest IO operations followed by the on-board NVIDIA nForce4 SATA controllers. The performance generated by the ULi and ATI IDE controller logic is particularly excellent, while the SATA performance of both is up to 12% better when compared to the nForce4 chipset. The SATA performance of the Silicon Image 3132 is very competitive with the core logic chipsets in our tests.
Memory Testing - Optimum tRAS
As expected, DDR2 memory behaves quite differently than DDR in tRAS testing. As you can see from the standard chart below, a 2GB kit of Corsair 8500 (DDR2-1066) experienced steadily increasing bandwidth until the maximum tRAS setting of 18 was reached.
Memtest86 Bandwidth ATI CrossFire Xpress 3200 AM2 with Athlon X2 4800+ |
|
6 tRAS | 2047 |
7 tRAS | 2047 |
8 tRAS | 2047 |
9 tRAS | 2047 |
10 tRAS | 2047 |
11 tRAS | 2140 |
12 tRAS | 2140 |
13 tRAS | 2191 |
14 tRAS | 2191 |
15 tRAS | 2242 |
16 tRAS | 2242 |
17 tRAS | 2298 |
18 tRAS | 2298 |
This is a very different pattern than DDR tRAS testing, where maximum bandwidth was reached at some intermediate tRAS setting and bandwidth decreased as tRAS was decreased or increased from this optimum value. In fact, at tRAS 18 we did get the highest bandwidth with all else equal, but the tRAS 18 setting was unstable - causing memory failures and random reboots.
We did further memory testing using Sandra 2007 unbuffered test results and found the optimum combination of bandwidth and stability was achieved at a tRAS setting of 13. Similar results were achieved with the DDR2 8500 Corsair memory on the nForce 590 chipset. We have shared our test results with Corsair and asked for more information on tRAS settings, performance, and stability with high-speed DDR2 memory. All stock benchmarking was performed with Corsair 8500 settings of DDR2-800 at 3-3-3-13 settings at 2.147V.
Memory Bandwidth
Memory bandwidth performance was verified using Sandra 2007. Both buffered and unbuffered tests were run with the stock 4800+ at DDR2-800 3-3-3-13 at 2.147V.
Both standard Buffered Sandra 2007 Memory Performance and Unbuffered Performance are almost identical in the ATI RD580 AMD and the NVIDIA 590 chipsets. This clearly demonstrates that both architectures perform about the same using the same memory and the same CPU with on-board AM2 memory controller. Any differences between the ATI and NVIDIA AM2 memory scores are likely the result of memory tweaking.
You can clearly see the AM2 processor exhibits dramatically higher memory bandwidth than the Athlon64 in Socket 939 running DDR memory. Unfortunately, that much improved memory bandwidth does not currently translate into similarly improved performance.
71 Comments
View All Comments
Stele - Friday, June 2, 2006 - link
Odd that the board uses two 3132s to provide the extra 4 ports - probably for logistic and pricing reasons (easier to stock and better economy of scale when buying 2x one chip compared to 2 different chips).I say it's 'odd' because the 3132 was specifically designed to work with port multipliers, specifically their SiI 3726 1-to-5 drive multiplier. The 3132 thus has only 2 ports to save space and costs (for customers who only need 2, e.g. laptops). In this motherboard's case, instead of having two 3132s giving 4 ports, you could use one 3132 and one 3726 to provide (1 + 5 =) 6 extra SATA ports via the 3132, bringing the total number of SATA ports on the motherboard to 10.
Indeed, this would probably be a useful combination: 4 from the SB600, 4/5 from the 3726 and the remaining 1/2 routed to the back as eSATA. For routing simplicity, I suspect board designers may keep all the ports from the 3726 in one cluster near the IC and hence as internal SATA, leaving the 3132's other channel available for eSATA.
While we're on the SATA question, I'd like to ask if anyone has any confirmation about the RAID levels supported by the SB600. This is because on pg 2 of the AT review, it mentions in the diagram that SB600 supports, inter alia, RAID 5. However, on pg 3, the table does not list RAID 5 among the supported RAID levels.
I then went to ATi's website to check out their own pages on the SB600. Interestingly enough, there was the same problem - the diagram was also there, showing RAID 5, but their own spec sheet does not mention RAID 5 either! So does the SB600 support RAID 5 or doesn't it? :P
Chadder007 - Thursday, June 1, 2006 - link
I wish ATI and NVidia would get off of this Dual Card setup crap and get their act together and make a Single Dual Core video card, in the way Dual Core Processors are being made now.Trisped - Thursday, June 1, 2006 - link
That would be nice, but the power drain and heat dissipation problems would be un real. Then people would still want a dual card solution. I can see it now, you need a 1K power supply for your video cards and one for the rest of your system. Your video cards take up 6 slots and have fans that sound like a 1960s sports car. Your CPU has 4 cores and everything is over clocked 25-50%.JarredWalton - Thursday, June 1, 2006 - link
There have actually been several dual GPU cards released in the past, although all of them still require SLI motherboards in order to function. (The SLI requirement is due to NVIDIA's drivers requiring an SLI chipset in order to function.) As far as making dual core GPU -- like the Pentium D, Athlon X2, Core Duo, etc. -- there's actually no point in doing so. Graphics functions are essentially infinitely parallel, so rather than making a dual core G70, they could just make a 48/16/32 (pixel pipelines/vertex pipelines/ROPs) chip instead. Of course, that would require something like 600 million transistors, so until we start getting GPUs made on 65 nm aren't likely to see such a design (or anything close to it).peternelson - Thursday, June 1, 2006 - link
On MAJOR difference not covered is in useable PCIE lanes.
The review talks about
x16 graphics
x16 graphics
no useable pci remain on the reference board
1x pcie
1x pcie
whereas the nvidia 590 solution offers much more including pcix4.
This is important for people who want to stick in extra raid controllers or specialist cards.
It would be good to highlight this shortcoming and whether it is purely down to the reference motherboard design or to the chipset not supporting as many lanes as the 590.
Also you mention that nvidia are working on putting both x16 in some future northbride (which will be nice). Can you give any hints as to timing, naming, or if this will be dubbed "nforce 6"
Wesley Fink - Thursday, June 1, 2006 - link
We are expecting a bit more information and we will then add this to our comparison chart.ATI RD580 AM2 has 40 PCIe Lanes - 32 for 2 x16 slots, 4 for interconnect between North and South bridge and 4 available for x1 x2, x4 slot(s). In addition the SB600 supports 6 PCI slots.
nVidia has 46 PCIe lanes available with 9 links.
psychobriggsy - Thursday, June 1, 2006 - link
This review says there is GigE in SB600, with a PCIe attached PHY.It also says that nVidia's dual GigE is via PCIe attached PHYs. PHYs do not connect via PCIe, they connect to a GigE controller (whereever it is located).
In the case of nVidia, the southbridge has two GigE controllers integrated. In the case of SB600, there is no GigE controller, you attach it via PCIe x1, allowing you to use decent controllers, or crappy realtek controllers (making motherboard purchases have another thing to check).
Stele - Friday, June 2, 2006 - link
That's probably what ATi's thinking. There are pros and cons to both nVidia's ondie MAC and SB5600's lack of ondie MAC. By having no controller on the SB600, the chip cost is reduced while motherboard manufacturers have complete freedom to choose whichever controller they would like to include - Marvell, Realtek, etc. and single- or dual-port.
The only downside is that you'd need extra real-estate on the motherboard, though arguably it's not that big a deal, especially if controllers with built-in MAC and PHY are used. After all, for dual-port networking capabilities that has server-like features like teaming and fail-over, manufacturers can just use such products as the very attractive Marvell 88E8062 PCIe x4 dual-port GbE controller - which some motherboards like the Asus P5WDG2-WS already do.
Indeed, I'm hoping (dreaming?) that at least one of the top motherboard brands would use this controller in their RD580 solutions, but the fact that the controller is likely going to be quite expensive, along with the perceived lack of the need for such a high-end component would probably kill that idea.
Wesley Fink - Thursday, June 1, 2006 - link
BOTH the ATI and nForce 590 use PHYs that connect to the chip and communicate over a PCIe lane. We were merely differentiating that the Gigabit LAN in both cases communicated over PCIe and was not connected to PCI. nVidia has 2 Gigabit PHY connections, while ATI has 1 Gigabit PHY connection.peternelson - Thursday, June 1, 2006 - link
He's right, the PHY (external or internal) connects to the MAC, which is subsequently connected to the pcie lanes. No pcie goes to any Gbe PHY.