Overclocking Intel's New 45nm QX9650: The Rules Have Changed
by Kris Boughton on December 19, 2007 2:00 AM EST- Posted in
- CPUs
The Origins of Static Read Control Delay (tRD)
With over a year of experience overclocking the Core 2 family of processors, we have learned a thing or two. One of the most important items we've learned is that higher FSB settings do not necessarily mean better performance. Understandably, this may come as a shock to some. For whatever reason, even a lot of well-regarded, seasoned overclockers seem to place great value in achieving the highest possible FSB. Based on what we know, we always establish our base target MCH overclock at the same spot - 400MHz FSB with a tRD of 6. The only other potential base MCH target value even worth considering is 450MHz with a tRD of 7, which should only be used when extra memory speed is needed or when a low maximum CPU multiplier becomes a limiting factor. Without getting into too much detail, let's examine what we mean by this.
When it comes to overclocking, the MCH functions as a hybrid of sorts. Like a CPU, it has an upper frequency limit and more voltage can often raise this limit. On the other hand, since it interfaces with memory it also behaves somewhat like memory with internal "timings" whose absolute values derive from the established FSB.
Consider the case of memory rated to run DDR-800 at CAS 3. We can calculate the absolute CAS (Column Address Strobe) delay in a few quick steps. DDR-800, which is in fact double date rate as the name suggests, runs at a base frequency of 400MHz or 400 million cycles per second. Inverting this value tells us the number of seconds per cycle (2.50ns). Finally, multiplying this by the CAS rating tells us the total delay time of 7.5ns (3 x 2.5ns). Likewise, setting a CAS value of 4 results in an absolute CAS delay of 10ns. We can see now why higher CAS values give way to lower memory bandwidths - in the case described above the MCH spends more time "waiting" for data to become available when the memory is set to CAS 4.
tRD in hiding…we promise we didn't make up the horrible "Performance Level" moniker
Arguably, the most important MCH setting when it comes to performance tweaking is the Static Read Control Delay (tRD) value. Like the memory CAS (CL), setting this value is relative to FSB. Case in point, a tRD value of 6, calculated in the same manner as used before, tells us that MCH sets a read delay of 15ns at an FSB of 400MHz. This means that in addition to the time required for the CPU to issue a request for data in memory to the MCH, the time the MCH spends translating and issuing the command to the memory, and the time the memory requires in retrieving the requested data, the MCH will spend an additional 15ns simply waiting for valid data to become available before fulfilling the CPU's original read request. Obviously, anything that can minimize this wait will be beneficial in improving memory read bandwidth and quite possibly overall system performance.
Until recently, direct tRD manipulation by the user was not even possible. In fact, for the longest time BIOS engineers had no choice but to accept this setting as essentially "hard-coded", making MCH performance rather lackluster. The only way to increase memory subsystem performance was to run at higher FSB settings or tighten primary memory timings. At some point, the MCH design teams got tired of the CPU people hogging all the glory and in a well-calculated effort to boost MCH performance exposed this setting for external programming.
The outside world's first introduction to variable tRD settings came when a few overclockers noticed that setting lower MCH "straps" allowed for higher memory bandwidths. What they didn't know at the time was that they had unintentionally stumbled upon tRD. Tricking the motherboard into detecting an installed CPU as an 800 FSB (200MHz) part forced the MCH into setting a lower tRD value than if the FSB were 1066 (266MHz). Consequently, overclocking the system to the same higher FSB value with the lower strap setting yielded higher memory performance. Often times the effect was significant enough that real-world performance was higher even with a lower final FSB. The tradeoff was apparent however: a lower strap meant a lower maximum FSB. The MCH tRD value, just like a memory timing, must eventually be loosened in order to scale higher. What's more, as is the case with memory, additional voltage can sometimes allow the MCH to run with tighter "timings" at higher speeds.
Eventually the inevitable next step in memory performance tuning became a reality. The option to adjust tRD independent of MCH strap selection became part of every overclocker's arsenal. Nowadays the MCH strap setting does little more than determine which memory multiplier ratios are available for use. Although tRD adjustments are now possible in many BIOS implementations, some motherboard manufactures choose to obfuscate their true nature by giving the setting confusing, proprietary names like "Transaction Booster" and the like. Don't let these names fool you; in the end they all do the same thing: manipulate tRD.
56 Comments
View All Comments
mariedeguzman - Friday, June 19, 2009 - link
Thanks for this post, this is a great article and a good help to those who need advices about this post.Markfw900 - Thursday, January 10, 2008 - link
My Gigabyte P35-DQ6 does have what you say is voffset, but is has NO vdroop from idle to load. I believe this is because it has a far superior power delivery system. I don't have an instrument to tell me any differences that may happen in nano-seconds on the voltage, but overall, it never seems to change. This would be consistant with a high quality board. So why do you say its a feature ? I can see how a mfg may undervolt to not go over recommended vcore for non-overclocked cpu's, but if I didn't overclock, my board wouldn't have vdroop either.Its just cheap motherboards, not a "feature". If I am wrong, please test a DQ6 and show the results.
LaGUNaMAN - Saturday, January 5, 2008 - link
One of the best tech articles I've read in awhile. (^^,)isvaljek - Tuesday, January 1, 2008 - link
"typically, even the worst "performance" memory can handle CAS3 when running at about DDR2-800, CAS4 to about DDR2-1075, and CAS5 for anything higher."Are they for real?
mindless1 - Monday, December 31, 2007 - link
Considering the heat produced I can't see a justification for the idea of drastic shifts in the cooling industry. Realistically there aren't THAT many overclockers using water cooling at all and current (including older) processors having lower power consumption were what brought the cooling industry to what it is today.You may say past some point the heat isn't the factor, but you still need a decent heatsink up until that point. 100W of heat for example is a non-trivial level even though some past parts have exceeded that.
mindless1 - Monday, December 31, 2007 - link
What I really meant to say is that it's not just a matter of getting rid of the heat but doing so without the system sounding like it has a leaf blower hidden inside, and for that many lesser heatsinks just don't cut it.mindless1 - Monday, December 31, 2007 - link
What I really meant to say is that it's not just a matter of getting rid of the heat but doing so without the system sounding like it has a leaf blower hidden inside and for that many lesser heatsinks just don't cut it.mindless1 - Monday, December 31, 2007 - link
What I really meant to say is that it's not just a matter of getting rid of the heat but doing so without the system sounding like it has a leaf-blower hidden inside and for that many lesser heatsinks just don't cut it.SilthDraeth - Friday, December 21, 2007 - link
And their TDP measurement is the same as it has always been, maximum draw.Yes ACP is a marketing tool. So what. MHZ is a marketing tool as well, and still has real world benefits. Same as ACP.
wordsworm - Thursday, December 20, 2007 - link
Best damned article I've seen out of AT in a long time. Bravo.