Media Encoding: Not as Happy to See Skulltrail as You'd Think

When we first got the Skulltrail machine we had visions of ripping HD-DVD and Blu-ray discs in record times, on the fly transcoding and just chewing through our DivX tests. While one of our media encoding benchmarks showed reasonable gains, for the most part we didn't see scaling beyond four cores with our encoders.

DivX 6.8 with Xmpeg

Our DivX test is the same one we've run in our regular CPU reviews, we're simply encoding a 1080p MPEG-2 file in DivX. We are using an unconstrained profile and enhanced multithreading is enabled.

DivX 6.8 w/ Xmpeg 5.1.3

While the dual QX9775 setup is technically faster than a single QX9770, it's not faster by all that much. Clock speed matters far more here.

Windows Media Encoder 9 x64

Using Windows Media Encoder's advanced video profile we encode a 500MB AVI file, this is the same test we've run in other CPU reviews.

Windows Media Encoder 9 x64

Once again we're not seeing great performance gains from 8 cores here, there's basically no performance advantage to Skulltrail.

x264 Encoding with AutoMKV

Using AutoMKV we compress the same source file we used in our WME test down to 100MB, but with the x264 codec. If we use anything less than the 2 Pass Insane Quality profile we won't see any scaling on 8 cores, but if we enable the highest quality settings we end up with around 80 - 90% CPU utilization across all 8 cores.

x264 w/ AutoMKV

Higher quality x264 encodes will benefit from 8 cores, but anything less intensive will show gains similar to what we saw with DivX and WME.

A Hammer for 3D Rendering Applications Photoshop and Valve Multithreaded Game Dev Benchmarks
Comments Locked

30 Comments

View All Comments

  • moiettoi - Friday, June 27, 2008 - link

    Hi all

    This sounds like a great board and for some-one like me that uses 4x22"monitors and does heaps of multi tasking it sounds perfect and would gladly pay the price asked.

    BUT why is such a great board slowed right down by not having DDR3 memory sticks,,,because from what I've read at the momment there is not that much difference with running this and what I have now which is a quad core with DDR3 which runs great but I do overwork it.So bigger would be better.

    You would think and I'm sure they already know that it would be common sence to make this board with DDR3 as it is it's only fault as far as I can see.

    We will probably see that board come out soon or next in line once they have sold enough of these to satify there egos.

    Great board but,,,,just not yet I will be waiting for the next one out which will have to carry DDR3,,,if they want to go forward in thier technolagy.





    hnolagy
  • VooDooAddict - Thursday, February 7, 2008 - link

    For testers of large distributed systems this is an awesome thing to have sitting on your desk.

    You can have a small server room running on one of these.

    The biggest shortfall I see is cramming enough RAM on it.
  • iSOBigD - Tuesday, February 5, 2008 - link

    I'm actually very disappointed with 3D rendering speed. Going from 1 core to 4 cores takes my rendering performance up by close to 400% (16 seconds to 4.something seconds, etc.) in Max with any renderer. (I've tried Scanline, MentalRay and VRay) ...so I'm surprised that going from 4 to 8 gives you 40-60% more speed. That's pretty pathetic, so I suspect the board is to blame, not the software.
  • martin4wn - Tuesday, February 5, 2008 - link

    Actually 40-60% is not disappointing at all, it's quite impressive. You are encountering the realities of Amdahl's law, which is that only the parallel part of the app scales. Here's a simple workthrough:

    Say the application is 94% parallel code and 6% serial. As you add cores, say the parallel part scales perfectly, so doubles in speed with every doubling in core count. Now say the runtime on one core is 16 seconds (your example). Of that, 1 second is serial code and the other 15 seconds is parallel code running serially.

    Now running on a 4 core machine, you still have the 1s serial, but the parallel part drops to 15/4 = 3.75 seconds. Total runtime 4.75s. Overall scaling is 3.4x. Now go to 8 cores. Total runtime = 1 + 15/8 = 2.87s. Scaling of 60% going from 4 cores to 8 cores, and overall scaling of 5.5x

    So the numbers are actually consistent with what you are seeing. It's a great illustration of the power of Amdahls law - even an app that is 94% parallel still only gains 60% going from 4 to 8 cores even with perfect scaling, and it's really hard to get good scaling at even moderate core counts. Once you get to 16 or more cores, expect scaling to fall off even more dramatically.
  • ChronoReverse - Tuesday, February 5, 2008 - link

    This is why I'm quite happy with my quad core. What would probably be the useful limit on the desktop would be a quad core with SMT. After that faster individual cores will be needed regardless of how parallel our code gets (face it, you're not getting 90% parallelizeable software most of the time and even then 8 cores over 4 isn't getting more than about 50% boost in the best case for 90% parallel code).
  • FullHiSpeed - Tuesday, February 5, 2008 - link

    Why the heck does this D5400XS MB support only the QX9775 CPU ??? If you need to use 8 cores you can get a lot more bang for the buck with quad core Xeon 5400 series, with only 80 watts TDP each, up to 3 ghz. For a TOTAL of $508 ($254 each quad ) you can have 8 cores @ 2 Ghz.

    Last month I built a system with a Supermicro X7DWA-N MB ($500) and 4 gig of DDR2 667 ($220) and a single 2.83 Ghz Xeon E5440 ($773) , which I use to test Gen 2 PCIE, dual channel 8 Gb/s Fibre Channel boards, two boards at once.
  • Starglider - Tuesday, February 5, 2008 - link

    Damnit. AMD could've destroyed this if they'd gotten their act togther. Tyan makes a 4 socket Opteron board that fits into an E-ATX form factor;

    http://www.tyan.com/product_board_detail.aspx?pid=...">http://www.tyan.com/product_board_detail.aspx?pid=...

    I was strongly tempted to get one before the whole Barcelona launch farce. If AMD hadn't made such horrible execution blunders and could have devoted the kind of resources Intel had to a project like this, we could have four Barcelonas running at 3 to 3.6 GHz with eight DDR2 slots all on a dedicated channel. Ah well. Guess I'll be waiting for Nehalem.
  • enigma1997 - Tuesday, February 5, 2008 - link

    Note what Francois said in his Feb 04 reply re memory timing http://blogs.intel.com/technology/2008/01/skulltra...">http://blogs.intel.com/technology/2008/01/skulltra... Do you think it would help the latency and make it closer to DDR2/DDR3 ones? Thanks.
  • enigma1997 - Tuesday, February 5, 2008 - link

    CL3 FBDIMM from Kingston would be "insanely fast"?! Have a read of this artcile: http://www.tgdaily.com/content/view/34636/135/">http://www.tgdaily.com/content/view/34636/135/
  • Visual - Tuesday, February 5, 2008 - link

    I must say, I am very disappointed.

    Not from performance - everything is as expected on this front... I didn't even need to see benchmarks about it.

    But prices and availability are hell. AMD giving up on QuadFX is hell. Intel not letting us use DDR2 is hell.

    I was really hoping I could get a dual-socket board with a couple (or quad) PCI-express x16 slots and standard ram, coupled with a pair of relatively inexpensive quadcore CPUs. Why is that too much to ask?

    The ASUS L1N64-SLI WS board has been available for an eon now, costs less than $300 and has quite a good feature set. Quadcore Opterons for the same socket are also available for more than a quarter, some models as cheap as $200-$250.
    Unfortunately, for some god-damned reason neither ASUS not AMD are willing to make this board work with these CPUs. The board works just fine with dual-core Opterons, all the while using standard unbuffered unregistered DDR2 modules, but not with quad cores? WTF.

    And that board is old like the world now. I am quite certain AMD could, if they wanted, have a refresh already - using the newest and coolest chipsets with PCIe 2.0, HT 3.0, independent power planes for each cpu, etc.
    Intel could also certainly have a dual socket board that works with cheap DDR2, have plenty PCI-express slots, and the cheap $300 quad-core Xeons that are out already instead of the $1500 "extremes".

    I feel like the industry is purposely slowing, throttling technological progress. It's like AMD and Intel just don't want to give us the maximum of their real capabilities, because that would devalue their existing products too quickly. They are just standing around idly most of the time, trying to sell out their old tech.
    Same as nVidia not letting us have SLI on all boards, or ATI not allowing crossfire on nforce for that matter.
    Same as a whole lot of other manufacturers too...
    I feel like there is some huge anti-progress conspiracy going on.

Log in

Don't have an account? Sign up now