View Full Version : HDlink and CF output use only one core?


Don Blish
March 17th, 2011, 03:52 PM
Clearly CS5 + Neo4K and and a suitable Nvidia card are a huge improvement over CS4. I do note however that both HDlink conversions of .mts to .avi and AME Cineform renderouts use only 25% of my quad. The performance monitor show HDlink running 35 threads....but at 22-25% it looks like it has not engaged my other 3 cores. Similar utilization with CF output from PProCS5 when all video adjustments done in FirstLight. All else seems to provide no bottlenecks....is there some setup item I missed or is the app just not multi-threaded enough?

Details
ASUS P6TDeluxV2 mobo, x58 chipset and 3banks x 2 strips for RAM
running 12GB Kingston 1333 DDR3 as 6 strips
i7-875 quad at 3.33ghz and the Win7 x64 device manager shows 8 processors
very effective cooler and case venting
OS on single 7200rpm SATA disc
Video and all work files on striped pair 7200rpm SATAs
and show only abt 15MBps throughput (copies can go way over 50MBps)
Nvidia FX 3800 which CS5 does recognize

[Keep up all those great updates! - Don]

David Newman
March 17th, 2011, 09:58 PM
No, we are fully threaded. HDLink should regularly saturate all you cores if your disks can keep up (I can saturate 12 cores.) CS5 export is just slow for CineForm exports, our encoding is twiddling its thumbs waiting for frames form Premiere.

Jay Bloomfield
March 19th, 2011, 03:20 PM
... Video and all work files on striped pair 7200rpm SATAs
and show only abt 15MBps throughput (copies can go way over 50MBps)


How are you measuring real time disk throughput while encoding? I have a dual CPU (12 physical, X5650) Xeon workstation and when I read and write from independent striped pairs of 7200 rpm SATA drives, I get the same CPU load (somewhere around 15-18%), so I suspect that what David said is correct, the disk subsystem isn't keeping up with the CPUs. Benchmarking my drives with ATTO Disk Benchmark, I get ~ 150 MB/s reads and writes and I know these numbers are accurate, since I can capture full HD with my Blackmagic Design capture card without dropping frames.

Just as an aside, to some extent how many cores that a video encoder uses also depends on the decoding. HDLink will use a lot more of the CPU power, if it encodes either an MPEG2 or h264 file to CFHD.

Don Blish
March 20th, 2011, 04:34 PM
For disc thruput measures I just use Win7-x64s Task Manager and Resource Monitor tab. My video striped pair are WD Caviar-Green 1.5TB, 7200rpm discs SATA connected to the Intel ICH10R ports on the MoBo. They show about 30MB/sec during either HD link (AVCHD to .avi) or CS5 Cineform Exports. Using that "overview" tab just now on a 6GB copy from the striped pair to my single 7200rpm root drive, the overview showed 90GB/sec. From the "Disc" tab you could see that it was 45GB/sec for each drive.

Perhaps I need two striped pairs, one for the source and one for the target of such efforts.

Jay Bloomfield
March 20th, 2011, 07:28 PM
The only other "bottleneck" that I can think of is that disk writes are much slower, if the actual size of data that is written in each write cycle is smaller. For example, my striped pairs will support 150 MB/s transfers, but only above 64 KB transfer sizes. You would think that the CFHD encoding chain would be writing large amounts of data at a time to the hard drive, though.

Also, I assume that you meant MB/s. BTW, I also see a maximum hard disk transfer rate of 50 MB/sec when encoding CFHD, which is well below the max that I see with other software, including OS copies from one drive to the other..

Jay Bloomfield
March 21st, 2011, 07:42 PM
Here's an update (Sorry, I couldn't edit the previous post). I created a RAM Disk (since I have 24 GB RAM) and tried to transcode an mt2 file to CFHD. Now, the RAM Disk has read and write transfer speeds that are at least an order of magnitude higher than even the fastest consumer SSD (I get 2.7 GB/s). Even with putting both the input and output files on the RAM Disk, the maximum CPU usage on my computer was 35%, indicating that the disk subsystem is not the cause of the bottleneck. I personally don't think that this result has anything to do with HDLink, but rather either has something to do with the OS or with the nature of video transcoding in general and how amenable it is to hyperthreading with large numbers of threads.

RAMDisk Enterprise is shareware, so if you want to test your system, you can download it from the author's website:

RAMDisk for Windows 2000 / XP / Server 2003 / Vista / PE (http://members.fortunecity.com/ramdisk/RAMDisk/ramdiskent.htm)

David Newman
March 21st, 2011, 07:59 PM
No one has discussed what they are converting. If the source format uses a third decoder that is single threaded, that will be your bottleneck. Becareful installing tools like FFDShow as you source decode may change from the one we ship to something else on you system.

Jay Bloomfield
March 21st, 2011, 09:06 PM
I get a somewhat better result "transcoding" Cineform to Cineform using HDLink (just resizing), I get CPU usages in the range of 50-60%, using a RAM Disk for both input and output. In any case, whatever is causing this behavior (as reported by the OP, I was only checking on it, because I never thought much about it before) on dual Xeon Win 7 x64 systems does not inherently have anything to do with the disk subsystem. My result is with build 308 of NEO 5 4K.

Don Blish
March 22nd, 2011, 12:32 PM
No one has discussed what they are converting. If the source format uses a third decoder that is single threaded, that will be your bottleneck. Becareful installing tools like FFDShow as you source decode may change from the one we ship to something else on you system.

As quickly said in my post, I was converting an .mts file in AVCHD format from my Pany HDC-TM300. Could the .MTS "reader" be the bottleneck?

[The other "complaint" about CS5 cine renderout you explained was that your software was waiting on CS5 to offer up the frames].

David Newman
March 22nd, 2011, 12:38 PM
My 6 core 12-HT Gulftown system converts Panasonic MTS files with 70% CPU usage. MTS is not the issue.

Jay Bloomfield
March 22nd, 2011, 08:46 PM
It's getting harder and harder to figure out what codecs are being used in what instances and the GraphEdit "replacement", Microsoft's Media Foundation Topology Editor doesn't work with all codecs. This "codec fog" is true, even for a system like mine that has very little "fluff" installed, except for programs like Vegas and CS5.

Maybe this is a bad example, but I created a graph for a Blackmagic Design AVI file with GraphEdit. Using the Microsoft MJPEG decoder, only one thread is used. If I edit the graph and use the Decklink MJPEG codec, it is obviously multithreaded (using a Win 7 gadget that displays all thread activity). I created both x86 & x64 graphs and the results were the same.

I wonder how heavily threaded the rest of the built-in Microsoft codecs are?

Stephen Armour
March 23rd, 2011, 07:14 AM
...RAMDisk Enterprise is shareware, so if you want to test your system, you can download it from the author's website:

RAMDisk for Windows 2000 / XP / Server 2003 / Vista / PE (http://members.fortunecity.com/ramdisk/RAMDisk/ramdiskent.htm)

This URL is apparently no longer valid, Jay. Any suggestions on how to find this now?

Jay Bloomfield
March 23rd, 2011, 08:03 PM
I just tried it. It works for me. Maybe the site was just down for a while. If you can't find it on the author's site, you can find it with Google, etc..

Aside from the issue raised in the OP of this thread, a RAM Disk is also helpful to speed up any operation where an intermediate file is used. You have to keep in mind that such storage is volatile (the contents are lost when you reboot) and even 24 GB of RAM isn't really that big these days, when looking at the size of some video files. RAM Disks can be expanded and contracted on the fly, although some times it requires a reboot.

Stephen Armour
March 24th, 2011, 07:35 AM
Weird. I still can't access it and couldn't find it Googling it! We're located in Northeastern Brazil, so who knows what's up.

We've been having some really strange DNS server probs for weeks and I've been swapping new DNS servers in and out of our router to try to re-route to troublesome URLs.

Attached is the screenshot I'm getting. If you've got a direct link to their download, I'd appreciate it.

Jay Bloomfield
March 24th, 2011, 03:30 PM
@Stephen,

I PMed you with the links, as I didn't want this thread to go too far off topic. :-)

Stephen Armour
March 25th, 2011, 06:15 AM
@Stephen,

I PMed you with the links, as I didn't want this thread to go too far off topic. :-)

Got them, thanks Jay. Didn't have any time to download or check them out yesterday, as I was really busy.

Today I'll see if I can get things to work right.

Jay Bloomfield
March 29th, 2011, 12:26 PM
After researching this a bit further in my spare time, I've come to the conclusion that it is going to be very difficult to figure out why you can't "pin" all the threads/cores to 100% when encoding with any A/V software under Win 7.

My "experiment" with MJPEG & GraphEdit really showed nothing, since a) GraphEdit is so old and may not take advantage of multiple cores and hyperthreading. Besides, the built-in Win 7 Microsoft Media Foundation and DMO (DirectShow) codecs may not be efficiently multithreaded, because why do they have to be? Not much uses them presently, except software allied with the OS directly, such as Windows Media Player (for example PPro CS5 and Sony Vegas both come with built-in codecs for most formats). And WMP only displays video. It doesn't encode it, so the limiting factors are the frame size (mostly < 1920x1080) and frame rate (mostly < 30 fps at HD resolutions). In these cases the displayed frame rate is probably the "bottleneck". It's not like we're dealing with a video game, which might display at some high frame rate, if the computer is lightning fast.


Let me know if you come up with anything else of interest on this.