[x264-devel] intel co-processors ... finally a way to utilize HPC for x264?

Jason Garrett-Glaser jason at x264.com
Tue Jun 19 21:36:20 CEST 2012


On Tue, Jun 19, 2012 at 6:12 AM, aviad rozenhek <aviadr1 at gmail.com> wrote:
> Dear experts,
> Intel has announced a new family of "co-processors" on PCIe [aka GPUs].
>
> excerpt:
>>
>> Knights Corner – which would have 50+ cores and be manufactured on Intel’s
>> 22nm process.
>
> and:
>>
>> with Intel confirming that it is indeed using an enhanced Pentium 1 (P54C)
>> core with the addition of vector and FP64 hardware. Intel has also confirmed
>> that Xeon Phi will offer 512-bit SIMD operations
>
> http://www.anandtech.com/show/6017/intel-announces-xeon-phi-family-of-coprocessors-mic-goes-retail
>
> for a long time i've argued that the best live H.264 encoding server is a
> latest-gen x86 server with 8-12 cores running x264, as it delivers a good
> compromise between TCO and encoding quality.
> however there are other options like encoding on DSPs and GPUs [which often
> come at a price when it comes to quality] or on edge servers [which are more
> expensive].
>
> the question is, do you think that this proposed architecture from intel
> would be efficient for encoding high-quality HD content?

"Knight's Corner" is just a rehash of Larrabee, and likely equally a
total disaster.  Unless they've made staggering improvements, it
likely still has the following core problems:

1.  No 8-bit or 16-bit operations; the "512-bit" SIMD is in practice
only 128-bit or 256-bit at best.
2.  The clock speed is far too slow and the in-order cores can't do
enough operations per cycle.
3.  Typical resolutions can't typically use that many cores for a
single encoding process.

Tests with Larrabee indicated that even in an ideal perfect world
where problem 3) didn't exist and scaling was flawless, Larrabee would
be slower than a Core 2 while still using more power.  I doubt
Knight's Corner, based on the same concept, is much better.

But maybe -- if Knight's Corner runs at 3Ghz and does 3 or 4
instructions per clock -- it might be useful.

Jason


More information about the x264-devel mailing list