[x264-devel] FPGAs and x264

Mon Jul 6 09:53:07 CEST 2009

On Mon, Jul 6, 2009 at 12:49 AM, Gabriel
Bouvigne<gabriel.bouvigne at joost.com> wrote:
> David Smith a écrit :
>>
>> I am currently working on a project to offload x264 functions to an FPGA.
>>  I have read several papers recently that describe the offloading of
>> x264_me_search_ref.   However, that is a large function to replicate in
>> hardware and well beyond the scope of my project.
>>
>> Fo my current project I am designing a SAD/SATD 16 core processor.  Each
>> core will have 4 processing elements.  Each element can compute up to either
>> a 8x8 SAD or SATD.
>
> Just offloading simple DSP functions to an fpga is a bad idea when the host
> is a modern cpu.
>
> Offloading more advanced DSP functions (FIR, big FFT or MDCT) can be useful,
> but something as simple as a 16x16 SAD/SATD will likely not be useful at all
> (except for learning). Those functions are quite fast on modern CPUs, and
> offloading them to an FGPA will likely be latency-bound (especially if you
> have to go through PCI, probably less if your FPGA is directly on an
> HyperTransport link).
>
> To do some useful offloading to an fpga, you would either have to offload
> bigger functions (like the whole deblocking, but then it's not a huge part
> of x264's cpu use), or to add a bit of conditional logic on the fpga side,
> in order to cover something bigger than just a simple DSP function, without
> the need to go back to the cpu each time.
>
> For educational purpose, if you really want to only offload SAD you might
> want to check its benefits in the context of a full motion search. (but bear
> in mind that full search is nearly useless in real encoders)

And also note that it will probably be extremely difficult to
outperform SEA on a CPU with a naive exhaustive search on either a GPU
or an FPGA.

Dark Shikari