[x264-devel] RE : FPGAs and x264

Jason Garrett-Glaser darkshikari at gmail.com
Sun Jul 5 12:09:35 CEST 2009


On Sun, Jul 5, 2009 at 3:01 AM, David Smith<agentdavo at mac.com> wrote:
>> 16x16 SATD (I have no idea what you mean by "sub blocks") takes 170
>> cycles on a Nehalem CPU.  A SAD takes something around 42, but
>> normally 4 are batched up(SAD_X4) and that takes 152 clocks total.  On
>> Phenom it's around 110 or so.
>
> For example, a FPGA core that in a single cycle takes a 16x16 macroblock
> then computes and returns all sub-blocks in that 16x16 area all the way down
> to 4x4 (called a systolic array in some articles). Same for SATD.
> 16,16 16,8 8,16 8,8 8,4 4,8 4,4

That isn't useful for the algorithms in x264 though.

Dark Shikari


More information about the x264-devel mailing list