[x264-devel] FPGAs and x264

Jason Garrett-Glaser darkshikari at gmail.com
Sun Jul 5 02:17:16 CEST 2009


On Sat, Jul 4, 2009 at 2:19 PM, David Smith<agentdavo at mac.com> wrote:
> Thanks for the reply.  Like you say, offloading has the major drawback of
> latency.
> I have not properly profiled some of the x264 functions yet.  Is that 170
> cycles for a 16x16 SAD including 41 sub-blocks?
> A quick Google search a few weeks ago revealed a few relevant articles.
> http://www.google.com/search?&q=x264_me_search_ref+pdf

16x16 SATD (I have no idea what you mean by "sub blocks") takes 170
cycles on a Nehalem CPU.  A SAD takes something around 42, but
normally 4 are batched up(SAD_X4) and that takes 152 clocks total.  On
Phenom it's around 110 or so.

Dark Shikari


More information about the x264-devel mailing list