As I posted earlier, there's a small error in the patch; x shouldn't be re-initted to zero after the first loop.<br><br>The x++ is correct, but it'd be better style-wise to use x+=4 for w64 ... and x+=2 for the w32 loop.<br>
<br>Also note that upon fixing checkasm to use random data for the input, it appears that the assembly disagrees with the C, so any thoughts there would be appreciated.<br><br><div class="gmail_quote">On Fri, Mar 7, 2008 at 3:12 AM, Hannes Domani <<a href="mailto:ssbssa@yahoo.de">ssbssa@yahoo.de</a>> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hello<br>
<br>
when looking at the patch i was wondering if the<br>
following part is what you really intended (since x++<br>
is called twice):<br>
> - if( qpel_idx & 5 ) /* qpel interpolation needed<br>
<div class="Ih2E3d">*/<br>
> + int x;<br>
> + width = width >> 2;<br>
> + for( x = 0; width - x >= 2; x++ )<br>
> {<br>
> - uint8_t *src2 = src[hpel_ref1[qpel_idx]] +<br>
offset + ((mvx&3) == 3);<br>
> - x264_pixel_avg_wtab_mmxext[i_width>>2](<br>
> - dst, *i_dst_stride, src1,<br>
i_src_stride,<br>
> - src2, i_height );<br>
> - return dst;<br>
> + frame_init_lowres_core_sse2_w32(src_stride,<br>
dest_stride, height, width, src0, dst0, dsth, dstv,<br>
dstc );<br>
> + src0 += 32;<br>
> + dst0 += 16;<br>
> + dsth += 16;<br>
> + dstv += 16;<br>
> + dstc += 16;<br>
> + x++;<br>
> }<br>
<br>
<br>
</div>regards<br>
Domani Hannes<br>
<br>
<br>
Machen Sie Yahoo! zu Ihrer Startseite. Los geht's:<br>
<a href="http://de.yahoo.com/set" target="_blank">http://de.yahoo.com/set</a><br>
<div><div></div><div class="Wj3C7c"><br>
_______________________________________________<br>
x264-devel mailing list<br>
<a href="mailto:x264-devel@videolan.org">x264-devel@videolan.org</a><br>
<a href="http://mailman.videolan.org/listinfo/x264-devel" target="_blank">http://mailman.videolan.org/listinfo/x264-devel</a><br>
</div></div></blockquote></div><br>