<div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><DIV>At 2014-02-19 18:53:47,dnyaneshwar@multicorewareinc.com wrote:<BR>># HG changeset patch<BR>># User Dnyaneshwar G <dnyaneshwar@multicorewareinc.com><BR>># Date 1392807092 -19800<BR>># Wed Feb 19 16:21:32 2014 +0530<BR>># Node ID cede20cde62ba0a96ac181bcf78a508097de0e7c<BR>># Parent 6150985c3d535f0ea7a1dc0b8f3c69e65e30d25b<BR>>asm-16bpp: code for addAvg luma and chroma all sizes<BR>><BR>>+%if HIGH_BIT_DEPTH<BR>>+INIT_XMM sse4<BR>>+cglobal addAvg_2x4, 6,7,8, pSrc0, pSrc1, pDst, iStride0, iStride1, iDstStride<BR>>+ mova m7, [pw_16400]<BR>>+ mova m0, [pw_1023]<BR>m7 and m0 used just once, so merge address into instruction is shorter code size.</DIV>
<DIV> </DIV>
<DIV>>+ add r3, r3<BR>>+ add r4, r4<BR>>+ add r5, r5<BR>>+<BR>>+ movd m1, [r0]<BR>>+ movd m2, [r0 + r3]<BR>>+ movd m3, [r1]<BR>>+ movd m4, [r1 + r4]<BR>>+<BR>>+ punpckldq m1, m2<BR>>+ punpckldq m3, m4<BR>>+<BR>>+ lea r0, [r0 + 2 * r3]<BR>>+ lea r1, [r1 + 2 * r4]<BR>>+<BR>>+ movd m2, [r0]<BR>>+ movd m4, [r0 + r3]<BR>>+ movd m5, [r1]<BR>>+ movd m6, [r1 + r4]<BR>>+<BR>>+ punpckldq m2, m4<BR>>+ punpckldq m5, m6<BR>>+ punpcklqdq m1, m2<BR>>+ punpcklqdq m3, m5<BR>>+<BR>>+ paddw m1, m3<BR>>+ paddw m1, m7</DIV>
<DIV>m7 is 16440, it is most possible to overflow, please do the dynamic range analyze here</DIV>
<DIV><BR>>+ psraw m1, 5<BR>>+ pxor m6, m6<BR>>+ pmaxsw m1, m6<BR>>+ pminsw m1, m0<BR>>+<BR>>+ movd [r2], m1<BR>>+ pextrd [r2 + r5], m1, 1<BR>>+ lea r2, [r2 + 2 * r5]<BR>>+ pextrd [r2], m1, 2<BR>>+ pextrd [r2 + r5], m1, 3<BR>>+<BR>>+ RET<BR>>+<BR></DIV></div>