<div dir="ltr">Ignore this patch, Need Modifications.<div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Jan 20, 2014 at 11:49 AM, chen <span dir="ltr"><<a href="mailto:chenm003@163.com" target="_blank">chenm003@163.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="line-height:1.7;font-size:14px;font-family:arial"><div>right<br><br>At 2014-01-20 13:24:20,<a href="mailto:murugan@multicorewareinc.com" target="_blank">murugan@multicorewareinc.com</a> wrote:<div>
<div class="h5"><br>># HG changeset patch<br>># User Murugan Vairavel <<a href="mailto:murugan@multicorewareinc.com" target="_blank">murugan@multicorewareinc.com</a>><br>># Date 1390195451 -19800<br>># Mon Jan 20 10:54:11 2014 +0530<br>
># Node ID 879feee2a43535ff490a3d82cebac64b03b3db8d<br>># Parent c88314c4a1a1bd0182be180e048f3788de0c2108<br>>asm: code for intra_pred[BLOCK_16x16] mode 3<br>><br>>diff -r c88314c4a1a1 -r 879feee2a435 source/common/x86/asm-primitives.cpp<br>
>--- a/source/common/x86/asm-primitives.cpp Fri Jan 17 12:18:25 2014 +0530<br>>+++ b/source/common/x86/asm-primitives.cpp Mon Jan 20 10:54:11 2014 +0530<br>>@@ -1011,6 +1011,8 @@<br>> SETUP_INTRA_ANG4(32, 4, sse4);<br>
> SETUP_INTRA_ANG4(33, 3, sse4);<br>> <br>>+ SETUP_INTRA_ANG16(3, 3, sse4);<br>>+<br>> p.dct[DCT_8x8] = x265_dct8_sse4;<br>> }<br>> if (cpuMask & X265_CPU_AVX)<br>>diff -r c88314c4a1a1 -r 879feee2a435 source/common/x86/intrapred8.asm<br>
>--- a/source/common/x86/intrapred8.asm Fri Jan 17 12:18:25 2014 +0530<br>>+++ b/source/common/x86/intrapred8.asm Mon Jan 20 10:54:11 2014 +0530<br>>@@ -1182,6 +1182,159 @@<br>> movu [r0 + r1], m2<br>
> RET<br>> <br>>+%macro TRANSPOSE_STORE_8x8 1<br>>+ punpckhbw m0, m4, m5<br>>+ punpcklbw m4, m5<br>>+ punpckhbw m2, m4, m0<br>>+ punpcklbw m4, m0<br>
>+<br>>+ punpckhbw m0, m6, m1<br>>+ punpcklbw m6, m1<br>>+ punpckhbw m1, m6, m0<br>>+ punpcklbw m6, m0<br>>+<br>>+ punpckhdq m5, m4, m6<br>>+ punpckldq m4, m6<br>
>+ punpckldq m6, m2, m1<br>>+ punpckhdq m2, m1<br>>+<br>>+ movh [r0 + %1 * 8], m4<br>>+ movhps [r0 + r1 + %1 * 8], m4<br>>+ movh [r0 + r1*2 + %1 * 8], m5<br>
>+ movhps [r0 + r5 + %1 * 8], m5<br>>+ movh [r6 + %1 * 8], m6<br>>+ movhps [r6 + r1 + %1 * 8], m6<br>>+ movh [r6 + r1*2 + %1 * 8], m2<br>>+ movhps [r6 + r5 + %1 * 8], m2<br>
>+%endmacro<br>>+<br>>+INIT_XMM sse4<br>>+cglobal intra_pred_ang16_3, 3,7,8<br>>+<br>>+ lea r3, [ang_table + 16 * 16]<br>>+ mov r4d, 2<br>>+ lea r5, [r1 * 3] ; r5 -> 3 * stride<br>
>+ lea r6, [r0 + r1 * 4] ; r6 -> 4 * stride<br>>+ mova m7, [pw_1024]<br>>+<br>>+.loop:<br>>+ movu m0, [r2 + 1]<br>>+ palignr m1, m0, 1<br>
>+<br>>+ punpckhbw m2, m0, m1<br>>+ punpcklbw m0, m1<br>>+ palignr m1, m2, m0, 2<br>>+<br>>+ movu m3, [r3 + 10 * 16] ; [26]<br>>+ movu m6, [r3 + 4 * 16] ; [20]<br>
>+<br>>+ pmaddubsw m4, m0, m3<br>>+ pmulhrsw m4, m7<br>>+ pmaddubsw m1, m6<br>>+ pmulhrsw m1, m7<br>>+ packuswb m4, m1<br>>+<br>>+ palignr m5, m2, m0, 4<br>
>+<br>>+ movu m3, [r3 - 2 * 16] ; [14]<br>>+ pmaddubsw m5, m3<br>>+ pmulhrsw m5, m7<br>>+<br>>+ palignr m6, m2, m0, 6<br>>+<br>>+ movu m3, [r3 - 8 * 16] ; [ 8]<br>
>+ pmaddubsw m6, m3<br>>+ pmulhrsw m6, m7<br>>+ packuswb m5, m6<br>>+<br>>+ palignr m1, m2, m0, 8<br>>+<br>>+ movu m3, [r3 - 14 * 16] ; [ 2]<br>
>+ pmaddubsw m6, m1, m3<br>>+ pmulhrsw m6, m7<br>>+<br>>+ movu m3, [r3 + 12 * 16] ; [28]<br>>+ pmaddubsw m1, m3<br>>+ pmulhrsw m1, m7<br>
>+ packuswb m6, m1<br>>+<br>>+ palignr m1, m2, m0, 10<br>>+<br>>+ movu m3, [r3 + 6 * 16] ; [22]<br>>+ pmaddubsw m1, m3<br>>+ pmulhrsw m1, m7<br>
>+<br>>+ palignr m2, m0, 12<br>>+<br>>+ movu m3, [r3] ; [16]<br>>+ pmaddubsw m2, m3<br>>+ pmulhrsw m2, m7<br>>+ packuswb m1, m2<br>
>+<br>>+ TRANSPOSE_STORE_8x8 0<br>>+<br>>+ movu m0, [r2 + 8]<br>>+ palignr m1, m0, 1<br>>+<br>>+ punpckhbw m2, m0, m1<br>>+ punpcklbw m0, m1<br>
>+ palignr m5, m2, m0, 2<br>>+<br>>+ movu m3, [r3 - 6 * 16] ; [10]<br>>+ movu m6, [r3 - 12 * 16] ; [04]<br>>+<br>>+ pmaddubsw m4, m0, m3<br>
>+ pmulhrsw m4, m7<br>>+ pmaddubsw m1, m5, m6<br>>+ pmulhrsw m1, m7<br>>+ packuswb m4, m1<br>>+<br>>+ movu m3, [r3 + 14 * 16] ; [30]<br>
>+ pmaddubsw m5, m3<br>>+ pmulhrsw m5, m7<br>>+<br>>+ palignr m6, m2, m0, 4<br>>+<br>>+ movu m3, [r3 + 8 * 16] ; [24]<br>>+ pmaddubsw m6, m3<br>
>+ pmulhrsw m6, m7<br>>+ packuswb m5, m6<br>>+<br>>+ palignr m1, m2, m0, 6<br>>+<br>>+ movu m3, [r3 + 2 * 16] ; [18]<br>>+ pmaddubsw m6, m1, m3<br>
>+ pmulhrsw m6, m7<br>>+<br>>+ palignr m1, m2, m0, 8<br>>+<br>>+ movu m3, [r3 - 4 * 16] ; [12]<br>>+ pmaddubsw m1, m3<br>>+ pmulhrsw m1, m7<br>
>+ packuswb m6, m1<br>>+<br>>+ palignr m1, m2, m0, 10<br>>+<br>>+ movu m3, [r3 - 10 * 16] ; [06]<br>>+ pmaddubsw m1, m3<br>>+ pmulhrsw m1, m7<br>
>+<br>>+ palignr m2, m0, 12<br>>+<br>>+ movu m3, [r3 - 16 * 16] ; [0]<br>>+ pmaddubsw m2, m3<br>>+ pmulhrsw m2, m7<br>>+ packuswb m1, m2<br>
>+<br>>+ TRANSPOSE_STORE_8x8 1<br>>+<br>>+ lea r0, [r6 + r1 * 4]<br>>+ lea r6, [r6 + r1 * 8]<br>>+ add r2, 8<br>>+ dec r4<br>>+ jnz .loop<br>
>+<br>>+ RET<br></div></div></div></div><br>_______________________________________________<br>
x265-devel mailing list<br>
<a href="mailto:x265-devel@videolan.org">x265-devel@videolan.org</a><br>
<a href="https://mailman.videolan.org/listinfo/x265-devel" target="_blank">https://mailman.videolan.org/listinfo/x265-devel</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div dir="ltr">With Regards,<div><br></div><div>Murugan. V</div><div>+919659287478</div></div>
</div>