<div dir="ltr"><br><br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">chen</b> <span dir="ltr"><<a href="mailto:chenm003@163.com">chenm003@163.com</a>></span><br>
Date: Fri, Nov 8, 2013 at 7:10 PM<br>Subject: Re: [x265] [PATCH] blockcopy_sp_16xN, optimized asm code<br>To: Development for x265 <<a href="mailto:x265-devel@videolan.org">x265-devel@videolan.org</a>><br><br><br><div style="line-height:1.7;font-size:14px;font-family:arial">
<div>>>code is right, but need uncrustify it, ex: add r3, r3<br></div><div>Does <span style="line-height:1.7">uncrustify work for .asm files?</span></div>
<div>t 2013-11-08 21:32:05,<a href="mailto:praveen@multicorewareinc.com" target="_blank">praveen@multicorewareinc.com</a> wrote:<div><div class="h5"><br>># HG changeset patch<br>># User Praveen Tiwari<br>># Date 1383917516 -19800<br>
># Node ID 662664f0863b38b838a15867745c5564f574fb09<br>># Parent 227a5666e08869d36e07a75f3db95dd94c774715<br>>blockcopy_sp_16xN, optimized asm code<br>><br>>diff -r 227a5666e088 -r 662664f0863b source/common/x86/blockcopy8.asm<br>
>--- a/source/common/x86/blockcopy8.asm Fri Nov 08 17:38:24 2013 +0530<br>>+++ b/source/common/x86/blockcopy8.asm Fri Nov 08 19:01:56 2013 +0530<br>>@@ -1325,51 +1325,38 @@<br>> ;-----------------------------------------------------------------------------<br>
> %macro BLOCKCOPY_SP_W16_H4 2<br>> INIT_XMM sse2<br>>-cglobal blockcopy_sp_%1x%2, 4, 7, 7, dest, destStride, src, srcStride<br>>+cglobal blockcopy_sp_%1x%2, 4, 5, 8, dest, destStride, src, srcStride<br>> <br>
>-mov r6d, %2<br>>+mov r4d, %2/4<br>> <br>>-add r3, r3<br>>-<br>>-mova m0, [tab_Vm]<br>>+add r3, r3<br>> <br>> .loop<br>>- movu m1, [r2]<br>
>- movu m2, [r2 + 16]<br>>- movu m3, [r2 + r3]<br>>- movu m4, [r2 + r3 + 16]<br>>- movu m5, [r2 + 2 * r3]<br>>- movu m6, [r2 + 2 * r3 + 16]<br>
>+ movu m0, [r2]<br>>+ movu m1, [r2 + 16]<br>>+ movu m2, [r2 + r3]<br>>+ movu m3, [r2 + r3 + 16]<br>>+ movu m4, [r2 + 2 * r3]<br>>+ movu m5, [r2 + 2 * r3 + 16]<br>
>+ lea r2, [r2 + 2 * r3]<br>>+ movu m6, [r2 + r3]<br>>+ movu m7, [r2 + r3 + 16]<br>> <br>>- pshufb m1, m0<br>>- pshufb m2, m0<br>>- pshufb m3, m0<br>
>- pshufb m4, m0<br>>- pshufb m5, m0<br>>- pshufb m6, m0<br>>+ packuswb m0, m1<br>>+ packuswb m2, m3<br>>+ packuswb m4, m5<br>>+ packuswb m6, m7<br>
> <br>>- movh [r0], m1<br>>- movh [r0 + 8], m2<br>>- movh [r0 + r1], m3<br>>- movh [r0 + r1 + 8], m4<br>>- movh [r0 + 2 * r1], m5<br>
>- movh [r0 + 2 * r1 + 8], m6<br>>+ movu [r0], m0<br>>+ movu [r0 + r1], m2<br>>+ movu [r0 + 2 * r1], m4<br>>+ lea r0, [r0 + 2 * r1]<br>
>+ movu [r0 + r1], m6<br>> <br>>- lea r4, [r2 + 2 * r3]<br>>- movu m1, [r4 + r3]<br>>- movu m2, [r4 + r3 + 16]<br>>+ lea r0, [r0 + 2 * r1]<br>
>+ lea r2, [r2 + 2 * r3]<br>> <br>>- pshufb m1, m0<br>>- pshufb m2, m0<br>>-<br>>- lea r5, [r0 + 2 * r1]<br>>- movh [r5 + r1], m1<br>
>- movh [r5 + r1 + 8], m2<br>>-<br>>- lea r0, [r5 + 2 * r1]<br>>- lea r2, [r4 + 2 * r3]<br>>-<br>>- sub r6d, 4<br>>+ dec r4d<br>
> jnz .loop<br>> <br>> RET<br></div></div>>_______________________________________________<br>>x265-devel mailing list<br>><a href="mailto:x265-devel@videolan.org" target="_blank">x265-devel@videolan.org</a><br>
><a href="https://mailman.videolan.org/listinfo/x265-devel" target="_blank">https://mailman.videolan.org/listinfo/x265-devel</a><br></div></div><br>_______________________________________________<br>
x265-devel mailing list<br>
<a href="mailto:x265-devel@videolan.org">x265-devel@videolan.org</a><br>
<a href="https://mailman.videolan.org/listinfo/x265-devel" target="_blank">https://mailman.videolan.org/listinfo/x265-devel</a><br>
<br></div><br></div>