[x265] [PATCH] assembly code for intra_pred_ang8_3

chen chenm003 at 163.com
Fri Jan 17 07:53:34 CET 2014


@Yuvaraj,  Jason means all of m1 is uninitialize before instruction execute, you said it is initialize after.
see Intel doc for palignr, the register m1 is src and dest, it is logic problem, but it save a mov instruction and work fine.
 
At 2014-01-17 14:42:31,"Yuvaraj Venkatesh" <yuvaraj at multicorewareinc.com> wrote:

Only the last pixel got the dependency of m1(uninitialized), anyway that particular pixel was not used anywhere on the code.



Moreover psrldq has higher latency than the palignr and also need to use additional mov instruction.



On Fri, Jan 17, 2014 at 12:08 PM, chen <chenm003 at 163.com> wrote:

At 2014-01-17 14:00:55,"Jason Garrett-Glaser" <jason at x264.com> wrote:

>+    movu        m0,        [r2 + 1]                   ; [16 15 14 13
>12 11 10 9 8 7 6 5 4 3 2 1]
>+    palignr     m1,        m0, 1                      ; [x 16 15 14
>13 12 11 10 9 8 7 6 5 4 3 2]
>
>Shouldn't this be pslrdq or similar?  The dependency on uninitialized
>registers is a bit weird too...

This algorithm is suggest by me, the  psrldq can't move register, we
have to wasting some instruction to do it.
Of course, we have a restrict use uninitialize value on other instruction.
 

_______________________________________________
x265-devel mailing list
x265-devel at videolan.org
https://mailman.videolan.org/listinfo/x265-devel



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20140117/8c2cbfaf/attachment.html>


More information about the x265-devel mailing list