[x265] [PATCH] assembly code for intra_pred_ang8_3
Yuvaraj Venkatesh
yuvaraj at multicorewareinc.com
Fri Jan 17 07:42:31 CET 2014
Only the last pixel got the dependency of m1(uninitialized), anyway that
particular pixel was not used anywhere on the code.
Moreover psrldq has higher latency than the palignr and also need to use
additional mov instruction.
On Fri, Jan 17, 2014 at 12:08 PM, chen <chenm003 at 163.com> wrote:
> At 2014-01-17 14:00:55,"Jason Garrett-Glaser" <jason at x264.com> wrote:
>
> >+ movu m0, [r2 + 1] ; [16 15 14 13
> >12 11 10 9 8 7 6 5 4 3 2 1]
> >+ palignr m1, m0, 1 ; [x 16 15 14
> >13 12 11 10 9 8 7 6 5 4 3 2]
> >
> >Shouldn't this be pslrdq or similar? The dependency on uninitialized
> >registers is a bit weird too...
> This algorithm is suggest by me, the psrldq can't move register, we
> have to wasting some instruction to do it.
> Of course, we have a restrict use uninitialize value on other instruction.
>
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20140117/767d9e4b/attachment-0001.html>
More information about the x265-devel
mailing list