[x264-devel] [PATCH 1/6] aarch64: Correctly sign extend int parameters in x264_plane_copy_core_neon

Janne Grunau janne-x264 at jannau.net
Tue Nov 15 23:11:17 CET 2016


On 2016-11-14 23:54:48 +0200, Martin Storsjö wrote:
> ---
>  common/aarch64/mc-a.S | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/common/aarch64/mc-a.S b/common/aarch64/mc-a.S
> index 3a99fbe..a7a383d 100644
> --- a/common/aarch64/mc-a.S
> +++ b/common/aarch64/mc-a.S
> @@ -1256,8 +1256,8 @@ endfunc
>  function x264_plane_copy_core_neon, export=1
>      add         x8,  x4,  #15
>      and         x4,  x8,  #~15
> -    sub         x1,  x1,  x4
> -    sub         x3,  x3,  x4
> +    sub         x1,  x1,  w4, sxtw
> +    sub         x3,  x3,  w4, sxtw

This patch is not very consequential. I'd change it into 

add w8, w4, #15 // 32-bit write clears the upper 32-bit the register
and w4, w8, #~15
// safe use of the full reg since negative width makes no sense
sub         x1,  x1,  x4
sub         x3,  x3,  x4

Janne


More information about the x264-devel mailing list