[x264-devel] [PATCH 01/24] aarch64: Fix integral_init4/8h_neon

Janne Grunau janne-x264 at jannau.net
Fri Aug 14 08:19:58 CEST 2015


On 2015-08-13 23:59:22 +0300, Martin Storsjö wrote:
> The stride is the number of uint16_t elements and thus needs
> to be shifted.
> 
> This issue had slipped unnoticed since checkasm didn't actually
> verify the output of these functions.
> ---
>  common/aarch64/mc-a.S |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/common/aarch64/mc-a.S b/common/aarch64/mc-a.S
> index a4488a4..b6b588e 100644
> --- a/common/aarch64/mc-a.S
> +++ b/common/aarch64/mc-a.S
> @@ -1403,7 +1403,7 @@ endfunc
>  .endm
>  
>  function integral_init4h_neon, export=1
> -    sub         x3,  x0,  x2
> +    sub         x3,  x0,  x2, lsl #1
>      ld1        {v6.8b,v7.8b}, [x1], #16
>  1:
>      subs        x2,  x2,  #16
> @@ -1438,7 +1438,7 @@ endfunc
>  .endm
>  
>  function integral_init8h_neon, export=1
> -    sub         x3,  x0,  x2
> +    sub         x3,  x0,  x2, lsl #1
>      ld1        {v16.8b,v17.8b}, [x1], #16
>  1:
>      subs        x2,  x2,  #16

ok

Janne


More information about the x264-devel mailing list