[x264-devel] [PATCH 04/24] arm: Use aligned loads in x264_coeff_last15_neon

Janne Grunau janne-x264 at jannau.net
Tue Aug 18 10:32:57 CEST 2015


On 2015-08-13 23:59:25 +0300, Martin Storsjö wrote:
> After subtracting 2, the pointer will be aligned.
> 
> checkasm timing      Cortex-A7    A8    A9
> coeff_last15_c              423   375   230
> coeff_last15_neon           350   420   404  (before)
> coeff_last15_neon           350   400   394  (after)
> ---
>  common/arm/quant-a.S |    4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/common/arm/quant-a.S b/common/arm/quant-a.S
> index 4b2129a..ad8d8f8 100644
> --- a/common/arm/quant-a.S
> +++ b/common/arm/quant-a.S
> @@ -337,10 +337,8 @@ endfunc
>  function x264_coeff_last\size\()_neon
>  .if \size == 15
>      sub         r0,  r0,  #2
> -    vld1.64     {d0-d3}, [r0]
> -.else
> -    vld1.64     {d0-d3}, [r0,:128]
>  .endif
> +    vld1.64     {d0-d3}, [r0,:128]
>      vtst.16     q0,  q0
>      vtst.16     q1,  q1
>      vshrn.u16   d0,  q0,  #8

ok

Janne


More information about the x264-devel mailing list