[x264-devel] [PATCH 1/3] arm: Implement x264_plane_copy_neon

Janne Grunau janne-x264 at jannau.net
Mon Aug 31 01:12:16 CEST 2015


On 2015-08-28 00:15:01 +0300, Martin Storsjö wrote:
> checkasm timing       Cortex-A7      A8     A9
> plane_copy_c                 13124   10925  9106
> plane_copy_neon              7349    5103   8945
> 
> ---
> Use bic instead of and, use lr instead of r5, return using
> pop {..,pc}. Settled on using two separate ldr calls instead of
> ldrd, both since it's required when loading r5+lr, and since it
> seemed much faster on A8.
> ---
>  common/arm/mc-a.S |   32 ++++++++++++++++++++++++++++++++
>  common/arm/mc-c.c |    3 +++
>  2 files changed, 35 insertions(+)

ok

Janne


More information about the x264-devel mailing list