[x264-devel] [PATCH 1/3] arm: Implement x264_plane_copy_neon
Janne Grunau
janne-x264 at jannau.net
Mon Aug 31 01:12:16 CEST 2015
On 2015-08-28 00:15:01 +0300, Martin Storsjö wrote:
> checkasm timing Cortex-A7 A8 A9
> plane_copy_c 13124 10925 9106
> plane_copy_neon 7349 5103 8945
>
> ---
> Use bic instead of and, use lr instead of r5, return using
> pop {..,pc}. Settled on using two separate ldr calls instead of
> ldrd, both since it's required when loading r5+lr, and since it
> seemed much faster on A8.
> ---
> common/arm/mc-a.S | 32 ++++++++++++++++++++++++++++++++
> common/arm/mc-c.c | 3 +++
> 2 files changed, 35 insertions(+)
ok
Janne
More information about the x264-devel
mailing list