[x264-devel] [PATCH 09/24] arm: Add x264_nal_escape_neon
Martin Storsjö
martin at martin.st
Mon Aug 24 19:27:30 CEST 2015
On Tue, 18 Aug 2015, Janne Grunau wrote:
> On 2015-08-13 23:59:30 +0300, Martin Storsjö wrote:
>> checkasm timing Cortex-A7 A8 A9
>> nal_escape_c 908338 878032 633692
>> nal_escape_neon 379946 451936 373471
>> ---
>> Makefile | 2 +-
>> common/arm/bitstream-a.S | 89 ++++++++++++++++++++++++++++++++++++++++++++++
>> common/bitstream.c | 4 +++
>> 3 files changed, 94 insertions(+), 1 deletion(-)
>> create mode 100644 common/arm/bitstream-a.S
>>
>> diff --git a/Makefile b/Makefile
>> index 6193c59..4403a11 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -119,7 +119,7 @@ ifeq ($(SYS_ARCH),ARM)
>> ifneq ($(AS),)
>> ASMSRC += common/arm/cpu-a.S common/arm/pixel-a.S common/arm/mc-a.S \
>> common/arm/dct-a.S common/arm/quant-a.S common/arm/deblock-a.S \
>> - common/arm/predict-a.S
>> + common/arm/predict-a.S common/arm/bitstream-a.S
>> SRCS += common/arm/mc-c.c common/arm/predict-c.c
>> OBJASM = $(ASMSRC:%.S=%.o)
>> endif
>> diff --git a/common/arm/bitstream-a.S b/common/arm/bitstream-a.S
>> new file mode 100644
>> index 0000000..62f9c96
>> --- /dev/null
>> +++ b/common/arm/bitstream-a.S
>> @@ -0,0 +1,89 @@
>> +/*****************************************************************************
>> + * bitstream-a.S: arm bitstream functions
>> + *****************************************************************************
>> + * Copyright (C) 2014-2015 x264 project
>> + *
>> + * Authors: Janne Grunau <janne-x264 at jannau.net>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02111, USA.
>> + *
>> + * This program is also available under a commercial proprietary license.
>> + * For more information, contact us at licensing at x264.com.
>> + *****************************************************************************/
>> +
>> +#include "asm.S"
>> +
>> +function x264_nal_escape_neon
>> + push {r4-r9}
>
> I'm not quite sure if you need all those registers. I certainly only
> used that many because arm64 has enough caller saved registers. Also lr
> is usually in the register list when registers are pushed/popped to/from
> the stack. The function returns becomes then pop {rx-ry, pc} instead of
> pop; bx
Done locally; I got it down to r4-r5,lr.
>> + vpush {q4-q7}
>
> please use q8-q15, I know this register number conflicts are annoying
> when porting neon between arm and arm64 (both ways, I endured )
Doh, yes - and this is one of the really simple ones to remap.
// Martin
More information about the x264-devel
mailing list