[x264-devel] [PATCH 13/24] arm: Implement x264_denoise_dct_neon
Martin Storsjö
martin at martin.st
Mon Aug 24 19:59:17 CEST 2015
On Tue, 18 Aug 2015, Janne Grunau wrote:
> On 2015-08-13 23:59:34 +0300, Martin Storsjö wrote:
>> checkasm timing Cortex-A7 A8 A9
>> denoise_dct_c 6605 5515 5950
>> denoise_dct_neon 1885 1178 1887
>> ---
>> common/arm/quant-a.S | 31 +++++++++++++++++++++++++++++++
>> common/arm/quant.h | 2 ++
>> common/quant.c | 2 +-
>> 3 files changed, 34 insertions(+), 1 deletion(-)
>>
>> diff --git a/common/arm/quant-a.S b/common/arm/quant-a.S
>> index ad8d8f8..e3d5cd2 100644
>> --- a/common/arm/quant-a.S
>> +++ b/common/arm/quant-a.S
>> @@ -4,6 +4,7 @@
>> * Copyright (C) 2009-2015 x264 project
>> *
>> * Authors: David Conrad <lessen42 at gmail.com>
>> + * Janne Grunau <janne-x264 at jannau.net>
>> *
>> * This program is free software; you can redistribute it and/or modify
>> * it under the terms of the GNU General Public License as published by
>> @@ -404,3 +405,33 @@ function x264_coeff_last64_neon
>> movlt r0, #0
>> bx lr
>> endfunc
>> +
>> +function x264_denoise_dct_neon
>> + vpush {q4-q7}
>
> after a cursory look it should no problem to do the same computation in
> 12 128-bit registers
Indeed - done locally, thanks!
// Martin
More information about the x264-devel
mailing list