[x264-devel] [PATCH 13/24] arm: Implement x264_denoise_dct_neon

Martin Storsjö martin at martin.st
Mon Aug 24 19:59:17 CEST 2015


On Tue, 18 Aug 2015, Janne Grunau wrote:

> On 2015-08-13 23:59:34 +0300, Martin Storsjö wrote:
>> checkasm timing       Cortex-A7      A8     A9
>> denoise_dct_c                6605    5515   5950
>> denoise_dct_neon             1885    1178   1887
>> ---
>>  common/arm/quant-a.S |   31 +++++++++++++++++++++++++++++++
>>  common/arm/quant.h   |    2 ++
>>  common/quant.c       |    2 +-
>>  3 files changed, 34 insertions(+), 1 deletion(-)
>>
>> diff --git a/common/arm/quant-a.S b/common/arm/quant-a.S
>> index ad8d8f8..e3d5cd2 100644
>> --- a/common/arm/quant-a.S
>> +++ b/common/arm/quant-a.S
>> @@ -4,6 +4,7 @@
>>   * Copyright (C) 2009-2015 x264 project
>>   *
>>   * Authors: David Conrad <lessen42 at gmail.com>
>> + *          Janne Grunau <janne-x264 at jannau.net>
>>   *
>>   * This program is free software; you can redistribute it and/or modify
>>   * it under the terms of the GNU General Public License as published by
>> @@ -404,3 +405,33 @@ function x264_coeff_last64_neon
>>      movlt       r0,  #0
>>      bx          lr
>>  endfunc
>> +
>> +function x264_denoise_dct_neon
>> +    vpush       {q4-q7}
>
> after a cursory look it should no problem to do the same computation in
> 12 128-bit registers

Indeed - done locally, thanks!

// Martin


More information about the x264-devel mailing list