[x264-devel] [PATCH 02/24] checkasm: Check the right output range for integral_initXh
Janne Grunau
janne-x264 at jannau.net
Sat Aug 22 17:52:25 CEST 2015
On 2015-08-13 23:59:23 +0300, Martin Storsjö wrote:
> These functions write their output into sum+stride, while we previously
> only checked [0..stride-8] within the sum array.
>
> This catches the previously broken aarch64 version of these functions.
> ---
> tools/checkasm.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/tools/checkasm.c b/tools/checkasm.c
> index a1e8eda..efe874b 100644
> --- a/tools/checkasm.c
> +++ b/tools/checkasm.c
> @@ -1616,7 +1616,7 @@ static int check_mc( int cpu_ref, int cpu_new )
> report( "lowres init :" );
> }
>
> -#define INTEGRAL_INIT( name, size, ... )\
> +#define INTEGRAL_INIT( name, size, offset, ... )\
> if( mc_a.name != mc_ref.name )\
> {\
> intptr_t stride = 96;\
> @@ -1628,17 +1628,17 @@ static int check_mc( int cpu_ref, int cpu_new )
> call_c1( mc_c.name, __VA_ARGS__ );\
> sum = (uint16_t*)buf4;\
> call_a1( mc_a.name, __VA_ARGS__ );\
> - if( memcmp( buf3, buf4, (stride-8)*2 ) \
> - || (size>9 && memcmp( buf3+18*stride, buf4+18*stride, (stride-8)*2 )))\
> + if( memcmp( buf3, buf4, (offset+stride-8)*2 ) \
I'd prefer if we separate the two parameters and move the out of
__VA_ARGS__. Another option would be to use cmp_len which would allow us
to test the last 4 values of integral_init4h. We can not just compare
stride values since all asm variants overwrite and C does not
> + || (size>9 && memcmp( buf3+18*stride, buf4+18*stride, (stride-8)*2 ))) \
> ok = 0;\
> call_c2( mc_c.name, __VA_ARGS__ );\
> call_a2( mc_a.name, __VA_ARGS__ );\
> }
> ok = 1; used_asm = 0;
> - INTEGRAL_INIT( integral_init4h, 2, sum+stride, pbuf2, stride );
> - INTEGRAL_INIT( integral_init8h, 2, sum+stride, pbuf2, stride );
> - INTEGRAL_INIT( integral_init4v, 14, sum, sum+9*stride, stride );
> - INTEGRAL_INIT( integral_init8v, 9, sum, stride );
> + INTEGRAL_INIT( integral_init4h, 2, stride, sum+stride, pbuf2, stride );
> + INTEGRAL_INIT( integral_init8h, 2, stride, sum+stride, pbuf2, stride );
> + INTEGRAL_INIT( integral_init4v, 14, 0, sum, sum+9*stride, stride );
> + INTEGRAL_INIT( integral_init8v, 9, 0, sum, stride );
otherwise ok
Janne
More information about the x264-devel
mailing list