[x264-devel] [PATCH 02/24] checkasm: Check the right output range for integral_initXh

Janne Grunau janne-x264 at jannau.net
Mon Aug 24 19:58:01 CEST 2015


On 2015-08-24 13:29:35 +0300, Martin Storsjö wrote:
> On Sat, 22 Aug 2015, Janne Grunau wrote:
> 
> >On 2015-08-13 23:59:23 +0300, Martin Storsjö wrote:
> >>These functions write their output into sum+stride, while we previously
> >>only checked [0..stride-8] within the sum array.
> >>
> >>This catches the previously broken aarch64 version of these functions.
> >>---
> >> tools/checkasm.c |   14 +++++++-------
> >> 1 file changed, 7 insertions(+), 7 deletions(-)
> >>
> >>diff --git a/tools/checkasm.c b/tools/checkasm.c
> >>index a1e8eda..efe874b 100644
> >>--- a/tools/checkasm.c
> >>+++ b/tools/checkasm.c
> >>@@ -1616,7 +1616,7 @@ static int check_mc( int cpu_ref, int cpu_new )
> >>         report( "lowres init :" );
> >>     }
> >>
> >>-#define INTEGRAL_INIT( name, size, ... )\
> >>+#define INTEGRAL_INIT( name, size, offset, ... )\
> >>     if( mc_a.name != mc_ref.name )\
> >>     {\
> >>         intptr_t stride = 96;\
> >>@@ -1628,17 +1628,17 @@ static int check_mc( int cpu_ref, int cpu_new )
> >>         call_c1( mc_c.name, __VA_ARGS__ );\
> >>         sum = (uint16_t*)buf4;\
> >>         call_a1( mc_a.name, __VA_ARGS__ );\
> >>-        if( memcmp( buf3, buf4, (stride-8)*2 ) \
> >>-            || (size>9 && memcmp( buf3+18*stride, buf4+18*stride, (stride-8)*2 )))\
> >>+        if( memcmp( buf3, buf4, (offset+stride-8)*2 ) \
> >
> >I'd prefer if we separate the two parameters and move the out of
> >__VA_ARGS__. Another option would be to use cmp_len which would allow us
> >to test the last 4 values of integral_init4h. We can not just compare
> >stride values since all asm variants overwrite and C does not
> 
> Hmmm, so we'd do call_a1( mc_a.name, sum+offset, __VA_ARGS )
> instead, and pass cmp_len as a separate parameter? That'd probably
> work, and would certainly look nicer.

I was more thinking of call_a1( mc_a.name, sum, offset, __VA_ARGS ) and 
I think you have to do that since you can't use sum in the memcmp().  Or 
even omit sum completely since passing the offset is enough.

Janne


More information about the x264-devel mailing list