[x264-devel] [PATCH 1/1] checkasm: fix arm64 register clobber test
Janne Grunau
janne-x264 at jannau.net
Mon Aug 24 20:59:44 CEST 2015
On 2015-08-24 21:51:44 +0300, Martin Storsjö wrote:
> On Mon, 24 Aug 2015, Janne Grunau wrote:
>
> >On 2015-08-24 12:42:47 +0300, Martin Storsjö wrote:
> >>On Mon, 17 Aug 2015, Janne Grunau wrote:
> >>
> >>>+.macro check_reg_neon reg1, reg2
> >>>+ ldr q0, [x9], #16
> >>>+ uzp1 v1.2d, v\reg1\().2d, v\reg2\().2d
> >>>+ eor v0.16b, v0.16b, v1.16b
> >>>+ orr v3.16b, v3.16b, v0.16b
> >>>+.endm
> >>>+ check_reg_neon 8, 9
> >>>+ check_reg_neon 10, 11
> >>>+ check_reg_neon 12, 13
> >>>+ check_reg_neon 14, 15
> >>>+ xtn v3.8b, v3.8h
> >>
> >>Doesn't this drop half of the bits in the registers, i.e. a bit set
> >>in the upper half of the halfwords would be discarded and missed? An
> >>uqxtn would probably handle that, right?
> >
> >half (16-bit) or double (64-bit) words?
>
> halfwords
>
> >in any case the code is correct.
>
> No, I just reproduced a clobbered register that it failed to notice.
> E.g., before the check_reg_neon, do "mov x0, #42; mov v8.b[1], w0",
> and it will fail to notice. For odd indices between 1 and 7, it will
> fail to notice the corrupted register, while even indices between 0
> and 6 are caught properly (and 8-15 ignored, as they should).
>
> >it takes the the lower 64-bit of the vector registers and writes
> >them into v1. The ABI just mandates that the lower 64-bit of the
> >refisters should be preserved, i.e. the equivalent of the ARM NEON
> >d8-d15.
>
> Yes, it takes the lower 64-bit half of two 128 bit vector registers
> and writes them into v1. But then, at the end, once you have the
> bitfield in v3 showing which bits had errors (where every bit is
> significant; the upper 64 bits of this register comes from the
> errors in v9, v11 etc), you discard every other byte in the
> register.
right, I completely missed that you were speaking of the xtn. uqxtn
should indeed do the trick
Janne
More information about the x264-devel
mailing list