[x264-devel] [PATCH 1/1] arm: optimize neon luma intra deblock

Martin Storsjö martin at martin.st
Wed Sep 2 10:48:25 CEST 2015


On Wed, 2 Sep 2015, Janne Grunau wrote:

> Hi Martin,
>
> I forgot to rescheduling the beginning of the macro in the last iteration.
> The floating point compare is a neat trick to test a 64-bit register
> against 0 and control the program flow based on it. I tend to forget that
> this is possible.
>
> We might have to add vmrs handling to gaspp. Older assembers might
> understand fmstat instead.

Hmm, maybe. armasm handles it fine, but old apple binutils doesn't. When 
testing there, I get the following:

{standard input}:955:ARM register expected -- `vmrs APSR_nzcv,FPSCR'

When trying with fmstat, I get this instead:

{standard input}:955:garbage following instruction -- `fmstat APSR_nzcv,FPSCR'

Dunno if this is critical for x264 or if everybody build for iOS with 
clang with their built-in assembler nowadays. (At least for libav, I've 
still got a fate instance with the old apple binutils.)

I can resend with this squashed, or should we stay off this trick for now?

// Martin


More information about the x264-devel mailing list