[x264-devel] PowerPC/VSX update
Luca Barbato
lu_zero at gentoo.org
Tue Nov 1 23:16:12 CET 2016
Work from me and Alexandra
[PATCH 1/6] ppc: Manually unroll the horizontal prediction loop
This is a low hanging fruit, there are other 4 functions slower than
C as they are currently written.
[PATCH 2/6] vsx: configure support
Unchanged.
[PATCH 3/6] vsx: Prepare to have different files for VSX and old
Dropped the ppcle broken special case, set the make variable to `no` after
the architecture checks if it is still auto.
[PATCH 4/6] ppc: Provide fallbacks for older architectures
As discussed on irc, vec_vsx_ld/vec_vsx_st are the new unaligned load/store
courtesy of VSX, Alexandra leverages them to avoid annoying slowdowns on
the current hardware in little endian mode.
[PATCH 5/6] ppc: Use vec_vsx_ld instead of VEC_LOAD/STORE macros
The initial patch now w/out the file duplication
[PATCH 6/6] ppc: Fix hadamard for little endian
Now checkasm passes fully =)
lu
More information about the x264-devel
mailing list