[x265] [PATCH v3 0/4] AArch64: Add Neon optimisations of interp functions

Gerda Zsejke More gerdazsejke.more at arm.com
Fri Apr 25 13:38:08 UTC 2025


Hello,

This is the v3 patch series to solve Chen's review comments. 
A comment on your suggestions:
I continued using the TBL instruction, as the REV64 instruction does not
offer any performance advantage. Additionally, this approach helps maintain
code consistency and improves readability.

Best regards,
Gerda

Gerda Zsejke More (4):
  AArch64: Add SVE implementation of HBD interp_horiz_pp
  AArch64: Add SVE implementation of HBD interp_horiz_ps
  AArch64: Add SVE implementation of HBD interp_vert_ss
  AArch64: Add SVE implementation of HBD interp_vert_pp

 source/common/CMakeLists.txt              |    2 +-
 source/common/aarch64/asm-primitives.cpp  |    2 +
 source/common/aarch64/filter-prim-sve.cpp | 1057 +++++++++++++++++++++
 source/common/aarch64/filter-prim-sve.h   |   37 +
 source/common/aarch64/neon-sve-bridge.h   |   12 +
 5 files changed, 1109 insertions(+), 1 deletion(-)
 create mode 100644 source/common/aarch64/filter-prim-sve.cpp
 create mode 100644 source/common/aarch64/filter-prim-sve.h

-- 
2.39.5 (Apple Git-154)



More information about the x265-devel mailing list