[x265] [PATCH 0/4] AArch64: Add SVE optimisations of interp functions
chen
chenm003 at 163.com
Wed Apr 16 21:16:32 UTC 2025
Hi Gerda,
Thank for optimize.
Other part looks fine, I just some comment on insert_new_s16_elements_x8 (and x4)
1) the function name is not clear enough
2) merge_block_tbl is constant table, is it necessary input as parameter? it is not template parameters here.
3) does register combine + TBL method faster than EXT directly?
Regards,
Chen
At 2025-04-15 17:36:30, "Gerda Zsejke More" <gerdazsejke.more at arm.com> wrote:
>Hi,
>
>This patch series adds SVE intrinsic optimisations of interp functions.
>
>Many thanks,
>Gerda
>
>Gerda Zsejke More (4):
> AArch64: Add SVE implementation of HBD interp_horiz_pp
> AArch64: Add SVE implementation of HBD interp_horiz_ps
> AArch64: Add SVE implementation of HBD interp_vert_ss
> AArch64: Add SVE implementation of HBD interp_vert_pp
>
> source/common/CMakeLists.txt | 2 +-
> source/common/aarch64/asm-primitives.cpp | 2 +
> source/common/aarch64/filter-prim-sve.cpp | 1022 +++++++++++++++++++++
> source/common/aarch64/filter-prim-sve.h | 37 +
> source/common/aarch64/neon-sve-bridge.h | 12 +
> 5 files changed, 1074 insertions(+), 1 deletion(-)
> create mode 100644 source/common/aarch64/filter-prim-sve.cpp
> create mode 100644 source/common/aarch64/filter-prim-sve.h
>
>--
>2.39.5 (Apple Git-154)
>
>_______________________________________________
>x265-devel mailing list
>x265-devel at videolan.org
>https://mailman.videolan.org/listinfo/x265-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20250417/801095e8/attachment.htm>
More information about the x265-devel
mailing list