[x265] [PATCH 00/12] AArch64: Optimise low bitdepth ipfilter primitives
Karam Singh
karam.singh at multicorewareinc.com
Fri Sep 6 06:56:06 UTC 2024
All the patches of this series have been pushed to the master branch.
*__________________________*
*Karam Singh*
*Ph.D. IIT Guwahati*
Senior Software (Video Coding) Engineer
Mobile: +91 8011279030
Block 9A, 6th floor, DLF Cyber City
Manapakkam, Chennai 600 089
On Mon, Sep 2, 2024 at 9:20 AM chen <chenm003 at 163.com> wrote:
> Hi Hari,
>
>
> Thank for the patches, I have no comment on this initialize version.
>
> In future version, we may improve by algorithm.
>
> For example, width=12 may split into 2 of witdh=8, it got cache benefits.
>
>
> Regards,
> Chen
>
>
> At 2024-08-31 03:18:48, "Hari Limaye" <hari.limaye at arm.com> wrote:
> >This patch series optimises the existing Neon intrinsics implementations of the ipfilter primitives, and removes the assembly implementations in favour of these new implementations.
> >
> >Relative performance observed for the new Neon intrinsics implementations, compared to the existing assembly implementations, is in the respective commit messages.
> >
> >Many thanks,
> >Hari
> >
> >Hari Limaye (12):
> > Test: Remove check for unused coeffIdx in ipfilter tests
> > Move ipfilter primitives into X265_NS
> > AArch64: Move ipfilter primitives into X265_NS
> > AArch64: Support all block sizes in p2s Neon
> > AArch64: Optimise low bitdepth interp_horiz_pp_neon
> > AArch64: Optimise low bitdepth interp_horiz_ps_neon
> > AArch64: Optimise low bitdepth interp_vert_ss_neon
> > AArch64: Optimise low bitdepth interp_vert_pp_neon
> > AArch64: Optimise low bitdepth interp_vert_ps_neon
> > AArch64: Optimise low bitdepth interp_vert_sp_neon
> > AArch64: Define all low bitdepth Neon ipfilter primitives
> > AArch64: Remove Assembly ipfilter primitives
> >
> > source/common/CMakeLists.txt | 4 +-
> > source/common/aarch64/asm-primitives.cpp | 186 --
> > source/common/aarch64/filter-prim.cpp | 2877 ++++++++++++++++++----
> > source/common/aarch64/fun-decls.h | 15 -
> > source/common/aarch64/ipfilter-common.S | 1436 -----------
> > source/common/aarch64/ipfilter-sve2.S | 1282 ----------
> > source/common/aarch64/ipfilter.S | 1054 --------
> > source/common/aarch64/mem-neon.h | 193 ++
> > source/common/ipfilter.cpp | 8 +-
> > source/test/ipfilterharness.cpp | 24 +-
> > 10 files changed, 2580 insertions(+), 4499 deletions(-)
> > delete mode 100644 source/common/aarch64/ipfilter-common.S
> > delete mode 100644 source/common/aarch64/ipfilter-sve2.S
> > delete mode 100644 source/common/aarch64/ipfilter.S
> >
> >--
> >2.42.1
> >
> >_______________________________________________
> >x265-devel mailing list
> >x265-devel at videolan.org
> >https://mailman.videolan.org/listinfo/x265-devel
>
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20240906/528851d5/attachment.htm>
More information about the x265-devel
mailing list