[x265] [PATCH] RISCV64: add copy_cnt assembly optimization

chen chenm003 at 163.com
Sun Jul 6 19:27:40 UTC 2025


Hi Changsheng,




Thank for the patches.




However, I don't think RISC-V Extension-V stable enough nowadays.

v1.0 frozen at September 2021

v1.1 public review at May 2023

no more update until July 2025



And most instructions has not behavior description,




For example, vredsum.vs in the patch

vredsum.vs  vd, vs2, vs1, vm   # vd[0] =  sum( vs1[0] , vs2[*] )




I just guess it is
vd[0] =  vs1[0] + sum(vs2[*])




Another example is vlse8.v,

I may guess it is equal to x86 PSHUFB or ARM VTBL,




Above example I just guess, I can't confirm my concept in past couple years, too many similar problem inside RISC-V Extension-V

So, I suggest do not integrate / implement RISC-V patch, until specification become stable enough.




Rgards,

Chen

2025-07-06 10:08:25,wu.changsheng at sanechips.com.cn 

From 7562e3a834a6a5ea76ab1b97acf915e095646cd5 Mon Sep 17 00:00:00 2001


From: Changsheng Wu <wu.changsheng at sanechips.com.cn>

Date: Sat, 5 Jul 2025 23:09:14 +0800

Subject: [PATCH] RISCV64: add copy_cnt assembly optimization




TestBench test result:

  copy_cnt[4x4] |        1.34x |          123.12   |      165.06

  copy_cnt[8x8] |        2.64x |          214.07   |      564.26

copy_cnt[16x16] |        3.96x |          563.83   |      2232.00

copy_cnt[32x32] |        7.44x |          2144.80  |      15954.42


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20250707/85b28356/attachment.htm>


More information about the x265-devel mailing list