[x265] [PATCH] RISCV64: add copy_cnt assembly optimization
chen
chenm003 at 163.com
Sun Jul 6 19:27:40 UTC 2025
Hi Changsheng,
Thank for the patches.
However, I don't think RISC-V Extension-V stable enough nowadays.
v1.0 frozen at September 2021
v1.1 public review at May 2023
no more update until July 2025
And most instructions has not behavior description,
For example, vredsum.vs in the patch
vredsum.vs vd, vs2, vs1, vm # vd[0] = sum( vs1[0] , vs2[*] )
I just guess it is
vd[0] = vs1[0] + sum(vs2[*])
Another example is vlse8.v,
I may guess it is equal to x86 PSHUFB or ARM VTBL,
Above example I just guess, I can't confirm my concept in past couple years, too many similar problem inside RISC-V Extension-V
So, I suggest do not integrate / implement RISC-V patch, until specification become stable enough.
Rgards,
Chen
2025-07-06 10:08:25,wu.changsheng at sanechips.com.cn
From 7562e3a834a6a5ea76ab1b97acf915e095646cd5 Mon Sep 17 00:00:00 2001
From: Changsheng Wu <wu.changsheng at sanechips.com.cn>
Date: Sat, 5 Jul 2025 23:09:14 +0800
Subject: [PATCH] RISCV64: add copy_cnt assembly optimization
TestBench test result:
copy_cnt[4x4] | 1.34x | 123.12 | 165.06
copy_cnt[8x8] | 2.64x | 214.07 | 564.26
copy_cnt[16x16] | 3.96x | 563.83 | 2232.00
copy_cnt[32x32] | 7.44x | 2144.80 | 15954.42
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20250707/85b28356/attachment.htm>
More information about the x265-devel
mailing list