[x265] [PATCH] asm: chroma_vpp[32x32] for colorspace i420 in avx2: improve 3881c->3648c
chen
chenm003 at 163.com
Tue Dec 9 20:32:40 CET 2014
code is right, but I suggest you modify algorithm and write a new version.
--------------------------------------------------------------------------------------------------
| Port | 0 - DV | 1 | 2 - D | 3 - D | 4 | 5 | 6 | 7 |
--------------------------------------------------------------------------------------------------
| Cycles | 96.0 0.0 | 50.0 | 35.5 25.0 | 47.6 25.0 | 43.0 | 151.0 | 9.0 | 9.9 |
--------------------------------------------------------------------------------------------------
bottlencek on Port5 and unbalance.
Your algorithm process 16 cols and 32 rows everytime, I suggest process 4 rows and 32 cols.
At 2014-12-09 17:11:09,"Divya Manivannan" <divya at multicorewareinc.com> wrote:
># HG changeset patch
># User Divya Manivannan <divya at multicorewareinc.com>
># Date 1418116177 -19800
># Tue Dec 09 14:39:37 2014 +0530
># Node ID 2e6c4518f7083d79202a28f739650278e5c0d88d
># Parent 88498ec9b10ba25a01c983a3f67c17bf470349fa
>asm: chroma_vpp[32x32] for colorspace i420 in avx2: improve 3881c->3648c
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20141210/6f7a4945/attachment-0001.html>
More information about the x265-devel
mailing list