[x265] [PATCH] asm: chroma_vpp[32x32] for colorspace i420 in avx2: improve 3881c->3648c

chen chenm003 at 163.com
Tue Dec 9 20:32:40 CET 2014


code is right, but I suggest you modify algorithm and write a new version.
--------------------------------------------------------------------------------------------------
|  Port  |   0   -  DV   |   1   |   2   -   D   |   3   -   D   |   4   |   5   |   6   |   7   |
--------------------------------------------------------------------------------------------------
| Cycles | 96.0     0.0  | 50.0  | 35.5    25.0  | 47.6    25.0  | 43.0  | 151.0 |  9.0  |  9.9  |
--------------------------------------------------------------------------------------------------
 
bottlencek on Port5 and unbalance.
Your algorithm process 16 cols and 32 rows everytime, I suggest process 4 rows and 32 cols.

At 2014-12-09 17:11:09,"Divya Manivannan" <divya at multicorewareinc.com> wrote:
># HG changeset patch
># User Divya Manivannan <divya at multicorewareinc.com>
># Date 1418116177 -19800
>#      Tue Dec 09 14:39:37 2014 +0530
># Node ID 2e6c4518f7083d79202a28f739650278e5c0d88d
># Parent  88498ec9b10ba25a01c983a3f67c17bf470349fa
>asm: chroma_vpp[32x32] for colorspace i420 in avx2: improve 3881c->3648c
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20141210/6f7a4945/attachment-0001.html>


More information about the x265-devel mailing list