<div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><div>code is right, but I suggest you modify algorithm and write a new version.</div><div>--------------------------------------------------------------------------------------------------<br>| Port | 0 - DV | 1 | 2 - D | 3 - D | 4 | 5 | 6 | 7 |<br>--------------------------------------------------------------------------------------------------<br>| Cycles | 96.0 0.0 | 50.0 | 35.5 25.0 | 47.6 25.0 | 43.0 | 151.0 | 9.0 | 9.9 |<br>--------------------------------------------------------------------------------------------------</div><div> </div><div>bottlencek on Port5 and unbalance.</div><div>Your algorithm process 16 cols and 32 rows everytime, I suggest process 4 rows and 32 cols.</div><pre><br>At 2014-12-09 17:11:09,"Divya Manivannan" <divya@multicorewareinc.com> wrote:
># HG changeset patch
># User Divya Manivannan <divya@multicorewareinc.com>
># Date 1418116177 -19800
># Tue Dec 09 14:39:37 2014 +0530
># Node ID 2e6c4518f7083d79202a28f739650278e5c0d88d
># Parent 88498ec9b10ba25a01c983a3f67c17bf470349fa
>asm: chroma_vpp[32x32] for colorspace i420 in avx2: improve 3881c->3648c
</pre></div>