[x265] Fwd: [PATCH] asm: intra_pred_ang16_2

chen chenm003 at 163.com
Wed Mar 11 17:53:05 CET 2015


 
At 2015-03-11 12:57:34,"Praveen Tiwari" <praveen at multicorewareinc.com> wrote:



---------- Forwarded message ----------
From: chen<chenm003 at 163.com>
Date: Wed, Mar 11, 2015 at 6:32 AM
Subject: Re: [x265] [PATCH] asm: intra_pred_ang16_2
To: Development for x265 <x265-devel at videolan.org>



>>same speed to old version


This avx2 version of asm code eliminates following instruction on cost of one vextracti128 instruction as compare to SSEE3, may not be a visible impact in testBench but seems worth to push.  
    add             r2, 34
    cmp             r3m, byte 34
    cmove           r2, r4
[MC] above for share code on mode 2 & 34, your new code use seprate functions, and vextract will use Port5, it is common bottleneck
 
    movu            m1, [r2 + 16]

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150312/561fa675/attachment.html>


More information about the x265-devel mailing list