[x265] Fwd: [PATCH] asm: intra_pred_ang16_2
chen
chenm003 at 163.com
Wed Mar 11 17:53:05 CET 2015
At 2015-03-11 12:57:34,"Praveen Tiwari" <praveen at multicorewareinc.com> wrote:
---------- Forwarded message ----------
From: chen<chenm003 at 163.com>
Date: Wed, Mar 11, 2015 at 6:32 AM
Subject: Re: [x265] [PATCH] asm: intra_pred_ang16_2
To: Development for x265 <x265-devel at videolan.org>
>>same speed to old version
This avx2 version of asm code eliminates following instruction on cost of one vextracti128 instruction as compare to SSEE3, may not be a visible impact in testBench but seems worth to push.
add r2, 34
cmp r3m, byte 34
cmove r2, r4
[MC] above for share code on mode 2 & 34, your new code use seprate functions, and vextract will use Port5, it is common bottleneck
movu m1, [r2 + 16]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20150312/561fa675/attachment.html>
More information about the x265-devel
mailing list