[x265] Fwd: [PATCH] asm code for ipfilterH_pp, 4 tap filter

Sat Sep 28 17:04:43 CEST 2013

---------- Forwarded message ----------
From: Jason Garrett-Glaser <jason at x264.com>
Date: Sat, Sep 28, 2013 at 4:30 PM
Subject: Re: [x265] [PATCH] asm code for ipfilterH_pp, 4 tap filter
To: Development for x265 <x265-devel at videolan.org>

On Fri, Sep 27, 2013 at 11:42 PM, Praveen Tiwari
<praveen at multicorewareinc.com> wrote:
> suppose, during execution width comes less than 8 like 5, then we would
like
> to run our code section which handles the reaming width (_end_col:) not
the
> whole code (handle multiple of 8 and renaming width part, it will computed
> twice in this case and  corrupting some (8 - widthleft) dst[] old values
> which is being used with 'pblenvb' instruction.This is why we have put a
> check. if width is always >= 8 you are right, we don't need to put the
> check.

>>Wait, so you're using pblendvb to avoid corrupting pixels to the right
>>of the block being stored?

Yes, can say but whole scenario is like, in our code we supporting odd-size
blocks also so we don't know exactly how many elements we need to store to
dst[] after multiple of 8. suppose we width is 13 then we need to store 13
- 8 = 5 elements to dst[], if it is 18 then 18 - 2*8 = 2, elements need to
store in dst[]. All these things are managed on run time using tab_left
mask table, "pblendvb" instruction and dst[] old values. So, "pblendvb" is
used to store right number of elements while not corrupting pixels to the
right of the block being stored.

>That really doesn't seem necessary; x264's MC functions just wrote
>past the end and this was never a problem, because blocks to the right
>hadn't been encoded yet anyways.

I hope x264's MC function fit in above case.

>>Does HEVC really have width-5 blocks?  I thought the widths were 4, 8,
>>and so forth; did they add odd-size blocks?  What is the exact,
>>complete list of widths we need to support?

I had a discussion about the width size with Min, he suggested that new
interpolation algorithm (sorry, I don't know much about this algorithm) may
use odd-block sizes also. so we need to provide support for them too but I
think currently it using multiple of 4.

Regards,
Praveen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20130928/f79b681d/attachment.html>