[x265] SSE4 Angular Mode 26 Intra function

chen chenm003 at 163.com
Sun Dec 22 16:21:39 CET 2013


Hi Matt,

 
Thanks your explain, I understand your idea, it is fast algorithm for Intra decide.
 
But I don't know why you use intra_pred_allangs function to calculate
Chroma.
The function generate all of 33 Luma modes, and chroma have 5 modes
only, so I don't use it for Chroma.
In your case, I think intra_pred[] is better choice.

Thanks!
Min

At 2013-12-22 22:28:50,"Matt Johnson" <johnso87 at illinois.edu> wrote:
>Hi Min,
> Unfortunately the code where I'm seeing the failure is part of a whole 
>separate program I wrote that makes use of x265's intra prediction and 
>SATD functions, so I don't have a minimal test case.  Essentially I am 
>decomposing the frame into a regular grid of 4x4, .. 32x32 PUs, and 
>computing SATD values for all 35 intra prediction modes, for all blocks, 
>for all block sizes, for all 3 channels, using the original frame pixels 
>as the reference rather than the reconstructed pixels as the decoder 
>would see them.  It's intended to be a way to quickly find the most 
>profitable intra prediction modes for high-bitrate scenarios where 
>reconstructed pixels approximately equal the original pixels; by using 
>the original pixels as reference, I can do prediction for the entire 
>frame at once, so it is profitable to offload the computation to a GPU.
> In any case, I do call the intra_pred_allangs function pointer of my 
>primitives struct with bFilter==1 for luma blocks under 32x32 and 
>bFilter==0 otherwise.  As an example of when the filtering clause gives 
>an incorrect result, consider a 4x4 chroma block.
>
>Above neighbor array = [0x7B 0x7B 0x7B 0x7B 0x7B ...] (first element is 
>[-1][-1] index in the image, relative to the block in question (top-left 
>corner)
>Left neighbor array = [0x7B 0x7B 0x7A 0x7B 0x7C ...] (first element is 
>also [-1][-1] index in the image)
>
>For the vertical prediction mode on 4x4 chroma, you just copy the above 
>neighbors downward, so for example, the element in the second row and 
>the first column ([1][0] in row-major syntax, [0][1] in the standard) is 
>0x7B according to the standard.  With the filtering clause, that same 
>element is calculated as 0x7B+((0x7A - 0x7B) >> 1) = 0x7A.
>
>Whether this needs to be fixed probably depends on whether x265 ever 
>intends on doing intra prediction for chroma.  If not, then I should 
>just pass "--cpuid 1" to use the serial C versions.
>
>Thanks!
>-Matt
>
>On 12/22/2013 12:07 AM, chen wrote:
>> Hello Matt,
>> Could you tell me how to reproduce the hash mistake?
>> I check the code again, the intra_pred_allangs() called only Luma path.
>>
>> Thanks,
>>
>> Min
>>
>> At 2013-12-22 08:35:09,"Matt Johnson" <johnso87 at illinois.edu> wrote:
>>>Hi all,
>>> I don't know x86 assembly well enough to easily diagnose the problem
>>>myself, but I'm running into a problem with intra prediction in the
>>>horizontal (mode 10) and vertical (mode 26) modes, where the SSE4 result
>>>(--cpuid 255) mismatches the C result (--cpuid 1) and indeed any cpuid
>>>value earlier than SSE4.
>>> The problem seems to be with the filtering clause (Equation 8-54 in the
>>>standard for mode 26, 8-62 for mode 10), which applies to 4x4, 8x8, and
>>>16x16 luma blocks.  I'm seeing the problem with 4x4 chroma blocks; it
>>>looks like the C version respects the bLuma flag to all_angs_pred_c()
>>>(which propagates to the bFilter argument to intra_pred_ang_c()), so the
>>>filtering clause is not invoked for 4x4 chroma blocks and the normal
>>>equations involving ref[], iIdx, and iFact come into play.  It looks
>>>like the SSE4 version doesn't implement that flag the same way; the
>>>predicted pixels I'm getting back are consistent with the use of the
>>>filtering clause.
>>>
>>>Thanks,
>>>Matt
>>>_______________________________________________
>>>x265-devel mailing list
>>>x265-devel at videolan.org
>>>https://mailman.videolan.org/listinfo/x265-devel
>>
>>
>>
>> _______________________________________________
>> x265-devel mailing list
>> x265-devel at videolan.org
>> https://mailman.videolan.org/listinfo/x265-devel
>>
>_______________________________________________
>x265-devel mailing list
>x265-devel at videolan.org
>https://mailman.videolan.org/listinfo/x265-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20131222/f9b4a410/attachment.html>


More information about the x265-devel mailing list