[x265] SSE4 Angular Mode 26 Intra function
Matt Johnson
johnso87 at illinois.edu
Sun Dec 22 01:35:09 CET 2013
Hi all,
I don't know x86 assembly well enough to easily diagnose the problem
myself, but I'm running into a problem with intra prediction in the
horizontal (mode 10) and vertical (mode 26) modes, where the SSE4 result
(--cpuid 255) mismatches the C result (--cpuid 1) and indeed any cpuid
value earlier than SSE4.
The problem seems to be with the filtering clause (Equation 8-54 in the
standard for mode 26, 8-62 for mode 10), which applies to 4x4, 8x8, and
16x16 luma blocks. I'm seeing the problem with 4x4 chroma blocks; it
looks like the C version respects the bLuma flag to all_angs_pred_c()
(which propagates to the bFilter argument to intra_pred_ang_c()), so the
filtering clause is not invoked for 4x4 chroma blocks and the normal
equations involving ref[], iIdx, and iFact come into play. It looks
like the SSE4 version doesn't implement that flag the same way; the
predicted pixels I'm getting back are consistent with the use of the
filtering clause.
Thanks,
Matt
More information about the x265-devel
mailing list