[x265] SSE4 Angular Mode 26 Intra function

Matt Johnson johnso87 at illinois.edu
Sun Dec 22 01:35:09 CET 2013


Hi all,
	I don't know x86 assembly well enough to easily diagnose the problem 
myself, but I'm running into a problem with intra prediction in the 
horizontal (mode 10) and vertical (mode 26) modes, where the SSE4 result 
(--cpuid 255) mismatches the C result (--cpuid 1) and indeed any cpuid 
value earlier than SSE4.
	The problem seems to be with the filtering clause (Equation 8-54 in the 
standard for mode 26, 8-62 for mode 10), which applies to 4x4, 8x8, and 
16x16 luma blocks.  I'm seeing the problem with 4x4 chroma blocks; it 
looks like the C version respects the bLuma flag to all_angs_pred_c() 
(which propagates to the bFilter argument to intra_pred_ang_c()), so the 
filtering clause is not invoked for 4x4 chroma blocks and the normal 
equations involving ref[], iIdx, and iFact come into play.  It looks 
like the SSE4 version doesn't implement that flag the same way; the 
predicted pixels I'm getting back are consistent with the use of the 
filtering clause.

Thanks,
Matt


More information about the x265-devel mailing list