[x264-devel] Deblocking filter

Jean-Michel Hautbois jean-michel.hautbois at crf.canon.fr
Mon Jul 16 09:32:20 CEST 2007


Thanks a lot for this answer !
I will go through history :).

Guy Bonneau wrote:
> Optimization is not a one time process. Improvement was done step
> by step sometime over many years of development. Looking at the 
> history of the deblocking assembler file will help you find how the
> optimization evolved. 
>
> A few weeks ago I went through the same process of understanding the
> deblocking algorithm of x264 for academic purpose. The assembler code
> was written to take advantage of mmx byte processing to speed up the
> algorithm execution.
>
> Here is some munging of the binary mathematic of byte processing for 
> the p0' and q0' when bS is less than 4.
>
> Let start from:
>
> (((q0-p0)<<2) + (p1-q1) + 4) >> 3      (1)
>
> The first 2 Least Significant Bit of result (p1-q1) doesn’t add to the
> result. Thus they can be dropped. And we can rewrite the equation to:
>
> (((q0-p0)) + ((p1-q1) >> 2) + 1) >> 1    (2)
>
> If a and b are unsigned value we have the identity
>
> (a-b) = a+(~b)+1 – 256   (Note a and b are unsigned value)
>
> Thus we can rewrite (2) :
>
> (((q0+~p0 + 1 - 256)) + ((p1+~q1 + 1 - 256) >> 2) + 1) >> 1
>
> And trying to use PAVGB we can do some binary mathematic:
>     
> (((q0+~p0 + 1 - 256)) + ((p1+~q1 + 1) >> 2) - 64 + 1)  >> 1
> (((q0+~p0 + 1)) + (PAVGB(p1,~q1) >> 1) - 256 - 64 + 1)  >> 1
> (((q0+~p0 + 1)) + (PAVGB(p1,~q1) + 4 ) >> 1) - 256 - 64 – (4>>1) + 1) >> 1
> (((q0+~p0 + 1)) + (PAVGB(p1,~q1) + 3 + 1) >> 1) - 256 - 64 – (4>>1) +1) >> 1
> (((q0+~p0 + 1)) + (PAVGB(p1,~q1) + 3 + 1) >> 1) - 256 - 64 – 2 + 1) >> 1
> (((q0+~p0 + 1)) + PAVGB(PAVGB(p1,~q1), 3) - 256 - 64 – 2 + 1)  >> 1
> (((q0+~p0 + 1)) + PAVGB(PAVGB(p1,~q1), 3) + 1) >> 1 - 128 – 33
> (((q0+~p0 + 1)) >> 1) + (PAVGB(PAVGB(p1,~q1), 3) + 1) >> 1)  - 161 
> PAVGB(q0,~p0) + (PAVGB(PAVGB(p1,~q1), 3) + 1) >> 1) - 161
>
> At that point we know that we have a problem. The expression PAVGB(q0,~p0)
> will drop the Least Significant Bit of (q0+~p0+1) that should be added to
> the second part of the equation: (PAVGB(PAVGB(p1,~q1), 3) + 1. And this will
> cause an imprecision of 1 bit in the computing. To solve this problem we
> need to add the value of the Least Significant Bit of (q0+~p0+1)
> to the second part of the equation. Let name this value avglsb. We then
> have:
>
> PAVGB(q0,~p0) + (avglsb + (PAVGB(PAVGB(p1,~q1), 3) + 1) >> 1) - 161
>  
> avglsb is ((q0^p0 ) & 0x1)
>
> Then
>
> PAVGB(q0,~p0) + (((q0^p0 )& 0x1) + (PAVGB(PAVGB(p1,~q1), 3) + 1) >> 1) - 161
> PAVGB(q0,~p0) + PAVGB(((q0^p0 ) & 0x1), (PAVGB(PAVGB(p1,~q1), 3)))  - 161
>
> This is what the assembler code implements to compute p0' and q0' with some
> clipping code needed. Keep in mind that the optimized code was written to
> use byte processing like Loren said.
>
> Hope this help.
>
> BTW the deblocking optimization of x264 is probably one of the most 
> beautiful piece of optimizing code I ever saw. Great Work!
>
> Guy Bonneau
>
>
>
>   
>> -----Original Message-----
>> From: x264-devel-bounces at videolan.org [mailto:x264-devel-
>> bounces at videolan.org] On Behalf Of Jean-Michel HAUTBOIS
>> Sent: Friday, July 13, 2007 3:52 AM
>> To: x264-devel at videolan.org
>> Subject: [x264-devel] Deblocking filter
>>
>> Hi everyone !
>> I am currently looking at the deblocking filter algorithm in H.264, and
>> am trying to understand your implementation. You have written the filter
>> entirely in assembly language, but how did you proceed for optimizing ?
>> Did you use some papers ?
>> If so, could you please give me the references you used ?
>>
>> Thanks in advance for your advices.
>> Best regards.
>> _______________________________________________
>> x264-devel mailing list
>> x264-devel at videolan.org
>> http://mailman.videolan.org/listinfo/x264-devel
>>     
>
> _______________________________________________
> x264-devel mailing list
> x264-devel at videolan.org
> http://mailman.videolan.org/listinfo/x264-devel
>   


-- 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jean-michel.hautbois.vcf
Type: text/x-vcard
Size: 176 bytes
Desc: not available
Url : http://mailman.videolan.org/pipermail/x264-devel/attachments/20070716/1a86b6ab/attachment-0001.vcf 
-------------- next part --------------
_______________________________________________
x264-devel mailing list
x264-devel at videolan.org
http://mailman.videolan.org/listinfo/x264-devel


More information about the x264-devel mailing list