[x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8

Guillaume POIRIER gpoirier at mplayerhq.hu
Thu Aug 24 11:16:02 CEST 2006


Hi,

Guillaume POIRIER a écrit :
> Hi all,
> Please find in attachment the altivec-optimized version of quant 
> 4x4(+dc) and quant 8x8. All now pass regression tests, so they are all 
> activated
> I've also found out that even though my 4x4 DC also pass regression 
> tests, it doesn't seem to be called too often.
> It's likely to be because I had to restrict its use to the cases where 
> maxQdc < (1<<15).
> That can hopefully be improved if I manage to understand why I can't use 
>  my 4x4dc with maxQdc < (1<<16) or more.

I still haven't found what's the problem. In practice, it doesn't
matter too much because with default flat matrices, the optimized
version is used


> I've also cleaned up the patch to put the unions as typedefs, and also 
> fixed some indentation problems.
> 
> Sorry in advance if I can't be responsive to address any comment made to 
> the code, as I don't have any Internet access where the PPC machine is :-(.

The attached patch now also features sub8x8_dct8 in Altivec in addition 
to previous optimized routines. This new routine still need to be a bit 
cleaned up, and the code need to be factorized in a macro, but it works.

I've introduced another transpose8x8 routine, shamelessly taken from 
FFmpeg's: the one on ppccommon.h didn't do what I wanted, but maybe it's 
just because I didn't know how to use it.

Last but not least, when I tested that my patch was applying cleanly to 
svn, I had to remove this hunk:

Index: common/macroblock.c
===================================================================
--- common/macroblock.c (revision 540)
+++ common/macroblock.c (working copy)
@@ -26,7 +26,7 @@

  #include "common.h"

-static const int dequant_mf[6][4][4] =
+static const int dequant_mf[6][4][4] __attribute__((__aligned__(16))) =
  {
      { {10, 13, 10, 13}, {13, 16, 13, 16}, {10, 13, 10, 13}, {13, 16, 
13, 16} },
      { {11, 14, 11, 14}, {14, 18, 14, 18}, {11, 14, 11, 14}, {14, 18, 
14, 18} },


Now x264 in svn doesn't have this line since r552: 
https://trac.videolan.org/x264/changeset/552

I don't know what what happened to this variable, but I need to ensure 
that it's aligned (as my code assume it).
The diff indicates that the declaration of dequant_mf[6][4][4] was just 
removed, and I can't find where it's declared, though grep does show 
that it's still used, so I guess there's a declaration somewhere, it's 
just that _I_ can't locate it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: altivec_quant+sub_dct8-cleaned-up.diff
Type: text/x-patch
Size: 24155 bytes
Desc: not available
Url : http://mailman.videolan.org/pipermail/x264-devel/attachments/20060824/3211bbaa/attachment.bin 


More information about the x264-devel mailing list