[x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8
Guillaume POIRIER
gpoirier at mplayerhq.hu
Thu Aug 24 11:16:02 CEST 2006
Hi,
Guillaume POIRIER a écrit :
> Hi all,
> Please find in attachment the altivec-optimized version of quant
> 4x4(+dc) and quant 8x8. All now pass regression tests, so they are all
> activated
> I've also found out that even though my 4x4 DC also pass regression
> tests, it doesn't seem to be called too often.
> It's likely to be because I had to restrict its use to the cases where
> maxQdc < (1<<15).
> That can hopefully be improved if I manage to understand why I can't use
> my 4x4dc with maxQdc < (1<<16) or more.
I still haven't found what's the problem. In practice, it doesn't
matter too much because with default flat matrices, the optimized
version is used
> I've also cleaned up the patch to put the unions as typedefs, and also
> fixed some indentation problems.
>
> Sorry in advance if I can't be responsive to address any comment made to
> the code, as I don't have any Internet access where the PPC machine is :-(.
The attached patch now also features sub8x8_dct8 in Altivec in addition
to previous optimized routines. This new routine still need to be a bit
cleaned up, and the code need to be factorized in a macro, but it works.
I've introduced another transpose8x8 routine, shamelessly taken from
FFmpeg's: the one on ppccommon.h didn't do what I wanted, but maybe it's
just because I didn't know how to use it.
Last but not least, when I tested that my patch was applying cleanly to
svn, I had to remove this hunk:
Index: common/macroblock.c
===================================================================
--- common/macroblock.c (revision 540)
+++ common/macroblock.c (working copy)
@@ -26,7 +26,7 @@
#include "common.h"
-static const int dequant_mf[6][4][4] =
+static const int dequant_mf[6][4][4] __attribute__((__aligned__(16))) =
{
{ {10, 13, 10, 13}, {13, 16, 13, 16}, {10, 13, 10, 13}, {13, 16,
13, 16} },
{ {11, 14, 11, 14}, {14, 18, 14, 18}, {11, 14, 11, 14}, {14, 18,
14, 18} },
Now x264 in svn doesn't have this line since r552:
https://trac.videolan.org/x264/changeset/552
I don't know what what happened to this variable, but I need to ensure
that it's aligned (as my code assume it).
The diff indicates that the declaration of dequant_mf[6][4][4] was just
removed, and I can't find where it's declared, though grep does show
that it's still used, so I guess there's a declaration somewhere, it's
just that _I_ can't locate it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: altivec_quant+sub_dct8-cleaned-up.diff
Type: text/x-patch
Size: 24155 bytes
Desc: not available
Url : http://mailman.videolan.org/pipermail/x264-devel/attachments/20060824/3211bbaa/attachment.bin
More information about the x264-devel
mailing list