[x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
Loren Merritt
lorenm at u.washington.edu
Mon Sep 25 00:46:32 CEST 2006
- Previous message: [x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
- Next message: [x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
On Sun, 24 Sep 2006, Guillaume POIRIER wrote:
> On 9/18/06, Loren Merritt <lorenm at u.washington.edu> wrote:
>
>> pixel_sa8d_8x8_core_altivec could use a VEC_DIFF with one of the pointers
>> 8byte aligned.
>
> So far I've been able to use VEC_DIFF_H_8BYTE_ALIGNED with the
> following pattern:
>
> + VEC_DIFF_H_8BYTE_ALIGNED( pix1, i_pix1, pix2, i_pix2, 8, diff0v );
> + VEC_DIFF_H( pix1, i_pix1, pix2, i_pix2, 8, diff1v );
> + VEC_DIFF_H_8BYTE_ALIGNED( pix1, i_pix1, pix2, i_pix2, 8, diff2v );
> + VEC_DIFF_H( pix1, i_pix1, pix2, i_pix2, 8, diff3v );
> +
> + VEC_DIFF_H_8BYTE_ALIGNED( pix1, i_pix1, pix2, i_pix2, 8, diff4v );
> + VEC_DIFF_H( pix1, i_pix1, pix2, i_pix2, 8, diff5v );
> + VEC_DIFF_H_8BYTE_ALIGNED( pix1, i_pix1, pix2, i_pix2, 8, diff6v );
> + VEC_DIFF_H( pix1, i_pix1, pix2, i_pix2, 8, diff7v );
>
> I have not looked too much at this problem, but as far as I've seen,
> it looks like one every other call to VEC_DIFF* is done with a
> different alignment of pix1 and pix2;
> i.e. each call of VEC_DIFF_H_8BYTE_ALIGNED is done with both pix1 and
> pix8 8bytes or 16 bytes aligned, whereas on the above call the calls
> to VEC_DIFF are done with a different alignment of pix1 and pix2 (i.e.
> one is 8bytes aligned and the other is 16 bytes aligned).
Weird. That would indicate that stride is only a multiple of 8. Which does
happen for pix2 during slicetype and chroma_me, but only for sad and satd
not sa8d.
> I'll see what I can do, but I imagine it's possible to make do without
> using VEC_DIFF (which doesn't care about alignment at all).
sad, satd, and sa8d can all be optimized for:
pix1 is aligned to whatever the block size is.
pix2 is unaligned.
stride1 is a multiple of 16.
stride2 is a multiple of 8, and I could easily make it 16.
Additionally, in the current usage of sa8d, pix2 is also aligned to the
blocksize. But don't count on that remaining so.
> Now I have a question regarding a bug I've found in the Altivec quant code.
> I've noticed on some encodes I've done with that patch, I'm getting
> some isolated green or blue blocks that sometimes create green drags
> on first pass, and on the final encode, I'm just getting blocs that
> "pop in and pop out" (as in: the motion compensation doesn't turn them
> into green drags).
>
> It _appears_ that the more I activate high quality options (RD,
> trellis), the less artifacts I'm getting. I imagine that it means that
> the different codepath taken with high quality options may not trigger
> the bug as often, or maybe compensate for them.
>
> What's funny is that the bug is un-reproductible, as in: if I take a
> sample encode it once, I'll get some green/blue blocks, say at frames
> 5 and 7... and if I re-encode, with the same source, and the same
> options, I won't get the blocs at the same frames and at the same
> locations of the frame.
The only causes of nondeterminism in single-threaded programs are
uninitialized memory and deliberate randomness (e.g. time()).
So try valgrind.
--Loren Merritt
--
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html
- Previous message: [x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8, pixel_sa8d_16x16, *idct8*
- Next message: [x264-devel] Re: [PATCH] Altivec optimizations for quant4x4, quant4x4dc, quant8x8, sub8x8_dct8, sub16x16_dct8, pixel_sa8d_8x8
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
More information about the x264-devel
mailing list