[vlc-devel] [PATCH 1/1] picture: increase alignment for AVX2 on x86 to 32 byte
Janne Grunau
janne-vlc at jannau.net
Sun Nov 27 22:21:50 CET 2016
On 2016-11-27 18:22:17 +0200, Rémi Denis-Courmont wrote:
> Le dimanche 27 novembre 2016, 17:07:47 Janne Grunau a écrit :
> > On 2016-11-27 17:54:09 +0200, Rémi Denis-Courmont wrote:
> > > Le dimanche 27 novembre 2016, 16:29:15 Janne Grunau a écrit :
> > > > Required for direct rendering with AVX2 enabled libavcodec and AVX2
> > > > optimizations for the blend deinterlacer.
> > >
> > > The value in picture_Setup() does not actually apply in most scenarii,
> > > only
> > > when the picture is allocated from the heap.
> >
> > I need to make sure that that all pictures are aligned properly.
> >
> > > libavcodec already has the opportunity to align picture buffer sizes
> > > correctly through avcodec_align_dimensions2(), which works a lot
> > > better in practice.
> >
> > At least the change for the memalign is required. The aligned pitch to
> > libavcodec's requirements doesn't help at all if the base bointer is
> > only aligned to 16-bytes to begin with.
>
> IMO, the base pointer should be on a page boundary anyway. But if you want to
> align them to 32 bytes, there is no need for all that complicated ifdefery.
I don't see the need to force more alignment than necessary. I wouldn't
call the host_cpu based setting of the required alignment complicated.
Since the pitch has the same alignment requirements over alignment
wastes memory. For example for a yuv420 frame 2 * Height * 16 bytes if
the width is not an multiple of 32.
> Still, that patch does not address the main case, which is that the
> picture is allocated by a video output rather than the generic heap
> allocator.
I can't be bothered to check and fix every video output. Nothing bad
will happen if the plane pointers and pitches are not properly aligned.
It makes just things slower. The avx2 merge/blend deinterlacer is for
example is factor ~1.7 slower if the memory is not 32 byte aligned.
This patch fixes all video outputs I care about. I will amend the patch
with the obvious fix for evas video out which also requested 16 byte
alignment with memalign.
Janne
More information about the vlc-devel
mailing list