[vlc-devel] [PATCH] arm_neon: Add an optimized routine for NV12/21 to I420

Martin Storsjö martin at martin.st
Mon Sep 30 17:05:23 CEST 2013


On Mon, 30 Sep 2013, Jean-Baptiste Kempf wrote:

> On 30 Sep, Rémi Denis-Courmont wrote :
>>> +VIDEO_FILTER_WRAPPER (NV12_I420)
>>
>> Regarding the luminance plane, in my exprience, memcpy() optimizations beat
>> the crap out of a simplistic NEON load/store loop. memcpy() would basically
>> halve the complexity of the assembler code and improve data locality.
>
> Why do we even need to copy the luminance plane? Can't we just swap the
> pointers? </curious>

You can't assume that you can mix and match plane pointers from different 
pictures between each other. For instance, in the android vout (which is 
of interest wrt to this patch) the vout internally only has got one single 
pointer, and the chroma planes follow after the luma one. So if you just 
swap the pointers, the vout will get a vlc_picture returned with a swapped 
y plane pointer, but the actual vout buffer where the luma was supposed to 
be filled in is untouched.

// Martin


More information about the vlc-devel mailing list