[vlc-devel] [PATCH] arm_neon: Add an optimized routine for NV12/21 to I420
    Martin Storsjö 
    martin at martin.st
       
    Mon Sep 30 17:05:23 CEST 2013
    
    
  
On Mon, 30 Sep 2013, Jean-Baptiste Kempf wrote:
> On 30 Sep, Rémi Denis-Courmont wrote :
>>> +VIDEO_FILTER_WRAPPER (NV12_I420)
>>
>> Regarding the luminance plane, in my exprience, memcpy() optimizations beat
>> the crap out of a simplistic NEON load/store loop. memcpy() would basically
>> halve the complexity of the assembler code and improve data locality.
>
> Why do we even need to copy the luminance plane? Can't we just swap the
> pointers? </curious>
You can't assume that you can mix and match plane pointers from different 
pictures between each other. For instance, in the android vout (which is 
of interest wrt to this patch) the vout internally only has got one single 
pointer, and the chroma planes follow after the luma one. So if you just 
swap the pointers, the vout will get a vlc_picture returned with a swapped 
y plane pointer, but the actual vout buffer where the luma was supposed to 
be filled in is untouched.
// Martin
    
    
More information about the vlc-devel
mailing list