[vlc-devel] [PATCH] Do adjust filter in SSE2 and SSE4.1

Martin Briza Gamajun at seznam.cz
Mon Jul 18 19:06:13 CEST 2011


Hi,

> On Fri, Jul 15, 2011 at 09:08:18PM +0200, gamajun at seznam.cz wrote :
>> +    *p_out++ = clip_uint8_vlc( (( ((i_u * i_cos + i_v * i_sin - i_x)  
>> >> 8) \
>> +                           * i_sat) >> 8) + 128); \
> Not sure you really need this many parenthesis...

This is just a recycled macro, that has been in place before I started  
working on it, but I think it's ok.

>> +                WRITE_UV_CLIP_PLANAR_SSE4_1();
>> +              //  WRITE_UV_CLIP_PLANAR_SSE4_1();
> Why keeping the Second line?

Uh, this is just a residue from testing I didn't notice.

>> +            p_in += p_pic->p[U_PLANE].i_pitch
>> +                  - p_pic->p[U_PLANE].i_visible_pitch;
>> +            p_in_v += p_pic->p[V_PLANE].i_pitch
>> +                    - p_pic->p[V_PLANE].i_visible_pitch;
>> +            p_out += p_outpic->p[U_PLANE].i_pitch
>> +                   - p_outpic->p[U_PLANE].i_visible_pitch;
>> +            p_out_v += p_outpic->p[V_PLANE].i_pitch
>> +                     - p_outpic->p[V_PLANE].i_visible_pitch;
> Some alignment could improve the readability.

Agreed.

>> +#elif defined(CAN_COMPILE_SSE2)
> Maybe this should be in a different function?

A different function? You mean to split static picture_t *FilterPlanar in  
more functions?

> No opinion about the ASM. Speedups numbers?

It's about 50% on my system, but highly dependant on size of the processed  
image (better on bigger images).

I'll wait for your reviews and after you tell me everything you think  
about it, I'll make another patch to repair what you notice.

Regards,

Martin Briza



More information about the vlc-devel mailing list