[vlc-devel] [PATCH] Do adjust filter in SSE2 and SSE4.1
Jean-Baptiste Kempf
jb at videolan.org
Mon Jul 18 18:31:29 CEST 2011
Hello Martin,
First, thanks a lot for the work.
On Fri, Jul 15, 2011 at 09:08:18PM +0200, gamajun at seznam.cz wrote :
> + *p_out++ = clip_uint8_vlc( (( ((i_u * i_cos + i_v * i_sin - i_x) >> 8) \
> + * i_sat) >> 8) + 128); \
Not sure you really need this many parenthesis...
> + WRITE_UV_CLIP_PLANAR_SSE4_1();
> + // WRITE_UV_CLIP_PLANAR_SSE4_1();
Why keeping the Second line?
> + p_in += p_pic->p[U_PLANE].i_pitch
> + - p_pic->p[U_PLANE].i_visible_pitch;
> + p_in_v += p_pic->p[V_PLANE].i_pitch
> + - p_pic->p[V_PLANE].i_visible_pitch;
> + p_out += p_outpic->p[U_PLANE].i_pitch
> + - p_outpic->p[U_PLANE].i_visible_pitch;
> + p_out_v += p_outpic->p[V_PLANE].i_pitch
> + - p_outpic->p[V_PLANE].i_visible_pitch;
Some alignment could improve the readability.
> +#elif defined(CAN_COMPILE_SSE2)
Maybe this should be in a different function?
> +#if defined(CAN_COMPILE_SSE4_1)
> + if ( vlc_CPU() & CPU_CAPABILITY_SSE4_1 && i_sat > 256 )
> + {
> +#define WRITE_UV_CLIP() \
> + i_u = *p_in; p_in += 4; i_v = *p_in_v; p_in_v += 4; \
> + *p_out = clip_uint8_vlc( (( ((i_u * i_cos + i_v * i_sin - i_x) >> 8) \
> + * i_sat) >> 8) + 128); \
> + p_out += 4; \
> + *p_out_v = clip_uint8_vlc( (( ((i_v * i_cos - i_u * i_sin - i_y) >> 8) \
> + * i_sat) >> 8) + 128); \
> + p_out_v += 4
> +
> + uint8_t i_u, i_v;
> +
> + WRITE_UV_CLIP_PACKED_PREPARE;
> +
> + for( ; p_in < p_in_end ; )
> + {
> + p_line_end = p_in + i_visible_pitch - 8 * 4;
> +
> + for( ; p_in < p_line_end ; )
> + {
> + /* Do 8 pixels at a time */
> + WRITE_UV_CLIP_PACKED_SSE4_1();
> + }
> +
> + p_line_end += 8 * 4;
> +
> + for( ; p_in < p_line_end ; )
> + {
> + WRITE_UV_CLIP();
> + }
> +
> + p_in += i_pitch - i_visible_pitch;
> + p_in_v += i_pitch - i_visible_pitch;
> + p_out += i_pitch - i_visible_pitch;
> + p_out_v += i_pitch - i_visible_pitch;
Mostly same remarks as above.
No opinion about the ASM. Speedups numbers?
Best Regards,
--
Jean-Baptiste Kempf
http://www.jbkempf.com/ - +33 672 704 734
Sent from my Electronic Device
More information about the vlc-devel
mailing list