[x264-devel] Add SSE support to rectangle.h for 16-byte stores

Zuxy Meng zuxy.meng at gmail.com
Wed Apr 13 05:30:18 CEST 2011


Hi,

2011/4/13 Jason Garrett-Glaser <git at videolan.org>:
> x264 | branch: master | Jason Garrett-Glaser <jason at x264.com> | Tue Mar 29 05:33:44 2011 -0700| [f422ec93254ed3f9883acac0bb3f67e3b4ea960c] | committer: Jason Garrett-Glaser
>
> Add SSE support to rectangle.h for 16-byte stores
> Uses GCC vector intrinsics; may be suboptimal on particularly old GCC versions.
>
>> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=f422ec93254ed3f9883acac0bb3f67e3b4ea960c
> ---
>
>  common/common.h    |    3 ++-
>  common/rectangle.h |   10 ++++++++++
>  common/x86/util.h  |    3 +++
>  configure          |    4 +++-
>  4 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/common/common.h b/common/common.h
> index fcf0250..496542e 100644
> --- a/common/common.h
> +++ b/common/common.h
> @@ -851,11 +851,12 @@ struct x264_t
>
>  // included at the end because it needs x264_t
>  #include "macroblock.h"
> -#include "rectangle.h"
>
>  #if HAVE_MMX
>  #include "x86/util.h"
>  #endif
>
> +#include "rectangle.h"
> +
>  #endif
>
> diff --git a/common/rectangle.h b/common/rectangle.h
> index aeaa2b9..770de2c 100644
> --- a/common/rectangle.h
> +++ b/common/rectangle.h
> @@ -80,6 +80,15 @@ static ALWAYS_INLINE void x264_macroblock_cache_rect( void *dst, int w, int h, i
>     {
>         /* height 1, width 16 doesn't occur */
>         assert( h != 1 );
> +#if HAVE_VECTOREXT && defined(__SSE__)
> +        v4si v16 = {v,v,v,v};
> +
> +        M128( d+s*0+0 ) = (__m128)v16;
> +        M128( d+s*1+0 ) = (__m128)v16;
> +        if( h == 2 ) return;
> +        M128( d+s*2+0 ) = (__m128)v16;
> +        M128( d+s*3+0 ) = (__m128)v16;
> +#else

Would it be better to use Intel intrinsics so compilers other than GCC
can work as well?


-- 
Zuxy
Beauty is truth,
While truth is beauty.
PGP KeyID: E8555ED6


More information about the x264-devel mailing list