[x264-devel] Add SSE support to rectangle.h for 16-byte stores
Zuxy Meng
zuxy.meng at gmail.com
Wed Apr 13 05:30:18 CEST 2011
Hi,
2011/4/13 Jason Garrett-Glaser <git at videolan.org>:
> x264 | branch: master | Jason Garrett-Glaser <jason at x264.com> | Tue Mar 29 05:33:44 2011 -0700| [f422ec93254ed3f9883acac0bb3f67e3b4ea960c] | committer: Jason Garrett-Glaser
>
> Add SSE support to rectangle.h for 16-byte stores
> Uses GCC vector intrinsics; may be suboptimal on particularly old GCC versions.
>
>> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=f422ec93254ed3f9883acac0bb3f67e3b4ea960c
> ---
>
> common/common.h | 3 ++-
> common/rectangle.h | 10 ++++++++++
> common/x86/util.h | 3 +++
> configure | 4 +++-
> 4 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/common/common.h b/common/common.h
> index fcf0250..496542e 100644
> --- a/common/common.h
> +++ b/common/common.h
> @@ -851,11 +851,12 @@ struct x264_t
>
> // included at the end because it needs x264_t
> #include "macroblock.h"
> -#include "rectangle.h"
>
> #if HAVE_MMX
> #include "x86/util.h"
> #endif
>
> +#include "rectangle.h"
> +
> #endif
>
> diff --git a/common/rectangle.h b/common/rectangle.h
> index aeaa2b9..770de2c 100644
> --- a/common/rectangle.h
> +++ b/common/rectangle.h
> @@ -80,6 +80,15 @@ static ALWAYS_INLINE void x264_macroblock_cache_rect( void *dst, int w, int h, i
> {
> /* height 1, width 16 doesn't occur */
> assert( h != 1 );
> +#if HAVE_VECTOREXT && defined(__SSE__)
> + v4si v16 = {v,v,v,v};
> +
> + M128( d+s*0+0 ) = (__m128)v16;
> + M128( d+s*1+0 ) = (__m128)v16;
> + if( h == 2 ) return;
> + M128( d+s*2+0 ) = (__m128)v16;
> + M128( d+s*3+0 ) = (__m128)v16;
> +#else
Would it be better to use Intel intrinsics so compilers other than GCC
can work as well?
--
Zuxy
Beauty is truth,
While truth is beauty.
PGP KeyID: E8555ED6
More information about the x264-devel
mailing list