[x264-devel] Re: [patch] SSE2 pixel routines - new patch!

Guillaume POIRIER poirierg at gmail.com
Sat Aug 13 23:31:50 CEST 2005


Hi,

On 8/13/05, Guillaume POIRIER <poirierg at gmail.com> wrote:

> On 7/27/05, Alexander Izvorski <aizvorski at gmail.com> wrote:

> > I tried to post a new SSE2 patch to the list but I think the message
> > is not getting through (possibly the attachment is too large?).  The
> > patch is here:
> >
> > http://www.geocities.com/x264hack/sse2-pixel-routines-v4.diff.txt
> >
> > and the message I had sent originally is below.  Sorry if it ends up
> > posting twice.
> 
> Here is the minibench of the AMD-64 v4 version (AMD64 3400+ s754 512k L2).
> We can see that SSE2 doesn't seem to be AMD64 cup of tea. :-(

Ahem! Powernow!(tm) screwed the previous benchmark. Here is a more correct one.
SATD and SAD are sadly slower, SSD are faster.

Sorry for the trouble.

cpuspeed = 2403188000.000000
loops = 100000000
emptyloop time = 1.9700 cpl = 47.343
x264_pixel_sad_16x16_mmxext time = 4.6800 cpl = 65.126
x264_pixel_sad_16x8_mmxext time = 3.2000 cpl = 29.559
x264_pixel_sad_8x16_mmxext time = 4.1300 cpl = 51.909
x264_pixel_sad_8x8_mmxext time = 2.7000 cpl = 17.543
x264_pixel_sad_8x4_mmxext time = 2.4000 cpl = 10.334
x264_pixel_sad_4x8_mmxext time = 2.7300 cpl = 18.264
x264_pixel_sad_4x4_mmxext time = 2.4000 cpl = 10.334
x264_pixel_ssd_16x16_mmxext time = 13.140 cpl = 268.44
x264_pixel_ssd_16x8_mmxext time = 7.7900 cpl = 139.87
x264_pixel_ssd_8x16_mmxext time = 8.1800 cpl = 149.24
x264_pixel_ssd_8x8_mmxext time = 5.3800 cpl = 81.949
x264_pixel_ssd_8x4_mmxext time = 3.8700 cpl = 45.661
x264_pixel_ssd_4x8_mmxext time = 4.0800 cpl = 50.707
x264_pixel_ssd_4x4_mmxext time = 3.3400 cpl = 32.924
x264_pixel_satd_16x16_mmxext time = 30.360 cpl = 682.27
x264_pixel_satd_16x8_mmxext time = 16.250 cpl = 343.18
x264_pixel_satd_8x16_mmxext time = 16.220 cpl = 342.45
x264_pixel_satd_8x8_mmxext time = 9.0200 cpl = 169.42
x264_pixel_satd_8x4_mmxext time = 5.7500 cpl = 90.841
x264_pixel_satd_4x8_mmxext time = 5.7800 cpl = 91.561
x264_pixel_satd_4x4_mmxext time = 3.9500 cpl = 47.583
x264_pixel_sad_16x16_sse2 time = 5.1800 cpl = 77.142
x264_pixel_sad_16x8_sse2 time = 3.7500 cpl = 42.777
x264_pixel_ssd_16x16_sse2 time = 12.260 cpl = 247.29
x264_pixel_ssd_16x8_sse2 time = 7.3600 cpl = 129.53
x264_pixel_satd_16x16_sse2 time = 32.770 cpl = 740.18
x264_pixel_satd_16x8_sse2 time = 17.480 cpl = 372.73
x264_pixel_satd_8x16_sse2 time = 17.510 cpl = 373.46
x264_pixel_satd_8x8_sse2 time = 10.170 cpl = 197.06
x264_pixel_satd_8x4_sse2 time = 6.3900 cpl = 106.22


-- 
A legend is an old man with a cane known for
what he used to do. I'm still doing it.
  -- Miles Davis

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html



More information about the x264-devel mailing list