[x264-devel] Re: [patch] SSE2 pixel routines - new patch!
Guillaume POIRIER
poirierg at gmail.com
Sat Aug 13 23:31:50 CEST 2005
Hi,
On 8/13/05, Guillaume POIRIER <poirierg at gmail.com> wrote:
> On 7/27/05, Alexander Izvorski <aizvorski at gmail.com> wrote:
> > I tried to post a new SSE2 patch to the list but I think the message
> > is not getting through (possibly the attachment is too large?). The
> > patch is here:
> >
> > http://www.geocities.com/x264hack/sse2-pixel-routines-v4.diff.txt
> >
> > and the message I had sent originally is below. Sorry if it ends up
> > posting twice.
>
> Here is the minibench of the AMD-64 v4 version (AMD64 3400+ s754 512k L2).
> We can see that SSE2 doesn't seem to be AMD64 cup of tea. :-(
Ahem! Powernow!(tm) screwed the previous benchmark. Here is a more correct one.
SATD and SAD are sadly slower, SSD are faster.
Sorry for the trouble.
cpuspeed = 2403188000.000000
loops = 100000000
emptyloop time = 1.9700 cpl = 47.343
x264_pixel_sad_16x16_mmxext time = 4.6800 cpl = 65.126
x264_pixel_sad_16x8_mmxext time = 3.2000 cpl = 29.559
x264_pixel_sad_8x16_mmxext time = 4.1300 cpl = 51.909
x264_pixel_sad_8x8_mmxext time = 2.7000 cpl = 17.543
x264_pixel_sad_8x4_mmxext time = 2.4000 cpl = 10.334
x264_pixel_sad_4x8_mmxext time = 2.7300 cpl = 18.264
x264_pixel_sad_4x4_mmxext time = 2.4000 cpl = 10.334
x264_pixel_ssd_16x16_mmxext time = 13.140 cpl = 268.44
x264_pixel_ssd_16x8_mmxext time = 7.7900 cpl = 139.87
x264_pixel_ssd_8x16_mmxext time = 8.1800 cpl = 149.24
x264_pixel_ssd_8x8_mmxext time = 5.3800 cpl = 81.949
x264_pixel_ssd_8x4_mmxext time = 3.8700 cpl = 45.661
x264_pixel_ssd_4x8_mmxext time = 4.0800 cpl = 50.707
x264_pixel_ssd_4x4_mmxext time = 3.3400 cpl = 32.924
x264_pixel_satd_16x16_mmxext time = 30.360 cpl = 682.27
x264_pixel_satd_16x8_mmxext time = 16.250 cpl = 343.18
x264_pixel_satd_8x16_mmxext time = 16.220 cpl = 342.45
x264_pixel_satd_8x8_mmxext time = 9.0200 cpl = 169.42
x264_pixel_satd_8x4_mmxext time = 5.7500 cpl = 90.841
x264_pixel_satd_4x8_mmxext time = 5.7800 cpl = 91.561
x264_pixel_satd_4x4_mmxext time = 3.9500 cpl = 47.583
x264_pixel_sad_16x16_sse2 time = 5.1800 cpl = 77.142
x264_pixel_sad_16x8_sse2 time = 3.7500 cpl = 42.777
x264_pixel_ssd_16x16_sse2 time = 12.260 cpl = 247.29
x264_pixel_ssd_16x8_sse2 time = 7.3600 cpl = 129.53
x264_pixel_satd_16x16_sse2 time = 32.770 cpl = 740.18
x264_pixel_satd_16x8_sse2 time = 17.480 cpl = 372.73
x264_pixel_satd_8x16_sse2 time = 17.510 cpl = 373.46
x264_pixel_satd_8x8_sse2 time = 10.170 cpl = 197.06
x264_pixel_satd_8x4_sse2 time = 6.3900 cpl = 106.22
--
A legend is an old man with a cane known for
what he used to do. I'm still doing it.
-- Miles Davis
--
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html
More information about the x264-devel
mailing list