[x264-devel] Re: [patch] SSE2 pixel routines - new patch!
Guillaume POIRIER
poirierg at gmail.com
Sat Aug 13 23:17:20 CEST 2005
Hi,
On 7/27/05, Alexander Izvorski <aizvorski at gmail.com> wrote:
> Hello,
>
> I tried to post a new SSE2 patch to the list but I think the message
> is not getting through (possibly the attachment is too large?). The
> patch is here:
>
> http://www.geocities.com/x264hack/sse2-pixel-routines-v4.diff.txt
>
> and the message I had sent originally is below. Sorry if it ends up
> posting twice.
Here is the minibench of the AMD-64 v4 version (AMD64 3400+ s754 512k L2).
We can see that SSE2 doesn't seem to be AMD64 cup of tea. :-(
cpuspeed = 1001328000.000000
loops = 100000000
emptyloop time = 2.5800 cpl = 25.834
x264_pixel_sad_16x16_mmxext time = 4.6800 cpl = 21.028
x264_pixel_sad_16x8_mmxext time = 3.2200 cpl = 6.4085
x264_pixel_sad_8x16_mmxext time = 3.9600 cpl = 13.818
x264_pixel_sad_8x8_mmxext time = 2.7000 cpl = 1.2016
x264_pixel_sad_8x4_mmxext time = 2.4000 cpl = -1.8024
x264_pixel_sad_4x8_mmxext time = 2.7400 cpl = 1.6021
x264_pixel_sad_4x4_mmxext time = 2.4000 cpl = -1.8024
x264_pixel_ssd_16x16_mmxext time = 13.060 cpl = 104.94
x264_pixel_ssd_16x8_mmxext time = 7.8000 cpl = 52.269
x264_pixel_ssd_8x16_mmxext time = 8.1900 cpl = 56.175
x264_pixel_ssd_8x8_mmxext time = 5.3700 cpl = 27.937
x264_pixel_ssd_8x4_mmxext time = 3.8700 cpl = 12.917
x264_pixel_ssd_4x8_mmxext time = 4.0900 cpl = 15.120
x264_pixel_ssd_4x4_mmxext time = 3.3300 cpl = 7.5100
x264_pixel_satd_16x16_mmxext time = 30.360 cpl = 278.17
x264_pixel_satd_16x8_mmxext time = 16.260 cpl = 136.98
x264_pixel_satd_8x16_mmxext time = 16.210 cpl = 136.48
x264_pixel_satd_8x8_mmxext time = 9.0100 cpl = 64.385
x264_pixel_satd_8x4_mmxext time = 5.7700 cpl = 31.942
x264_pixel_satd_4x8_mmxext time = 5.7700 cpl = 31.942
x264_pixel_satd_4x4_mmxext time = 3.9600 cpl = 13.818
x264_pixel_sad_16x16_sse2 time = 5.1800 cpl = 26.035
x264_pixel_sad_16x8_sse2 time = 3.7400 cpl = 11.615
x264_pixel_ssd_16x16_sse2 time = 12.260 cpl = 96.929
x264_pixel_ssd_16x8_sse2 time = 7.3700 cpl = 47.964
x264_pixel_satd_16x16_sse2 time = 32.760 cpl = 302.20
x264_pixel_satd_16x8_sse2 time = 17.480 cpl = 149.20
x264_pixel_satd_8x16_sse2 time = 17.500 cpl = 149.40
x264_pixel_satd_8x8_sse2 time = 10.170 cpl = 76.001
x264_pixel_satd_8x4_sse2 time = 6.4000 cpl = 38.251
--
A legend is an old man with a cane known for
what he used to do. I'm still doing it.
-- Miles Davis
--
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html
More information about the x264-devel
mailing list