[x264-devel] [PATCH] sad mmx/sse2/sse3 unification updated

Axel Zeuner Axel.Zeuner at gmx.de
Sat Sep 6 09:54:39 CEST 2008


Hello,

the attachment contains an updated sad mmx/sse2/sse3 unification patch against 
latest git.
Results on k10 and k8 are shown in the table below.

Regards,
Axel

delta =(old-new)/old*100 i.e percent speedup or slowdown

                        k10                     k8      
                        old     new     delta   old     new     delta
sad_4x4_c               595     594     0.2     618     618     0.0
sad_4x4_mmx             122     114     6.6     146     139     4.8
sad_4x8_c               1170    1168    0.2     1216    1215    0.1
sad_4x8_mmx             196     178     9.2     222     219     1.4
sad_8x4_c               1104    1101    0.3     1151    1151    0.0
sad_8x4_mmx             130     115     11.5    167     166     0.6
sad_8x4_mmx_c64         151     136     9.9     177     176     0.6
sad_8x8_c               2324    2302    0.9     2300    2299    0.0
sad_8x8_mmx             208     185     11.1    253     255     -0.8
sad_8x8_mmx_c64         233     222     4.7     271     271     0.0
sad_8x16_c              4681    4694    -0.3    4684    4684    0.0
sad_8x16_mmx            369     320     13.3    434     429     1.2
sad_8x16_mmx_c64        413     378     8.5     467     461     1.3
sad_16x8_c              4246    4249    -0.1    4220    4219    0.0
sad_16x8_mmx            320     308     3.8     413     413     0.0
sad_16x8_sse2           238     227     4.6     503     471     6.4
sad_16x8_sse3_c64       237     229     3.4     503     471     6.4
sad_16x16_c             8534    8548    -0.2    8548    8548    0.0
sad_16x16_mmx           606     591     2.5     729     725     0.5
sad_16x16_sse2          412     389     5.6     833     823     1.2
sad_16x16_sse3_c64      412     389     5.6     833     823     1.2
sad_aligned_4x4_c       570     569     0.2     616     616     0.0
sad_aligned_4x4_mmx     114     105     7.9     130     120     7.7
sad_aligned_4x8_c       1168    1167    0.1     1224    1224    0.0
sad_aligned_4x8_mmx     179     161     10.1    189     185     2.1
sad_aligned_8x4_c       1096    1094    0.2     1155    1155    0.0
sad_aligned_8x4_mmx     117     104     11.1    134     131     2.2
sad_aligned_8x8_c       2327    2304    1.0     2317    2317    0.0
sad_aligned_8x8_mmx     196     161     17.9    201     202     -0.5
sad_aligned_8x16_c      4714    4730    -0.3    4691    4683    0.2
sad_aligned_8x16_mmx    360     305     15.3    363     318     12.4
sad_aligned_16x8_c      4215    4245    -0.7    4235    4235    0.0
sad_aligned_16x8_mmx    309     293     5.2     327     326     0.3
sad_aligned_16x8_sse2   209     189     9.6     360     332     7.8
sad_aligned_16x16_c     8544    8523    0.2     8630    8555    0.9
sad_aligned_16x16_mmx   596     559     6.2     618     579     6.3
sad_aligned_16x16_sse2  336     332     1.2     593     581     2.0
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sad-mmx-sse2-sse3-unification.patch
Type: text/x-diff
Size: 8817 bytes
Desc: not available
Url : http://mailman.videolan.org/pipermail/x264-devel/attachments/20080906/ecca43ac/attachment.patch 


More information about the x264-devel mailing list