[x264-devel] Patch: Additional SPARC VIS SAD (8x16, 16x8, 16x16)

Phil Jensen philj at csufresno.edu
Thu Jul 21 09:34:36 CEST 2005


Hi all,

I have completed additonal SAD implementations (8x16, 16x8 and 16x16) 
using Sparc VIS.  Overall speedup is roughly 90% from straight C.  I'm 
doing development and testing on a Sun Fire V220, with 2 * 1.5ghz 
UltraSPARC-III CPUs.

I've hand-unrolled each of the loops.  Sun's assembler does not appear 
to have macro functionality built-in and I didn't want to establish an 
external dependancy on m4.  Please let me know if you run into any 
trouble with the patch.

Intrepid, C (x264 pre-r276):
x264 [info]: PSNR Mean Y:42.11 U:47.52 V:46.06 Avg:43.18 Global:43.13 
kb/s:147.3 encoded 150 frames, 5.76 fps, 147.50 kb/s
x264 [info]: PSNR Mean Y:42.11 U:47.52 V:46.06 Avg:43.18 Global:43.13 
kb/s:147.3 encoded 150 frames, 5.75 fps, 147.50 kb/s
x264 [info]: PSNR Mean Y:42.11 U:47.52 V:46.06 Avg:43.18 Global:43.13 
kb/s:147.3 encoded 150 frames, 5.80 fps, 147.50 kb/s

Intrepid, SAD optimized (8x8, 8x16, 16x8 & 16x16)
x264 [info]: PSNR Mean Y:42.11 U:47.52 V:46.06 Avg:43.18 Global:43.13 
kb/s:147.3 encoded 150 frames, 10.59 fps, 147.50 kb/s
x264 [info]: PSNR Mean Y:42.11 U:47.52 V:46.06 Avg:43.18 Global:43.13 
kb/s:147.3 encoded 150 frames, 10.60 fps, 147.50 kb/s
x264 [info]: PSNR Mean Y:42.11 U:47.52 V:46.06 Avg:43.18 Global:43.13 
kb/s:147.3 encoded 150 frames, 10.47 fps, 147.50 kb/s

http://zimmer.csufresno.edu/~philj/x264.sadnew.vis.patch
(patch is against r277)

Take care,
Phil

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: x264.sadnew.vis.patch
Url: http://mailman.videolan.org/pipermail/x264-devel/attachments/20050721/7f75c8cf/attachment.txt 


More information about the x264-devel mailing list