[x264-devel] Patch: Additional SPARC VIS SAD (8x16, 16x8, 16x16)
Phil Jensen
philj at csufresno.edu
Thu Jul 21 09:34:36 CEST 2005
Hi all,
I have completed additonal SAD implementations (8x16, 16x8 and 16x16)
using Sparc VIS. Overall speedup is roughly 90% from straight C. I'm
doing development and testing on a Sun Fire V220, with 2 * 1.5ghz
UltraSPARC-III CPUs.
I've hand-unrolled each of the loops. Sun's assembler does not appear
to have macro functionality built-in and I didn't want to establish an
external dependancy on m4. Please let me know if you run into any
trouble with the patch.
Intrepid, C (x264 pre-r276):
x264 [info]: PSNR Mean Y:42.11 U:47.52 V:46.06 Avg:43.18 Global:43.13
kb/s:147.3 encoded 150 frames, 5.76 fps, 147.50 kb/s
x264 [info]: PSNR Mean Y:42.11 U:47.52 V:46.06 Avg:43.18 Global:43.13
kb/s:147.3 encoded 150 frames, 5.75 fps, 147.50 kb/s
x264 [info]: PSNR Mean Y:42.11 U:47.52 V:46.06 Avg:43.18 Global:43.13
kb/s:147.3 encoded 150 frames, 5.80 fps, 147.50 kb/s
Intrepid, SAD optimized (8x8, 8x16, 16x8 & 16x16)
x264 [info]: PSNR Mean Y:42.11 U:47.52 V:46.06 Avg:43.18 Global:43.13
kb/s:147.3 encoded 150 frames, 10.59 fps, 147.50 kb/s
x264 [info]: PSNR Mean Y:42.11 U:47.52 V:46.06 Avg:43.18 Global:43.13
kb/s:147.3 encoded 150 frames, 10.60 fps, 147.50 kb/s
x264 [info]: PSNR Mean Y:42.11 U:47.52 V:46.06 Avg:43.18 Global:43.13
kb/s:147.3 encoded 150 frames, 10.47 fps, 147.50 kb/s
http://zimmer.csufresno.edu/~philj/x264.sadnew.vis.patch
(patch is against r277)
Take care,
Phil
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: x264.sadnew.vis.patch
Url: http://mailman.videolan.org/pipermail/x264-devel/attachments/20050721/7f75c8cf/attachment.txt
More information about the x264-devel
mailing list