Hello everyone!<br><br><div>I'm very sorry for the late response to this old mail. It didn't go unnoticed, I just had to find enough time to carefully review your patch.</div><div><br><div class="gmail_quote">On Sun, May 31, 2009 at 12:41 AM, David Wolstencroft <span dir="ltr"><<a href="mailto:wolstencroft@alum.rpi.edu">wolstencroft@alum.rpi.edu</a>></span> wrote:<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Please find the attached patch for frame_init_lowres_core_altivec<br></blockquote><div><br></div><div>Youpi! It's indeed the current most CPU-extensive non-altivec-ed routine.</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
I would appreciate any review you guys can give,</blockquote><div><br></div><div>At first I thought I could shrink the linecount of the code that processes the "end" part with some macros, but it'd only make the core less readable.</div>
<div><br></div><div>Well, really, I don't see anything that I could do to improve your code...</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
as well as anyone who can test the compile on a linux distribution.<br></blockquote><div><br></div><div>Works on Gentoo with GCC 4.3 and 4.1.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Speed gains as follows:<br>
<br>
ticks:<br>
lowres_init_c: 8374<br>
lowres_init_altivec: 375<br>
<br>
Below with a patched checkasm with CHUD for cycle accurate counts:<br>
lowres_init_c: 286981<br>
lowres_init_altivec: 12822<br>
<br>
Using the cycle accurate counts, the altivec version shows a 22.38x speed increase over plain C<br><font color="#888888">
</font></blockquote><div><br></div><div>Here are the figures on G5:</div><div>lowres_init_c: 3310</div><div>lowres_init_altivec: 132</div><div> </div></div>That's a 25x speed increase over sequential code.</div><div>
<br></div><div>I'll apply your code shortly<br clear="all"><br></div><div>Thanks a million, and sorry for the very late reply.</div><div><br></div><div>Guillaume<br>-- <br>Only a very small fraction of our DNA does anything; the rest is all<br>
comments and ifdefs.<br><br><a href="http://www.brainyquote.com/quotes/authors/g/george_carlin.html" target="_blank">George Carlin</a> - "Electricity is really just organized lightning."
</div>