[x264-devel] [PATCH] zigzag SSE2

Holger Lubitz Holger.Lubitz at Informatik.Uni-Oldenburg.DE
Sun May 4 20:42:31 CEST 2008


> I want to see the latencies - it shows me when the results are ready, 
> otherwise only the decoding speed is measured if the code generates less 
> uOps/mOps than entries in the reorder buffers/schedulers exist.

Granted. But it's still the more interesting figure, unless you really
use the results immediately after (and in that case one should try to
rewrite the code to avoid the writes as they are likely to be 
unnecessary). The cycle counts i quoted for my code were done with
unsynchronized rdtsc using pengvado's bench.h include.

> But you are right, one should measure both times.
> rdtscp does not exist on all athlons (before F-stepping?, I know it is not 
> available on E6 stepping).

Possibly. Mine is an x2 3800+ ee "family 15, model 67, stepping 2" and it 
supports rdtscp.

Holger



More information about the x264-devel mailing list