[x264-devel] checkasm: Serialize read_time() calls on x86

Henrik Gramner git at videolan.org
Thu Nov 13 13:52:04 CET 2014


x264 | branch: master | Henrik Gramner <henrik at gramner.com> | Wed Oct  8 22:25:35 2014 +0200| [4576cfd8c391b27748d6f97f5b621cec4ed8047c] | committer: Fiona

checkasm: Serialize read_time() calls on x86

Improves the accuracy of benchmarks, especially in short functions.

To quote the Intel 64 and IA-32 Architectures Software Developer's Manual:
"The RDTSC instruction is not a serializing instruction. It does not necessarily
wait until all previous instructions have been executed before reading the counter.
Similarly, subsequent instructions may begin execution before the read operation
is performed. If software requires RDTSC to be executed only after all previous
instructions have completed locally, it can either use RDTSCP (if the processor
supports that instruction) or execute the sequence LFENCE;RDTSC."

RDTSCP would accomplish the same task, but it's only available since Nehalem.

This change makes SSE2 a requirement to run checkasm.

> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=4576cfd8c391b27748d6f97f5b621cec4ed8047c
---

 tools/checkasm.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/checkasm.c b/tools/checkasm.c
index f4a8547..a0e8810 100644
--- a/tools/checkasm.c
+++ b/tools/checkasm.c
@@ -90,7 +90,9 @@ static inline uint32_t read_time(void)
 {
     uint32_t a = 0;
 #if HAVE_X86_INLINE_ASM
-    asm volatile( "rdtsc" : "=a"(a) :: "edx", "memory" );
+    asm volatile( "lfence \n"
+                  "rdtsc  \n"
+                  : "=a"(a) :: "edx", "memory" );
 #elif ARCH_PPC
     asm volatile( "mftb %0" : "=r"(a) :: "memory" );
 #elif ARCH_ARM     // ARMv7 only



More information about the x264-devel mailing list