How can I count how many clock cycles it takes for the rdtsc instruction to execute?

Question

I know that the unsigned long long gets stored in eax/edx but I'm wondering how can I find out how many clock cycles it takes to execute a single rdtsc instruction?

EDIT: Does something like this work?

.globl rdtsc

rdtsc:

rdtsc

movl %eax, %ecx

movl %edx, %ebx

rdtsc

subl %ecx, %eax

subl %ebx, %edx

ret

If this is a problem for you, then you aren't benchmarking your code properly. You need to run enough iterations so that the overhead of rdtsc() is negligible. — Mysticial
The overhead of rdtsc has already been measured. See instlatx64.atw.hu — harold

NPE NPE · Accepted Answer · 2012-11-07T01:57:49

You could execute rdtsc repeatedly, and look at the difference between consecutive return values. Of course you need to bear in mind things like context switches etc, which will cause massive spikes.

See rdtsc, too many cycles for a discussion.

How can I count how many clock cycles it takes for the rdtsc instruction to execute?

2 Answers