I'm currently doing an assignment that measures the performance of various x86-64 commands (at&t syntax).
The command I'm somewhat confused on is the "unconditional jmp" command. This is how I've implemented it:
.global uncond
uncond:
.rept 10000
jmp . + 2
.endr
mov $10000, %rax
ret
It's fairly simple. The code creates a function called "uncond" which uses the .rept directive to call the jmp command 10000 times, then sets the return value to the number of times you called the jmp command.
"." in at&t syntax means the current address, which I increase by 2 bytes in order to account for the jmp instruction itself (so jmp . + 2 should simply move to the next instruction).
Code that I haven't shown calculate the number of cycles it takes to process the 10000 commands.
My results say jmp is pretty slow (takes 10 cycles to process a single jmp instruction) - but from what I understand about pipelining, unconditional jumps should be very fast (no branch prediction errors).
Am I missing something? Is my code wrong?