how to calculate theoretical fp32 instructions per cycle (IPC) on nvidia GPU

1

votes

I'm having a hard time understanding how the theoretical Instructions per Cycle (IPC) for a Fermi architecture nvidia GPU is 2, according to http://on-demand.gputechconf.com/gtc-express/2011/presentations/Inst_limited_kernels_Oct2011.pdf page 9.

According to section 5.4.1 of the programming guide (http://docs.nvidia.com/cuda/cuda-c-programming-guide/#arithmetic-instructions) for 32-bit floats, there can be 32 fp32-instructions/SM/clock cycle.

How do the two quantities relate?

cudagpugpgpunvidia

1 Answers

2

votes

Answer provided here on the NVIDIA developer forums:

https://devtalk.nvidia.com/default/topic/722525/cuda-programming-and-performance/how-to-calculate-theoretical-fp32-instructions-per-cycle-ipc-on-nvidia-gpu/