2
votes

I am a little confused by the concept of the memory bandwidth of a GPU.

According to the TESLA M 2090 GPU specs it says the peak bandwidth is 177.6 GB/s.

So When people refer to bandwidth, does it refer to

  • the speed of one way traffic ,as in the number of bytes per second which can be read , from the device

  • the speed of the two way-traffic , as in the number of bytes per second which can be read and written to the device memory.

Wherever I read this term, I dont see this clarification being made

2
You might also differentiate between continuous flow of data (in either direction; usually referred to as bandwidth), and the instances of there-and-back communication (lookup, usually referred to as latency). For NVIDIA GPUs the global memory latency is usually around 400 cycles (for L1 and L2 miss).P Marecki

2 Answers

2
votes

The bandwidth is the amount of data that can be read or written in a given period of time.

The same bus is used for both reads and writes. In a given clock cycle, the bus can be used for either a read or a write.

1
votes

There is only one set of wires on the bus so data can't be written or read at the same time. In theory the bandwidth is the same, total read+write == total read == total write.

But in practice the transfers are much more efficient if you are writing large contiguous blocks of data to the device, this is the most common usage and is what the system is optimised for.

edit. The internal memory bandwidth of a graphics card (ie the memory path between various components on the card) is much higher than the bandwidth to/from the computer.

It's also much more complex, there are different types of memory connected to different processors in different ways and the manufacturer will pick the numbers that make it sound the highest - this number is really meaningless except to compare different models of very similar cards from the same GPU family.