3
votes

I have a question about the Performance of NVIDIA GPUs. I have an implementation where I do interpolation between two arrays. Using textures pitched to linear memory is faster than using CUDA-Arrays. For now i tried it on only one GPU. Is this the case on every GPU or can there be differences? I am using a GPU in a Laptop. Are the desktop GPUs much faster? Because at the moment i only gain a speedup by 2-3.

I might seem like a stupid question but I would be thankful for an answer from somebody who worked with textures on many GPUs. It wonders me that using CUDA-Arrays (which should have some Cache optimization...) is slower.

I'm working on a NVIDIA Quadro 2000m and I'm comparing it to a I7-2860QM @ 2,50GHZ (the implementations). Is this a fair race?

1
and here, vice a versa, cuda-arrays are faster than pitched memory: devtalk.nvidia.com/default/topic/504608/…user1545642
Just to make sure - are you talking about global memory access versus textures or about textures in linear memory versus textures in CUDA-arrays? If the latter, my answer below doesn't apply.tera
I'm taling about textures in linear memory vs. textures in cuda-arraysSilve2611

1 Answers

1
votes

GPUs with compute capability 2.0 or higher cache global memory as well as textures, so the main advantage that textures had in the CC 1.x era is no more.

Quite to the contrary, a little mentioned fact about textures is that they can increase register pressure due to the need to store multiple arguments and return values in registers in a hardwired layout. Furthermore, the cache for global memory is larger than that for texture memory. So it is not unexpected that reading memory through textures can be slower than directly access to global memory.

This characteristic should be the same for mobile or desktop GPUs, even though high-end desktop GPUs can be about 2x to 5x faster that mobile devices.