6
votes

I need some help understanding the concept of cores on a GPU vs. cores in a CPU for the purpose of doing parallel calculations.

When it comes to cores in a CPU, it seems pretty simple. I have a super intensive "for" loop that iterates four times. I have four cores in my Intel i5 2.26GHz CPU. I give one loop to each core. Each of the four loops is independent of the other. Boom - I now have four threads created and 100% CPU usage (instead of 25% CPU usage with only one core). My "for" loop now runs almost four times faster than it would have if I did not parallelize it. By the way, for the "for" loop, I was using the auto-parallelization available on Microsoft Visual Studio 2012, as in this online example:(http://msdn.microsoft.com/en-us/library/hh872235.aspx).

In contrast, I don't even know the number of cores in my laptop's GPU (Intel Graphics Media Accelerator HD, or Intel HD Graphics, with 1696MB shared memory) that I can use for parallel calculations. I don't even know a valid way of comparing the GPU to the CPU. When I see "12@500MHz" next to my graphics card description, I wonder if that means the graphics card has 12 cores for parallelization that can work kinda like the 4 cores in a CPU, except that the GPU cores run at 500MHz [slow] instead of 2.26GHz [fast]? Is there a GPU usage comparable to the CPU usage in Windows task manager? I'm an utter novice trying to use the C++ library in visual studio 2012, if that makes any difference. When I write the actual GPU software, the parallelization code looks like this:(http://msdn.microsoft.com/en-us/library/hh265137.aspx).

So, would you please fill some of the gaps or mistakes in my knowledge or help me compare the two? I don't need a super complicated answer, something as simple as "You can't compare a CPU core with a GPU core because of blankity blank" or "a GPU core isn't really a core like a CPU core is" would be very much appreciated.

1
If you're going to make the effort to downvote my question, at least leave a sentence explaining why you think it's a bad one. This question is rather open ended and you are free to address it from a variety of angles.user2287171
Also, I am aware that my graphics card that came standard with the laptop is a piece of crap. It is not a "discrete graphics card" capable of working with <amp.h>. For the sake of this question, please pretend that it is a "legit" card that functions well for doing program computations.user2287171

1 Answers

4
votes

First, the OS initiate more cores only if you ask for them in your code. Try using OpenMP or Win32 threads to achieve parallelism on your i5.

Second, the CPU clocking is more than GPU clocking. If the clocking of GPU is same as CPU, you can use it as a stove to cook. The cores in the GPU are more than CPU. There is a difference between a thread and core.

Third, I recommend you to read specifications and reference manuals for your CPU and GPU. Also, dont forget PCI-e. It is the bottleneck for Parallel Programming implementation.

Hope this clarifies your doubts. Any more questions, feel free to ask.