3
votes

I am writing a parallel merge sort program. I use fork() to perform the parallel processing. I tried running 2 parallel processes, 4 processes, 8 processes and so on. Then I found that the one running with 2 processes required the least time to finish, i.e the highest performance. I think it's reasonable as my cpu is core 2 duo. For 4,8,16,32 processes, it seems to have a steady declining of performance, but after that the performance fluctuate (doesn't seem to have a pattern). Can someone explain that?

Plus according to the pattern, I have a feeling that when the number of processes used in the program is equal to the number of core that my cpu has, the my program could have the highest performance. But I am 100% sure. Can someone verify me? Or tell me what actually affect the performance of a parallel program.

Thanks in advance!!

2

2 Answers

1
votes

With 2 cores any number of processes greater than 2 will have to share the processor time. You will incur overhead from process switching and you will never have more than two processes executing at one time. It is better to have just two processes run uninterrupted on your two cores.

As to why you were seeing a fluctuation in performance once you hit a large number of processes I'd have to make a guess that your OS is spending more time task switching between the processes than actually performing work doing the sort. The time it takes to switch tasks is an artifact of your OS's scheduler, amount of memory being used by individual tasks, caching, potential use of swap space, etc...

If you want to maximize the performance of parallel processes, the number of processes running concurrently should equal the number of processors times the number of cores on each processor. In your case, two. Any less then you have cores sitting idle not doing anything, any more, you have processes sitting idle waiting for time on a processor core.

1
votes

3 processes should never be faster than 2 processes on a Core 2 Duo.

Also, forking only makes sense if you're doing CPU-expensive tasks:

Forking to print the message Hello world! twice is nonsense. The forking itself will consume more CPU-time than it could possibly save.

Forking to sort an array with 1,000,000 elements (if you use the proper sorting algorithm) will cut execution time roughly in half.