In this example code I do a summation from i=0
to i=n
and then add the result to itself k
times, where k
is the number of threads. I purposely did it without critical
(surrounding the printf
and ans += ans
) to cause race conditions. However, to my surprise, no race condition happened:
int summation_with_operation_after_it_wrong1(int n, int k) {
int ans = 0;
#pragma omp parallel firstprivate(n) num_threads(k)
{
int i; /* Private */
#pragma omp for schedule(dynamic) reduction(+:ans)
for (i = 0; i < n; i++) {
ans += i;
}
printf("Thread %d ans=%d\n", omp_get_thread_num(), ans);
ans += ans;
}
return ans;
}
Using n=10
and k=4
, the output is (always the same, except for thread order):
Thread 1 ans=45
Thread 3 ans=45
Thread 0 ans=45
Thread 2 ans=45
720
However, I did noticed something odd about it. ans
was always 45, instead of
Thread 3 ans=45
Thread 0 ans=90
Thread 2 ans=180
Thread 1 ans=360
720
When using critical
. So I moved the printf
to after the ans += ans
to see what it was doing, and, for my surprise, the predicted race conditions started to occur all the time!
Thread 3 ans=90
Thread 1 ans=135
Thread 2 ans=90
Thread 0 ans=135
135
So... How does the printf
prevented race conditions? And how does that sum ended up to be 720? I'm completely lost here.
printf()
has some 'thread safety' built into it, which forces some serialization. I've not checked, but there are functions likeputc_unlocked()
, butprintf()
is written in terms ofputc()
for character output. So, simply usingprintf()
may impose some sequencing, leaving the program better behaved than you'd expect. – Jonathan Leffler