Microsoft's documention of Parallel.For contains the following method:
static void MultiplyMatricesParallel(double[,] matA, double[,] matB, double[,] result)
{
int matACols = matA.GetLength(1);
int matBCols = matB.GetLength(1);
int matARows = matA.GetLength(0);
// A basic matrix multiplication.
// Parallelize the outer loop to partition the source array by rows.
Parallel.For(0, matARows, i =>
{
for (int j = 0; j < matBCols; j++)
{
double temp = 0;
for (int k = 0; k < matACols; k++)
{
temp += matA[i, k] * matB[k, j];
}
result[i, j] = temp;
}
}); // Parallel.For
}
In this method, potentially multiple threads read values from matA
and matB
, which were both created and initialized on the calling thread, and potentially multiple threads write values to result
, which is later read by the calling thread. Within the lambda passed to Parallel.For
, there is no explicit locking around the array reads and writes. Because this example comes from Microsoft, I assume it's thread-safe, but I'm trying to understand what's going on behind the scenes to make it thread-safe.
To the best of my understanding from what I've read and other questions I've asked on SO (for example this one), several memory barriers are needed to make this all work. Those are:
- a memory barrier on the calling thread after creating and intializing
matA
andmatB
, - a memory barrier on each non-calling thread before reading values from
matA
andmatB
, - a memory barrier on each non-calling thread after writing values to
result
, and - a memory barrier on the calling thread before reading values from
result
.
Have I understood this correctly?
If so, does Parallel.For
do all of that somehow? I went digging in the reference source but had trouble following the code. I didn't see any lock
blocks or MemoryBarrier
calls.
MemoryBarrier
calls or lock blocks. – adv12