Memory barriers in Parallel.For

Question

Microsoft's documention of Parallel.For contains the following method:

static void MultiplyMatricesParallel(double[,] matA, double[,] matB, double[,] result)
{
    int matACols = matA.GetLength(1);
    int matBCols = matB.GetLength(1);
    int matARows = matA.GetLength(0);

    // A basic matrix multiplication.
    // Parallelize the outer loop to partition the source array by rows.
    Parallel.For(0, matARows, i =>
    {
        for (int j = 0; j < matBCols; j++)
        {
            double temp = 0;
            for (int k = 0; k < matACols; k++)
            {
                temp += matA[i, k] * matB[k, j];
            }
            result[i, j] = temp;
        }
    }); // Parallel.For
}

In this method, potentially multiple threads read values from matA and matB, which were both created and initialized on the calling thread, and potentially multiple threads write values to result, which is later read by the calling thread. Within the lambda passed to Parallel.For, there is no explicit locking around the array reads and writes. Because this example comes from Microsoft, I assume it's thread-safe, but I'm trying to understand what's going on behind the scenes to make it thread-safe.

To the best of my understanding from what I've read and other questions I've asked on SO (for example this one), several memory barriers are needed to make this all work. Those are:

a memory barrier on the calling thread after creating and intializing matA and matB,
a memory barrier on each non-calling thread before reading values from matA and matB,
a memory barrier on each non-calling thread after writing values to result, and
a memory barrier on the calling thread before reading values from result.

Have I understood this correctly?

If so, does Parallel.For do all of that somehow? I went digging in the reference source but had trouble following the code. I didn't see any lock blocks or MemoryBarrier calls.

I expect that Parallel.For() has a memory barrier at the start and end of it, have you looked at the source code for Parallel.For()? — Ian Ringrose
@IanRingrose, yes, as mentioned and linked to in my question. I didn't find any MemoryBarrier calls or lock blocks. — adv12
But what about in the methods it calls, I expect it sits on top of the task system, and that somewhere in the task system are memory barriers where task start and end. — Ian Ringrose
@HansPassant I do trust that MS gets it right, but I've written threaded code that is wrong, and understanding exactly what is required to get it right helps me when I have to do it myself, and more importantly helps me know exactly what I can count on MS to do for me vs what locking, etc I still need to implement when using Microsoft's patterns and libraries. — adv12

Patrick Hofman Patrick Hofman · Accepted Answer · 2017-01-14T19:09:56

Since the array is already created, writing or reading from it won't cause any resizes. Also, the code itself prevents reading/writing the same position in the array.

Bottom line is that the code always can calculate the position to read and write in the array and those calls never cross each other. Hence, it is thread-safe.

Memory barriers in Parallel.For

4 Answers