Passing large 2d dimentional array in MPI C++

Question

I have a task to speed up a program using MPI. Let's assume I have a large 2d array (1000x1000 or bigger) on the input. I have a working sequential program that divides, so the 2d array into chunks (for example 10x10) and calculates the result which is double for each chuck. (so we have a function which argument is 2d array 10x10 and a result is a double number).

My first idea to speed up:

Create 1d array of size N*N (for example 10x10 = 100) and Send array to another process

double* buffer = new double[dataPortionSize];
//copy some data to buffer
MPI_Send(buffer, dataPortionSize, MPI_DOUBLE, currentProcess, 1, MPI_COMM_WORLD);

Recieve it in another process, calculate result, send back the result

double* buf = new double[dataPortionSize];
MPI_Recv(buf, dataPortionSize, MPI_DOUBLE, 0, 1, MPI_COMM_WORLD, status);
double result = function->calc(buf);
MPI_Send(&result, 1, MPI_DOUBLE, 0, 3, MPI_COMM_WORLD);

This program was much slower than the sequential version. It looks like MPI needs a lot of time to pass an array to another process.

My second idea:

Pass the whole 2d input array to all processes

// data is protected field in base class, it is injected during runtime 
MPI_Send(&(data[0][0]), dataSize * dataSize, MPI_DOUBLE, currentProcess, 1, MPI_COMM_WORLD);

And receive data like this

double **arrayAlloc( int size ) {
 double **result; result = new double [ size ];
 for ( int i = 0; i < size; i++ )
 result[ i ] = new double[ size ];
return result;
}

double **data = arrayAlloc(dataSize);
MPI_Recv(&data[0][0], dataSize * dataSize, MPI_DOUBLE, 0, 1, MPI_COMM_WORLD, status);

Unfortunately, I got a bunch of errors during execution:

Those crashes are pretty random. It happened 2 times that the program ended successfully

My third idea:

Pass memory address to all processes, but I found this:

MPI processes cannot read each others' memory, and virtual addressing makes one process' pointer completely meaningless to another.

Does anyone have an idea how to speed up it? I understand that the key thing for speed to is pass array/arrays to processes in an efficient way, but I don't have an idea how to do this.

Regarding the crash, what is data? How is declared/defined and initialized? — Some programmer dude
I am using a function to allocated memory when receiving: ``` double *arrayAlloc( int size ) { double **result; result = new double [ size ]; for ( int i = 0; i < size; i++ ) result[ i ] = new double[ size ]; return result; } ``` and then I am using MPI_Recv. I am not allocating memory and defining data for before sending, I get it injected, you can assume that data is well defined. I forgot to add. Those crashes are pretty random. It happened 2 times that the program ended successfully. — Raspberry
The problem is that you don't actually have a "2D" array, what you have is a single array of pointers. The data is not contiguous as it would be with a proper "2D" array. — Some programmer dude