(1). I am wondering how I can speed up the time-consuming computation in the loop of my code below using MPI?
int main(int argc, char ** argv)
{
// some operations
f(size);
// some operations
return 0;
}
void f(int size)
{
// some operations
int i;
double * array = new double [size];
for (i = 0; i < size; i++) // how can I use MPI to speed up this loop to compute all elements in the array?
{
array[i] = complicated_computation(); // time comsuming computation
}
// some operations using all elements in array
delete [] array;
}
As shown in the code, I want to do some operations before and after the part to be paralleled with MPI, but I don't know how to specify where the parallel part begins and ends.
(2) My current code is using OpenMP to speed up the comutation.
void f(int size)
{
// some operations
int i;
double * array = new double [size];
omp_set_num_threads(_nb_threads);
#pragma omp parallel shared(array) private(i)
{
#pragma omp for schedule(dynamic) nowait
for (i = 0; i < size; i++) // how can I use MPI to speed up this loop to compute all elements in the array?
{
array[i] = complicated_computation(); // time comsuming computation
}
}
// some operations using all elements in array
}
I wonder if I change to use MPI, is it possible to have the code written both for OpenMP and MPI? If it is possible, how to write the code and how to compile and run the code?
(3) Our cluster has three versions of MPI: mvapich-1.0.1, mvapich2-1.0.3, openmpi-1.2.6. Are their usage same? Especially in my case. Which one is best for me to use?
Thanks and regards!
UPDATE:
I like to explain a bit more about my question about how to specify the start and end of the parallel part. In the following toy code, I want to limit the parallel part within function f():
#include "mpi.h"
#include <stdio.h>
#include <string.h>
void f();
int main(int argc, char **argv)
{
printf("%s\n", "Start running!");
f();
printf("%s\n", "End running!");
return 0;
}
void f()
{
char idstr[32]; char buff[128];
int numprocs; int myid; int i;
MPI_Status stat;
printf("Entering function f().\n");
MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
if(myid == 0)
{
printf("WE have %d processors\n", numprocs);
for(i=1;i<numprocs;i++)
{
sprintf(buff, "Hello %d", i);
MPI_Send(buff, 128, MPI_CHAR, i, 0, MPI_COMM_WORLD); }
for(i=1;i<numprocs;i++)
{
MPI_Recv(buff, 128, MPI_CHAR, i, 0, MPI_COMM_WORLD, &stat);
printf("%s\n", buff);
}
}
else
{
MPI_Recv(buff, 128, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &stat);
sprintf(idstr, " Processor %d ", myid);
strcat(buff, idstr);
strcat(buff, "reporting for duty\n");
MPI_Send(buff, 128, MPI_CHAR, 0, 0, MPI_COMM_WORLD);
}
MPI_Finalize();
printf("Leaving function f().\n");
}
However, the running output is not expected. The printf parts before and after the parallel part have been executed by every process, not just the main process:
$ mpirun -np 3 ex2
Start running!
Entering function f().
Start running!
Entering function f().
Start running!
Entering function f().
WE have 3 processors
Hello 1 Processor 1 reporting for duty
Hello 2 Processor 2 reporting for duty
Leaving function f().
End running!
Leaving function f().
End running!
Leaving function f().
End running!
So it seems to me the parallel part is not limited between MPI_Init() and MPI_Finalize().
Besides this one, I am still hoping someone could answer my other questions. Thanks!