0
votes

I am trying to modify my program so the code can look better. Right now I am using MPI_Send and MPI_Recv, but I am trying to make it work with MPI_Scatter. I have an array called All_vals and I trying to send Sub_arr_len equal parts to every slave. So after I first find the number of values I should send to every slave I have to send and recieve the number of values and then send and recieve the values. I am trying to change those send/recv to MPI_Scatter and think about the case where the number of the values wont be able to be divided to equal parts like when I have 20 numbers,but I have 3 processes. So slaves should have 20/3=6 values, but the master should have the rest 20-2*6=8.
Here's the part I am trying to edit:

int master(int argc, char **argv){
...
for (i=0; i<ntasks; i++)
    {
    Sub_arr_len[i]=Num_vals/ntasks;      //Finding the number of values I should send to every process             
    printf("\n\nIn range %d there are %d\n\n",i,Sub_arr_len[i]);
    }

    for(i=1;i<ntasks;i++) {
        Sub_len = Sub_arr_len[i];
        MPI_Isend(&Sub_len,1, MPI_INTEGER, i, i, MPI_COMM_WORLD, &request);
        MPI_Wait(&request, &status);
        Sub_arr_start += Sub_arr_len[i-1];
        printf("\r\nSub_arr_start = %d ",Sub_arr_start);
        MPI_Isend(&All_vals[Sub_arr_start],Sub_len, MPI_INTEGER, i, i, MPI_COMM_WORLD, &request);
        MPI_Wait(&request, &status);
    }
...
}  


int slave(){
MPI_Irecv(&Sub_arr_len, 1, MPI_INTEGER, 0, myrank, MPI_COMM_WORLD, &request);
    MPI_Wait(&request,&status);
    printf("\r\nSLAVE:Sub_arr_len = %d\n\n",Sub_arr_len);
    All_vals = (int*) malloc(Sub_arr_len * sizeof(MPI_INTEGER));

    MPI_Irecv(&All_vals[0], Sub_arr_len, MPI_INTEGER, 0, myrank, MPI_COMM_WORLD, &request);
    MPI_Wait(&request,&status);
}

I am trying to make the scatter thing, but I am doing something wrong, so I would love if someone help me build it.

1

1 Answers

0
votes

With MPI_Scatter, each rank involved must receive the same number of elements, including the root. Generally the root process is a normal participant in the scatter operation and "receives" his own chunk. This means you also need to specify a receive buffer for the root process. You basically have the following options:

  1. Use MPI_IN_PLACE as recvbuf on the root, by doing that the root will not send anything to itself. Combine that with scattering a tail of the original sendbuffer from the root such that this "tail" has a number of elements that is divisible by the number of processes. E.g. for 20 elements scatter &All_vals[2] which has a total 18 elements, 6 for each (including root again). The root can then use All_vals[0-7].
  2. Pad the sendbuffer with a few elements that don't do anything such that it is divisible by the number of processes.
  3. Use MPI_Scatterv to send unequal numbers of elements to each rank. Here is a good example how to properly setup the send counts and displacements

The last option is has a some distinct advantages. It creates the least load imbalance among processes and allows all processes to use the same code. If applicable, the second option is very simple and still has a good load balance.