0
votes

I am trying to send implement master-slave pattern in which master has an array(acts as job queue) and sends data to slave processors. Based on the data obtained from master, slaves compute results and return answers to master. Master on receiving results, find out the slave rank from which msg was received, and then send the next job to that slave.

This is the code skeleton which I have implemented:

        if (my_rank != 0) 
        {
            MPI_Recv(&seed, 1, MPI_FLOAT, 0, tag, MPI_COMM_WORLD, &status);

                    //.. some processing 

            MPI_Send(&message, 100, MPI_FLOAT, 0, my_rank, MPI_COMM_WORLD);
        } 
        else 
        {
            for (i = 1; i < p; i++) {
                MPI_Send(&A[i], 1, MPI_FLOAT, i, tag, MPI_COMM_WORLD);
            }

            for (i = p; i <= S; i++) {
                MPI_Recv(&buf, 100, MPI_FLOAT, MPI_ANY_SOURCE, MPI_ANY_TAG,
                        MPI_COMM_WORLD, &status);
                //.. processing to find out free slave rank from which above msg was received (y)
                MPI_Send(&A[i], 1, MPI_FLOAT, y, tag, MPI_COMM_WORLD);
            }

            for (i = 1; i < p; i++) {
                MPI_Recv(&buf, 100, MPI_FLOAT, MPI_ANY_SOURCE, MPI_ANY_TAG,MPI_COMM_WORLD, &status);

                // .. more processing 
            }

        }

If I am using 4 processor; 1 is master and 3 are slaves; the program sends and receives messages for first 3 jobs in job queue but after that the program hangs. What could be the problem?

1
It sound like one of the processes is dying before it sends a response. Figure out which process is not sending a response to the main process. Some debugging code would be helpful here. - tdbeckett
This is very incomplete. - Vladimir F
^ This is the only MPI code where I am doing send and receive. other things seem normal to me. - Kany

1 Answers

0
votes

If this is the totality of your MPI based code, then it looks like you are missing a while loop on the outside of the client code. I've done this before, and I typically break it up as the taskMaster and peons

in the taskMaster:

 for (int i = 0; i < commSize; ++i){
    if (i == commRank){ // commRank doesn't have to be 0
        continue;
    }

    if (taskNum < taskCount){
        // tasks is vector<Task>, where I have crated a Task 
        // class and send it as a stream of bytes
        toSend = tasks.at(taskNum);  
        jobList.at(i) = taskNum;  // so we no which rank has which task
        taskNum += 1;
        activePeons += 1;
    } else{
        // stopTask is a flag value to stop receiving peon
        toSend = stopTask;
        allTasksDistributed = true;
    }

    // send the task, with the size of the task as the tag
    taskSize = sizeof(toSend);
    MPI_Send(&toSend, taskSize, MPI_CHAR, i, taskSize, MPI_COMM_WORLD);
}   

MPI_Status status;

while (activePeons > 0){ 
    // get the results from a peon (but figure out who it is coming from and what the size is)
    MPI_Probe(MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
    MPI_Recv(   &toSend,                    // receive the incoming task (with result data)
                status.MPI_TAG,             // Tag holds number of bytes
                MPI_CHAR,                   // type, may need to be more robust later
                status.MPI_SOURCE,          // source of send
                MPI_ANY_TAG,                // tag
                MPI_COMM_WORLD,             // COMM
                &status);                   // status

    // put the result from that task into the results vector
    results[jobList[status.MPI_SOURCE]] = toSend.getResult();

    // if there are more tasks to send, distribute the next one
    if (taskNum < taskCount ){
        toSend = tasks.at(taskNum);
        jobList[status.MPI_SOURCE] = taskNum;
        taskNum += 1;
    } else{ // otherwise send the stop task and decrement activePeons
        toSend = stopTask;
        activePeons -= 1;
    }

    // send the task, with the size of the task as the tag
    taskSize = sizeof(toSend);
    MPI_Send(&toSend, taskSize, MPI_CHAR, status.MPI_SOURCE, taskSize, MPI_COMM_WORLD);
}

in the peon function:

while (running){
    MPI_Probe(MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);    // tag holds data size

    incoming = (Task *) malloc(status.MPI_TAG);

    MPI_Recv(   incoming,           // memory location of input
                status.MPI_TAG,     // tag holds data size
                MPI_CHAR,           // type of data
                status.MPI_SOURCE,  // source is from distributor
                MPI_ANY_TAG,        // tag
                MPI_COMM_WORLD,     // comm
                &status);           // status

    task = Task(*incoming);

    if (task.getFlags() == STOP_FLAG){
        running = false;
        continue;
    }

    task.run();   // my task class has a "run" method
    MPI_Send(   &task,                  // string to send back
                status.MPI_TAG,         // size in = size out
                MPI_CHAR,               // data type
                status.MPI_SOURCE,      // destination
                status.MPI_TAG,         // tag doesn't matter
                MPI_COMM_WORLD);        // comm

    free(incoming);
}

There are some bool and int values that have to be assigned (and as I said, I have a Task class), but this gives the basic structure for what I think you want to do.