1
votes

I have a master application which spawns a worker which itself spawns two slaves. The slave application writes its output to stdout. My idea was to bind stdout to a different stream in the worker application for being able to store the output of the slaves in a variable and send it over to the master, which handles the output. However, stdout of the slaves does not get redirected properly and still appears on the console. The buffer in the worker application stays empty. Am I missing something or is this not possible in the way I do it? If so, any recommendations on how to handle this issue in a different manner a greatly appreciated. I'm using Open MPI 1.6.5 on Gentoo and here's the source code of my applications:

master.cpp

#include <mpi.h>
#include <iostream>

using namespace std;

int main(int argc, char *argv[])
{
    char appExe[] = "worker";
    char *appArg[] = {NULL};
    int maxProcs = 1;
    int myRank; 
    MPI_Comm childComm;
    int spawnError;

    // Initialize
    MPI_Init(&argc, &argv);

    // Rank 
    MPI_Comm_rank(MPI_COMM_WORLD, &myRank);

    // Spawn application    
    MPI_Comm_spawn(appExe, appArg, maxProcs, MPI_INFO_NULL, myRank, MPI_COMM_SELF, &childComm, &spawnError);

    // Receive length of message from worker
    int len;
    MPI_Recv(&len, 1, MPI_INT, 0, MPI_ANY_TAG, childComm, MPI_STATUS_IGNORE);
    // Receive actual message from worker
    char *buf = new char[len];
    MPI_Recv(buf, len, MPI_CHAR, 0, MPI_ANY_TAG, childComm, MPI_STATUS_IGNORE);
    cout << "master: Got the following from worker: " << buf << endl;

    // Finalize
    MPI_Finalize();

    return 0;
}

worker.cpp

#include "mpi.h"
#include <iostream>
#include <string>
#include <sstream>

using namespace std;

int main(int argc, char *argv[])
{
    char appExe[] = "slave";
    char *appArg[] = {NULL};
    int maxProcs = 2;
    int myRank, parentRank; 
    MPI_Comm childComm, parentComm;
    int spawnError[maxProcs];

    // Initialize
    MPI_Init(&argc, &argv);

    // Rank
    MPI_Comm_rank(MPI_COMM_WORLD, &myRank);

    // Get parent
    MPI_Comm_get_parent(&parentComm);

    // Bind stdout to new_buffer
    stringstream new_buffer;
    streambuf *old_buffer = cout.rdbuf(new_buffer.rdbuf());  

    // Spawn application    
    MPI_Comm_spawn(appExe, appArg, maxProcs, MPI_INFO_NULL, myRank, MPI_COMM_SELF, &childComm, spawnError);

    // Enter barrier
    MPI_Barrier(childComm);

    // Reset stdout to old_buffer
    cout.rdbuf(old_buffer);

    // Make a string
    string tmp = new_buffer.str();
    // Make a character array from string
    const char* cstr = tmp.c_str();
    cout << "worker: Got the following from slaves: " << cstr << endl;

    // Send length of message to master   
    int len = sizeof(cstr);
    MPI_Send(&len, 1, MPI_INT, 0, 0, parentComm);
    // Send actual message
    MPI_Send(&cstr, len, MPI_CHAR, 0, 0, parentComm);

    // Finalize
    MPI_Finalize();

    return 0;
}

slave.cpp

#include <mpi.h>
#include <iostream>

using namespace std;

int main(int argc, char *argv[])
{
    MPI_Comm parent;

    // Initialize
    MPI_Init(&argc, &argv);

    // Get parent
    MPI_Comm_get_parent(&parent);

    // Say hello
    cout << "slave: Hi there!" << endl;

    // Enter barrier
    if (parent != MPI_COMM_NULL)
        MPI_Barrier(parent);

    // Finalize
    MPI_Finalize();

    return 0;
}
2

2 Answers

1
votes

Job spawning in MPI happens in the same "universe" and is usually performed by the same application launcher that is being used to launch the initial MPI job. In Open MPI that would be orterun (mpiexec and mpirun are both symlinks to orterun). I/O redirection is performed by the ORTE (the Open MPI run-time environment, part of the MPI library) and it sends the standard output of each MPI process to orterun, which then mixes everything and displays it to its console output or saves it to a file if output redirection is in place. Unless a spawned job specifically writes its output to a file, the parent has no way to intercept that output.

The other (and only) MPI-compliant way to communicate between the parent job and the spawned jobs is to use MPI message passing. You can implement your own C++ input and output stream classes that use MPI messages to transmit data over the intercommunicator.

0
votes

In general, using stdout/stderr for important output in a distributed application isn't the right way to go. It's difficult to enforce useful ordering which causes the lines to get jumbled together sometimes. It's usually far more effective to read/write data to files that can be moved around via NFS or some script. Then you know that the ordering is correct.