True non-blocking two-way communication between parent and external child process

Question

I have read around 50 posts and tutorials on this topic, I have copied, written and tested around 20 alternatives and done every possible research I can think of. Still, I have not seen a working solution for the following problem:

Parent process A wants to pass data to an external process B, let process B modify the data and pass it back to parent process A, then continue with parent process A. Process B is part of an external program suite that I have no influence over, and that is normally run like this on the UNIX command line:

< input_data program_B1 | program_B2 | program_B3 > output_data

...where

input_data, output_data: Some data that is processed in programs B1-B3

program_B1,B2,B3: Programs that read data from stdin (fread) and output to stdout (fwrite) and apply some processing to the data.

So, in sequence:

(1) Parent process A passes data to child process B

(2) Child process B reads data and modifies it

(3) Child process B passes data back to parent process A

(4) Parent process A reads data and continues (for example passing it further on to a process B2..).

(5) Parent process A passes another data set to child process B etc.

The problem is, whatever I do, the program almost always ends up hanging on a read/fread (or write/fwrite?) to or from a pipe.

One important thing to note is that the parent process cannot simply close the pipes after passing data on to the child process, because it works in a loop and wants to pass another set of data to the child process once it has finished processing the first set.

Here is a working set of parent/child programs (compile with g++ pipe_parent.cc -o pipe_parent, g++ pipe_child.cc -o pipe_child) illustrating the problem with unnamed pipes. I have also tried named pipes, but not as extensively. Each execution can have a slightly different outcome. If the sleep statement is omitted in the parent, or the fflush() statement is omitted in the child, the pipes will almost surely block. If the amount of data to be passed on is increased, it will always block independent of the sleep or fflush.

Parent program A:

#include <cstring>
#include <cstdio>
#include <cstdlib>

extern "C" {
  #include <unistd.h>
  #include <fcntl.h>
 }

using namespace std;

/*
 * Parent-child inter-communication
 * Child is external process
 */

int main() {
  int fd[2];
  if( pipe(fd) == -1 ) {
    fprintf(stderr,"Unable to create pipe\n");
  }
  int fd_parentWrite = fd[1];
  int fd_childRead   = fd[0];
  if( pipe(fd) == -1 ) {
    fprintf(stderr,"Unable to create pipe\n");
    exit(-1);
  }
  int fd_childWrite = fd[1];
  int fd_parentRead = fd[0];

  pid_t pid = fork();
  if( pid == -1 ) {
    fprintf(stderr,"Unable to fork new process\n");
    exit(-1);
  }

  if( pid == 0 ) { // Child process
    dup2( fd_childRead,  fileno(stdin)  );  // Redirect standard input(0) to child 'read pipe'
        dup2( fd_childWrite, fileno(stdout) );  // Redirect standard output(1) to child 'write pipe'

    close(fd_parentRead);
    close(fd_parentWrite);
    close(fd_childRead);
    close(fd_childWrite);
    // execl replaces child process with an external one
    int ret = execl("/disk/sources/pipe_test/pipe_child","pipe_child",NULL);
    fprintf(stderr,"External process failed, return code: %d...\n", ret);
    exit(-1);
    // Child process is done. Will not continue from here on
  }
  else { // Parent process
    // Nothing to set up
  }

  // ...more code...

  if( pid > 0 ) { // Parent process (redundant if statement)
    int numElements = 10000;
    int totalSize = numElements * sizeof(float);
    float* buffer = new float[numElements];
    for( int i = 0; i < numElements; i++ ) {
      buffer[i] = (float)i;
    }

    for( int iter = 0; iter < 5; iter++ ) {
      fprintf(stderr,"--------- Iteration #%d -----------\n", iter);
      int sizeWrite = (int)write( fd_parentWrite, buffer, totalSize );
      if( sizeWrite == -1 ) {
        fprintf(stderr,"Parent process write error\n");
        exit(-1);
      }
      fprintf(stderr,"Parent #%d: Wrote %d elements. Total size: %d\n", iter, sizeWrite, totalSize);
      sleep(1);   // <--- CHANGE!
      int sizeRead = (int)read( fd_parentRead, buffer, totalSize );
      if( sizeRead <= 0 ) {
        fprintf(stderr,"Parent process read error\n");
      }
      while( sizeRead < totalSize ) {
        fprintf(stderr,"Parent #%d: Read %d elements, continue reading...\n", iter, sizeRead);
        int sizeNew = (int)read( fd_parentRead, &buffer[sizeRead], totalSize-sizeRead );
        fprintf(stderr," ...newly read %d elements\n", sizeNew);
        if( sizeNew < 0 ) {
          exit(-1);
        }
        sizeRead += sizeNew;
      }
      fprintf(stderr,"Parent #%d: Read %d elements. Total size: %d\n", iter, sizeRead, totalSize);
      fprintf(stderr,"Examples :  %f  %f  %f\n", buffer[0], buffer[10], buffer[100]);
    }

    delete [] buffer;
  }

  close(fd_parentRead);
  close(fd_parentWrite);
  close(fd_childRead);
  close(fd_childWrite);

  return 0;
}

Child program B:

#include <cstdio>

using namespace std;

int main() {

  int numElements = 10000;
  int totalSize = numElements * sizeof(float);
  float* buffer = new float[numElements];

  int counter = 0;
  int sizeRead = 0;
  do {
    sizeRead = fread( buffer, 1, totalSize, stdin);
    fprintf(stderr,"Child  #%d: Read %d elements, buffer100: %f\n", counter, sizeRead, buffer[100]);
    if( sizeRead > 0 ) {
      for( int i = 0; i < numElements; i++ ) {
        buffer[i] += numElements;
      }
      int sizeWrite = fwrite( buffer, 1, totalSize, stdout);
      fflush(stdout);  // <--- CHANGE!

      fprintf(stderr,"Child  #%d: Wrote %d elements\n", counter, sizeWrite);
      counter += 1;
    }
  } while( sizeRead > 0 );

  return 0;
}

Is there any way to check when the pipe has enough data to be read? Or is there an alternative way to resolve the above problem, with or without pipes?

Please help!

Are you sure process B (the one you do not have control over) supports this mode of operation ? Many programs are written with the assumption that it should read up until stdin is closed, buffering output in the mean time (either explicittly or implicittly by writing to a buffered FILE* and not fflush'ing that). That can easily lead to deadlocks such as the ones you see, as the program will not output its final chunk of data until the program is terminated/stdin is closed. — nos
@nos, I believe this should not be a problem in my case, unless fwrite to stdout gets buffered by the system. I can see program B's source code and all it does is use fread from stdin and fwrite to stdout. But it doesn't flush, and I cannot make such updates to the code. — GridKing
fwrite to stdout is usually buffered, on most systems you have to turn off buffering explicittly (or make sure you call fflush() at the proper times) — nos

DarkDust DarkDust · Accepted Answer · 2011-07-04T15:50:24

Possibly the best solution when reading is to check with select whether you can read from the pipe. You can even pass a timeout. The alternative might be setting the O_NONBLOCK flag on file descriptor 0 (stdin) with fcntl, though I think the select way is better.

As with ensuring non-blocking write: that's a bit harder as you don't know how much you can write before the pipe blocks. One way (that I feel is very ugly) would be to only write 1 byte chunks and again check with select whether you can write. But that would be a performance killer, so use only if performance in communication is not an issue.

True non-blocking two-way communication between parent and external child process

2 Answers