Open MPI to distributed and manipulate 2d array in PGM files

Question

I need to use Open MPI to distribute 2d-array in the PGM file among 10 working computers. Then I need to manipulate each value of the array to get a negative image (255-i) and then print the output back. I'm thinking of using mpi_scatter and mpi_gather to distribute the data. The problem now is how to read the 2-d array into sub array and send the sub array to each of the working computer to do the manipulation. I'm writing this program in C.

Can anyone can help me solve this problem or give an idea? Thank you.

Below are the example of array in the PGM file:

P2
# created by 'xv balloons_bw.tif'
640 480
255
232 227 220 216 212 209 207 206 205 205 205 207 208 209 210 211 212 
211 211 213 212 211 210 209 210 210 211 212 211 210 210 210 210 211 
210 210 210 210 209 210 209 208 209 208 209 210 209 208 210 209 209 
208 208 208 209 208 208 208 207 207 207 206 207 207 207 207 207 207 
207 207 207 207 205 204 206 205 205 204 204 204 203 202 203 202 201 
201 201 200 199 199 200 199 198 198 198 197 197 198 197 196 195 195 
194 193 192 192 191 191 190 190 190 190 189 189 190 188 188 188 187 
187 187 186 186 186 186 187 186 186 187 188 188 187 186 186 186 185 
186 186 186 187 186 186 186 185 185 187 186 185 186 185 185 186 185 
184 185 186 185 186 186 186 185 186 185 185 185 184 183 184 184 183

I'm re-tagging this mpi because it's not openmpi specific (just the same way you wouldn't take a C programming question as gcc just because that happened to be the compiler you were using). — Jonathan Dursi

Jonathan Dursi Jonathan Dursi · Accepted Answer · 2011-02-24T14:33:20

I would normally agree with Shawn Chin about using existing libraries to do the file reading; in this case I might disagree because the file format is so simple and it's so important for MPI to know how the data is laid out in memory. A 2d nxm array allocated as a contiguous 1-d array of nxm is very different from rows scattered all over memory! As always, this is C's fault for not having real multi-d arrays. On the other hand, you could check out the libnetpbm libraries and see how it's allocated, or as Shawn suggests, copy the whole thing into contiguous memory after reading it in.

Note too that this would actually be easier with the (binary) P5 format, as one could use MPI-IO to read in the data in parallel right at the beginning, rather than having one processor doing all the reading and using scatter/gather to do the data distribution. With ascii files, you never really know how long a record is going to be, which makes coordinated I/O very difficult.

Also note that this really isn't a 2d problem - you are just doing an elementwise operation on every piece of the array. So you can greatly simplify things by just treating the data as a 1d array and ignoring the geometry. This wouldn't be the case if you were (say) applying a 2d filter to the image, as there the geometry matters and you'd have to partition data accordingly; but here we don't care.

Finally, even in this simple case you have to use scatterv and gatherv because the number of cells in the image might not evenly divide by the number of MPI tasks. You could simplify the logic here just by padding the array to make it evenly divide; then you could avoid some of the extra steps here.

So if you have a read_pgm() and write_pgm() that you know return pointers into a single contiguous block of memory, you can do something like this:

int main(int argc, char **argv) {
    int ierr;
    int rank, size;
    int **greys;
    int rows, cols, maxval;
    int ncells;
    int mystart, myend, myncells;
    const int IONODE=0;
    int *disps, *counts, *mydata;
    int *data;

    ierr = MPI_Init(&argc, &argv);
    if (argc != 3) {
        fprintf(stderr,"Usage: %s infile outfile\n",argv[0]);
        fprintf(stderr,"       outputs the negative of the input file.\n");
        return -1;
    }            

    ierr  = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    ierr |= MPI_Comm_size(MPI_COMM_WORLD, &size);
    if (ierr) {
        fprintf(stderr,"Catastrophic MPI problem; exiting\n");
        MPI_Abort(MPI_COMM_WORLD,1);
    }

    if (rank == IONODE) {
        if (read_pgm(argv[1], &greys, &rows, &cols, &maxval)) {
            fprintf(stderr,"Could not read file; exiting\n");
            MPI_Abort(MPI_COMM_WORLD,2);
        }
        ncells = rows*cols;
        disps = (int *)malloc(size * sizeof(int));
        counts= (int *)malloc(size * sizeof(int));
        data = &(greys[0][0]); /* we know all the data is contiguous */
    }

    /* everyone calculate their number of cells */
    ierr = MPI_Bcast(&ncells, 1, MPI_INT, IONODE, MPI_COMM_WORLD);
    myncells = ncells/size;
    mystart = rank*myncells;
    myend   = mystart + myncells - 1;
    if (rank == size-1) myend = ncells-1;
    myncells = (myend-mystart)+1;
    mydata = (int *)malloc(myncells * sizeof(int));

    /* assemble the list of counts.  Might not be equal if don't divide evenly. */
    ierr = MPI_Gather(&myncells, 1, MPI_INT, counts, 1, MPI_INT, IONODE, MPI_COMM_WORLD);
    if (rank == IONODE) {
        disps[0] = 0;
        for (int i=1; i<size; i++) {
            disps[i] = disps[i-1] + counts[i-1];
        }
    }

    /* scatter the data */
    ierr = MPI_Scatterv(data, counts, disps, MPI_INT, mydata, myncells, 
                        MPI_INT, IONODE, MPI_COMM_WORLD);

    /* everyone has to know maxval */
    ierr = MPI_Bcast(&maxval, 1, MPI_INT, IONODE, MPI_COMM_WORLD);

    for (int i=0; i<myncells; i++)
        mydata[i] = maxval-mydata[i];

    /* Gather the data */
    ierr = MPI_Gatherv(mydata, myncells, MPI_INT, data, counts, disps, 
                        MPI_INT, IONODE, MPI_COMM_WORLD);

    if (rank == IONODE) {
        write_pgm(argv[2], greys, rows, cols, maxval);
    }

    free(mydata);
    if (rank == IONODE) {
        free(counts);
        free(disps);
        free(&(greys[0][0]));
        free(greys);
    }
    MPI_Finalize();
    return 0;
}

Open MPI to distributed and manipulate 2d array in PGM files

2 Answers

not using libpgm