You were really quite close. I believe I was able to correct the program and the corrected text is below.
Note: I started writing this explanation before I actually modified the program, so replace "you must do" with "I did"
I was able to download and build your program. I was able to work around the missing Function.h
, based on what I already had by changing Function::value
into just Function_value
[a global function].
The one thing you do have to do is have rank 0 get the random seed value. It has to bcast this first, so the client ranks can get it so their calls to shuffle
will produce the same results. So, I created a global ranseed
, set from a bcast in main
, that shuffle
can see it.
I changed the test*
routines to always compute x0
, y0
, a
, and c
, regardless of rank. Otherwise, the compiler was complaining about possibly uninitialized values for those pointers. The computation is small. No need to broadcast these values. If this precalculation were intensive [which it isn't], then bcast-ing the arrays would be the way to go.
I think you already realized this in the MPI version of calcError
because you were commenting out the bcasts for these arrays.
AFAICT, your MPI calcError
looks good except for two things:
(1) The MPI_Reduce
call should not be inside the if (myid_c >= 0)
block. I think it needs to be at the bottom after this block (i.e. even rank 0 needs to call it).
(2) In the old place for the MPI_Reduce
[inside the if
], I think you need an error_p = error;
after the double level for loops because error_p
is the value that the clients send [and error
is what the root receives]
I added a -R
command line option to allow manual setting the of the random seed value. I also added a -M
option to switch between the MPI and non-MPI versions of calcError
I've run the program using the mpi mode: mpirun -np 8 <program> -M
and all tests seemed to work. I did ctrl-c
on testL
after a while, so no guarantees on that.
Here's a single [archive] file, similar to your paste, the file separater line is % <filename>
It has all my outlined corrections [please pardon the gratuitous style cleanup].
Warning: Be careful when extracting Function.h
as I had to create a skeleton version, so if your version has a class definition, it will get lost.
% Error.h
#ifndef _mpisize_Error_h_
#define _mpisize_Error_h_
class Error {
private:
double *v;
double *x0;
double *y0;
double *a;
double *c;
int length;
double length_2;
double mult;
int size;
double error;
public:
void setValues(double *v, int length, double mult);
void setCoefficients(double *x0,double *y0, double *a,double *c, int size);
void calcError(int mpiflg);
void calcErrorStd();
void calcErrorMpi();
double getError();
};
#endif
% Function.h
#ifndef _mpisize_Function_h_
#define _mpisize_Function_h_
#include <math.h>
double
Function_value(double x, double y, double x0, double y0, double a, double c);
#endif
% mpisize.h
#ifndef _mpisize_mpisize_h_
#define _mpisize_mpisize_h_
#include <math.h>
#include "Error.h"
#include "Function.h"
#ifdef _MPISIZE_GLO_
#define EXTRN_MPISIZE
#else
#define EXTRN_MPISIZE extern
#endif
EXTRN_MPISIZE int opt_debug;
EXTRN_MPISIZE int opt_send;
EXTRN_MPISIZE int glob_rank;
EXTRN_MPISIZE double tvzero;
double
tvgetf(void);
int
dbgif(int lvl);
void
_dbgprt(const char *fmt,...);
#define dbgprt(_lvl,_fmt...) \
do { \
if (dbgif(_lvl)) \
_dbgprt(_fmt); \
} while (0)
#endif
% ErrMpi.cpp
#include <iostream>
#include <math.h>
#include <mpi.h>
#include "mpisize.h"
void
Error::calcErrorMpi()
{
int mystart, myend;
int myid, myid_c;
int numproc, numproc_c;
myid = glob_rank;
MPI_Comm_size(MPI_COMM_WORLD, &numproc);
myid_c = myid - 1;
numproc_c = numproc - 1;
#if 0
MPI_Bcast(&length, 1, MPI_INT, 0, MPI_COMM_WORLD);
#endif
MPI_Bcast(&error, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(&size, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast(&mult, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);
if (opt_send) {
MPI_Bcast(x0, size, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(y0, size, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(a, size, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(c, size, MPI_DOUBLE, 0, MPI_COMM_WORLD);
}
double error_p = 0;
error = 0;
if (myid_c >= 0) {
mystart = (length / numproc_c) * myid_c;
if (length % numproc_c > myid_c) {
mystart += myid_c;
myend = mystart + (length / numproc_c) + 1;
}
else {
mystart += length % numproc_c;
myend = mystart + (length / numproc_c);
}
dbgprt(2,"calcErrorMpi: STARTUP myid_c=%d numproc_c=%d length=%d length_2=%g mystart=%d myend=%d\n",
myid_c,numproc_c,length,length_2,mystart,myend);
double dv;
double vtmp;
for (int i = mystart; i < myend; i++) {
for (int j = 0; j < length; j++) {
vtmp = 0.0;
for (int k = 0; k < size; k++)
vtmp += Function_value((i - length_2) * mult,
(j - length_2) * mult,
x0[k], y0[k], a[k], c[k]);
dv = v[i * length + j] - vtmp;
error += dv * dv;
}
}
error_p = error;
}
dbgprt(2,"calcErrorMpi: RED/BEF error_p=%g error=%g\n",error_p,error);
MPI_Reduce(&error_p, &error, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
dbgprt(2,"calcErrorMpi: RED/AFT error_p=%g error=%g\n",error_p,error);
}
% Error.cpp
#include <iostream>
#include <math.h>
#include "Error.h"
#include "Function.h"
using namespace std;
void
Error::setValues(double *v, int length, double mult)
{
this->v = v;
this->length = length;
this->mult = mult;
length_2 = length * 0.5;
cout << "test" << endl;
}
void
Error::setCoefficients(double *x0, double *y0, double *a, double *c, int size)
{
this->x0 = x0;
this->y0 = y0;
this->a = a;
this->c = c;
this->size = size;
}
void
Error::calcErrorStd()
{
double dv;
double vtmp;
error = 0;
for (int i = 0; i < length; i++) {
for (int j = 0; j < length; j++) {
vtmp = 0.0;
for (int k = 0; k < size; k++)
vtmp += Function_value((i - length_2) * mult,
(j - length_2) * mult,
x0[k], y0[k], a[k], c[k]);
dv = v[i * length + j] - vtmp;
error += dv * dv;
}
}
}
void
Error::calcError(int mpiflg)
{
if (mpiflg)
calcErrorMpi();
else
calcErrorStd();
}
double
Error::getError()
{
return sqrt(error);
}
% Function.cpp
#include "Function.h"
double
Function_value(double x, double y, double x0, double y0, double a, double c )
{
return a * exp(-((x - x0) * (x - x0) + (y - y0) * (y - y0)) * c);
}
% mpisize.cpp
#define _MPISIZE_GLO_
#include "mpisize.h"
#include "Error.h"
#include "Function.h"
#include <iostream>
#include <mpi.h>
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <sys/time.h>
#include <math.h>
using namespace std;
const int LEN = 1250;
const int LEN_2 = LEN / 2;
const int SIZE = LEN * LEN;
const int SHUFFLE_TIMES = 2 * SIZE;
const double MULT = 0.1;
const int COMPLEXITY = 8;
int ranseed;
int opt_mpi;
double
generate(double x, double y, double *x0, double *y0, double *a, double *c, int indexS, int indexE)
{
double v = 0.0;
for (int i = indexS; i < indexE; i++) {
v += Function_value(x, y, x0[i], y0[i], a[i], c[i]);
}
return v;
}
void
generate(double *v, double *x0, double *y0, double *a, double *c)
{
for (int i = 0; i < LEN; i++)
for (int j = 0; j < LEN; j++)
v[i * LEN + j] = generate((i - LEN_2) * MULT, (j - LEN_2) * MULT,
x0, y0, a, c, 0, COMPLEXITY);
}
void
shuffle(double *v)
{
int i;
int j;
double vtmp;
srandom(ranseed);
for (int k = 0; k < SHUFFLE_TIMES; k++) {
i = random() % SIZE;
j = random() % SIZE;
vtmp = v[i];
v[i] = v[j];
v[i] = vtmp;
}
}
void
test(const char *testname,Error *err, double errorExpected)
{
err->calcError(opt_mpi);
double error;
if (glob_rank == 0) {
error = err->getError();
cout << endl << "Test " << testname << " blad " << error << endl;
if (fabs(error - errorExpected) > (0.001 + 0.001 * errorExpected)) {
cerr << "Blad powinno byc " << errorExpected << endl;
MPI_Finalize();
exit(0);
}
else {
cout << " - - - - - - OK" << endl;
}
}
}
void
test1(Error *err)
{
double x0[] = { -3, -2, -2, -1, 1, 2, 2, 3 };
double y0[] = { -3, -3, 3, -1, 1, -3, 3, 3 };
double a[] = { 1, 2, 3, 4, 5, 6, 7, 8 };
double c[] = { 0.1, 0.05, 0.02, 0.01, 0.01, 0.01, 0.02, 0.05 };
err->setCoefficients(x0, y0, a, c, 8);
test("test1", err, 0);
}
void
test2(Error *err)
{
double x0[] = { -3, -1, -2, -1, 1, 2, 2, 3, 1, 2, 3, 4, 5, 6, 7 };
double y0[] = { -3, -3, 3, -1, 1, -3, 1, 3, 1, 2, 3, 4, 1, 2, 3 };
double a[] = { 1, 2, 3, 4, 5, 6, 7, 8, 7, 6, 5, 4, 1, 2, 3 };
double c[] = { 0.1, 0.05, 0.02, 0.01, 0.02, 0.01, 0.02, 0.05, 1, 1, 1, 1, 1, 2, 3 };
err->setCoefficients(x0, y0, a, c, 15);
test("test2", err, 357.729);
}
void
test3(Error *err)
{
double x0[] = { -3, -1, -2, -1, 1, 2, 2, 3, -3, -1, -2, -1, 1, 2, 2, 3, 1, 2, 3, 4, 5, 6, 7, -3, -1, -2, -1, 1, 2, 2, 3, 1, 2, 3, 4, 5, 6, 7 };
double y0[] = { -3, -3, 3, -1, 1, -3, 1, 3, -3, -1, -2, -1, 1, 2, 2, 3, 1, 2, 3, 4, 5, 6, 7, -3, -1, -2, -1, 1, 2, 2, 3, 1, 2, 3, 4, 5, 6, 7 };
double *a = new double[38];
double *c = new double[38];
for (int i = 0; i < 38; i++) {
a[i] = 1 + i / 38.0;
c[i] = 2 + i / 38.0;
}
err->setCoefficients(x0, y0, a, c, 38);
test("test3", err, 2975.86);
}
void
test4(Error *err)
{
double *x0 = new double[150];
double *y0 = new double[150];
double *a = new double[150];
double *c = new double[150];
for (int i = 0; i < 150; i++) {
x0[i] = 5 - i * 0.2;
a[i] = 1 + i / 8.0;
y0[i] = 2 - i * 0.22;
c[i] = 2 + i / 38.0;
}
err->setCoefficients(x0, y0, a, c, 150);
test("test4", err, 3303.04);
}
void
testL(Error *err)
{
double *x0;
double *y0;
double *a;
double *c;
if (glob_rank == 0)
cout << "Test pozwalajacy na oszacowanie przyspieszenia" << endl;
x0 = new double[111];
y0 = new double[111];
a = new double[111];
c = new double[111];
for (int i = 0; i < 111; i++) {
x0[i] = 5 - i * 0.2;
a[i] = 1 + i / 1.1;
y0[i] = 2 - i * 0.4;
c[i] = 2 + i / 38.0;
}
double toterror = 0;
double error;
for (int i = 0; i < 20; i++) {
a[i] = i;
err->setCoefficients(x0, y0, a, c, 111 - i * 3);
err->calcError(opt_mpi);
error = err->getError();
dbgprt(2,"testL: POST error=%g\n",error);
if (glob_rank == 0) {
toterror += error;
}
}
if (glob_rank == 0)
cout << "Uwaga: ta wartosc nie moze zalezec od liczby uzytych procesow = " << toterror << endl;
}
int
main(int ac, char **av)
{
char *cp;
MPI_Init(&ac, &av);
MPI_Comm_rank(MPI_COMM_WORLD, &glob_rank);
#if 1
ranseed = 123767832;
#else
ranseed = 0;
#endif
--ac;
++av;
for (; ac > 0; --ac, ++av) {
if (glob_rank != 0)
break;
cp = *av;
if (*cp != '-')
break;
switch (cp[1]) {
case 'd':
opt_debug = 1;
break;
case 'M':
opt_mpi = 1;
break;
case 'R':
ranseed = atoi(cp + 2);
break;
case 'S':
opt_send = 1;
break;
}
}
tvzero = tvgetf();
if (ranseed == 0)
ranseed = time(NULL);
if (glob_rank == 0) {
cout << "Random: " << ranseed << std::endl;
cout << "Mpi: " << opt_mpi << std::endl;
cout << "Send: " << opt_send << std::endl;
cout << "Debug: " << opt_debug << std::endl;
}
MPI_Bcast(&tvzero, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast(&ranseed, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast(&opt_mpi, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast(&opt_send, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast(&opt_debug, 1, MPI_INT, 0, MPI_COMM_WORLD);
Error *err = new Error();
double *v;
v = new double[SIZE];
double x0[] = { -3, -2, -2, -1, 1, 2, 2, 3 };
double y0[] = { -3, -3, 3, -1, 1, -3, 3, 3 };
double a[] = { 1, 2, 3, 4, 5, 6, 7, 8 };
double c[] = { 0.1, 0.05, 0.02, 0.01, 0.01, 0.01, 0.02, 0.05 };
generate(v, x0, y0, a, c);
shuffle(v);
err->setValues(v, LEN, MULT);
test1(err);
test2(err);
test3(err);
test4(err);
test1(err);
testL(err);
MPI_Finalize();
return 0;
}
double
tvgetf(void)
{
struct timespec ts;
double sec;
clock_gettime(CLOCK_REALTIME,&ts);
sec = ts.tv_nsec;
sec /= 1e9;
sec += ts.tv_sec;
return sec;
}
int
dbgif(int lvl)
{
int goflg = 0;
do {
if (! opt_debug)
break;
if (glob_rank == 0) {
goflg = 1;
break;
}
if (lvl > opt_debug)
goflg = 1;
} while (0);
return goflg;
}
void
_dbgprt(const char *fmt,...)
{
va_list ap;
double tvnow;
tvnow = tvgetf();
tvnow -= tvzero;
printf("%.9f/%d ",tvnow,glob_rank);
va_start(ap,fmt);
vprintf(fmt,ap);
va_end(ap);
}
UPDATE:
I've updated my full code example above to add tracing. It seems to work in all cases.
However, to answer your followup questions ...
what if I cannot modify the main class?
I presume you mean "main file" which includes the test*
functions? No. You must modify the test*
functions because they are the source of the bugs you're having.
If I had to only modify Error, would it work if I moved out the MPI_reduce out of those loops, and added error = error_p
at the end of the if statement as you suggested?
Mostly, but you want error_p = error
instead. Reread my comment about this in my original post and reread the documentation for MPI_Reduce
. First arg is what client sends to root and second arg is how root gets the value.
Interesting enough I'm also not able to broadcast length_2.
That's because you were using MPI_INT
instead of MPI_DOUBLE
Then I guess I'd have to add all those Bcasts, but even if I broadcast the size value I dont know how to set the second parameter of the Bcast function for the rest of the arrays. Everything I try ends with segmentation fault.
You do not need to broadcast the x0
et. al. arrays if you always set them in test*, regardless of rank. Remember that I mentioned that the calculation was small enough to be able to be repeated in all ranks
Not setting them in all ranks is the bug.
As expected, no matter what I try, I'm not able to broadcast any array to the rest of the processes.
The broadcast part is working fine (i.e. rank 0 is sending the data correctly). But, the non-root ranks have no place to store it and you're getting segfaults.
Here is your test4
:
void
test4(Error * err, int rank)
{
if (rank == 0) {
double *x0 = new double[150];
double *y0 = new double[150];
double *a = new double[150];
double *c = new double[150];
for (int i = 0; i < 150; i++) {
x0[i] = 5 - i * 0.2;
a[i] = 1 + i / 8.0;
y0[i] = 2 - i * 0.22;
c[i] = 2 + i / 38.0;
}
err->setCoefficients(x0, y0, a, c, 150);
}
test(err, 3303.04, rank);
}
Here is the modified version:
void
test4(Error * err, int rank)
{
double *x0 = new double[150];
double *y0 = new double[150];
double *a = new double[150];
double *c = new double[150];
for (int i = 0; i < 150; i++) {
x0[i] = 5 - i * 0.2;
a[i] = 1 + i / 8.0;
y0[i] = 2 - i * 0.22;
c[i] = 2 + i / 38.0;
}
err->setCoefficients(x0, y0, a, c, 150);
test(err, 3303.04, rank);
}
In my version, setCoefficients
is always being called. So, the non-root rank Error
class will always have valid [non-null] pointers for x0
, etc.
In your version, setCoefficients
is only called for the root rank. So, the non-root ranks will always have null pointers for x0
, etc. Thus, you'll get a segfault.
Note that if you had compiled with -Wall
, some of the test*
functions would have been flagged with warnings about possibly uninitialized values. Fixing those would probably have put you on to the problem earlier.
At a minimum, even if you wish to bcast x0
et. al., you must still ensure the child ranks have valid/sufficient space allocated.
To achieve this, here's a modified test4
:
void
test4(Error * err, int rank)
{
if (rank == 0) {
double *x0 = new double[150];
double *y0 = new double[150];
double *a = new double[150];
double *c = new double[150];
for (int i = 0; i < 150; i++) {
x0[i] = 5 - i * 0.2;
a[i] = 1 + i / 8.0;
y0[i] = 2 - i * 0.22;
c[i] = 2 + i / 38.0;
}
err->setCoefficients(x0, y0, a, c, 150);
}
else
err->setBuffers(150);
test(err, 3303.04, rank);
}
And, you'll need something like:
void
Error::setBuffers(int size)
{
this->x0 = realloc(this->x0,sizeof(double) * size);
this->y0 = realloc(this->y0,sizeof(double) * size);
this->a = realloc(this->a,sizeof(double) * size);
this->c = realloc(this->c,sizeof(double) * size);
this->size = size;
}
So, after doing all that, you end up with a more complicated solution and you still have to modify test*
.
Of course, in the MPI version of calcError
, you'd want to do:
MPI_Bcast(x0, size, MPI_DOUBLE, 0, MPI_COMM_WORLD);
MPI_Bcast
and the "client side" code that receives them. And enough code regarding the generation and usage of the size. To prevent "going round-n-round" on posting more here [based on feedback], add a link to [e.g. pastebin] for full code. When you get enough feedback to solve the problem, you can edit the fragments to create a whole question here [and optionally remove the external link]. You can post all here. Limit is 30,000 chars – Craig EsteyFunction.h
is missing. I'll try to analyze things without it, but having it would help. – Craig Estey