new to openMP, any suggestions to parallel the following code with openMP?
I want to speed up the code with openMP, and tried to add #pragma omp for in the following 2 sections of "sum -= a[i][k]*a[k][j]" since the hotspot analysis shows these two loop takes great portions of time. but seems like some race conditions resulted in wrong results. any suggestions?
void ludcmp(float **a, int n, int *indx, float *d)
{
int i,imax,j,k;
float big,dum,sum,temp;
float *vv;
vv=vector(1,n);
*d=1.0;
for (j=1;j<=n;j++) {
for (i=1;i<j;i++) {
sum=a[i][j];
for (k=1;k<i;k++) sum -= a[i][k]*a[k][j]; //here
a[i][j]=sum;
}
big=0.0;
for (i=j;i<=n;i++) {
sum=a[i][j];
for (k=1;k<j;k++)
sum -= a[i][k]*a[k][j]; //here
a[i][j]=sum;
if ( (dum=vv[i]*fabs(sum)) >= big) {
big=dum;
imax=i;
}
}
}