new to openMP, any suggestions to parallel the following code with openMP?

Question

I want to speed up the code with openMP, and tried to add #pragma omp for in the following 2 sections of "sum -= a[i][k]*a[k][j]" since the hotspot analysis shows these two loop takes great portions of time. but seems like some race conditions resulted in wrong results. any suggestions?

void ludcmp(float **a, int n, int *indx, float *d)
{
int i,imax,j,k;
float big,dum,sum,temp;
float *vv;
    vv=vector(1,n);
*d=1.0;

for (j=1;j<=n;j++) {

    for (i=1;i<j;i++) {
        sum=a[i][j];
        for (k=1;k<i;k++) sum -= a[i][k]*a[k][j];     //here
        a[i][j]=sum;
    }   
    big=0.0;
    for (i=j;i<=n;i++) {
        sum=a[i][j];
        for (k=1;k<j;k++)
            sum -= a[i][k]*a[k][j];                   //here
        a[i][j]=sum;
        if ( (dum=vv[i]*fabs(sum)) >= big) {
            big=dum;
            imax=i;
        }
    }
}

1201ProgramAlarm 1201ProgramAlarm · Accepted Answer · 2019-11-10T04:38:22

Your variables are all declared at the top of the function, so every thread will share them resulting in little or no benefit from the threading.

You should declare variables as close as possible to where you use them. In particular, sum and k are used in the innermost loops, and should be declared right there (so that every thread will have its own copy of those variables). This can be extended to i and dum as well. Also, that last if (looking for the largest value) can/should be placed in a separate loop and either run single threaded, or with proper OpenMP directives for handling big and imax.

new to openMP, any suggestions to parallel the following code with openMP?

1 Answers