I'm learning OpenMP and C and having some issues with simple programs.
I have set the following environment variables in my bashrc:
define how many threads you want
export OMP_NUM_THREADS=4
#allow to switch number of threads
export OMP_DYNAMIC=true
#allow nested parallel regions
export OMP_NESTED=true
Here is the program I'm trying to run:
#include <stdio.h> /* input, output */
#include <omp.h> /* openMP library */
#include <time.h> /* measure time */
#define N 100000000 // if sourcearray not static, I'll be overflowing the stack.
// > ~10^6 elements is a lot for most systems.
void forloop(void);
int
main(void)
{
/* worksharing: for loop */
forloop();
return(0);
}
/*=============================================================*/
/*=============================================================*/
void forloop(void){
/*do a for loop sequentially and in parallel; measure each times */
printf("=====================\n");
printf("FOR LOOP\n");
printf("=====================\n\n");
long i;
clock_t start, end;
double cpu_time_used;
static double sourcearray[N];
/*============*/
/*measure time*/
/*============*/
start=clock();
for (i=0; i<N; i++){
sourcearray[i] = ((double) (i)) * ((double) (i))/2.2034872;
}
end = clock();
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
printf("Non-parallel needed %lf s\n", cpu_time_used);
/*===============*/
/*parallel region*/
/*===============*/
#pragma omp parallel
/*need to specify num_threads, when OMP_DYNAMIC=true to make sure 4 are used.*/
{
omp_set_num_threads(4);
double starttime_omp, endtime_omp;
/*time measurement*/
starttime_omp=omp_get_wtime();
int procs, maxt, nt, id;
procs = omp_get_num_procs(); // number of processors in use
maxt = omp_get_max_threads(); // max available threads
nt = omp_get_num_threads();
id = omp_get_thread_num();
printf("num threads forloop %d from id %d, procs: %d, maxthrds: %d\n", nt, id, procs, maxt);
#pragma omp for
for (i=0; i<N; i++){
sourcearray[i] = ((double) (i)) * ((double) (i))/2.2034872;
}
endtime_omp = omp_get_wtime();
cpu_time_used = ((endtime_omp - starttime_omp)) ;
} /* end parallel region */
}
I compile the code with gcc -g -Wall -fopenmp -o omp_worksharing.exe omp_worksharing.c
The program compiles with a warning that I don't quite understand:
omp_worksharing.c: In function ‘forloop’:
omp_worksharing.c:78:17: warning: variable ‘sourcearray’ set but not used [-Wunused-but-set-variable]
static double sourcearray[N];
but that is not the main issue:
The issue is that the program doesn't start 4 threads. This is the output:
=====================
FOR LOOP
=====================
Non-parallel needed 0.900340 s
num threads forloop 3 from id 0, procs: 8, maxthrds: 4
num threads forloop 3 from id 1, procs: 8, maxthrds: 4
num threads forloop 3 from id 2, procs: 8, maxthrds: 4
Same happens when I use #pragma omp num_threads(4) instead of omp_set_num_threads(4);
Even weirder, I when I leave out both #pragma omp num_threads(4) and omp_set_num_threads(4); most of the times 3 threads are started, but sometimes 4. I couldn't find any regularity when or why, but a research suggests that the OMP_DYNAMIC=true allows OpenMP to choose the number of threads by itself optimally.
How come I can't specify the number of threads to be used?