1
votes

I have a function: createrWorkerPool which will spawn "n" worker threads and each of them will take input as file specified in the args for pthread_create, read the file modify a shared variable by using a mutex around it and wait at a barrier until all threads have done modifying their shared variable. This operation happens in a loop a large number of times.

The problem I am facing is that lets consider there are two files file1 and file2 and file2 is much bigger in size than file1. The barrier synchronization works till completion of file1- but since it finishes execution it no longer reaches the barrier and file2 is stuck in the barrier forever.

My question is that is there a way to dynamically change the count of the number of threads the barrier is waiting on when a thread exits. so in the above case if file1 finishes early it decrements the barrier count from 2 to 1 so file1 can proceed with its execution. I tried looking at the man page but don't see any function.Example code

pthread_mutex_t m1;
pthread_barrier_t b1;
//common function executed by each worker thread
void* doWork(void* arg) {
    const char* file = arg;
    while(1) {
        pthread_mutex_lock(&m1);
        // Modify shared variable
        pthread_mutex_unlock(&m1);
        // Wait for all threads to finish modifying shared variable
        pthread_barrier_wait(&b1);
        // Once all threads reach barrier check state of shared variable and do some task based on state
        // check for condition if true break out of loop

    }
    return 0;
}

So basically thread1 manipulating file1 finishes before and thread2 is stuck at the barrier forever

1
It's not exactly clear how the program is structured. Is each file looped over the same number of times as every other file?caf
question edited to reflect clear ideajohn smith
This source is difficult to understand with many details left out which are not explained in the text of your question. What is stopping both threads from reaching the pthread_barrier_wait()? The actual structure of the program is not clear. I am not sure why there is a loop in doWork() nor the pthread_barrier_wait() as I would expect that each worker thread would call the doWork() function when it needs to execute to the task which requires the shared variable. So shouldn't the doWork() function do something like: check the shared variable, open the file specified, do stuff, return?Richard Chambers
As for changing the barrier count you would use pthread_barrier_init() to initialize the barrier, pthread_barrier_wait() to wait for all threads to arrive at the barrier, and then pthread_barrier_destroy() to destroy the barrier after use. Then just create a new one. See pthread_barrier_init(3) - Linux man page. See also docs.oracle.com/cd/E19253-01/816-5137/gfwek/index.html and for a sample see threads/pthread_barrier_demo.c.Richard Chambers
I'm not sure about the exact purpose of barrier here but for your comments you have consider use a semaphore instead of barrier?Mquinteiro

1 Answers

2
votes

You can't really change the barrier count while the barrier is in use like that.

Presumably the problem is that the condition that is tested to break out of the loop is not true for all files at the same time - ie, each thread might execute a different number of loops.

If this is the case, one solution is to have each thread that finishes early continue to loop around, but do nothing but wait on the barrier in each loop. Then arrange for the threads to all exit together - something like this:

void* doWork(void* arg)
{
    const char* file = arg;
    int work_done = 0;

    while(1) {
        if (work_done)
        {
            if (all_threads_done)
                break;

            pthread_barrier_wait(&b1);
            continue;
        }

        pthread_mutex_lock(&m1);
        // Modify shared variable
        pthread_mutex_unlock(&m1);
        // Wait for all threads to finish modifying shared variable
        pthread_barrier_wait(&b1);
        // Once all threads reach barrier check state of shared variable and do some task based on state
        if (finish_condition)
            work_done = 1;
    }
    return 0;
}