pthread_cond_signal() not giving enough time for the signaled thread to run

Question

The thread calling pthread_cond_signal is re-grabbing the mutex before the signaled thread can be released.

The code below shows a simple example of the issue at hand. The main thread will hold the lock, create the worker thread, and then enter a loop that prints data as it comes in. It is signaled to run via a conditional variable.

The worker thread will enter a loop that generates data, grabs the lock, writes the data, and then signals the main thread.

#include <stdio.h>
#include <pthread.h>

pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

int data = 0;

void *thread(void *arg) {
    int length = *(int *) arg;
    for (int i = 0; i < length; i++) {
        // Do some work
        pthread_mutex_lock(&lock);
        data = i;
        fprintf(stdout, "Writing\n");
        pthread_cond_signal(&cond);
        pthread_mutex_unlock(&lock);
    }
    pthread_exit(0);
}


int main(int argc, const char *argv[]) {
    pthread_t worker;
    int length = 4;

    pthread_mutex_lock(&lock);
    pthread_create(&worker, 0, thread, &length);

    for (int i = 0; i < length; i++) {
        fprintf(stdout, "Waiting\n");
        pthread_cond_wait(&cond, &lock);
        fprintf(stdout, "read data: %d\n", data);
    }
    pthread_mutex_unlock(&lock);

    pthread_join(worker, NULL);
    return 0;
}

This will give the following output:

Waiting
Writing
Writing
Writing
Writing
read data: 3
Waiting

Expected behavior: Main thread holds the mutex and only releases it once its waiting. The worker thread will then write its data and signal the main thread. The main thread will immediately lock the mutex on signal, then read the data and go back to waiting, releasing the mutex. Meanwhile the worker thread will do its work and be waiting until the main thread is waiting again to write its data and signal it.

Instead, it appears that the worker thread is getting the mutex immediately after calling signal, rarely letting the main thread get access. If I put a sleep in the worker thread in place of the // Do some work then it will give the expected output.

@immibis Since the main thread is waiting on cond, I expected it would immediately stop being blocked once it was signaled and have priority claim of the mutex. Instead the worker thread is able to get to the mutex before the main thread can. — Cyrusc
@Cyrusc Evidently it takes longer to unblock the main thread than it does for the worker thread to acquire the mutex again. When things are multithreaded you don't get to choose which order they happen in! — user253751

caf caf · Accepted Answer · 2019-04-10T01:28:01

Signalling the condition variable does not give any kind of priority on locking the mutex to a thread that was waiting on that condition variable. All it means is that at least one thread waiting on the condition variable will start trying to acquire the mutex so it can return from the pthread_cond_wait(). The signalling thread will keep executing and can easily re-acquire the mutex first, as you've seen.

You should never have a condition variable without an actual condition over some shared state that you're waiting for - returning from a pthread_cond_wait() doesn't mean a thread should definitely proceed, it means that it should check if the condition it was waiting for is true. That's why they're called condition variables.

In this case, the state your writing thread wants to wait for is "the main thread has consumed the last data I wrote.". However, your reading (main) thread also needs to wait on a condition - "the writing thread has written some new data". You can achieve both these conditions with a flag variable that indicates that some new, unconsumed data has been written to the data variable. The flag starts out unset, is set by the writing thread when it updates data, and is unset by the main thread when it reads from data. The writing thread waits for the flag to be unset, and the reading thread waits for the flag to be set.

With this arrangement, you also don't need to have the mutex locked when you start the writing thread - it doesn't matter which order the threads start, because everything is consistent either way.

The updated code looks like:

#include <stdio.h>
#include <pthread.h>

pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

int data = 0;
int data_available = 0;

void *thread(void *arg)
{
    int length = *(int *) arg;
    for (int i = 0; i < length; i++) {
        // Do some work
        pthread_mutex_lock(&lock);
        fprintf(stdout, "Waiting to write\n");
        while (data_available)
            pthread_cond_wait(&cond, &lock);
        fprintf(stdout, "Writing\n");
        data = i;
        data_available = 1;
        pthread_cond_signal(&cond);
        pthread_mutex_unlock(&lock);
    }
    pthread_exit(0);
}


int main(int argc, const char *argv[])
{
    pthread_t worker;
    int length = 4;

    pthread_create(&worker, 0, thread, &length);

    for (int i = 0; i < length; i++) {
        pthread_mutex_lock(&lock);
        fprintf(stdout, "Waiting to read\n");
        while (!data_available)
            pthread_cond_wait(&cond, &lock);
        fprintf(stdout, "read data: %d\n", data);
        data_available = 0;
        pthread_cond_signal(&cond);
        pthread_mutex_unlock(&lock);
    }

    pthread_join(worker, NULL);
    return 0;
}

Of course, the threads end up working in lockstep - but essentially you have a producer-consumer with a maximum queue length of 1, so that's expected.

pthread_cond_signal() not giving enough time for the signaled thread to run

1 Answers