0
votes

I'm having a dead-lock when trying to notify a condition_variable from a thread.

Here is my MCVE:

#include <iostream>
#include <boost/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread/condition_variable.hpp>

static boost::mutex m_mutex;
static boost::condition_variable m_cond;

void threadFunc()
{
    std::cout << "LOCKING MUTEX" << std::endl;
    boost::mutex::scoped_lock lock( m_mutex );
    std::cout << "LOCKED, NOTIFYING CONDITION" << std::endl;
    m_cond.notify_all();
    std::cout << "NOTIFIED" << std::endl;
}

int main( int argc, char* argv[] )
{
    while( true )
    {
        std::cout << "TESTING!!!" << std::endl;

        boost::mutex::scoped_lock lock( m_mutex );

        boost::thread thrd( &threadFunc );

        //m_cond.wait( lock );
        while ( !m_cond.timed_wait(lock,boost::posix_time::milliseconds(1)) )
        {
            std::cout << "WAITING..." << std::endl;
        }

        static int pos = 0;
        std::cout << "DONE!!! " << pos++ << std::endl;

        thrd.join();
    }

    return 0;
}

If using m_cond.wait( lock );, I see DONE!!! being written for every attempt, no problem here.

If I use the while ( !m_cond.timed_wait(lock,boost::posix_time::milliseconds(1)) ) loop, I see DONE!!! being written for a few attempts, and, at some point, I get a dead lock and waiting finally never ends:

TESTING!!!
LOCKING MUTEX
LOCKED, NOTIFYING CONDITION
NOTIFIED
WAITING...
WAITING...
WAITING...
WAITING...
WAITING...
WAITING...
...

I have read other posts on stackoverflow (like Condition variable deadlock): they mention that this could happen if notify_all is called before condition's wait function is running, so the mutex must be used to prevent that. But I feel like that's what I'm doing:

  • I lock the mutex before creating the thread
  • Then thread cannot notify before m_cond.timed_wait is reached (and then mutex is unlocked)
  • Within the loop, in case of timeout, timed_wait relocks the mutex so notify cannot be done, we print "WITTING..." and we release the mutex when we are again ready to receive the notification

So why is the dead-lock occuring? Could the condition be notified between the moment when timed_wait detects the timeout and relock the mutex?

2
I've tried this out with the c++ standard library and it doesn't deadlock: compiler-explorer.com/z/Gh7da7JVApen
@JVApen: Clicked your link. It wentto infinite loop after attempt #54!, retried two times, it was OK, on first try, it started WAITING for ever after attempt #284....So this is apparently not a boost issue.jpo38
You don't seem to guard yourself against spurious wakeups.Ted Lyngmo
... also, notifying while holding the lock may be a pessimization.Ted Lyngmo
@TedLyngmo Rainer explains why the mutex needs to be held when notifying the lock here: modernescpp.com/index.php/… - although in this case it would simply result with an extra ms wait as opposed to waiting forever.Den-Jason

2 Answers

3
votes

The problem is that if timed_wait completes before notify_all is called it will then have to wait for the thread to release the mutex (i.e. after it has called notify_all) before it resumes then will call timed_wait again, the thread has finished so timed_wait will never succeed. There are two scenarios where this can happen, if your thread takes more than a millisecond to start (should be unlikely but the scheduling vagaries of your OS mean it could happen, especially if the CPU is busy) the other is spurious wakeups.

Both scenarios can be guarded against by setting a flag when calling notify_all which the waiting thread can check to ensure notify has been called:

#include <iostream>
#include <boost/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread/condition_variable.hpp>

static boost::mutex m_mutex;
static boost::condition_variable m_cond;

void threadFunc(bool& notified)
{
    std::cout << "LOCKING MUTEX" << std::endl;
    boost::mutex::scoped_lock lock(m_mutex);
    std::cout << "LOCKED, NOTIFYING CONDITION" << std::endl;
    notified = true;
    m_cond.notify_all();
    std::cout << "NOTIFIED" << std::endl;
}

int main(int argc, char* argv[])
{
    while (true)
    {
        std::cout << "TESTING!!!" << std::endl;

        boost::mutex::scoped_lock lock(m_mutex);

        bool notified = false;

        boost::thread thrd(&threadFunc, boost::ref(notified));

        //m_cond.wait( lock );
        std::cout << "WAITING..." << std::endl;
        while (!m_cond.timed_wait(lock, boost::posix_time::milliseconds(1), [&] { return notified; }))
        {
            std::cout << "WAITING..." << std::endl;
        }

        static int pos = 0;
        std::cout << "DONE!!! " << pos++ << std::endl;

        thrd.join();
    }

    return 0;
}
0
votes

A wait on a condition variable has to start before the condition is signalled. With your code it's possible that a spurious wakeup could allow the thread to complete before the wait begins.

The solution is this - don't wait solely on a condition variable. Test a shared flag, and use the condition variable to wake up as soon as it is signalled. See Rainer's guidelines on this here: https://www.modernescpp.com/index.php/c-core-guidelines-be-aware-of-the-traps-of-condition-variables

Also see this thread about using the boost condition variable with a predicate: boost::condition_variable - using wait_for with predicate

and How do I use a boost condition variable to wait for a thread to complete processing?