5
votes

I stumbled across the following Code Review StackExchange and decided to read it for practice. In the code, there is the following:

Note: I am not looking for a code review and this is just a copy paste of the code from the link so you can focus in on the issue at hand without the other code interfering. I am not interested in implementing a 'smart pointer', just understanding the memory model:

// Copied from the link provided (all inside a class)

unsigned int count;
mutex m_Mutx;

void deref()
{
    m_Mutx.lock();
    count--;
    m_Mutx.unlock();
    if (count == 0)
    {
        delete rawObj;
        count = 0;
    }
}

Seeing this makes me immediately think "what if two threads enter when count == 1 and neither see the updates of each other? Can both end up seeing count as zero and double delete? And is it possible for two threads to cause count to become -1 and then deletion never happens?

The mutex will make sure one thread enters the critical section, however does this guarantee that all threads will be properly updated? What does the C++ memory model tell me so I can say this is a race condition or not?

I looked at the Memory model cppreference page and std::memory_order cppreference, however the latter page seems to deal with a parameter for atomic. I didn't find the answer I was looking for or maybe I misread it. Can anyone tell me if what I said is wrong or right, and whether or not this code is safe or not?

For correcting the code if it is broken:

Is the correct answer for this to turn count into an atomic member? Or does this work and after releasing the lock on the mutex, all the threads see the value?

I'm also curious if this would be considered the correct answer:

Note: I am not looking for a code review and trying to see if this kind of solution would solve the issue with respect to the C++ memory model.

#include <atomic>
#include <mutex>

struct ClassNameHere {
    int* rawObj;
    std::atomic<unsigned int> count;
    std::mutex mutex;

    // ...

    void deref()
    {
        std::scoped_lock lock{mutex};
        count--;
        if (count == 0)
            delete rawObj;
    }
};
4
You are asking partially for a code review, you know a better place for that. Further, your ClassNameHere type contains a bunch of public members. Also, don't put tags into the title, that's what tags are for. Lastly, you are asking multiple (related) questions here. The general rule of thumb is to only ask one.Ulrich Eckhardt
@UlrichEckhardt I don't want a code review at all. This is temporary code designed to compile and quickly illustrate an issue. I am not sure if being inside a struct/class means anything different than a variable in the global namespace, so I just wrapped it in a struct. All I want to know is if the C++ memory model confirms or denies that there are problems here. The code I wrote is a jumping off point and to help illustrate the issue at hand. I encourage anyone else reading this to not do a code review but to focus on the memory model. I will edit my post to make this clearer.Water

4 Answers

4
votes

"what if two threads enter when count == 1" -- if that happens, something else is fishy. The idea behind smart pointers is that the refcount is bound to an object's lifetime (scope). The decrement happens when the object (via stack unrolling) is destroyed. If two threads trigger that, the refcount can not possibly be just 1 unless another bug is present.

However, what could happen is that two threads enter this code when count = 2. In that case, the decrement operation is locked by the mutex, so it can never reach negative values. Again, this assumes non-buggy code elsewhere. Since all this does is to delete the object (and then redundantly set count to zero), nothing bad can happen.

What can happen is a double delete though. If two threads at count = 2 decrement the count, they could both see the count = 0 afterwards. Just determine whether to delete the object inside the mutex as a simple fix. Store that info in a local variable and handle accordingly after releasing the mutex.

Concerning your third question, turning the count into an atomic is not going to fix things magically. Also, the point behind atomics is that you don't need a mutex, because locking a mutex is an expensive operation. With atomics, you can combine operations like decrement and check for zero, which is similar to the fix proposed above. Atomics are typically slower than "normal" integers. They are still faster than a mutex though.

2
votes

In both cases there’s a data race. Thread 1 decrements the counter to 1, and just before the if statement a thread switch occurs. Thread 2 decrement the counter to 0 and then deletes the object. Thread 1 resumes, sees that count is 0, and deletes the object again.

Move the unlock() to the end of th function.or, better, use std::lock_guard to do the lock; its destructor will unlock the mutex even when the delete call throws an exception.

1
votes

Your lock prevents that operation count-- gets in a mess when performed concurrently in different threads. It does not guarantee, however, that the values of count are synchronized, such that repeated reads outside a single critical section will bear the risk of a data race.

You could rewrite it as follows:

void deref()
{
    bool isLast;
    m_Mutx.lock();
    --count;
    isLast = (count == 0);
    m_Mutx.unlock();
    if (isLast) {
        delete rawObj;
    }
}

Thereby, the lock makes sure that access to count is synchronized and always in a valid state. This valid state is carried over to the non-critical section through a local variable (without race condition). Thereby, the critical section can be kept rather short.

A simpler version would be to synchronize the complete function body; this might get a disadvantage if you want to do more elaborative things than just delete rawObj:

void deref()
{
    std::lock_guard<std::mutex> lock(m_Mutx);
    if (! --count) {
        delete rawObj;
    }
}

BTW: std::atomic allone will not solve this issue as this synchronizes just each single access, but not a "transaction". Therefore, your scoped_lock is necessary, and - as this spans the complete function then - the std::atomic becomes superfluous.

1
votes

If two threads potentially* enter deref() concurrently, then, regardless of the previous or previously expected value of count, a data race occurs, and your entire program, even the parts that you would expect to be chronologically prior, has undefined behavior as stated in the C++ standard in [intro.multithread/20] (N4659):

Two actions are potentially concurrent if

(20.1) they are performed by different threads, or

(20.2) they are unsequenced, at least one is performed by a signal handler, and they are not both performed by the same signal handler invocation.

The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.

The potentially concurrent actions in this case, of course, are the read of count outside of the locked section, and the write of count within it.

*) That is, if current inputs allow it.

UPDATE 1: The section you reference, describing atomic memory order, explains how atomic operations synchronize with each other and with other synchronization primitives (such as mutexes and memory barriers). In other words, it describes how atomics can be used for synchronization so that some operations aren't data races. It does not apply here. The standard takes a conservative approach here: Unless other parts of the standard explicitly make clear that two conflicting accesses are not concurrent, you have a data race, and hence UB (where conflicting means same memory location, and at least one of them isn't read-only).