Is Python (cpython) behavior with respect to memory barriers and atomicity etc. guaranteed?

Question

I was wondering about the equivalent of Java's "volatile", and found this answer.

An equivalent to Java volatile in Python

Which (basically) says that everything is effectively volatile in python, at least in cpython, because of the GIL. Which makes sense, everything is locked by the GIL, no memory barriers to worry about, etc. But I would be happier if this were documented and guaranteed by specification, rather than have it be a result of the way that cpython happens to currently be implemented.

Because, say I want one thread to post data and others to read it, so I can choose something like this:

class XFaster:
    def __init__(self):
        self._x = 0

    def set_x(self, x):
        self._x = x

    def get_x(self, x):
        return self._x


class XSafer:
    def __init__(self):
        self._x = 0
        self._lock = threading.Lock()

    def set_x(self, x):
        with self._lock:
            self._x = x

    def get_x(self, x):
        with self._lock:
            return self._x

I'd rather go with XFaster or even not use a getter and setter at all. But I also want to do things reliably and "correctly". Is there some official documentation that says this is OK? What about say putting a value in a dict or appending to a list?

In other words, is there a systematic, documented way of determining what I can do without a threading.Lock (without digging through dis or anything like that)? And also preferably in a way that won't break with a future python release.

On edit: I appreciate the informed discussion in comments. But what I would really want is some specification that guarantees the following:

If I execute something like this:

# in the beginning
x.a == foo
# then two threads start

# thread 1:
x.a = bar

# thread 2
do_something_with(x.a)

I want to be sure that:

when thread 2 reads x.a it reads either foo or bar
if the read in thread 2 occurs physically later than the assignment in thread 1, then it actually reads bar

Here are some things I want not to happen:

the threads get scheduled on different processors, and the assignment x.a=bar from thread 1 isn't visible to the thread 2
x.__dict__ is in the middle of being re-hashed and so thread 2 reads garbage
etc

For objects, python documentation should describe which objects are and aren't thread safe. I don't believe lists are, for example, but queues are. Generally, accessing is thread safe, but setting is not. If you need to set something, grab the lock before setting then release. This is what actually makes setting safe. — M Z
@PeterCordes - quite the opposite is true. CPython's byte code interpreter is single threaded, controlled by the Global Interpreter Lock. Any given byte code operation is run to completion while holding the lock and is thus atomic with respect to any other byte code operation. The exception is when the operation results in a call to an extension function (C most likely) that specifically releases the lock. — tdelaney
@PeterCordes It doesn't quite answer the question. First, it depends on what the definition of "a single statement" is (e.g. is a,b=c,d a single statement?). But also, just because they don't happen at once doesn't automatically imply that the memory that thread A sees will reflect everything that happened in thread B, say if the os schedules the threads on different processors. It should, and almost certainly does work that way, but it would be nice if there were some documentation on python.org that said this explicitly. — mrip
What do you mean by “physically later”? Normally we don’t appeal to things like globally-accessible clocks when talking about multithreaded algorithms. — Davis Herring
@MisterMiyagi I am asking where I can find official documentation that tells me when I need to use an explicit lock and when I don't. As an example, in the classes I defined above, can I call set_x from one thread and get_x from other threads using XFaster or do I need to use XSafer to make sure that I don't run into any problems. And also I would like to read about this on python.org, not on some blog post. — mrip

MisterMiyagi MisterMiyagi · Accepted Answer · 2020-07-24T19:57:26

TLDR: CPython guarantees that its own data structures are thread-safe against corruption. This does not mean that any custom data structures or code are race-free.

The intention of the GIL is to protect CPython's data structures against corruption. One can rely on the internal state being thread-safe.

global interpreter lock (Python documentation – Glossary)

The mechanism used by the CPython interpreter to assure that only one thread executes Python bytecode at a time. This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access. [...]

This also implies correct visibility of changes across threads.

However, this does not mean that any isolated statement or expression is atomic: Almost any statement or expression can invoke more than one bytecode instruction. As such the GIL does explicitly not provide atomicity for these cases.

In specific, a statement such as x.a=bar may execute arbitrary many bytecode instructions by invoking a setter via object.__setattr__ or the descriptor protocol. It executes at least three bytecode instructions for bar lookup, x lookup and a assignment.

As such, Python guarantees visibility/consistency, but provides no guarantees against race conditions. If an object is mutated concurrently, this must be synchronised for correctness.

Is Python (cpython) behavior with respect to memory barriers and atomicity etc. guaranteed?

1 Answers

global interpreter lock (Python documentation – Glossary)