12
votes

I know when reading from a location of memory which is written to by several threads or processes the volatile keyword should be used for that location like some cases below but I want to know more about what restrictions does it really make for compiler and basically what rules does compiler have to follow when dealing with such case and is there any exceptional case where despite simultaneous access to a memory location the volatile keyword can be ignored by programmer.

volatile SomeType * ptr = someAddress;
void someFunc(volatile const SomeType & input){
 //function body
}
5
Note that in portable C++ volatile cannot be used as a poor man's thread synchronization (although compilers might extend its meaning thus). Writing to a volatile object in one thread does not necessarily mean another thread will see the updated value. (It might only have been written to one CPU's cache, but not through the cache into whatever memory the CPUs share.) For that you need memory barriers.sbi
@sbi: While your comment is true in the bold face, I don't think that there is any way for a conforming compiler to leave the value in the CPU cache and not flush it to memory. After all that is the actual meaning of volatile: writes need to make it to main memory. The reason that it cannot be used for synchronization is that the guarantees don't ensure atomicity or reorders with non-volatile variables.David Rodríguez - dribeas
@dribeas: The C++98 "abstract machine" has no concept of CPU cache (or registers, for that matter) so no, there is no requirement to flush volatile writes to main memory.zwol
@David: "writes need to make it to main memory" I'm way out of my depth here, but from what I know volatile is often used for addresses that do not even correspond to memory, so I think this must be wrong. But, yes, atomicity and write ordering are problems I forgot about.sbi
@sbi, @Zack: right, I typed faster than I could think: the abstract machine does not have the concept of cpu cache. I mixed two concepts, the c++ memory model determines that it has to be written out to memory. The hardware architectures are what ensures that the view of the memory is consistent across processors --even if that imposes some burden in performance. The language determines that it is "written to memory". The hardware architecture ensures that the memory as seen by the different processors is consistent.David Rodríguez - dribeas

5 Answers

19
votes

What you know is false. Volatile is not used to synchronize memory access between threads, apply any kind of memory fences, or anything of the sort. Operations on volatile memory are not atomic, and they are not guaranteed to be in any particular order. volatile is one of the most misunderstood facilities in the entire language. "Volatile is almost useless for multi-threadded programming."

What volatile is used for is interfacing with memory-mapped hardware, signal handlers and the setjmp machine code instruction.

It can also be used in a similar way that const is used, and this is how Alexandrescu uses it in this article. But make no mistake. volatile doesn't make your code magically thread safe. Used in this specific way, it is simply a tool that can help the compiler tell you where you might have messed up. It is still up to you to fix your mistakes, and volatile plays no role in fixing those mistakes.

EDIT: I'll try to elaborate a little bit on what I just said.

Suppose you have a class that has a pointer to something that cannot change. You might naturally make the pointer const:

class MyGizmo
{ 
public:
  const Foo* foo_;
};

What does const really do for you here? It doesn't do anything to the memory. It's not like the write-protect tab on an old floppy disc. The memory itself it still writable. You just can't write to it through the foo_ pointer. So const is really just a way to give the compiler another way to let you know when you might be messing up. If you were to write this code:

gizmo.foo_->bar_ = 42;

...the compiler won't allow it, because it's marked const. Obviously you can get around this by using const_cast to cast away the const-ness, but if you need to be convinced this is a bad idea then there is no help for you. :)

Alexandrescu's use of volatile is exactly the same. It doesn't do anything to make the memory somehow "thread safe" in any way whatsoever. What it does is it gives the compiler another way to let you know when you may have screwed up. You mark things that you have made truly "thread safe" (through the use of actual synchronization objects, like Mutexes or Semaphores) as being volatile. Then the compiler won't let you use them in a non-volatile context. It throws a compiler error you then have to think about and fix. You could again get around it by casting away the volatile-ness using const_cast, but this is just as Evil as casting away const-ness.

My advice to you is to completely abandon volatile as a tool in writing multithreadded applications (edit:) until you really know what you're doing and why. It has some benefit but not in the way that most people think, and if you use it incorrectly, you could write dangerously unsafe applications.

10
votes

It's not as well defined as you probably want it to be. Most of the relevant standardese from C++98 is in section 1.9, "Program Execution":

The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions.

Accessing an object designated by a volatile lvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression might produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.

Once the execution of a function begins, no expressions from the calling function are evaluated until execution of the called function has completed.

When the processing of the abstract machine is interrupted by receipt of a signal, the values of objects with type other than volatile sig_atomic_t are unspecified, and the value of any object not of volatile sig_atomic_t that is modified by the handler becomes undefined.

An instance of each object with automatic storage duration (3.7.2) is associated with each entry into its block. Such an object exists and retains its last-stored value during the execution of the block and while the block is suspended (by a call of a function or receipt of a signal).

The least requirements on a conforming implementation are:

  • At sequence points, volatile objects are stable in the sense that previous evaluations are complete and subsequent evaluations have not yet occurred.

  • At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.

  • The input and output dynamics of interactive devices shall take place in such a fashion that prompting messages actually appear prior to a program waiting for input. What constitutes an interactive device is implementation-defined.

So what that boils down to is:

  • The compiler cannot optimize away reads or writes to volatile objects. For simple cases like the one casablanca mentioned, that works the way you might think. However, in cases like

    volatile int a;
    int b;
    b = a = 42;
    

    people can and do argue about whether the compiler has to generate code as if the last line had read

    a = 42; b = a;
    

    or if it can, as it normally would (in the absence of volatile), generate

    a = 42; b = 42;
    

    (C++0x may have addressed this point, I haven't read the whole thing.)

  • The compiler may not reorder operations on two different volatile objects that occur in separate statements (every semicolon is a sequence point) but it is totally allowed to rearrange accesses to non-volatile objects relative to volatile ones. This is one of the many reasons why you should not try to write your own spinlocks, and is the primary reason why John Dibling is warning you not to treat volatile as a panacea for multithreaded programming.

  • Speaking of threads, you will have noticed the complete absence of any mention of threads in the standards text. That is because C++98 has no concept of threads. (C++0x does, and may well specify their interaction with volatile, but I wouldn't be assuming anyone implements those rules yet if I were you.) Therefore, there is no guarantee that accesses to volatile objects from one thread are visible to another thread. This is the other major reason volatile is not especially useful for multithreaded programming.

  • There is no guarantee that volatile objects are accessed in one piece, or that modifications to volatile objects avoid touching other things right next to them in memory. This is not explicit in what I quoted but is implied by the stuff about volatile sig_atomic_t -- the sig_atomic_t part would be unnecessary otherwise. This makes volatile substantially less useful for access to I/O devices than it was probably intended to be, and compilers marketed for embedded programming often offer stronger guarantees, but it's not something you can count on.

  • Lots of people try to make specific accesses to objects have volatile semantics, e.g. doing

    T x;
    *(volatile T *)&x = foo();
    

    This is legit (because it says "object designated by a volatile lvalue" and not "object with a volatile type") but has to be done with great care, because remember what I said about the compiler being totally allowed to reorder non-volatile accesses relative to volatile ones? That goes even if it's the same object (as far as I know anyway).

  • If you are worried about reordering of accesses to more than one volatile value, you need to understand the sequence point rules, which are long and complicated and I'm not going to quote them here because this answer is already too long, but here's a good explanation which is only a little simplified. If you find yourself needing to worry about the differences in the sequence point rules between C and C++ you have already screwed up somewhere (for instance, as a rule of thumb, never overload &&).

7
votes

Declaring a variable as volatile means the compiler can't make any assumptions about the value that it could have done otherwise, and hence prevents the compiler from applying various optimizations. Essentially it forces the compiler to re-read the value from memory on each access, even if the normal flow of code doesn't change the value. For example:

int *i = ...;
cout << *i; // line A
// ... (some code that doesn't use i)
cout << *i; // line B

In this case, the compiler would normally assume that since the value at i wasn't modified in between, it's okay to retain the value from line A (say in a register) and print the same value in B. However, if you mark i as volatile, you're telling the compiler that some external source could have possibly modified the value at i between line A and B, so the compiler must re-fetch the current value from memory.

6
votes

A particular and very common optimization that is ruled out by volatile is to cache a value from memory into a register, and use the register for repeated access (because this is much faster than going back to memory every time).

Instead the compiler must fetch the value from memory every time (taking a hint from Zach, I should say that "every time" is bounded by sequence points).

Nor can a sequence of writes make use of a register and only write the final value back later on: every write must be pushed out to memory.

Why is this useful? On some architectures certain IO devices map their inputs or outputs to a memory location (i.e. a byte written to that location actually goes out on the serial line). If the compiler redirects some of those writes to a register that is only flushed occasionally then most of the bytes won't go onto the serial line. Not good. Using volatile prevents this situation.

1
votes

The compiler is not allowed to optimize away reads of a volatile object in a loop, which otherwise it'd normally do (i.e. strlen()).

It's commonly used in embedded programming when reading a hardware registry at a fixed address, and that value may change unexpectedly. (In contrast with "normal" memory, that doesn't change unless written to by the program itself...)

That is it's main purpose.

It could also be used to make sure one thread see the change in a value written by another, but it in no way guarantees atomicity when reading/writing to said object.