First off, the whole motivation for your (quite ingenious I might say) usage of "placement new" as a means of implementing the assignment operator, operator=()
, as instigated by this question (std::vector of objects and const-correctness), is now nullified. As of C++11, that question's code now has no errors. See my answer here.
Secondly, C++11's emplace()
functions now do pretty much exactly what your usage of placement new was doing, except that they are all virtually guaranteed by the compilers themselves now to be well-defined behavior, per the C++ standard.
Third, when the accepted answer states:
because this
is not guaranteed to refer to the new object
I wonder if this is because the value contained in the this
variable might be changed by the placement new copy-construction operation, NOT because anything using that instance of the class might retain a cached value of it, with the old instance data, rather than read a new value of the object instance from memory. If the former, it seems to me you could ensure this
is correct inside the assignment operator function by using a temporary copy of the this
pointer, like this:
// Custom-defined assignment operator
A& operator=(const A& right)
{
if (this == &right) return *this;
// manually call the destructor of the old left-side object
// (`this`) in the assignment operation to clean it up
this->~A();
// Now back up `this` in case it gets corrupted inside this function call
// only during the placement new copy-construction operation which
// overwrites this objct:
void * thisBak = this;
// use "placement new" syntax to copy-construct a new `A`
// object from `right` into left (at address `this`)
new (this) A(right);
// Note: we cannot write to or re-assign `this`.
// See here: https://stackoverflow.com/a/18227566/4561887
// Return using our backup copy of `this` now
return *thisBak;
}
But, if it has to do with an object being cached and not re-read each time it is used, I wonder if volatile
would solve this! ie: use volatile const int c;
as the class member instead of const int c;
.
Fourth, in the rest of my answer I focus on the usage of volatile
, as applied to the class members, to see if this might solve the 2nd of these two potential undefined behavior cases:
The potential UB in your own solution:
// Custom-defined assignment operator
A& operator=(const A& right)
{
if (this == &right) return *this;
// manually call the destructor of the old left-side object
// (`this`) in the assignment operation to clean it up
this->~A();
// use "placement new" syntax to copy-construct a new `A`
// object from `right` into left (at address `this`)
new (this) A(right);
return *this;
}
The potential UB you mention may exist in the other solution.
// (your words, not mine): "very very bad, IMHO, it is
// undefined behavior"
*const_cast<int*> (&c)= assign.c;
Although I think perhaps adding volatile
might fix both cases above, my focus in the rest of this answer is on the 2nd case just above.
tldr;
It seems to me this (the 2nd case just above, in particular) becomes valid and well-defined behavior by the standard if you add volatile
and make the class member variable volatile const int c;
instead of just const int c;
. I can't say this is a great idea, but I think casting away const
and writing to c
then becomes well-defined behavior and perfectly valid. Otherwise, the behavior is undefined only because reads of c
may be cached and/or optimized out since it is only const
, and not also volatile
.
Read below for more details and justification, including a look at some examples and a little assembly.
const member and assignment operator. How to avoid the undefined behavior?
Writing to const
members is only undefined behavior...
...because the compiler may optimize out further reads to the variable, since it's const
. In other words, even though you've correctly updated the value contained at a given address in memory, the compiler may tell the code to just regurgitate whatever was last in the register holding the value it first read, rather than going back to the memory address and actually checking for a new value each time you read from that variable.
So this:
// class member variable:
const int c;
// anywhere
*const_cast<int*>(&c) = assign.c;
probably is undefined behavior. It may work in some cases but not others, on some compilers but not others, or in some versions of compilers, but not others. We can't rely on it to have predictable behavior because the language does not specify what should happen each and every time we set a variable as const
and then write to and read from it.
This program, for instance (see here: https://godbolt.org/z/EfPPba):
#include <cstdio>
int main() {
const int i = 5;
*(int*)(&i) = 8;
printf("%i\n", i);
return 0;
}
prints 5
(although we wanted it to print 8
) and produces this assembly in main
. (Note that I'm no assembly expert). I've marked the printf
lines. You can see that even though 8
is written to that location (mov DWORD PTR [rax], 8
), the printf
lines do NOT read out that new value. They read out the previously-stored 5
because they don't expect it to have changed, even though it did. The behavior is undefined, so the read is omitted in this case.
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], 5
lea rax, [rbp-4]
mov DWORD PTR [rax], 8
// printf lines
mov esi, 5
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov eax, 0
leave
ret
Writing to volatile const
variables, however, is not undefined behavior...
...because volatile
tells the compiler it better read the contents at the actual memory location on every read to that variable, since it might change at any time!
You might think: "Does this even make sense?" (having a volatile const
variable. I mean: "what might change a const
variable to make us need to mark it volatile
!?) The answer is: "well, yes! It does make sense!" On microcontrollers and other low-level memory-mapped embedded devices, some registers, which could change at any moment by the underlying hardware, are read-only. To mark them read-only in C or C++ we make them const
, but to ensure the compiler knows it better actually read the memory at their address location every single time we read the variable, rather than relying on optimizations which retain previously-cached values, we also mark them as volatile
. So, to mark address 0xF000
as a read-only 8-bit register named REG1
, we'd define it like this in a header file somewhere:
// define a read-only 8-bit register
#define REG1 (*(volatile const uint8_t*)(0xF000))
Now, we can read to it at our whim, and each and every time we ask the code to read the variable, it will. This is well-defined behavior. Now, we can do something like this, and this code will NOT get optimized out, because the compiler knows that this register value actually could change at any given time, since it's volatile
:
while (REG1 == 0x12)
{
// busy wait until REG1 gets changed to a new value
}
And, to mark REG2
as an 8-bit read/write register, of course, we'd just remove const
. In both cases, however, volatile
is required, as the values could change at any given time by the hardware, so the compiler better not make any assumptions about these variables or try to cache their values and rely on cached readings.
// define a read/write 8-bit register
#define REG2 (*(volatile uint8_t*)(0xF001))
Therefore, the following is not undefined behavior! This is very well-defined behavior as far as I can tell:
// class member variable:
volatile const int c;
// anywhere
*const_cast<int*>(&c) = assign.c;
Even though the variable is const
, we can cast away const
and write to it, and the compiler will respect that and actually write to it. And, now that the variable is also marked as volatile
, the compiler will read it every single time, and respect that too, the same as reading REG1
or REG2
above.
This program, therefore, now that we added volatile
(see it here: https://godbolt.org/z/6K8dcG):
#include <cstdio>
int main() {
volatile const int i = 5;
*(int*)(&i) = 8;
printf("%i\n", i);
return 0;
}
prints 8
, which is now correct, and produces this assembly in main
. Again, I've marked the printf
lines. Notice the new and different lines I've marked too! These are the only changes to the assembly output! Every other line is exactly identical. The new line, marked below, goes out and actually reads the new value of the variable and stores it into register eax
. Next, in preparation for printing, instead of moving a hard-coded 5
into register esi
, as was done before, it moves the contents of register eax
, which is just read, and which now contains an 8
, into register esi
. Solved! Adding volatile
fixed it!
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], 5
lea rax, [rbp-4]
mov DWORD PTR [rax], 8
// printf lines
mov eax, DWORD PTR [rbp-4] // NEW!
mov esi, eax // DIFFERENT! Was `mov esi, 5`
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov eax, 0
leave
ret
Here's a bigger demo (run it online: https://onlinegdb.com/HyU6fyCNv). You can see that we can write to a variable by casting it to a non-const reference OR a non-const pointer.
In all cases (casting to both non-const references or non-const pointers in order to modify the const value), we can use C++-style casts, OR C-style casts.
In the simple example above, I verified that in all four cases (even using a C-style cast to cast to a reference: (int&)(i) = 8;
, oddly enough, since C doesn't have references :)) the assembly output was the same.
#include <stdio.h>
int main()
{
printf("Hello World\n");
// This does NOT work!
const int i1 = 5;
printf("%d\n", i1);
*const_cast<int*>(&i1) = 6;
printf("%d\n\n", i1); // output is 5, when we want it to be 6!
// BUT, if you make the `const` variable also `volatile`, then it *does* work! (just like we do
// for writing to microcontroller registers--making them `volatile` too). The compiler is making
// assumptions about that memory address when we make it just `const`, but once you make it
// `volatile const`, those assumptions go away and it has to actually read that memory address
// each time you ask it for the value of `i`, since `volatile` tells it that the value at that
// address could change at any time, thereby making this work.
// Reference casting: WORKS! (since the `const` variable is now `volatile` too)
volatile const int i2 = 5;
printf("%d\n", i2);
const_cast<int&>(i2) = 7;
// So, the output of this is 7:
printf("%d\n\n", i2);
// C-style reference cast (oddly enough, since C doesn't have references :))
volatile const int i3 = 5;
printf("%d\n", i3);
(int&)(i3) = 8;
printf("%d\n\n", i3);
// It works just fine with pointer casting too instead of reference casting, ex:
volatile const int i4 = 5;
printf("%d\n", i4);
*(const_cast<int*>(&i4)) = 9;
printf("%d\n\n", i4);
// or C-style:
volatile const int i5 = 5;
printf("%d\n", i5);
*(int*)(&i5) = 10;
printf("%d\n\n", i5);
return 0;
}
Sample output:
Hello World
5
5
5
7
5
8
5
9
5
10
Notes:
- I've also noticed that the above works when modifying
const
class members even when they are NOT volatile
. See my "std_optional_copy_test" program! Ex: https://onlinegdb.com/HkyNyTt4D. This, however, is probably undefined behavior. To make it well-defined, make the member variable volatile const
instead of just const
.
- The reason you don't have to cast from
volatile const int
to volatile int
(ie: why just to int
reference or int
pointer) works just fine, is because volatile
affects the reading of the variable, NOT the writing of the variable. So, so long as we read the variable through a volatile variable means, which we do, our reads are guaranteed not to be optimized out. That's what gives us the well-defined behavior. The writes always worked--even when the variable wasn't volatile
.
Refences:
- [my own answer] What uses are there for "placement new"?
- x86 Assembly Guide
- Change 'this' pointer of an object to point different object
- Compiler Explorer outputs, with assembly, from godbolt.org:
- Here: https://godbolt.org/z/EfPPba
- And here: https://godbolt.org/z/6K8dcG
- [my answer] Register-level GPIO access on STM32 microcontrollers: Programing STM32 like STM8(register level GPIO )