I'm writing a compiler that matches, within {} scope, by and large C99's semantics. When trying to reverse-engineer how gcc handles certain 'undefined behaviour', concretely, chained pre- and post-increments of variables, I noticed that it gets hopelessly confused if you combine this with modifying assignments (e.g., "*=") and array access. Simplifying to the easiest point of apparent utter confusion, gcc 4.6.3. evaluates (with and without option -std=c99):
a[0] = 2;
a[0] *= a[0]++;
to
a[0] = 3.
Am I mis-remembering the standard incorrectly? Is any use of a pre- or post-increment already undefined, not only chained use in a compound expression?
Also, even if the behaviour is 'undefined', the above seems like a particularly poor way of calculating the result as I could only see how you'd justify a result of 5 ( = 2*2 + 1, what I would have implemented - post-increment after an assignment statement), or 6 ( = 3 * 2, use a variable, then immediately post-increment it, and process in the order of parsing - the parser is almost certain to evaluate the "*=" after evaluating the RHS expression). Any insight into this - from a C or C++ angle?
I noticed this when trying to combine arrays with integer expression bounds with pre- and post-increments, and realize this is really hard; but still, the above seems a little bit like a cop-out considering the flagship status of gcc.
This is under Ubuntu 12.04.
Edit: I should have added that gcc's behavior can be reverse-engineered if the variable is not an array element - at least all examples I tried work as follows: (1) evaluate all compound expression pre-increments; (2) evaluate the expression; (3) evaluate all compound expression post-increments. So it probably has to do with the 'really hard' above as well.
Note: clang produces the philosophically reasonable value of 6. I ran more elaborate cases with clang, and am reasonably certain that it treats the array access and scalar case the same, and operates as in what I described above as the second philosophically reasonable way.
*=
and()++
are not ordered. Apparently in this case gcc chooses to do them left-to-right. – rici__tmp = a[0] + 1; a[0] *= __tmp; a[0] == __tmp;
which is just a valid ordering of instructions. – David Rodríguez - dribeas