I believe I found a bug in GCC while implementing O'Neill's PCG PRNG. (Initial code on Godbolt's Compiler Explorer)
After multiplying oldstate
by MULTIPLIER
, (result stored in rdi), GCC doesn't add that result to INCREMENT
, movabs'ing INCREMENT
to rdx instead, which then gets used as the return value of rand32_ret.state
A minimum reproducible example (Compiler Explorer):
#include <stdint.h>
struct retstruct {
uint32_t a;
uint64_t b;
};
struct retstruct fn(uint64_t input)
{
struct retstruct ret;
ret.a = 0;
ret.b = input * 11111111111 + 111111111111;
return ret;
}
Generated assembly (GCC 9.2, x86_64, -O3):
fn:
movabs rdx, 11111111111 # multiplier constant (doesn't fit in imm32)
xor eax, eax # ret.a = 0
imul rdi, rdx
movabs rdx, 111111111111 # add constant; one more 1 than multiplier
# missing add rdx, rdi # ret.b=... that we get with clang or older gcc
ret
# returns RDX:RAX = constant 111111111111 : 0
# independent of input RDI, and not using the imul result it just computed
Interestingly, modifying the struct to have the uint64_t as the first member produces correct code, as does changing both members to be uint64_t
x86-64 System V does return structs smaller than 16 bytes in RDX:RAX, when they're trivially copyable. In this case the 2nd member is in RDX because the high half of RAX is the padding for alignment or .b
when .a
is a narrower type. (sizeof(retstruct)
is 16 either way; we're not using __attribute__((packed))
so it respects alignof(uint64_t) = 8.)
Does this code contain any undefined behaviour that would allow GCC to emit the "incorrect" assembly?
If not, this should get reported on https://gcc.gnu.org/bugzilla/