The OpenGL specification lies (or is this a bug?)... Referring to the layout for std140, with shared uniform buffers, it states:
"The set of rules shown in Tabl e L-1 are used by the GLSL compiler to layout members in a std140-qualified uniform block. The offsets of members in the block are accumulated based on the sizes of the previous members in the block (those declared before the variable in question), and the starting offset. The starting offset of the first member is always zero.
Scalar variable type (bool, int, uint, float) - Size of the scalar in basic machine types"
(http://www.opengl-redbook.com/appendices/AppL.pdf)
So, armed with this information, I setup a uniform block in my shader that looks something like this:
// Spotlight.
layout (std140) uniform Spotlight
{
float Light_Intensity;
vec4 Light_Ambient;
vec3 Light_Position;
};
... only to discover it doesn't work with the subsequent std140 layout I setup on the CPU side. That is the first 4 bytes are a float (size of the machine scalar type for GLfloat), the next 16 bytes are a vec4 and the following 12 bytes are a vec3 (with 4 bytes left over on the end to take account of the rule that a vec3 is really a vec4).
When I change the CPU side to specify a float as being the same size as a vec4, i.e. 16 bytes, and do my offsets and buffer size making this assumption, the shader works as intended.
So, either the spec is wrong or I've misunderstood the meaning of "scalar" in this context, or ATI have a driver bug. Can anyone shed any light on this mystery?