2
votes

I'm having trouble with passing 3x3 matrix through a constant buffer to my shader in DirectX. This is how I define my constant buffer:

In .cpp:

struct PostProcessConvolutionCB {
    float screenWidth;
    float screenHeight;
    float sum;
    XMFLOAT3X3 kernel;
};

In .hlsl:

struct PostProcessConvolutionCB {
    float screenWidth;
    float screenHeight;
    float sum;
    float3x3 kernel;
};
ConstantBuffer<PostProcessConvolutionCB> cb : register(b0);

struct PixelShaderInput {
    float4 Position : SV_Position;
};

float4 main(PixelShaderInput IN) : SV_Target {
    return float4(cb.kernel[1][1], 0.f, 0.f, 1.f);
}

It seems like access to some element is all messed up. To test this I initialized the matrix in the constant buffer like this: XMFLOAT3X3(0.1f, 0.2f, 0.3f, 0.4f, 0.5f, 0.6f, 0.7f, 0.8f, 0.9f); and tried displaying the value of each element by hardcoding the matrix indices in the shader like in hlsl snippet above (cb.kernel[1][1]). After 9 runs I got the following results:

kernel[0][0] = 0.1
kernel[1][0] = 0.2
kernel[2][0] = 0.3

kernel[0][1] = 0.5
kernel[1][1] = 0.6
kernel[2][1] = 0.7

kernel[0][2] = 0.9
kernel[1][2] = 1.0
kernel[2][2] = 1.0

Seems like every row is aligned to 4 floats. Changing the matrix to 4x4 helps but I guess there has to be a way to use float3x3 type.

How to handle this properly?

1

1 Answers

2
votes

The issue you are hitting is that the HLSL rules for packing are different than C++. See Microsoft Docs:

HLSL packing rules are similar to performing a #pragma pack 4 with Visual Studio, which packs data into 4-byte boundaries. Additionally, HLSL packs data so that it does not cross a 16-byte boundary.

Also keep in mind that by default, HLSL uses 'column-major' matrices while DirectXMath uses 'row-major'. This is why you see lots of samples that transpose the matrix going from XMFLOAT?X? to an HLSL constant buffer struct. See Microsoft Docs.

Generally your best option is to use XMFLOAT4X4 for HLSL matrices. One option for saving a little constant buffer memory (useful for skinning in particular where you have many bones that do not include projection transformations) is to use in HLSL:

struct SkinnedEffectConstants
{
…
    XMVECTOR bones[MaxBones][3];
};

Then in C++ you use:

for (size_t i = 0; i < count; i++)
{
    XMMATRIX boneMatrix = XMMatrixTranspose(XMLoadFloat4x3(…));

    boneConstant[i][0] = boneMatrix.r[0];
    boneConstant[i][1] = boneMatrix.r[1];
    boneConstant[i][2] = boneMatrix.r[2];
}