So, I'm writing a math library using SSE intrinsics to use with my OpenGL application. Right now I'm implementing some of the more important functions like lookAt, using the glm library to check for correctness, but for some reason my implementation of lookAt isn't working as it should.
Here's the source code:
inline void lookAt(__m128 position, __m128 target, __m128 up)
{
/* Get the target vector relative to the camera position */
__m128 t = vec4::normalize3(_mm_sub_ps(target, position));
__m128 u = vec4::normalize3(up);
/* Get the right vector by crossing target and up. */
__m128 r = vec4::normalize3(vec4::cross(t, u));
/* Correct the up vector by crossing right and target. */
u = vec4::cross(r, t);
/* Negate the target vector. */
t = _mm_sub_ps(_mm_setzero_ps(), t);
/* Treat the right, up, and target vector as a matrix, and transpose it. */
/* Conveniently, this also sets the w component of all four to 0.0f */
_MM_TRANSPOSE4_PS(r, u, t, _mm_setr_ps(0.0f, 0.0f, 0.0f, 1.0f));
vec4 pos = _mm_sub_ps(_mm_setzero_ps(), position);
pos.w = 1.0f;
/* Multiply our matrix by the transposed vectors. */
mat4 temp;
temp.col0 = r;
temp.col1 = u;
temp.col2 = t;
temp.col3 = _mm_setr_ps(0.0f, 0.0f, 0.0f, 1.0f);
multiply(temp);
translate(pos);
}
My matrices are column-major, stored internally as "__m128 col0, col1, col2, col3;".
I made this after reading the man pages Here for gluLookAt. Once I realized that the right, up, and target vectors looked an awful lot like a row-major matrix, it was simple for me to transpose them so I could assign them to the rotation matrix.
The code for normalize3, in case it helps:
inline static __m128 normalize3(const __m128& vec)
{
__m128 v = _mm_mul_ps(vec, vec);
v = _mm_add_ps(
_mm_add_ps(
_mm_shuffle_ps(v, v, _MM_SHUFFLE(0, 0, 0, 0)),
_mm_shuffle_ps(v, v, _MM_SHUFFLE(1, 1, 1, 1))),
_mm_shuffle_ps(v, v, _MM_SHUFFLE(2, 2, 2, 2)));
return _mm_mul_ps(vec, _mm_rsqrt_ps(v));
}
It saves a couple of calls by ignoring the w component of the vector.
What am I doing wrong?
Here's some sample output. Using position(5.0, 5.0, 0.0), target(10.0, 20.0, 55.0), and up (0.0, 1.0, 0.0), I get:
From GLM:
- [-0.9959] [ 0.0000] [ 0.0905] [ 4.9795]
- [-0.0237] [ 0.9650] [-0.2610] [-4.7065]
- [-0.0874] [-0.2621] [-0.9611] [ 1.7474]
- [ 0.0000] [ 0.0000] [ 0.0000] [ 1.0000]
From my lookAt():
- [-0.9959] [ 0.0000] [ 0.0905] [-5.0000]
- [-0.0237] [ 0.9651] [-0.2610] [-5.0000]
- [-0.0874] [-0.2621] [-0.9611] [ 0.0000]
- [ 0.0000] [ 0.0000] [ 0.0000] [ 1.0000]
It seems that the only difference is in the third column, but I'm honestly not sure which of the two is correct. I'm inclined to say that GLM's is correct, since it was designed to be identical to the glu version.
EDIT: I just discovered something interesting. If I call "translate(pos);" before calling "multiply(temp);", my resulting matrix is exactly the same as glm's. Which is correct? According to the OpenGL man page on gluLookAt, this (and thus glm) is doing it backwards. Was I doing it right before, or it correct now?
gluLookAt (...)
's output as a reference. – Andon M. Colemannormalize3
does "normalize" thew
values when I believe it should be ignoring them. – Ben Jackson_mm_rsqrt_ps(v)
intrinsic. It's not accurate enough. – Z boson