Math behind the mouse input in opengl and the yaw/pitch values

Question

Hi I was trying out some c++ opengl codes and made a camera. I wanted to give the scene a mouse input to look around, so I added this code just like the tutorial here. https://learnopengl.com/Getting-started/Camera

However, there are some mathematical concepts I do not understand regarding the yaw and pitch values. Here is the callback function of the mouse movement.

void mouse_callback(GLFWwindow* window, double xpos, double ypos) {
    if (firstMouse) //preventing large jumps
    {
        lastX = xpos;
        lastY = ypos;
        firstMouse = false;
    }

    float xoffset = xpos - lastX;
    float yoffset = lastY - ypos;
    lastX = xpos;
    lastY = ypos;

    float sensitivity = 0.1f;
    xoffset *= sensitivity;
    yoffset *= sensitivity;

    yaw += xoffset;
    pitch += yoffset;

    if (pitch > 89.0f)
        pitch = 89.0f;
    if (pitch < -89.0f)
        pitch = -89.0f;

    glm::vec3 front;
    front.x = cos(glm::radians(yaw)) * cos(glm::radians(pitch));
    front.y = sin(glm::radians(pitch));
    front.z = cos(glm::radians(pitch)) * sin(glm::radians(yaw));

    //cameraFront is the direction vector of the camera. Where the camera is looking at
    cameraFront = glm::normalize(front);
}

Here are the global variables with their initial values used in the mouse_callback function just in case.

glm::vec3 cameraPos = glm::vec3(0.0f, 0.0f, 3.0f);
glm::vec3 cameraFront = glm::vec3(0.0f, 0.0f, -1.0f);
glm::vec3 cameraUp = glm::vec3(0.0f, 1.0f, 0.0f);
glm::mat4 view = glm::lookAt(cameraPos, cameraPos + cameraFront, cameraUp);
//lookAt function requires position, target's position, up vector respectively.

There are 2 things I don't understand. As I understand, yaw and pitch are calculated from the y axis and the x axis respectively. And by using our right hand and putting the thumb towards the + direction of the axis, the direction the other fingers are curved is the positive direction of the angle.

Now let's say that I moved the mouse to the right without changing the yoffset. Then according to the mouse_callback function the yaw value increases since xoffset is positive. Since the positive direction of the y axis points to the top of the window we're watching, the increase in yaw means that the direction vector of the camera should rotate to the left direction right? But in the program, the camera turns and shows the right part of the scene. I don't understand what's happening.

If I know what's happening here, I think I can understand why the calculation order to get yoffset is different from getting the xoffset value as well.

Any help would be greatly appreciated. Tell me if I'm mistaking any mathematical concepts or something. Thanks in advance!

derhass derhass · Accepted Answer · 2018-07-14T16:57:06

Since the positive direction of the y axis points to the top of the window we're watching, the increase in yaw means that the direction vector of the camera should rotate to the left direction right?

No. The direction of the y axis has nothing to do with anything here. With leaving pitch at 0, the formulas given equal to:

front.x = cos(glm::radians(yaw))
front.y = 0
front.z = sin(glm::radians(yaw));

so, if yaw is 0, you end up with (1,0,0) (right). If you increase it to 90 degrees, you end up with (0,0,1), which is pointing straight to the back in a right-handed coordinate system, so you just turned to the right.

You somehow seem to relate this to the positive rotation orientation, which is always from z to x when rotating around y. But these formulas do not implement a positive rotation around the y axis by the angle yaw, but they actually rotate by -yaw: Since the system is set up to yield +x at angle 0, we can consider this as a rotation of the forward direction vector v= (1,0,0) around axis y, so the classic rotation matrix would yield:

      (  cos(yaw)      0     sin(yaw) )           ( cos(yaw) )
v' =  (     0          1        0     ) * v   =   (     0    )
      (  -sin(yaw)     0     cos(yaw) )           (-sin(yaw) )

However, when you rotate in negative rotation direction, you end up with the transposed matrix, just flipping the minus sign for the sin:

      (  cos(yaw)      0    -sin(yaw) )           ( cos(yaw) )
v' =  (     0          1        0     ) * v   =   (     0    )
      (  sin(yaw)      0     cos(yaw) )           ( sin(yaw) )

So it is just

front = R_y ( -yaw) * (1,0,0)^T

If you look ath the complete formula with pitch and yaw, you will notice that it will be equal to:

front = R_y(-yaw) * R_z(pitch) * (1,0,0)^T

which is just a compund rotation, first rotating (1,0,0) around z axis by angle pitch in positive winding order, and then rotating the result around y axis by angle yaw in negative order.

I also think that the author of the source code you're citing here was either a) in a hurry, or b) just a bit confused about how the math works out here. I say that for two reasons:

The default direction is given as (0,0,-1), but the euler angles are set up so that pitch=0, yaw=0 resulting in (1,0,0) viewing direction, with the default being yaw=-90. One could have formulated it in a cleaner and more intuitive way, so that zero angles yield default forward looking direction.
The usage of lookAt is completely unneccessary here. The ortho-normalization it will internally do is just a waste of processing power (not a big one by today's standars, but nonetheless). The usage of (0,1,0) as up vector will become very unstable near the poles, and limiting pitch to [-89,89] is just a hack to prevent that from happening. There is actually nothing wrong about having that camera looking straight up or down in this navigation model (since you only move along a 2D plane, the forward direction is still well-defined by yaw alone, even when looking straight up or down). The gimbal lock induced by that situation is also not relevant as there simply is no third rotation following.

It is indeed much easier to just create the view matrix directly from the two rotation angles and camera position, and completely avoid any issues near or at the full 90 degrees.

Math behind the mouse input in opengl and the yaw/pitch values

1 Answers