OpenGL - Why is my ray picking not working?

Question

I recently setup a project that uses OpenGL (Via the C# Wrapper Library OpenTK) which should do the following:

Create a perspective projection camera - this camera will be used to make the user rotate,move etc. to look at my 3d models.
Draw some 3d objects.
Use 3d ray picking via unproject to let the user pick points/models in the 3d view.

The last step (ray picking) looks ok on my 3d preview (GLControl) but returns invalid results like Vector3d (1,86460186949617; -45,4086124979203; -45,0387025610247). I have no idea why this is the case!

I am using the following code to setup my viewport:

        this.RenderingControl.MakeCurrent();

        int w = RenderingControl.Width;
        int h = RenderingControl.Height;

        // Use all of the glControl painting area
        GL.Viewport(0, 0, w, h);

        GL.MatrixMode(MatrixMode.Projection);
        GL.LoadIdentity();
        Matrix4 p = Matrix4.CreatePerspectiveFieldOfView(MathHelper.PiOver4, w / (float)h, 0.1f, 64.0f);
        GL.LoadMatrix(ref p);

I use this method for unprojecting:

    /// <summary>
    /// This methods maps screen coordinates to viewport coordinates.
    /// </summary>
    /// <param name="screen"></param>
    /// <param name="view"></param>
    /// <param name="projection"></param>
    /// <param name="view_port"></param>
    /// <returns></returns>
    private Vector3d UnProject(Vector3d screen, Matrix4d view, Matrix4d projection, int[] view_port)
    {
        Vector4d pos = new Vector4d();

        // Map x and y from window coordinates, map to range -1 to 1 
        pos.X = (screen.X - (float)view_port[0]) / (float)view_port[2] * 2.0f - 1.0f;
        pos.Y = (screen.Y - (float)view_port[1]) / (float)view_port[3] * 2.0f - 1.0f;
        pos.Z = screen.Z * 2.0f - 1.0f;
        pos.W = 1.0f;

        Vector4d pos2 = Vector4d.Transform(pos, Matrix4d.Invert(Matrix4d.Mult(view, projection)));
        Vector3d pos_out = new Vector3d(pos2.X, pos2.Y, pos2.Z);

        return pos_out / pos2.W;
    }

I use this code to position my camera (including rotation) and do the ray picking:

        // Clear buffers
        GL.Clear(ClearBufferMask.ColorBufferBit | ClearBufferMask.DepthBufferBit);
        // Apply camera
        GL.MatrixMode(MatrixMode.Modelview);

        Matrix4d mv = Matrix4d.LookAt(EyePosition, Vector3d.Zero, Vector3d.UnitY);
        GL.LoadMatrix(ref mv);

        GL.Translate(0, 0, ZoomFactor);

        // Rotation animation
        if (RotationAnimationActive)
        {
            CameraRotX += 0.05f;
        }

        if (CameraRotX >= 360)
        {
            CameraRotX = 0;
        }

        GL.Rotate(CameraRotX, Vector3.UnitY);
        GL.Rotate(CameraRotY, Vector3.UnitX);

        // Apply useful rotation
        GL.Rotate(50, 90, 30, 0f);

        // Draw Axes
        drawAxes();

        // Draw vertices of my 3d objects ...

        // Picking Test
        int x = MouseX;
        int y = MouseY;

        int[] viewport = new int[4];
        Matrix4d modelviewMatrix, projectionMatrix;
        GL.GetDouble(GetPName.ModelviewMatrix, out modelviewMatrix);
        GL.GetDouble(GetPName.ProjectionMatrix, out projectionMatrix);
        GL.GetInteger(GetPName.Viewport, viewport);

        // get depth of clicked pixel
        float[] t = new float[1];
        GL.ReadPixels(x, RenderingControl.Height - y, 1, 1, OpenTK.Graphics.OpenGL.PixelFormat.DepthComponent, PixelType.Float, t);

        var res = UnProject(new Vector3d(x, viewport[3] - y, t[0]), modelviewMatrix, projectionMatrix, viewport);

        GL.Begin(BeginMode.Lines);
        GL.Color3(Color.Yellow);

        GL.Vertex3(0, 0, 0);

        GL.Vertex3(res);

        Debug.WriteLine(res.ToString());
        GL.End();

I get the following result from my ray picker:

Clicked Position = (1,86460186949617; -45,4086124979203; -45,0387025610247)

This vector is shown as the yellow line on the attached screenshot.

Why is the Y and Z Position not in the range -1/+1? Where do these values like -45 come from and why is the ray rendered correctly on the screen?

If you have only a tip about what could be broken I would also appreciate your reply!

Screenshot:

Rethunk Rethunk · Accepted Answer · 2016-03-02T05:38:46

If you break down the transform from screen to world into individual matrices, print out the inverses of the M, V, and P matrices, and print out the intermediate result of each (matrix inverse) * (point) calculation from screen to world/model, then I think you'll see the problem. Or at least you'll see that there is a problem with using the inverse of the M-V-P matrix and then intuitively grasp the solution. Or maybe just read the list of steps below and see if that helps.

Here's the approach I've used:

Convert the 2D vector for mouse position in screen/control/widget coordinates to the 4D vector (mouse.x, mouse.y, 0, 1).
Transform the 4D vector from screen coordinates to Normalized Device Coordinates (NDC) space. That is, multiply the inverse of your NDC-to-screen matrix [or equivalent equations] by (mouse.x, mouse.y, 0, 1) to yield a 4D vector in NDC coordinate space: (nx, ny, 0, 1).
In NDC coordinates, define two 4D vectors: the source (near point) of the ray as (nx, ny, -1, 1) and a far point at (nx, ny, +1, 1).
Multiply each 4D vector by the inverse of the (perspective) projection matrix.
Convert each 4D vector to a 3D vector (i.e. divide through by the fourth component, often called "w"). *
Multiply the 3D vectors by the inverse of the view matrix.
Multiply the 3D vectors by the inverse of the model matrix (which may well be the identity matrix).
Subtract the 3D vectors to yield the ray.
Normalize the ray.
Yee-haw. Go back and justify each step with math, if desired, or save that review for later [if ever] and work frantically towards catching up on creating actual 3D graphics and interaction and whatnot.
Go back and refactor, if desired.

(* The framework I use allows multiplication of a 3D vector by a 4x4 matrix because it treats the 3D vector as a 4D vector. I can make this more clear later, if necessary, but I hope the point is reasonably clear.)

That worked for me. This set of steps also works for Ortho projections, though with Ortho you could cheat and write simpler code since the projection matrix isn't goofy.

It's late as I write this and I may have misinterpreted your problem. I may have also misinterpreted your code since I use a different UI framework. But I know how aggravating ray casting for OpenGL can be, so I'm posting in the hope that at least some of what I write is useful, and that I can thereby alleviate some human misery.

Postscript. Speaking of misery: I found numerous forum posts and blog pages that address ray casting for OpenGL, but most posts start with some variant of the following: "First, you have to know X" [where X is not necessary to know]; or "Go look at the unproject function [in library X in repository Y for which you'll need client app Z . ..]"; or a particular favorite of mine: "Review a textbook on linear algebra."

Having to slog through yet another description of the OpenGL rendering pipeline or the OpenGL transformation conga line when you just need to debug ray casting--a common problem--is like having to listen to a lecture on hydraulics when you discover your brake pedal isn't working.

OpenGL - Why is my ray picking not working?

1 Answers