0
votes

My goal is to calculate a ray that points to the scene, for checking mouse clicks and stuff. I'm not using conventional perspective projection / camera, instead I'm just using an oblique projection (like a skewed orthographic proj) matrix for my scene with no camera (no view matrix). All the methods I found online are kinda specific for perspective projection and cameras, and use camera position directly as ray origin, then calculate ray direction from mouse position and proj/view matrices. However in my case (thinking about the projection in real world context) my ray origin should be calculated from mouse position, and ray direction should be the same for all rays and be able to directly calculated from the projection matrix, but I just don't know how..

This is my oblique projection matrix if that's relevant:

fn oblique_projection(cam) -> Mat4 {

    let w = cam.screen_width;
    let h = cam.screen_height;
    let near = cam.near;
    let far = cam.far;

    // normal orthographic projection matrix:
    let (left, right, bottom, top) = (-w / 2.0, w / 2.0, -h / 2.0, h / 2.0);
    let tx = -(right + left) / (right - left);
    let ty = -(top + bottom) / (top - bottom);
    let tz = -(far + near) / (far - near);

    let m1 = mat4![
        2.0 / (right - left), 0.0, 0.0, 0.0,
        0.0, 2.0 / (top - bottom), 0.0, 0.0,
        0.0, 0.0, -2.0 / (far - near), 0.0,
        tx, ty, tz, 1.0,
    ];

    // apply a skew matrix on top of the orthographic matrix to get an oblique projection matrix
    let a = -self.z_scale * f32::cos(self.angle);
    let b = -self.z_scale * f32::sin(self.angle);

    let m2 = mat4![
        1.0, 0.0, 0.0, 0.0,
        0.0, 1.0, 0.0, 0.0,
        a, b, 1.0, 0.0,
        0.0, 0.0, 0.0, 1.0,
    ];

    return m1 * m2;

}

(basically a skewed orthographic projection, result in something like an isometric view)

EDIT:

I found a solution that's very specific for my setup (my oblique projection)

let a = -cam.z_scale * cos(cam.angle);
let b = -cam.z_scale * sin(cam.angle);

let skewed = mat4[
    1.0, 0.0, 0.0, 0.0,
    0.0, 1.0, 0.0, 0.0,
    a, b, 1.0, 0.0,
    0.0, 0.0, 0.0, 1.0,
]; // the only the skew part from the projection matrix construction above

let ray_dir = (skewed * vec3!(0, 0, -1)).normalized(); // apply the skew to a unit forward vector
let mouse_pos_clip_space = screen_to_clip(mouse_pos);
let clip_coord = vec4(mouse_pos_clip_space, -1, 1);
let ray_orig = projection_matrix.inverse() * clip_coord; // unproject the oblique projection matrix calculated in the previous code block


return Ray {
    origin: ray_orig,
    dir: ray_dir,
};

So the idea is to first figure out how to construct a ray for an ordinary orthographic projection, then apply the skew

This makes me think there isn't a general algorithm fn get_ray(proj: Mat4, view: Mat4) -> (ray_origin: Vec3, ray_dir: Vec3) cuz the way I construct the ray is vastly different than the way for a traditional perspective projection + camera scene.

2

2 Answers

1
votes

I am going to answer this in the most general way, and you can decide whether it's helpful.

In OpenGL (and vulkan and probably other graphics APIs). There's no "camera".

Rather, you have a rectangular space that goes (in the case of opengl) from -1 to 1 in the x and y directions and from 0 to 1 in the z direction.

In your vertex shader any vertex inside that volume is rasterized, any vertex outside of it is discarded. In addition to that, vertices that are occluded (fail the depth test) will also be discarded.

Why does this matter? Any "camera" is nothing but a transformation that takes an arbitrary point X and maps it to a new point X'. In other words, the classic MVP matrix is just taking the points of a model and making them fit inside the OpenGL prism in a particular way.

So in general a camera is just a function that maps a world point to a camera point or C(X) = X'.

That means that for any camera (including nonlinear cameras), an unprojection is equivalent to the inverse function C^{-1} that satisfies C^{-1}(X') = X.

Just answer the question already!

The camera position in the normalized prism is just (0,0). It just has the quirk that your rays are all parallel rather than converging to the same point. So in camera space, the ray for a given pixel (x,y) is just (x,y,-1) (or +1) if you want that ray in world coordinates, then just multiply it by the inverse of your vertex transformation matrix.

Hope that helps.

0
votes

If your projection matrix is not a singular matrix (and it isn't) what you desire can easily be achieved by using the inverse matrix (and classic unproject functions all do that internally).

The key point here is that in a way, the GPU always uses a perfect orthographic projection: the rasterizer just takes the window space x and y, the window space z coordinate is at this point only some additional data associated with the vertices (and interpolated per fragment), similar to color or texcoord attributes.

The projection matrix (in conjunction with the perspective divide) is used to transform the space such that your view volume (be it a frustum for perspective projection, an axis-aligned box for standard ortho, or some skewed parallelepiped like in your case) is mapped into the [-1,1]^3 normalized device coordinate range. This is then further transformed to window space using the viewport and depth range settings. So the projection matrix doesn't do a projection in the strict mathematical sense (an idempotent mapping which basically looses information).

A single pixel (well, strictly speaking every point on the plane, you would typically use the pixel center as representative for the whole pixel) in window space defines a view ray, and this view ray has always the orthogonal direction (0,0,+/-1) (positive or negative z depends on some settings, but it is completely irrelevant here) in window space. In other words, you can basically take just two window space points with the same 2D x and y position, and just varying z, to calculate the view ray there.

Hence, you can simply unproject the these two window space points to world or view space, and these two unprojected points will completely define the ray in view or world space.

It appears most practical to just choose the first point on the near plane (z_win=0) and the other on the far plane (z_win=1).

[...] and ray direction should be the same for all rays and be able to directly calculated from the projection matrix, but I just don't know how..

What I wrote is for the general case of an arbitrary projection matrix. With a perspective projection, the ray direction will differ at each point. But with any kind of just affine projection matrices (as you are using), you simply can unproject the direction vector itself, in homogeneous coordinates: (0,0,1,0). This boils down to inverse(proj) * vec(0,0,1,0), or just taking the third column of the inverse projection matrix, and will yield an eye space vector with the w component still 0, so it stays a direction during the transformation (in contrast to the perspective case, where you'll end up with an actual point). You can also directly transform to world space by transforming with inverse(proj*view). The other parts of the typical unproject function are not needed here because the viewport transform will not change viewing directions, and the only thing the depth range could do is flipping the sign for z, which you can easily manually compensate for, if you are using some backwards mapping there.