1
votes

I'm learning about D3D and coordinate spaces while working with the kinect (in C++). While I can draw skeleton positions easily using Direct2D, I am curious as to how to draw these positions using direct3D 11 and what coordinate space transformations I would need.

A simple example: I would like to translate a cube based on the motion of the left hand. Tracking the left hand joint, I can get skeleton locations. However, how would I convert these locations to something the world space of the cube would understand?

I attempted a solution by doing:

  1. Convert skeleton locations to depth using SkeletonToDepth conversion - giving me results in screen space.
  2. Map the screen space point back into object space using XMVector3UnProject(...), i.e. essentially a ray picking solution.

While this is fine, is there a more efficient way that does not involve mapping back into object space and that would allow me to work directly with in screen space or at least, projection space?

1
If you've not already, take a look at the D3D examples in the Kinect for Windows Toolkit. There is one there that deals with avataring, which might give you insight into the mapping. I've not dealt with these examples, so I'm not sure if they will have exactly what you want.Nicholas Pappas
Thanks Evil Closet Monkey. I have taken a look at the Avateering demo. I don't completely understand it, mostly because I have zero experience with C#. While I understand the general flow of the program code (what program calls what), I have not been able to identify the particular transformations from skeleton space to the model. I'll keep reading to see what I can find. Any more suggestions?Nikky

1 Answers

1
votes

I found the answer that works for me.

The kinect raw data from a skeleton/ face tracker gives you data in the kinect camera space, which I defined as a view matrix defined with up as (0,1,0), looking at (0,0,-1), with origin at (0,0,0). With a projection matrix of fov in y = 45.8f,aspect ratio of view port, and near and far to match the application (in my case 1.0f to 2000.0f as I am working in mm), any 3D point returned by the kinect is in camera space. Thus, to get to world space, multiply with the inverse of the view matrix (and in my case, flip the x-axis coordinate in the transformed point by multiplying with -1). The object on the screen should follow the movements from the kinect.