5
votes

I am playing with the Kinect driver, CL NUI and trying to get the relative depth of items.

The Library provides a way to get an image representing the depth of an object on the screen. Here is an example:
Kinect Depth Image

Is there an easy way to convert from the pixel color to the images depth? For example, the closest color could be depth of 0, and the farthest color could be the depth of 1.

Does anyone know how to do that?

I found these cacluations on how to convert the depth data to color, what I want is the inverse:

float RawDepthToMeters(int depthValue)
{
    if (depthValue < 2047)
    {
        return float(1.0 / (double(depthValue) * -0.0030711016 + 3.3309495161));
    }
    return 0.0f;
}

Vec3f DepthToWorld(int x, int y, int depthValue)
{
    static const double fx_d = 1.0 / 5.9421434211923247e+02;
    static const double fy_d = 1.0 / 5.9104053696870778e+02;
    static const double cx_d = 3.3930780975300314e+02;
    static const double cy_d = 2.4273913761751615e+02;

    Vec3f result;
    const double depth = RawDepthToMeters(depthValue);
    result.x = float((x - cx_d) * depth * fx_d);
    result.y = float((y - cy_d) * depth * fy_d);
    result.z = float(depth);
    return result;
}

Vec2i WorldToColor(const Vec3f &pt)
{
    static const Matrix4 rotationMatrix(
                            Vec3f(9.9984628826577793e-01f, 1.2635359098409581e-03f, -1.7487233004436643e-02f),
                            Vec3f(-1.4779096108364480e-03f, 9.9992385683542895e-01f, -1.2251380107679535e-02f),
                            Vec3f(1.7470421412464927e-02f, 1.2275341476520762e-02f, 9.9977202419716948e-01f));
    static const Vec3f translation(1.9985242312092553e-02f, -7.4423738761617583e-04f, -1.0916736334336222e-02f);
    static const Matrix4 finalMatrix = rotationMatrix.Transpose() * Matrix4::Translation(-translation);

    static const double fx_rgb = 5.2921508098293293e+02;
    static const double fy_rgb = 5.2556393630057437e+02;
    static const double cx_rgb = 3.2894272028759258e+02;
    static const double cy_rgb = 2.6748068171871557e+02;

    const Vec3f transformedPos = finalMatrix.TransformPoint(pt);
    const float invZ = 1.0f / transformedPos.z;

    Vec2i result;
    result.x = Utility::Bound(Math::Round((transformedPos.x * fx_rgb * invZ) + cx_rgb), 0, 639);
    result.y = Utility::Bound(Math::Round((transformedPos.y * fy_rgb * invZ) + cy_rgb), 0, 479);
    return result;
}

My Matrix Math is weak, and I'm not sure how to reverse the calculations.

2

2 Answers

3
votes

I found the solution.

The depth value is stored in the CLNUIDevice.GetCameraDepthFrameRAW image. This image's color is created from the eleven bit depth value. I.E. you have 24 bits for the RGB color (32 if ARGB). To get the depth, you need to truncate the unnecessary bits like this:

    private int colorToDepth(Color c) {
        int colorInt = c.ToArgb();

        return (colorInt << 16) >> 16;
    }

You use 16, because the 11-bit number is 0 padded for the 16 bit place. I.E. you have a number that looks like this: 1111 1111 0000 0### #### #### where the 1 is the alpha channel, the zeros are the padding, and the '#' is the actual value. Bitshifting the the left 16 leaves you with 0000 0### #### #### 0000 0000, which you want to push back my shifting to the right 16 times: 0000 0000 0000 0### ##### ####

0
votes

Correct me if I'm wrong but from what I understand your code receives a depth array from the Kinect, converts it to "world" objects (with x,y,z coordinates) using DepthToWorld, then converts that to a color 2D array using WorldToColor. From that, you're trying to retrieve the distance (in meters) of each pixel relative to the Kinect. Right?

So why don't you just take the initial depth array you get from the Kinect, and call the RawDepthToMeters function on each of the pixels?

Inversing the operations done by your code just before is a huge loss of performance...