1
votes

There has been previous questions (here, here and here) related to my question, however my question has a different aspect to it, which I have not seen in any of the previously asked questions.

I have acquired a dataset for my research using Kinect Depth sensor. This dataset is in the format of .png images for both depth and rgb stream at a specific instant. To give you more idea below are the frames:

RGB Image

Depth Image

EDIT: I am adding the edge detection output here.

Sobel Edge detection output for:

  1. RGB Image Edge RGB

  2. Depth Image Edge Depth

Now what I am trying to do is align these two frames to give me a combined RGBZ image.

I do not have knowledge of the underlying camera characteristics or the distance between both rgb and infrared sensors.

Is there a method which can be applied to match the RGB values to the corresponding Z values?

One of the ideas I have is to use edges in both images and try to match them.

2
How come all the questions you mentioned not helping you solve this problem?Shai
All the solutions previously provided are using inbuilt OpenNI functions to align the data, 'g_depth.GetAlternativeViewPointCap().SetViewPoint(g_image)'. What I am trying to achieve here is to write a similar function but not using the camera characteristics.masad

2 Answers

4
votes

This is for those of you who are experiencing the same problem. I thought it might help to share what I have found out.

As notified by Bill, camera calibration is the best solution to this problem.

However I found out that, using homographies and epipolar lines both the images can be aligned. This requires atleast 8 matching features in both images. This is a difficult problem when dealing with depth images.

There have been several attempts to calibrate these images which can be found here and here both require a calibration pattern to calibrate. What I was trying to achieve was to align already captured depth and rgb images, which can be done given I calibrate parameters from the same kinect sensor which I used to record.

I have found that the best way to get around this problem is to align both the images using built in library function in OpenNI and Kinect SDK.

1
votes

In general what you are trying to do from a pair of RGB and Depth images is non-trivial and ill-defined. As humans we recognise the arm in the RGB image, and are able to relate it to the area of the depth image closer to the camera. However, a computer has no prior knowledge about which parts of the RGB image it expects to correspond to which parts of the depth image.

The reason most algorithms for such alignment use camera calibration is that this process allows this ill-posed problem to become well-posed.

However, there may still be ways to find the correspondences, particularly if you have lots of image pairs from the same Kinect. You then need only search for one set of transformation parameters. I don't know of any existing algorithms to do this, but as you note in your question you may find something like doing edge detection on both images and trying to align the edge images a good place to start.

Finally, note that when objects get close to the Kinect the correspondence between RGB and depth images can become poor, even after the images have been calibrated. You can see some of this effect in your images - the 'shadow' that the hand makes in your example depth image is somewhat indicative of this.