I have a research idea for the robot mapping. Basically, the end goal is to use a moderate monocular camera (for $20-50) and create a 3D occupancy grid map (there is a popular library written in c++ called Octomap). In order to do that, I have proposed myself these steps:
Take an rgb image (from the video) and convert to depth image using Convolutional Neural network. This part is done.
Take the original rgb image and created depth image and convert to Point Cloud.
Take the point cloud and convert it to 3D occupancy grid map.
So for the step 2, I am a bit confused, whether I am doing it right or wrong. I have taken this code, which is an open source:
import argparse
import sys
import os
from PIL import Image
focalLength = 938.0
centerX = 319.5
centerY = 239.5
scalingFactor = 5000
def generate_pointcloud(rgb_file,depth_file,ply_file):
rgb = Image.open(rgb_file)
depth = Image.open(depth_file).convert('I')
if rgb.size != depth.size:
raise Exception("Color and depth image do not have the same
resolution.")
if rgb.mode != "RGB":
raise Exception("Color image is not in RGB format")
if depth.mode != "I":
raise Exception("Depth image is not in intensity format")
points = []
for v in range(rgb.size[1]):
for u in range(rgb.size[0]):
color = rgb.getpixel((u,v))
Z = depth.getpixel((u,v)) / scalingFactor
print(Z)
if Z==0: continue
X = (u - centerX) * Z / focalLength
Y = (v - centerY) * Z / focalLength
points.append("%f %f %f %d %d %d 0\n"%
As I think points
is the list that actually stores the point cloud, am I right?
So the big question I am asking is, having RGB image and Depth image created using a deep learning algorithm is it possible to convert to point cloud using the code above?