3
votes

I'm studying on a project for a long time. My aim is to get depth map from stereo cameras' images and filter humans only in order to count human inside.

I am trying to calibrate my camera, continuously for 1-2 months. Nevertheless, when I draw epipolar lines on rectified pair, result is not good enough(I have attached my rectified pair result). I am working now, with my averagely good calibration results and trying to get depth map from disparity map. I have recorded an image sequence, .avi file, and when I try to get depth map from this videos, When I try this I am facing an unstable situation. A spot that is white in the previous frame can be very black in the next. So I can't just count people by filtering disparity. I use SGBM to get depths from rectified images.I am still considered an amateur in this project. I am open to any advice. (How to do better calibration? Better disparity map? Better depth map?)

This is depth map, and rectified pair: depth map

Rectified pair and epipolar lines pair and lines

I have calibrated my camera with 600 pairs almost and improved it. My overall mean error was .13 pix with 35 pair images.

minDisparity=-1,
        numDisparities=2*16,  # max_disp has to be dividable by 16 f. E. HH 192, 256
        blockSize=window_size,
        P1=8 * 3 * window_size,
        # wsize default 3; 5; 7 for SGBM reduced size image; 15 for SGBM full size image (1300px and above); 5 Works nicely
        P2=32 * 3 * window_size,
        disp12MaxDiff=12,
        uniquenessRatio=1,
        speckleWindowSize=50,
        speckleRange=32,
        preFilterCap=63,
        mode=cv2.STEREO_SGBM_MODE_SGBM_3WAY

This is my block matching parameters.

3
“I am still considered an amateur in this project.” An amateur is someone who doesn’t get paid for doing something, they do it for fun. As opposed to a professional who does get paid, they do it for work. The concept of an amateur being bad at doing something or knowing less than a professional is just dumb. I find often amateurs are more knowledgeable than most professionals. Maybe you mean to say you are a novice (have little experience)?Cris Luengo
Thanks for your comment. I will consider it. :)kursat-06

3 Answers

1
votes

To improve the results of your disparity map, you can implement post-filtering, here is a tutorial (https://docs.opencv.org/master/d3/d14/tutorial_ximgproc_disparity_filtering.html). I used also an extra speckle filter and the option to fill in missing disparities. The python implementation is as follows:

stereoProcessor = cv2.StereoSGBM_create(
                minDisparity=0,
                numDisparities = max_disparity, # max_disp has to be dividable by 16 f. E. HH 192, 256
                blockSize=window_size,
                P1 = p1,       # 8*number_of_image_channels*SADWindowSize*SADWindowSize
                P2 = p2,    # 32*number_of_image_channels*SADWindowSize*SADWindowSize
                disp12MaxDiff=disp12Maxdiff,
                uniquenessRatio= uniquenessRatio,
                speckleWindowSize=speckle_window,
                speckleRange=speckle_range,
                preFilterCap=prefiltercap,
               # mode=cv2.STEREO_SGBM_MODE_HH# numDisparities = max_disparity, # max_disp has to be dividable by 16 f. E. HH 192, 256
                
        )
        
        #stereoProcessor = cv2.StereoBM_create(numDisparities=16, blockSize=15)
        
        # set up left to right + right to left left->right + right->left matching +
        # weighted least squares filtering (not used by default)

        left_matcher = stereoProcessor
        right_matcher = cv2.ximgproc.createRightMatcher(left_matcher)

        #Image information 
        height, width, channels = I.shape

        frameL= I[:,0:int(width/2),:]
        frameR = I[:,int(width/2):width,:]

        # remember to convert to grayscale (as the disparity matching works on grayscale)

        grayL = cv2.cvtColor(frameL,cv2.COLOR_BGR2GRAY)
        grayR = cv2.cvtColor(frameR,cv2.COLOR_BGR2GRAY)

        # perform preprocessing - raise to the power, as this subjectively appears
        # to improve subsequent disparity calculation

        grayL = np.power(grayL, 0.75).astype('uint8')
        grayR = np.power(grayR, 0.75).astype('uint8')

        # compute disparity image from undistorted and rectified versions
        # (which for reasons best known to the OpenCV developers is returned scaled by 16)

        if (wls_filter):

            wls_filter = cv2.ximgproc.createDisparityWLSFilter(matcher_left=left_matcher)
            wls_filter.setLambda(wls_lambda)
            wls_filter.setSigmaColor(wls_sigma)
            displ = left_matcher.compute(cv2.UMat(grayL),cv2.UMat(grayR))  # .astype(np.float32)/16
            dispr = right_matcher.compute(cv2.UMat(grayR),cv2.UMat(grayL))  # .astype(np.float32)/16
            displ = np.int16(cv2.UMat.get(displ))
            dispr = np.int16(cv2.UMat.get(dispr))
            disparity = wls_filter.filter(displ, grayL, None, dispr)
        else:

            disparity_UMat = stereoProcessor.compute(cv2.UMat(grayL),cv2.UMat(grayR))
            disparity = cv2.UMat.get(disparity_UMat)
        
        speckleSize = math.floor((width * height) * 0.0005)
        maxSpeckleDiff = (8 * 16) # 128

        cv2.filterSpeckles(disparity, 0, speckleSize, maxSpeckleDiff)
        
        # scale the disparity to 8-bit for viewing
        # divide by 16 and convert to 8-bit image (then range of values should
        # be 0 -> max_disparity) but in fact is (-1 -> max_disparity - 1)
        # so we fix this also using a initial threshold between 0 and max_disparity
        # as disparity=-1 means no disparity available

        _, disparity = cv2.threshold(disparity,0, max_disparity * 16, cv2.THRESH_TOZERO)
        disparity_scaled = (disparity / 16.).astype(np.uint8)

        # fill disparity if requested

        if (fill_missing_disparity):

            _, mask = cv2.threshold(disparity_scaled,0, 1, cv2.THRESH_BINARY_INV)
            mask[:,0:120] = 0
            disparity_scaled = cv2.inpaint(disparity_scaled, mask, 2, cv2.INPAINT_NS)

        # display disparity - which ** for display purposes only ** we re-scale to 0 ->255
        disparity_to_display = (disparity_scaled * (256. / self.value_NumDisp)).astype(np.uint8)
        
0
votes

Why do you want to use a distance map to detect humans? In my opinion it is an object detection problem.

Anyway, In the current state of the art to get distance maps, I would recommend models based on artificial intelligence.

Models like NeRF have achieved amazing results.

  • Google provides ARCore which provides a depth map, but based on a single camera.
  • Nvidia has this project
  • This, based on Nerf from a video achieves 3D reconstruction

In a few weeks I will be working on this, I want to achieve a model that runs in TensorFlowLite that, from stereo cameras achieves a depth map

0
votes

A few things at quick glance:

  • Your P1, P2 parameters in StereoSGBM should be squared, calculated like so:

    P1 = 8*3*blockSize**2
    P2 = 32*3*blockSize**2
    
  • StereoSGBM supports color images, try skipping the grayscale conversion. If you use grayscale, you should remove *3 mulitiplier in the P1, P2 parameters. This is for the number of image channels, where grayscale has 1.

  • You are using cv2.STEREO_SGBM_MODE_3WAY, which is faster but less accurate. For better results but slower, try using cv2.STEREO_SGBM_MODE_SGBM (default, 5 neighbor) or cv2.STEREO_SGBM_MODE_HH (8 neighbor)

  • Your images have different exposures, if possible, try fixing the AWB/gain of your cameras so that they are captured consistently.