3
votes

I'm attempting to write an augmented reality app using SceneKit, and I need accurate 3D points from the current rendered frame, given a 2D pixel and depth using SCNSceneRenderer's unprojectPoint method. This requires an x, y, and z where the x and y is a pixel coordinate and normally the z is a value read from the depth buffer at that frame.

The SCNView's delegate has this method to render the depth frame:

func renderer(_ renderer: SCNSceneRenderer, willRenderScene scene: SCNScene, atTime time: TimeInterval) {
    renderDepthFrame()
} 

func renderDepthFrame(){

    // setup our viewport
    let viewport: CGRect = CGRect(x: 0, y: 0, width: Double(SettingsModel.model.width), height: Double(SettingsModel.model.height))

    // depth pass descriptor
    let renderPassDescriptor = MTLRenderPassDescriptor()

    let depthDescriptor: MTLTextureDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: MTLPixelFormat.depth32Float, width: Int(SettingsModel.model.width), height: Int(SettingsModel.model.height), mipmapped: false)
    let depthTex = scnView!.device!.makeTexture(descriptor: depthDescriptor)
    depthTex.label = "Depth Texture"
    renderPassDescriptor.depthAttachment.texture = depthTex
    renderPassDescriptor.depthAttachment.loadAction = .clear
    renderPassDescriptor.depthAttachment.clearDepth = 1.0
    renderPassDescriptor.depthAttachment.storeAction = .store



    let commandBuffer = commandQueue.makeCommandBuffer()

    scnRenderer.scene = scene
    scnRenderer.pointOfView = scnView.pointOfView!

    scnRenderer!.render(atTime: 0, viewport: viewport, commandBuffer: commandBuffer, passDescriptor: renderPassDescriptor)


    // setup our depth buffer so the cpu can access it
    let depthImageBuffer: MTLBuffer = scnView!.device!.makeBuffer(length: depthTex.width * depthTex.height*4, options: .storageModeShared)
    depthImageBuffer.label   = "Depth Buffer"
    let blitCommandEncoder: MTLBlitCommandEncoder = commandBuffer.makeBlitCommandEncoder()
    blitCommandEncoder.copy(from: renderPassDescriptor.depthAttachment.texture!, sourceSlice: 0, sourceLevel: 0, sourceOrigin: MTLOriginMake(0, 0, 0), sourceSize: MTLSizeMake(Int(SettingsModel.model.width), Int(SettingsModel.model.height), 1), to: depthImageBuffer, destinationOffset: 0, destinationBytesPerRow: 4*Int(SettingsModel.model.width), destinationBytesPerImage: 4*Int(SettingsModel.model.width)*Int(SettingsModel.model.height))
    blitCommandEncoder.endEncoding()

    commandBuffer.addCompletedHandler({(buffer) -> Void in
        let rawPointer: UnsafeMutableRawPointer = UnsafeMutableRawPointer(mutating: depthImageBuffer.contents())
        let typedPointer: UnsafeMutablePointer<Float> = rawPointer.assumingMemoryBound(to: Float.self)
        self.currentMap = Array(UnsafeBufferPointer(start: typedPointer, count: Int(SettingsModel.model.width)*Int(SettingsModel.model.height)))

    })

    commandBuffer.commit()

}

This works. I get depth values between 0 and 1. The problem is that I can't use them in the unprojectPoint because they don't appear to be scaled the same as the initial pass, despite using the same SCNScene and SCNCamera.

My questions:

  1. Is there any way to get the depth values directly from SceneKit SCNView's main pass without having to do an extra pass with a separate SCNRenderer?

  2. Why don't the depth values in my pass match the values I get from doing a hit test and then unprojecting? The depth values from my pass go from 0.78 to 0.94. The depth values in the hit test range from 0.89 to 0.97, which curiously enough, matches the OpenGL depth values of the scene when I rendered it in Python.

My hunch is this is a difference in viewports and SceneKit is doing something to scale the depth values from -1 to 1 just like OpenGL.

EDIT: And in case you're wondering, I can't use the hitTest method directly. It's too slow for what I'm trying to achieve.

2

2 Answers

2
votes

SceneKit uses a log scale reverse Z-Buffer by default. You can disable the reverse Z-Buffer quite easily (scnView.usesReverseZ = false) but taking the log depth to [0, 1] range with linear distribution requires access to the depth buffer, the value of the far clipping range and the value of the near clipping range. Here is the process of taking a non-reverse-z-log-depth to a linearly distributed depth in the range of [0, 1]:

float delogDepth(float depth, float nearClip, float farClip) {
    // The depth buffer is in Log Format. Probably a 24bit float depth with 8 for stencil.
    // https://outerra.blogspot.com/2012/11/maximizing-depth-buffer-range-and.html
    // We need to undo the log format.
    // https://stackguides.com/questions/18182139/logarithmic-depth-buffer-linearization
    float logTuneConstant = nearClip / farClip;
    float deloggedDepth = ((pow(logTuneConstant * farClip + 1.0, depth) - 1.0) / logTuneConstant) / farClip;
    // The values are going to hover around a particular range. Linearize that distribution.
    // This part may not be necessary, depending on how you will use the depth.
    // http://glampert.com/2014/01-26/visualizing-the-depth-buffer/
    float negativeOneOneDepth = deloggedDepth * 2.0 - 1.0;
    float zeroOneDepth = ((2.0 * nearClip) / (farClip + nearClip - negativeOneOneDepth * (farClip - nearClip)));
    return zeroOneDepth;
}
1
votes

As a workaround, I switched to OpenGL ES and read the depth buffer by adding a fragment shader that packs the depth value into the RGBA renderbuffer SCNShadable.

See here for more info: http://concord-consortium.github.io/lab/experiments/webgl-gpgpu/webgl.html

I understand this is a valid approach as it is used in shadow mapping quite often on OpenGL ES devices and WebGL, but this feels hacky to me and I shouldn't have to do this. I would still be interested in another answer if someone can figure out Metal's viewport transformation.