How to track image anchors after initial detection in ARKit 1.5?

Question

I'm trying ARKit 1.5 with image recognition and, as we can read in the code of the sample project from Apple: Image anchors are not tracked after initial detection, so create an animation that limits the duration for which the plane visualization appears.

An ARImageAnchor doesn't have a center: vector_float3 like ARPlaneAnchor has, and I cannot find how I can track the detected image anchors.

I would like to achieve something like in this video, that is, to have a fix image, button, label, whatever, staying on top of the detected image, and I don't understand how I can achieve this.

Here is the code of the image detection result:

// MARK: - ARSCNViewDelegate (Image detection results)
/// - Tag: ARImageAnchor-Visualizing
func renderer(_ renderer: SCNSceneRenderer, didAdd node: SCNNode, for anchor: ARAnchor) {
    guard let imageAnchor = anchor as? ARImageAnchor else { return }
    let referenceImage = imageAnchor.referenceImage
    updateQueue.async {

        // Create a plane to visualize the initial position of the detected image.
        let plane = SCNPlane(width: referenceImage.physicalSize.width,
                         height: referenceImage.physicalSize.height)
        plane.materials.first?.diffuse.contents = UIColor.blue.withAlphaComponent(0.20)
        self.planeNode = SCNNode(geometry: plane)

        self.planeNode?.opacity = 1

        /*
         `SCNPlane` is vertically oriented in its local coordinate space, but
         `ARImageAnchor` assumes the image is horizontal in its local space, so
         rotate the plane to match.
         */
        self.planeNode?.eulerAngles.x = -.pi / 2

        /*
         Image anchors are not tracked after initial detection, so create an
         animation that limits the duration for which the plane visualization appears.
         */

        // Add the plane visualization to the scene.
        if let planeNode = self.planeNode {
            node.addChildNode(planeNode)
        }

        if let imageName = referenceImage.name {
            plane.materials = [SCNMaterial()]
            plane.materials[0].diffuse.contents = UIImage(named: imageName)
        }
    }

    DispatchQueue.main.async {
        let imageName = referenceImage.name ?? ""
        self.statusViewController.cancelAllScheduledMessages()
        self.statusViewController.showMessage("Detected image “\(imageName)”")
    }
}

You are looking to get something like the little labels that appear at the top of each painting in that video, right? If so are the images your are tagging stationary? — theMikeSwan
@theMikeSwan yes exactly like the little labels! Yes there are posters on a wall. — magohamoth
Tracking Image after you detect a reference image is handled nicely by ARKit 2.0 check it out. — user2153553

rickster rickster · Accepted Answer · 2018-03-08T19:32:49

You’re already most of the way there — your code places a plane atop the detected image, so clearly you have something going on there that successfully sets the center position of the plane to that of the image anchor. Perhaps your first step should be to better understand the code you have...

ARPlaneAnchor has a center (and extent) because planes can effectively grow after ARKit initially detects them. When you first get a plane anchor, its transform tells you the position and orientation of some small patch of flat horizontal (or vertical) surface. That alone is enough for you to place some virtual content in the middle of that small patch of surface.

Over time, ARKit figures out where more of the same flat surface is, so the plane anchor’s extent gets larger. But you might initially detect, say, one end of a table and then recognize more of the far end — that means the flat surface isn’t centered around the first patch detected. Rather than change the transform of the anchor, ARKit tells you the new center (which is relative to that transform).

An ARImageAnchor doesn’t grow — either ARKit detects the whole image at once or it doesn’t detect the image at all. So when you detect an image, the anchor’s transform tells you the position and orientation of the center of the image. (And if you want to know the size/extent, you can get that from the physicalSize of the detected reference image, like the sample code does.)

So, to place some SceneKit content at the position of an ARImageAnchor (or any other ARAnchor subclass), you can:

Simply add it as a child node of the SCNNode ARKit creates for you in that delegate method. If you don’t do something to change them, its position and orientation will match that of the node that owns it. (This is what the Apple sample code you’re quoting does.)
Place it in world space (that is, as a child of the scene’s rootNode), using the anchor’s transform to get position or orientation or both.

_{(You can extract the translation — that is, relative position — from a transform matrix: grab the first three elements of the last column; e.g. transform.columns.3 is a float4 vector whose xyz elements are your position and whose w element is 1.)}

The demo video you linked to isn’t putting things in 3D space, though — it’s putting 2D UI elements on the screen, whose positions track the 3D camera-relative movement of anchors in world space.

You can easily get that kind of effect (to a first approximation) by using ARSKView (ARKit+SpriteKit) instead of ARSCNView (ARKit+SceneKit). That lets you associate 2D sprites with 3D positions in world space, and then ARSKView automatically moves and scales them so that they appear to stay attached to those 3D positions. It’s a common 3D graphics trick called “billboarding”, where the 2D sprite is always kept upright and facing the camera, but moved around and scaled to match 3D perspective.

If that’s the effect you’re looking for, there’s an App(le sample code) for that, too. The Using Vision in Real Time with ARKit example is mostly about other topics, but it does show how to use ARSKView to display labels associated with ARAnchor positions. (And as you’ve seen above, placing content to match an anchor position is the same no matter which ARAnchor subclass you’re using.) Here’s the key bit in their code:

func view(_ view: ARSKView, didAdd node: SKNode, for anchor: ARAnchor) {
    // ... irrelevant bits omitted... 
    let label = TemplateLabelNode(text: labelText)
    node.addChild(label)
}

That is, just implement the ARSKView didAdd delegate method, and add whatever SpriteKit node you want as a child of the one ARKit provides.

However, the demo video does more than just sprite billboarding: the labels it associates with paintings not only stay fixed in 2D orientation, they stay fixed in 2D size (that is, they don’t scale to simulate perspective like a billboarded sprite does). What’s more, they seem to be UIKit controls, with the full set of inherited interactive behaviors that entails, not just 2D images the likes of which are ways to do with SpriteKit.

Apple’s APIs don’t provide a direct way to do this “out of the box”, but it’s not a stretch to imagine some ways one could put API pieces together to get this kind of result. Here are a couple of avenues to explore:

If you don’t need UIKit controls, you can probably do it all in SpriteKit, using constraints to match the position of the “billboarded” nodes ARSKView provides but not their scale. That’d probably look something like this (untested, caveat emptor):

func view(_ view: ARSKView, didAdd node: SKNode, for anchor: ARAnchor) {
    let label = MyLabelNode(text: labelText) // or however you make your label
    view.scene.addChild(label)

    // constrain label to zero distance from ARSKView-provided, anchor-following node
    let zeroDistanceToAnchor = SKConstraint.distance(SKRange(constantValue: 0), to: node)
    label.constraints = [ zeroDistanceToAnchor ]
}

If you want UIKit elements, make the ARSKView a child view of your view controller (not the root view), and make those UIKit elements other child views. Then, in your SpriteKit scene’s update method, go through your ARAnchor-following nodes, convert their positions from SpriteKit scene coordinates to UIKit view coordinates, and set the positions of your UIKit elements accordingly. (The demo appears to be using popovers, so those you wouldn’t be managing as child views... you’d probably be updating the sourceRect for each popover.) That’s a lot more involved, so the details are beyond the scope of this already long answer.

A final note... hopefully this long-winded answer has been helpful with the key issues of your question (understanding anchor positions and placing 3D or 2D content that follows them as the camera moves).

But to clarify and give a warning about some of the key words early in your question:

When ARKit says it doesn’t track images after detection, that means it doesn’t know when/if the image moves (relative to the world around it). ARKit reports an image’s position only once, so that position doesn’t even benefit from how ARKit continues to improve its estimates of the world around you and your position in it. For example, if an image is on a wall, the reported position/orientation of the image might not line up with a vertical plane detection result on the wall (especially over time, as the plane estimate improves).

Update: In iOS 12, you can enable "live" tracking of detected images. But there are limits on how many you can track at once, so the rest of this advice may still apply.

This doesn’t mean that you can’t place content that appears to “track” that static-in-world-space position, in the sense of moving around on the screen to follow it as your camera moves.

But it does mean your user experience may suffer if you try to do things that rely on having a high-precision, real-time estimate of the image’s position. So don’t, say, try to put a virtual frame around your painting, or replace the painting with an animated version of itself. But having a text label with an arrow pointing to roughly where the image is in space is great.

How to track image anchors after initial detection in ARKit 1.5?

3 Answers