7
votes

I'm trying to make my OpenCV-based fiducial marker detection more robust when the user moves the camera (phone) violently. Markers are ArTag-style with a Hamming code embedded within a black border. Borders are detected by thresholding the image, then looking for quads based on the found contours, then checking the internals of the quads.

In general, decoding of the marker is fairly robust if the black border is recognized. I've tried the most obvious thing, which is downsampling the image twice, and also performing quad-detection on those levels. This helps with camera defocus on extreme nearground markers, and also with very small levels of image blur, but doesn't hugely help the general case of camera motion blur

Is there available research on ways to make detection more robust? Ideas I'm wondering about include:

  1. Can you do some sort of optical flow tracking to "guess" the positions of the marker in the next frame, then some sort of corner detection in the region of those guesses, rather than treating the rectangle search as a full-frame thresholding?
  2. On PCs, is it possible to derive blur coeffiients (perhaps by registration with recent video frames where the marker was detected) and deblur the image prior to processing?
  3. On smartphones, is it possible to use the gyroscope and/or accelerometers to get deblurring coefficients and pre-process the image? (I'm assuming not, simply because if it were, the market would be flooded with shake-correcting camera apps.)

Links to failed ideas would also be appreciated if it saves me trying them.

2

2 Answers

3
votes
  1. Yes, you can use optical flow to estimate where the marker might be and localise your search, but it's just relocalisation, your tracking will have broken for the blurred frames.
  2. I don't know enough about deblurring except to say it's very computationally intensive, so real-time might be difficult
  3. You can use the sensors to guess the sort of blur you're faced with, but I would guess deblurring is too computational for mobile devices in real time.

Then some other approaches:

There is some really smart stuff in here: http://www.robots.ox.ac.uk/~gk/publications/KleinDrummond2004IVC.pdf where they're doing edge detection (which could be used to find your marker borders, even though you're looking for quads right now), modelling the camera movements from the sensors, and using those values to estimate how an edge in the direction of blur should appear given the frame-rate, and searching for that. Very elegant.

Similarly here http://www.eecis.udel.edu/~jye/lab_research/11/BLUT_iccv_11.pdf they just pre-blur the tracking targets and try to match the blurred targets that are appropriate given the direction of blur. They use Gaussian filters to model blur, which are symmetrical, so you need half as many pre-blurred targets as you might initially expect.

If you do try implementing any of these, I'd be really interested to hear how you get on!

0
votes

From some related work (attempting to use sensors/gyroscope to predict likely location of features from one frame to another in video) I'd say that 3 is likely to be difficult if not impossible. I think at best you could get an indication of the approximate direction and angle of motion which may help you model blur using the approaches referenced by dabhaid but I think it unlikely you'd get sufficient precision to be much more help.