28
votes

I'm new to image processing and I'm working on detecting lines in a document image. I read the theory of Hough line transform but I can't see why I must use Canny before calling that function in opencv like being said in many tutorials. What's the point of finding edges in this case? The fact is that if I don't use Canny or threshold before HoughLines() the results will be very messy. I hope someone will explain for me the reason why.

2 of the tutorials I've read:

  1. Imgproc Feature Detection
  2. Hough Line Transform
4
Please provide a minimal working example and link to one of the "many tutorial"s that you refer to.Unapiedra
Thank you for the links!Unapiedra

4 Answers

12
votes

First of all, to detect lines you need to work on a boolean matrix image (or binary), I mean: the color is black or white, there's no grayscale.

HoughLines()'s requirement to work properly is to have this kind of image as input. That's the reason you have to use Canny or Treshold, to convert the colored image matrix into a boolean one.

Hough transformation

A line in one picture is actually an edge. Hough transform scans the whole image and using a transformation that converts all white pixel cartesian coordinates in polar coordinates; the black pixels are left out. So you won't be able to get a line if you first don't detect edges, because HoughLines() don't know how to behave when there's a grayscale.

11
votes

Short Answer

cvCanny is used to detect Edges, as well as increase contrast and remove image noise. HoughLines which uses the Hough Transform is used to determine whether those edges are lines or not. Hough Transform requires edges to be detected well in order to be efficient and provide meaning results.

Long Answer

The Limitations of the Hough Transform are described in more detail on Wikipedia.

The efficiency of the Hough Transform relies of the bin of acculumated pixel being distinct, e.g. a direct contrast between a pixel and its surrounding neighbours or if using a mask region a pixel region and its surrounds regions. If all pixels had similar acculumated values nothing would stand out as a line or circle. This leads to the reduction of colour (colour to grayscale, grayscale to black and white) in order to increase contract.

The number of parameters to the Hough Transform also increase the spread of votes in the pixel bins and increase the complexity of the transform, which mean that normally only lines or circles are reliably detected using it as they have less than 3 parameters.

The edges need to be detected well before running the Hough Transform otherwise its efficiency suffers further. Also noisy images don't work well with Hough transform unless the noise is removed before hand.

5
votes

Theoretically, you are correct. Finding edges is not absolutely required for the Hough Line algorithm to work.

The way the Hough works is basically it takes every point and connects it to every other point, and whatever points have the most lines going through them, those lines stay. For this, we need points. The Canny creates those points. Theoretically you could use any sort of filter - isolate all blue or purple points and connect them, whatever - but edges works well.

The Hough also does not weight its lines or points. To the Hough, an image is binary - made up of either 1s or 0, points or not points. There is no need for greyscale, and the canny conveniently returns binary images.

Thus is the Canny always part of the Hough.

-2
votes

all is about processing binary data,

complex data -> (a binary data, b binary data, c binary data, ..) (using canny(),sobel(), etc)

a binary data -> function1() (using houghlines())

b binary data -> function2()

c binary data -> function3() ..

a binary data -X-> function2() ..

complex data -X-> function1() ..

HTH