5
votes

I wrote a piece of code that needs to be optimized. Just felt like checking with community to see if that code is indeed optimal. It fills up the accumulator for the Hough transform. I actually just copy pasted most the code from the OpenCV library. Thanks!


int i,j,n,index;
for (i = 0;i<numrows;i++)
{
    for (j = 0;j<numcols;j++)
    {
            if (img[i*numcols + j] == 100)
        {
            for (n = 300;n<600;n++)
            {   
                index = cvRound(j*tabCos[n] + i * tabSin[n]) + (numrho-1)/2;
                accum[(n+1) * (numrho+2) + index+1]++;
            }
        }
    }
}
3
do you have actual example of data upon which apply this code ? Looks like there is several possible optimization, some are independant of data, but others would depend on the actual .distribution of data in img and the size of imag.kriss
an example of the data i have is at stackoverflow.com/questions/4372259/… i do realize that there are only 3 points per column (thats how i created those images) so there should be some way of speeding it up but the time consumin part is filling up the accumulator and not going through the imageDenis

3 Answers

2
votes

There is a large and repetitive Hough transform in a piece of code I'm vaguely attached too. The maintainer of that part of the code has been experimenting with sparse arrays (actually a C++ std::map keyed on the cell index if I understood his presentation right) for the accumulator with some success.

I presume the speed up is related to cache locality issues, and it certainly depends on the data being sparse.


Update: The software referenced above is intended to serve many particle physics experiments, but was originally used on a test-bed project (i.e. small scale). As we've gotten serious about doing larger projects and started doing Monte Carlo for them, the Hough transform has gotten to be a bit of a bottle neck again even with the sparse matrix.

As yet we do not have a solution, but one of colleagues found Gandalf which includes "fast hough transform", which appears to evaluate the transform in a quad-tree like way (in 2D, presumably you use a oct-tree in 3D) to reduce the order of work. We're probably going to experiment with this.

Further update: A colleague eventual implemented a progressive, probabilistic Hough transform in our code which currently seems to be the fastest version we've got. Works best if you don't require that every point gets assigned to a line.

1
votes

No it's not. Replace as many of the [] usages as you can by simple pointer arithmetic to iterate the arrays in question. Abstract out invariant expressions into local variables.

However, the first question is, does your profiler show that this code is a bottleneck in the context of your entire app. If not, why bother micro-optimizing this?

EDIT: loop micro-optimization - prefer the second as no array indexing required (mult versus add)

int ints[100];
int i;
int *pi;

for (i = 0; i < 100; ++i)
{
  printf("%d", ints[i]);
}

for (pi = ints; pi < ints + 100; ++pi)
{
  printf("%d", *pi);
}
0
votes

Depending on your application, there are numerous way to optimise Hough Transform and fiddling with low-level code is possibly the last of them. I would start with Randomised HT or Multiresolution HT, followed Hybrid approach merge. I believe it is better to optimised algorithm first. Last step would be do use hardware optimisation like CAD memory.