I've implemented the meanshift algorithm from http://courses.csail.mit.edu/6.869/handouts/PAMIMeanshift.pdf with fixed bandwidth for now. When I run my MATLAB on the cameraman image with h=[8 4 10] as they suggest I get around 4000 clusters (I do a raster scan of the pixels, for each one compute the mode to which it maps and merge regions if they are within h). This algorithm also takes around 5 minutes for 256x256 case.
I have tried reading/using their code but I would need some explanations...
Are my results to be expected or can I get this to fewer clusters without some post processing?