I have a large image which contains "blobs" of interest over a background. I have the position (centroid, bounding box, area) of all blobs. I want to crop a limited number of regions of a fixed size in the image that allow me to capture most of the blobs. Example below for 1, 2 or 3 cropped regions in the same image.
This example shows that cropping 1 region (in red) is relatively easy: just pick the region with as many blobs as possible. This can be solved by just trying everything or possibly by computing a density of blobs using a kernel density estimator or something like this.
But cropping 2 regions (in dashed blue) is not just cropping the next best crop after the first selected above. It is a new problem where I need to find the optimal combination of 2 crops. Trying all combinations of 2 crops (brute force) probably becomes too computationally expensive (I have many images to process and they are large).
Similarly, cropping 3 regions (in green) is a new problem, and one for which brute force is even less suited. In that particular example 2 of the 3 regions are identical to the blue case and a new one is added but this is not the general case (I just wanted to show a slightly complex scenario).
I have no idea regarding the algorithm to solve the n-crops case. I am wondering if there is a theoretical/well known solution to this problem.
In addition:
- the geometry of the problem is approximately that of the example above (maximum two crops over the height of the image, many crops over the width); in case that simplifies things
- crops should not intersect
- blobs should be as centred as possible in the crop (i.e. https://dl.dropboxusercontent.com/u/1047321/SO_crop/cutout_one_blob.png )
- crops should stay within the original image boundaries (cf. either example above)
- the area of the blob should be taken into account (I am more interested in large blobs than small ones); but that can probably be introduced in any algorithm by associating a weight with each blob to compute the score of each cropping layout.
- it is OK to leave some blobs out. Actually, I'll probably compute a cost-complexity parameter such as how many new blobs adding a crop would get and set a threshold under which I stop adding crops.
Thanks in advance for any pointer.
PS: The coding language does not really matter here since the core of the algorithm (finding the optimal position of the crops given the location/size of blobs) only needs small arrays (position/size of ~100 blobs per image) to be computed. I'll probably use Python or R.