Python Flood Fill - How to optimize

Question

def getRegionSize(cell, world):
    region = []
    frontier = [cell]
    val = world[cell[0]][cell[1]]
    while(len(frontier) > 0):
        item = frontier.pop()
        region.append(item)
        x = item[0]
        y = item[1]
        for i in [-1,1]:
            for j in [-1,1]:
                if (x+i,y+j) not in frontier and (x+i,y+j) not in region:
                    if not (x + i > width - 1 or x + i < 0 or y + j > height - 1 or y + i < 0) and world[x+i][y+j] == val:
                        frontier.append((x+i,y+j))
    return len(region)

I want my function to determine the size of an area connected to a given cell. The function takes world (2D binary matrix) and cell ((x,y) coordinates in the world) as arguments, and returns the size of the area connected to cell.

This function works like flood fill, but instead of recoloring an area, I am only interested in the size of the area being recolored. The function works fine, but it is slow. It takes a few seconds to return for areas of size greater than ~4000. Is there something I'm doing terribly wrong, or is it supposed to be slow for large areas?

EDIT: Edited for clarity.

It's not really clear from your question what your algorithm is supposed to do. Take a look at How to Ask — pvg
Some optimizations could be done through packages such as numpy and scipy. Related. — Sebastian Mendez

c2huc2hu c2huc2hu · Accepted Answer · 2017-11-17T05:23:04

In: if (x+i,y+j) not in frontier and (x+i,y+j) not in region:, you're testing if cell is in frontier or region, both of which are lists, so searching them takes O(n).

1) You don't need to check if cell is in frontier. As long as you ignore cells already in region, you can add a cell to the frontier as many times as you want.

2) Use a more efficient data structure for region, i.e. a set.

Python Flood Fill - How to optimize

1 Answers