4
votes

I'd like to ask about the following problem:

Given a directed graph (not necessarily a DAG), for every vertex v calculate the number of reachable vertices from v.

So using brute force approach (n times DFS) we can get the answer in O(n^2) time complexity. Is there a way to calculate it faster? I can definitely make a DAG from the given graph using SCC. I tried to use previously calculated values, so I could visit each vertex only once, but it does not work at all. The biggest problem lies in a graph like this:

  2 -> 1
  3 -> 2
  3 -> 1
  1 -> 4

I run a DFS from vertex 1 and return result 1. Then, using it I can immidiately calculate answer for 2 (I do not enter vertex 1 again in second DFS, I use its answer instead), what is 2. Then I get to the vertex 3, and... the algorithm would sum up the result of 1 and 2, since I can reach both of those vertices. But vertex 1 is already calculated in the result for vertex 2. In that way I get the answer equal to 4, which is not true.

2
If B is reachable from A, then all nodes reachable from B are also reachable from A. You can make use of this to avoid duplicate traversals. - meowgoesthedog
Indeed, but I'd have to store each vertex reachable, that would be still O(n^2) - Maras
Not necessarily; you could keep track of how many nodes are reachable from a certain node as you perform the first DFS. During subsequent DFS passes, stop as soon as you reach a visited node, and add on the number of nodes reachable from it. This way you traverse each node at most twice (I think - correct me if you find a contradiction). - meowgoesthedog
What if there are 2 vertices that are both already visited, but their reachable vertices are completely distinct? - Maras
That doesn't matter because you still iterate through all nodes in the graph, but the point is each DFS traversal performed from a node reduces the set of nodes which still need to traversed. - meowgoesthedog

2 Answers

4
votes

I really suspect that there isn't a known better algorithm for general graphs. All the papers I found on the subject [1] [2] describe algorithms that run in O(|V| * |E|) time.

The wikipedia page [3] says the fastest algorithms reduce the problem to matrix multiplication.

[1] http://ion.uwinnipeg.ca/~ychen2/conferencePapers/tranRelationCopy.pdf

[2] http://www.vldb.org/conf/1988/P382.PDF

[3] http://en.wikipedia.org/wiki/Transitive_closure#Algorithms

2
votes

I think the best algorithm run O(|V| * |E|).

Algorithm:

1 - Do SCC to the graph.

2 - Build reduced graph Gr.

3 - Run one dfs for each vertex v in Gr and for each visited different SCC (including the SCC of v) accumulate the amount of vertex in them (precalculated from step 1).

With this algorithm you eliminate the |V|*|V| factor from brute force O(|V| * |V| + |V| * |E|)).

This is the best and simple algorithm that I know for this problem. I have no demonstration but I'm pretty sure there's no better way.