Compute distance using DFS

Question

I was torn between these two methods:

M1:

Use adjacency list to represent graph G with vertices P and edges A
Use DFS on G storing all the distances from p in an array d;
Loop through d checking all entries. If some d[u] >6, return false otherwise true

M2:

Use adjacency list to represent graph G with vertices P and edges A
Use BFS on G storing all the distances from p in an array d;
Loop through d checking all entries. If some d[u] >6, return false otherwise true

Both these methods will produce a worst case O(|P| + |A|), therefore I think that both would be a correct answer to this question. I had chosen the DFS method, with the reasoning that with DFS you should be able to find the "outlier" of freedom degree 7 earlier than with BFS, since with BFS you would have to traverse every single Vertex until degree 7 in every case.

Apparently this is wrong according to the teacher, as using DFS, you can't compute the distances. I don't understand why you wouldn't be able to compute the distances. I could have a number n indicating the degree of freedom I am currently at. Starting from root p, the child would have n = 1. Now I store n in array d. Then I keep traversing down until no child is to be found, while incrementing n and storing the value in my array d. Then, if the back-tracking starts, the value n will be decremented until we find an unvisited child node of any of the visited nodes on the stack. If there is an unvisited child, increment once again, then increment until no more child is found, decrement until the next unvisited child from the stack is found...

I believe that would be a way to store the distances with DFS

DO NOT post images of code, data, error messages, etc. - copy or type the text into the question. How to Ask — Rob

trincot trincot · Accepted Answer · 2020-06-05T13:03:18

Both BFS and DFS can do the job: they can both limit their search to a depth of 6, and at the end of the traversal they can check whether the whole population was reached or not. But there are some important differences:

With BFS

The BFS traversal is the algorithm I would opt for. When a BFS search determines the degree of a person, it is definitive: no correction needs to be made to it.

Here is sketch of how you can do this with BFS:

visited = set()  # empty set
frontier = []  # empty array
visited.add(p)  # search starts at person p
frontier.append(p)
for degree in [1, 2, 3, 4, 5, 6]:
    nextFrontier = []  # empty array
    for person in frontier:
        for acquaintance in A[person]:
            if acquaintance not in visited:
                visited.add(acquaintance)
                nextFrontier.append(acquaintance)
    frontier = nextFrontier
    if size(visited) == size(P):  # have we reached the whole population?
        return True
# After six rounds we did not reach all people, so...
return False

This assumes that you can find the list of acquaintances for a given person via A[person]. If A is not structured like an adjacency list but as a list of pairs, then first do some preprocessing on the original A to create such an adjacency list.

With DFS

A DFS algorithm has as downside that it will not necessarily start with optimal paths, and so it will find that some persons have degree 6, while there really are shorter, uninvestigated paths that could improve on that degree. This means that a DFS algorithm may need to revisit nodes and even partial paths (edges) to register such improvements and cascade them through a visited path up to degree 6. And there might even be several improvements to be applied for the same person.

A DFS algorithm could look like this:

degreeOfPerson = dict()  # empty key/value dictionary
for person in P:
    degreeOfPerson[person] = 7  # some value greater than 6

function dfs(person, degree):
    if degree >= 7:
        return  # don't lose time for higher degrees than 6.
    for acquaintance in A[person]:
        if degree < degreeOfPerson[acquaintance]:  # improvement?
            degreeOfPerson[acquaintance] = degree
            dfs(acquaintance, degree+1)

# start DFS
degreeOfPerson[p] = 0
dfs(p, 1)

# Check if all persons got a degree of maximum 6
for person in P:
    if degreeOfPerson[person] > 6:
        return False
return True

Example

If the graph has three nodes, linked as a triangle a-b-c, with starting point a, then this would be the sequence. Indentation means (recursive) call of dfs:

degreeOfPerson[a] = 0
    a->b: degreeOfPerson[b] = 1
        b->c: degreeOfPerson[c] = 2
            c->a: # cannot improve degreeOfPerson[a]. Backtrack
            c->b: # cannot improve degreeOfPerson[b]. Backtrack
        b->a: # cannot improve degreeOfPerson[a]. Backtrack
    a->c: degreeOfPerson[c] = 1  # improvement!
        c->a: # cannot improve degreeOfPerson[a]. Backtrack
        c->b: # cannot improve degreeOfPerson[b]. Backtrack

Time Complexity

The number of times the same edge can be visited with DFS is not more than the maximum degree we are looking for -- in your case 6. If that is a constant, then it does not affect the time complexity. If however the degree to check for is an input value, then the time complexity of DFS becomes O(maxdegree * |E| + |V|).