8
votes

I have this question from Robert Sedgewick's book on algorithms.

Critical edges. An MST edge whose deletion from the graph would cause the MST weight to increase is called a critical edge. Show how to find all critical edges in a graph in time proportional to E log E. Note: This question assumes that edge weights are not necessarily distinct (otherwise all edges in the MST are critical).

Please suggest an algorithm that solves this problem.

One approach I can think of does the job in time E.V. My approach is to run the kruskal's algorithm.

But whenever we encounter an edge whose insertion in the MST creates a cycle and if that cycle already contains an edge with the same edge weight, then, the edge already inserted will not be a critical edge (otherwise all other MST edges are critical edges).

Is this algorithm correct? How can I extend this algorithm to do the job in time E log E.

2

2 Answers

7
votes

The condition you suggest for when an edge is critical is correct I think. But it's not necessary to actually find a cycle and test each of its edges.

The Kruskal algorithm adds edges in increasing weight order, so the sequence of edge additions can be broken into blocks of equal-weight edge additions. Within each equal-weight block, if there is more than one edge that joins the same two components, then all of these edges are non-critical, because any one of the other edges could be chosen instead. (I say they are all non-critical because we are not actually given a specific MST as part of the input -- if we were then this would identify a particular edge to call non-critical. The edge that Kruskal actually chooses is just an artefact of initial edge ordering or how sorting was implemented.)

But this is not quite sufficient: it might be that after adding all edges of weight 4 or less to the MST, we find that there are 3 weight-5 edges, connecting component pairs (1, 2), (2, 3) and (1, 3). Although no component pair is joined by more than 1 of these 3 edges, we only need (any) 2 of them -- using all 3 would create a cycle.

For each equal-weight block, having weight say w, what we actually need to do is (conceptually) create a new graph in which each component of the MST so far (i.e. using edges having weight < w) is a vertex, and there is an edge between 2 vertices whenever there is a weight-w edge between these components. (This may result in multi-edges.) We then run DFS on each component of this graph to find any cycles, and mark every edge belonging to such a cycle as non-critical. DFS takes O(nEdges) time, so the sum of the DFS times for each block (whose sizes sum to E) will be O(E).

Note that Kruskal's algorithm takes time O(Elog E), not O(E) as you seem to imply -- although people like Bernard Chazelle have gotten close to linear-time MST construction, TTBOMK no one has got there yet! :)

5
votes

Yes, your algorithm is correct. We can prove that by comparing the execution of Kruskal's algorithm to a similar execution where the cost of some MST edge e is changed to infinity. Until the first execution considers e, both executions are identical. After e, the first execution has one fewer connected component than the second. This condition persists until an edge e' is considered that, in the second execution, joins the components that e would have. Since edge e is the only difference between the forests constructed so far, it must belong to the cycle created by e'. After e', the executions make identical decisions, and the difference in the forests is that the first execution has e, and the second, e'.

One way to implement this algorithm is using a dynamic tree, a data structure that represents a labelled forest. One configuration of this ADT supports the following methods in logarithmic time.

  • MakeVertex() - constructs and returns a fresh vertex.
  • Link(u, c, v) - vertices u and v must not be connected. Creates an unmarked edge from vertex u to vertex v with cost c.
  • Mark(u, v) - vertices u and v must be endpoints of an edge e. Marks e.
  • Connected(u, v) - indicates whether vertices u and v are connected.
  • FindMax(u, v) - vertices u and v must be connected. Returns the endpoints of an unmarked edge on the unique path from u to v with maximum cost, together with that cost. The endpoints of this edge are given in the order that they appear on the path.

I make no claim that this is a good algorithm in practice. Dynamic trees, like Swiss Army knives, are versatile but complicated and often not the best tool for the job. I encourage you to think about how to take advantage of the fact that we can wait until all of the edges are processed to figure out what the critical edges are.