4
votes

I am working on a problem in which I am given an undirected graph G on n vertices and with m edges, such that each edge e has a weight w(e) ∈ {1, 2, 3}. The task is to design an algorithm which finds a minimum spanning tree of G in linear time (O(n + m)).

These are my thoughts so far:

  1. In the Algorithmic Graph Theory course which I am currently studying, we have covered Kruskal's and Prim's MST Algorithms. Perhaps I can modify these in some way, in order to gain linear time.
  2. Sorting of edges generally takes log-linear (O(mlog(m))) time; however, since all edge weights are either 1, 2 or 3, Bucket Sort can be used to sort the edges in time linear in the number of edges (O(m)).

I am using the following version of Kruskal's algorithm:

Kruskal(G)
  for each vertex ???? ∈ ???? do MAKE−SET(????)
  sort all edges in non-decreasing order
  for edge ????, ???? ∈ ???? (in the non-decreasing order) do
    if FIND ???? ≠ FIND(????) then 
      colour (????, ????) blue
      UNION(????, ????)
  od
  return the tree formed by blue edges

Also, MAKE-SET(x), UNION(x, y) and FIND(x) are defined as follows:

MAKE-SET(????)
  Create a new tree rooted at ????
  PARENT(????)=x

UNION(????, ????)
  PARENT FIND(????) ≔ ????????????????(????)

FIND(????)
  ???? ≔ ????
  while ???? ≠ PARENT(????) do
    ???? ≔ PARENT(????)
  return y

The issue I have at the moment is that, although I can implement the first two lines of Kruskal's in linear time, I have not managed to do the same for the next four lines of the algorithm (from 'for edge u, ...' until 'UNION (u, v)').

I would appreciate hints as to how to implement the rest of the algorithm in linear time, or how to find a modification of Kruskal's (or some other minimum spanning tree algorithm) in linear time.

Thank you.

1

1 Answers

2
votes

If you use the Disjoint Sets data structure with both path compression and union by rank, you get a data structure whose each operation's complexity grows extremely slowly - it is something like the inverse of the Ackermann function, and is not that large for sizes such as the estimated number of atoms in the universe. Effectively, then, each operation is considered constant time, and so the rest of the algorithm is considered linear time as well.

From the same wikipedia article

Since α(n) is the inverse of this function, α(n) is less than 5 for all remotely practical values of n. Thus, the amortized running time per operation is effectively a small constant.