Functional Breadth First Search in Scala with the State Monad

Question

I'm trying to implement a functional Breadth First Search in Scala to compute the distances between a given node and all the other nodes in an unweighted graph. I've used a State Monad for this with the signature as :-

case class State[S,A](run:S => (A,S))

Other functions such as map, flatMap, sequence, modify etc etc are similar to what you'd find inside a standard State Monad.

Here's the code :-

case class Node(label: Int)

case class BfsState(q: Queue[Node], nodesList: List[Node], discovered: Set[Node], distanceFromSrc: Map[Node, Int]) {
  val isTerminated = q.isEmpty
}

case class Graph(adjList: Map[Node, List[Node]]) {
  def bfs(src: Node): (List[Node], Map[Node, Int]) = {
    val initialBfsState = BfsState(Queue(src), List(src), Set(src), Map(src -> 0))
    val output = bfsComp(initialBfsState)
    (output.nodesList,output.distanceFromSrc)
  }

  @tailrec
  private def bfsComp(currState:BfsState): BfsState = {
     if (currState.isTerminated) currState
     else bfsComp(searchNode.run(currState)._2)
  }

  private def searchNode: State[BfsState, Unit] = for {
    node <- State[BfsState, Node](s => {
      val (n, newQ) = s.q.dequeue
      (n, s.copy(q = newQ))
    })
    s <- get
    _ <- sequence(adjList(node).filter(!s.discovered(_)).map(n => {
      modify[BfsState](s => {
        s.copy(s.q.enqueue(n), n :: s.nodesList, s.discovered + n, s.distanceFromSrc + (n -> (s.distanceFromSrc(node) + 1)))
      })
    }))
  } yield ()
}

Please can you advice on :-

Should the State Transition on dequeue in the searchNode function be a member of BfsState itself?
How do I make this code more performant/concise/readable?

HTNW HTNW · Accepted Answer · 2017-08-17T01:23:01

First off, I suggest moving all the private defs related to bfs into bfs itself. This is the convention for methods that are solely used to implement another.

Second, I suggest simply not using State for this matter. State (like most monads) is about composition. It is useful when you have many things that all need access to the same global state. In this case, BfsState is specialized to bfs, will likely never be used anywhere else (it might be a good idea to move the class into bfs too), and the State itself is always run, so the outer world never sees it. (In many cases, this is fine, but here the scope is too small for State to be useful.) It'd be much cleaner to pull the logic of searchNode into bfsComp itself.

Third, I don't understand why you need both nodesList and discovered, when you can just call _.toList on discovered once you've done your computation. I've left it in in my reimplementation, though, in case there's more to this code that you haven't displayed.

def bfsComp(old: BfsState): BfsState = {
  if(old.q.isEmpty) old // You don't need isTerminated, I think
  else {
    val (currNode, newQ) = old.q.dequeue
    val newState = old.copy(q = newQ)
    adjList(curNode)
      .filterNot(s.discovered) // Set[T] <: T => Boolean and filterNot means you don't need to write !s.discovered(_)
      .foldLeft(newState) { case (BfsState(q, nodes, discovered, distance), adjNode) =>
        BfsState(
          q.enqueue(adjNode),
          adjNode :: nodes,
          discovered + adjNode,
          distance + (adjNode -> (distance(currNode) + 1)
        )
      }
  }
}

def bfs(src: Node): (List[Node], Map[Node, Int]) = {
  // I suggest moving BfsState and bfsComp into this method
  val output = bfsComp(BfsState(Queue(src), List(src), Set(src), Map(src -> 0)))
  (output.nodesList, output.distanceFromSrc)
  // Could get rid of nodesList and say output.discovered.toList
}

In the event that you think you do have a good reason for using State here, here are my thoughts. You use def searchNode. The point of a State is that it is pure and immutable, so it should be a val, or else you reconstruct the same State every use.

You write:

node <- State[BfsState, Node](s => {
  val (n, newQ) = s.q.dequeue
  (n, s.copy(q = newQ))
})

First off, Scala's syntax was designed so that you don't need to have both a () and {} surrounding an anonymous function:

node <- State[BfsState, Node] { s =>
  // ...
}

Second, this doesn't look quite right to me. One benefit of using for-syntax is that the anonymous functions are hidden from you and there is minimal indentation. I'd just write it out

oldState <- get
(node, newQ) = oldState.q.dequeue
newState = oldState.copy(q = newQ)

Footnote: would it make sense to make Node an inner class of Graph? Just a suggestion.

Functional Breadth First Search in Scala with the State Monad

1 Answers