2
votes

tl;dr: distances is giving me path lengths, but fail to recover what nodes are in those paths when using simple_paths.


I'd like to find all shortest, simple paths of a given length in my network. My network can be relatively large (1000 nodes, tens of thousands of edges), and since simple_paths is relatively slow and distances is quick, I thought I could first calculate distances as a filtering step.

That is, my current strategy is to

  1. calculate the length of all simple paths between each pair of vertices, i.e. dd = distances(my.net)
  2. find paths with desired length, i.e. dd[dd == desired.length]
  3. recover the nodes in this, now relatively small, list of paths.

However, I'm failing on step 3. That is, I can't recover the paths given by distances. For example, in the code below distances finds a path of length 3 between nodes D and X. When I try to use simple_paths to find out what that path actually is, I get nothing. What am I doing wrong?

require(dplyr)
require(tidyverse)
require(igraph)
set.seed(1)

# make network
fake.net = data.frame(A = sample(LETTERS,50,replace = T),
                      B = sample(LETTERS,50,replace = T),
                      stringsAsFactors = F) %>%
  dplyr::filter(!A==B) %>%
  as.matrix() %>% graph_from_edgelist()

# find one path of length 3
dd = distances(fake.net)
ia = which(dd==3)[1]
v.from = V(fake.net)[ia %% ncol(dd)]
v.to = V(fake.net)[ceiling(ia / ncol(dd))]

# what is that path?
shortest_paths(fake.net, from = v.from, to = v.to)

$vpath
$vpath[[1]]
+ 0/26 vertices, named, from ffb91bb:


$epath
NULL

$predecessors
NULL

$inbound_edges
NULL

1
This sounds like the Traveling Salesman Problem with a constraint on partitions. Learn to post library() calls to load any needed packages. That way we can firgure out whether your workspace has stuff that would otherwise create failures. I get Error: Problem with filter()` input ..1. x level sets of factors are different` from your effort to create fake.net.IRTFM
I added packages to the code exampleR Greg Stacey
Added require(dplyr), does that fix it?R Greg Stacey
Thanks. I'm using R 3.6 so I failed to see that I needed , stringsAsFactors=FALSE in the data.frame call.IRTFM
Thanks for catching that. I added stringsAsFactors=F explicitlyR Greg Stacey

1 Answers

3
votes

I guess you need to enable arr.ind in which, and try the code like below (if your graph is directed, you should use mode = "out" in distances)

dd <- distances(fake.net, mode = "out")
idx <- which(dd == 3, arr.ind = TRUE)
all_simple_paths <- apply(
  matrix(row.names(dd)[idx], ncol = 2),
  1,
  function(v) shortest_paths(fake.net, from = v[1], to = v[2])$vpath
)

and you will obtain

> head(all_simple_paths)
[[1]]
[[1]][[1]]
+ 4/26 vertices, named, from 84fcc54:
[1] G A Y D


[[2]]
[[2]][[1]]
+ 4/26 vertices, named, from 84fcc54:
[1] L A Y D


[[3]]
[[3]][[1]]
+ 4/26 vertices, named, from 84fcc54:
[1] G A F W


[[4]]
[[4]][[1]]
+ 4/26 vertices, named, from 84fcc54:
[1] U H I W


[[5]]
[[5]][[1]]
+ 4/26 vertices, named, from 84fcc54:
[1] O H I W


[[6]]
[[6]][[1]]
+ 4/26 vertices, named, from 84fcc54:
[1] L A F W