1
votes

My goal is to split up a dataframe, run igraph's graph_from_data_frame on each group and combine this back into the original dataframe in some way.

So far I've been able to get the igraph function to return a list of what I think are graph parameters, but I can't tell because I can't 'see' inside the listed rows. Here is some replicable code:

set.seed(123)
Data <- data.frame(
  From = sample(c("Dan", "Sharon","Bob","Andrew"), 100, replace = TRUE),
  To = sample(c("Dan", "Sharon","Bob","Andrew"), 100, replace = TRUE),
  Time=sample(c(1,2,3),100, replace = TRUE),
  ID.match=1:100)
Data %>% View

I'd like to pull the graph measures of centrality and combine them with the ID.match variable. I then plan to regress these measures on other variables of interest already contained within my dataset. I'm using group_by on Time to create a graph for each point in time like this:

 Data %>% group_by(Time) %>% do(v=graph_from_data_frame(.))

The igraph function, graph_from_data_frame, create's an igraph object from which the measures of centrality can be obtained. The following code can do what I want without group_by. I'd like to use this with group_by:

set.seed(123)
Data <- data.frame(
  From = sample(c("Dan", "Sharon","Bob","Andrew"), 100, replace = TRUE),
  To = sample(c("Dan", "Sharon","Bob","Andrew"), 100, replace = TRUE),
  # Time=sample(c(1,2,3),100, replace = TRUE),
  ID.match=1:100)
Data %>% View

g <- graph_from_data_frame(Data)
plot(g)

enter image description here The plot looks like this, which is expected:

metrics <- data.frame(
  Degree=degree(g),
  Closeness = closeness(g),
  Betweenness = betweenness(g)
)
metrics %>% View

enter image description here

I would like to have a 'metrics' dataframe for each group. This question is similar to this SO question, but I can't seem to get things worked out. I've tried to use the purrr package to unlist the listed dataframe, but I think it's a bit too advanced for me. Any help would be much appreciated.

1

1 Answers

1
votes

Your data with Time

DF <- data.frame(
  From = sample(c("Dan", "Sharon","Bob","Andrew"), 100, replace = TRUE),
  To = sample(c("Dan", "Sharon","Bob","Andrew"), 100, replace = TRUE),
  Time=sample(c(1,2,3),100, replace = TRUE),
  ID.match=1:100)

function to make metrics data frame

makemetrics <- function(gr) {
                    data.frame(Degree=degree(gr), Closeness = closeness(gr), Betweenness = betweenness(gr))
               }

solution

Dsplit <- split(DF, DF$Time)
lapply(Dsplit, function(x) makemetrics(graph_from_data_frame(x)))

Output

$`1`
       Degree Closeness Betweenness
Andrew     17 0.3333333   0.1818182
Bob         8 0.2000000   0.0000000
Sharon     11 0.2500000   2.0000000
Dan        20 0.3333333   0.8181818

$`2`
       Degree Closeness Betweenness
Andrew     17 0.2500000   0.0000000
Dan        19 0.3333333   0.0000000
Bob        17 0.3333333   0.6666667
Sharon     19 0.3333333   0.3333333

$`3`
       Degree Closeness Betweenness
Sharon     26 0.3333333         0.8
Bob        17 0.3333333         0.0
Dan        15 0.3333333         0.2
Andrew     14 0.2500000         0.0

Something extra

You can save your results in a data frame again using purrr:map_df

ans <- lapply(Dsplit, function(x) makemetrics(graph_from_data_frame(x)))
library(purrr)
map_df(ans, ~.x, .id="Time")

Output

   Time Degree Closeness Betweenness
1     1     17 0.3333333   0.1818182
2     1      8 0.2000000   0.0000000
3     1     11 0.2500000   2.0000000
4     1     20 0.3333333   0.8181818
5     2     17 0.2500000   0.0000000
6     2     19 0.3333333   0.0000000
7     2     17 0.3333333   0.6666667
8     2     19 0.3333333   0.3333333
9     3     26 0.3333333   0.8000000
10    3     17 0.3333333   0.0000000
11    3     15 0.3333333   0.2000000
12    3     14 0.2500000   0.0000000