3
votes

I am trying to convert a data frame from an online forum into a social network, however I don't know how to transform the data into an adjacency matrix/edge list needed for network analysis.

My code is as follows:

library(igraph)  
graph.data.2002 <- as.matrix(data.2002[,2:3])  
g.2002 <- graph.data.frame(graph.data.2002, directed=FALSE)  
plot(g.2002, vertex.size = 1, vertex.label=NA)  

I am using R for analysis. The current problem is that authors are linked to each other through the ThreadID, however when doing a network analysis, it includes the ThreadID as a node. Ideally i'd like an adjacency matrix / edge list that shows a 1 if an author interacts with all author on the same thread.

(First time posting, so let me know if there's anything that is missing/not proper)

Currently the data is as follows:

ThreadID    AuthorID
659289  193537
432269  136196
572531  170305
230003  32359
459059  47875
635953  181593
235116  51993
1
So you want it as two columns - say Author1, Author2 - with each pair listed? The example you provide isn't overly informative as each of the authors and threads are unrelated. Can you clarify what you want exactly as the output?thelatemail
Hello and welcome to StackOverflow. Please take some time to read the help page, especially the sections named "What topics can I ask about here?" and "What types of questions should I avoid asking?". And more importantly, please read the Stack Overflow question checklist. You might also want to learn about Minimal, Complete, and Verifiable Examples.symbolrush
Hey, sorry about the poorly worded post, the answer below solved it though. I'll be sure to make my posts more informative in the future :)Simon Ricketts

1 Answers

5
votes

You could use an inner_join to get something like an edge list (just some mild reformatting needed).

If I'm understanding correctly, test 1 should only have one connection, between author 193537 and 32359 who were on thread 659289.

test1 <- data.frame(ThreadID = c(659289, 432269, 572531, 659289),
                 AuthorID = c(193537, 136196, 170305, 32359))
test2 <- dplyr::inner_join(test1, test1, by = "ThreadID")[,-1]
test3 <- apply(test2, 2, as.character) #AuthorID as character will become vertex ID

Check that you get what you expected:

library(network)
test.network <- network(test3, directed = FALSE)
as.sociomatrix(test.network)
as.edgelist(test.network)
plot(test.network, label = test.network%v%"vertex.names")