1
votes

I am having some issues in realizing a bipartite network in R with the library igraph. Here is my script:

library(igraph)
library(reshape2)
setwd("....")
getwd()
library(readxl)
network=read_excel("network1.xlsx")
print(network)
subjects=as.character(unlist(network[,1]))
agents=colnames(network[-1])
print(network)
network = network[,-1]
g=graph.incidence(network, weighted = T)
V(g)$type
V(g)$name=c(subjects,agents)
V(g)$color = V(g)$type
V(g)$color=gsub("FALSE","red",V(g)$color)
V(g)$color=gsub("TRUE","lightblue",V(g)$color)
plot(g, edge.arrow.width = 0.3,
     vertex.size = 5, 
     edge.arrow.size = 0.5,
     vertex.size2 = 5,
     vertex.label.cex = 1,
     vertex.label.color="black",
     asp = 0.35, 
     margin = 0,
     edge.color="grey",
     edge.width=(E(g)$weight),
     layout=layout_as_bipartite)

The network is properly plotted

as you can see

as you can see

however I have two issues

(1) I don't understand the order in which the vertexs are showed in the plot. They are not in the same order of the excel file, neither in alphabetical or numerical order. They seem to be in random order. How could I choose the order in which the vertex should be placed?

(2) I don't understand why some vertex are closer toghether, and some are more far apart. I would all vertexes at the same distance. How could I do it?

Thank you a lot for your invaluable help.

2
We do not have your file network1.xlsx so we cannot run your example. To help us help you please run your code to create the variable network ( up to and including the line network = network[,-1]). Then run dput(network) and paste the result into your question so that we can help with your example. - G5W

2 Answers

1
votes

Since you do not provide your data, I will illustrate with a made-up example.

Sample graph data

library(igraph)
set.seed(123)
EL = matrix(c(sample(8,18, replace=T),
    sample(LETTERS[1:6], 18, replace=T)), ncol=2)
g = simplify(graph_from_edgelist(EL))
V(g)$type = bipartite_mapping(g)$type 
VCol = c("#FF000066", "#0000FF66")[as.numeric(V(g)$type)+1]
plot(g, layout=layout_as_bipartite(g), vertex.color=VCol)

Raw graph

As with your graph, this has two problems. The nodes are ordered arbitrarily and the lower row is oddly spaced. Let's address those problems one at a time. To do so, we will need to take control of the layout instead of using any of the automated layout functions. A layout is simply a vcount(g) * 2 matrix giving the x-y coordinates of the vertices for plotting. Here, I will put one type of nodes in the top row by specifying the y coordinate as 1 and the other nodes in a lower row by specifying y=0. We want to specify the order horizontally by rank (alphabetically) within each group. So

LO = matrix(0, nrow=vcount(g), ncol=2)
LO[!V(g)$type, 2] = 1
LO[V(g)$type, 1]  = rank(V(g)$name[V(g)$type]) 
LO[!V(g)$type, 1] = rank(V(g)$name[!V(g)$type])
plot(g, layout=LO, vertex.color=VCol)

Ordered graph, but unbalanced

Now both rows are ordered and evenly spaced, but because there are fewer vertices in the bottom row, there is an unattractive, unbalanced look. We can fix that by stretching the bottom row. I find it easier to make the right scale factor if the coordinates go from 0 to (number of nodes) - 1 rather than 1 to (number of nodes) as above. Doing this, we get

LO[V(g)$type, 1]  = rank(V(g)$name[V(g)$type]) - 1
LO[!V(g)$type, 1] = (rank(V(g)$name[!V(g)$type]) - 1) * 
    (sum(V(g)$type) - 1)  /  (sum(!V(g)$type) - 1)
plot(g, layout=LO, vertex.color=VCol)

Balanced, ordered graph

0
votes

thank you a lot. I performed your very very helpful example, and with the step one I did it work properly with my data, keeping the different thickness of the edges and all as in my plot, but with the proper order. This is very important, thank you a lot. However, I have some troubles in understanding how to rescale properly the top and the bottom row with my data, because they always seem to bee too near. probably I did not understand completly the coordinates on which I have to work. Here are my data.

> `> network=read_excel("network1.xlsx",2)
> dput(network)
structure(list(`NA` = c(2333, 2439, 2450, 2451, 2452, 2453, 2454, 
2455, 2456, 2457, 2458, 2459, 2460, 2461, 2480, 2490, 2491, 2492, 
2493, 2494, 2495), A = c(12, 2, 2, 5, 2, 0, 5, 3, 0, 0, 7, 0, 
0, 0, 6, 2, 10, 7, 1, 2, 5), B = c(0, 1, 0, 1, 0, 0, 2, 0, 0, 
0, 0, 0, 1, 0, 5, 0, 2, 0, 0, 0, 0), C = c(0, 0, 0, 0, 1, 0, 
4, 0, 0, 0, 0, 1, 0, 0, 2, 0, 4, 4, 2, 1, 0), D = c(2, 0, 0, 
0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 7, 0, 4, 0, 1, 4, 0), E = c(11, 
2, 3, 3, 3, 8, 3, 6, 4, 1, 1, 0, 12, 0, 5, 0, 4, 6, 4, 8, 9), 
    F = c(2, 0, 0, 3, 1, 0, 10, 1, 0, 0, 0, 1, 0, 0, 9, 0, 0, 
    1, 1, 3, 3), G = c(0, 3, 1, 1, 0, 0, 0, 0, 0, 3, 2, 0, 0, 
    0, 1, 0, 0, 2, 0, 1, 0), H = c(0, 0, 2, 0, 0, 0, 1, 0, 0, 
    0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 1), I = c(0, 0, 0, 0, 0, 
    0, 3, 0, 6, 3, 0, 0, 1, 0, 7, 0, 0, 4, 1, 2, 0), J = c(0, 
    0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 
    0)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-21L), .Names = c(NA, "A", "B", "C", "D", "E", "F", "G", "H", 
"I", "J"))
> print(network)
     NA  A B C D  E  F G H I J
1  2333 12 0 0 2 11  2 0 0 0 0
2  2439  2 1 0 0  2  0 3 0 0 0
3  2450  2 0 0 0  3  0 1 2 0 0
4  2451  5 1 0 0  3  3 1 0 0 0
5  2452  2 0 1 0  3  1 0 0 0 0
6  2453  0 0 0 0  8  0 0 0 0 1
7  2454  5 2 4 2  3 10 0 1 3 0
8  2455  3 0 0 0  6  1 0 0 0 0
9  2456  0 0 0 0  4  0 0 0 6 0
10 2457  0 0 0 0  1  0 3 0 3 0
11 2458  7 0 0 0  1  0 2 0 0 0
12 2459  0 0 1 0  0  1 0 0 0 0
13 2460  0 1 0 0 12  0 0 0 1 0
14 2461  0 0 0 0  0  0 0 0 0 0
15 2480  6 5 2 7  5  9 1 2 7 1
16 2490  2 0 0 0  0  0 0 0 0 0
17 2491 10 2 4 4  4  0 0 0 0 0
18 2492  7 0 4 0  6  1 2 0 4 0
19 2493  1 0 2 1  4  1 0 0 1 0
20 2494  2 0 1 4  8  3 1 0 2 0
21 2495  5 0 0 0  9  3 0 1 0 0
> `