2
votes

Thanks for any help you can provide.

I am using the networkd3 package in R to plot a forceNetwork plot of a nodelist and links (edgelist).

I have an edgelist / link list:

> edgelist

      round_prob NODEAid NODEBid
33979     0.6245    6990    6588
4899      0.9797    1042    1041
37109     0.6046    7498    7531
27771     0.7144    5906   16029
3603      0.6452     783     804
28491     0.6078    6034    5862
4518      0.6245     962    9874
19613     0.6745    4121   10285
19916     0.8721    4179    4180
8249      0.6821    1737    1733
35389     0.7150    7145   16992
32010     0.6495    6728   16921
22553     0.6959    4722    4549
14996     0.6031    3273   12929
35927     0.6245    7221    9814
15349     0.6245    3337    3233
34833     0.6109    7085    6852
39044     0.6117    7936    7977
39075     0.6844    7944   10978
11691     0.6821    2572    2587

This is a sample of a much larger edgelist, where I have selected only those links with link probability >0.6 and <1. The full edgelist was zero-indexed before the sample was taken.

I also have a nodelist, that is 18000 rows long. A sample of it is this:

> head(nodes)

  node id gr
0 1097  0  1
1 1149  1  1
2 1150  2  1
3 3395  3  1
4 3396  4  1
5 3523  5  1

I try to plot using forceNetwork:

forceNetwork(Links = edgelist, Nodes = nodes, Source = "NODEAid",
             Target = "NODEBid", Value = "round_prob", NodeID = "node",
             Group = "gr", opacity = 0.9)

This gives this plot, before zooming in:

enter image description here

Problem: I only have 20 pairs of nodes, yet my plot has thousands more (I cannot return the number).

By hovering over the unconnected points, I have been able to identify that they are made up of all possible nodes that feature in the nodelist.

Basically I think that forceNetwork is plotting every possible node, even those not in the edgelist.

Why is this happening and how can I stop it from doing so?


As per this question Going crazy with forceNetwork in R: no edges displayed I made sure that all my data was in numeric format and zero indexed. I still get this error.

Note: If I run the forceNetwork example in this question How to plot a directed Graph in R with networkD3? and from this tutorial https://christophergandrud.github.io/networkD3/ the output is as expected.

2

2 Answers

2
votes

I would think you should subset the node list such that it only includes nodes that are in the edge list.

2
votes

I would suggest either using simpleNetwork, which automatically creates the node list based on the edge list you pass, or use similar code as simpleNetwork does to create your node list first and then pass that to forceNetwork...

edgelist <- read.table(header = T, text = "
round_prob NODEAid NODEBid
33979     0.6245    6990    6588
4899      0.9797    1042    1041
37109     0.6046    7498    7531
27771     0.7144    5906   16029
3603      0.6452     783     804
28491     0.6078    6034    5862
4518      0.6245     962    9874
19613     0.6745    4121   10285
19916     0.8721    4179    4180
8249      0.6821    1737    1733
35389     0.7150    7145   16992
32010     0.6495    6728   16921
22553     0.6959    4722    4549
14996     0.6031    3273   12929
35927     0.6245    7221    9814
15349     0.6245    3337    3233
34833     0.6109    7085    6852
39044     0.6117    7936    7977
39075     0.6844    7944   10978
11691     0.6821    2572    2587
")

library(networkD3)

simpleNetwork(edgelist, Source = 'NODEAid', Target = 'NODEBid')

sources <- edgelist$NODEAid
targets <- edgelist$NODEBid
node_names <- factor(sort(unique(c(as.character(sources), 
                                   as.character(targets)))))
nodes <- data.frame(name = node_names, group = 1, size = 8)
links <- data.frame(source = match(sources, node_names) - 1, 
                target = match(targets, node_names) - 1, 
                value = edgelist$round_prob)

forceNetwork(Links = links, Nodes = nodes, Source = "source",
             Target = "target", Value = "value", NodeID = "name",
             Group = "group", opacity = 0.9)