1
votes

I have a edge list with similarity scores as a data frame in R:

example <- data.frame(Source = c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4),
                  Target = c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4),
                  Similarity = c(1,0,.2,0.1,.004,.1,0,0,1,2,0,.14,.006,0,1,.036))

The Source and Target columns represent IDs and so should be treated as factors rather than numeric. Ignore the Similarity values, I just put in random numbers for illustration.

Now I want to convert this edge list format into a matrix where row names = Source, column names = Target, and the intersect between the two is Similarity. I will then feed the data into the Rtsne package for graphing.

I try to do this like so:

m1 <- as.matrix(sparseMatrix(i = example$Source,
                         j = example$Target,
                         x = example$Similarity))

And that works fine except that the rows and columns are not named labeled.

 m1
     [,1] [,2] [,3]  [,4]
[1,] 1.000  0.0  0.2 0.100
[2,] 0.004  0.1  0.0 0.000
[3,] 1.000  2.0  0.0 0.140
[4,] 0.006  0.0  1.0 0.036

How should I modify the as.matrix code to keep the row/column labels? I will use them later on in the process.

3

3 Answers

1
votes

You can retain the dimnames directly if you use xtabs:

xtabs(Similarity ~ Source + Target, example)
#       Target
# Source     1     2     3     4
#      1 1.000 0.000 0.200 0.100
#      2 0.004 0.100 0.000 0.000
#      3 1.000 2.000 0.000 0.140
#      4 0.006 0.000 1.000 0.036
1
votes

You can set the dimnames for m1:

dimnames(m1) <- list(Source = unique(example$Source), 
                     Target = unique(example$Target))
m1
#>       Target
#> Source     1   2   3     4
#>      1 1.000 0.0 0.2 0.100
#>      2 0.004 0.1 0.0 0.000
#>      3 1.000 2.0 0.0 0.140
#>      4 0.006 0.0 1.0 0.036
0
votes

An option with acast from reshape2

library(reshape2)
acast(example, Source ~ Target, value.var = 'Similarity')
#      1   2   3     4
#1 1.000 0.0 0.2 0.100
#2 0.004 0.1 0.0 0.000
#3 1.000 2.0 0.0 0.140
#4 0.006 0.0 1.0 0.036

Or using tapply from base R

tapply(example$Similarity, example[1:2], FUN = I)