Data frame: Populate matrix with corresponding values of matching cells

Question

I am trying to populate an adjacency matrix with "1"s if the values of cells in a data frame column match. The location of the "1"s is based on corresponding values in the same row.

To be more precise: Data frame pat1

ID PATID SUB
 1     2  2A
 2     2  2B 
 3     3  2C
 4     3  2D

I'd like to populate the matrix cells [2A,2B], [2B,2A], [2C,2D], and [2D,2C] in an empty matrix patmat1 with the corresponding row/col. names with a "1" since PATID[1]=PATID[2], and PATID[3]=PATID[4], respectively.

The desired output would be matrix(data = c(0,1,0,0,1,0,0,0,0,0,0,1,0,0,1,0), nrow=4, byrow=T) with colnames <- rownames <- c("2A", "2B", "2C", "2D") In this example, the patmat1 dimension would be 4 4 (2A,2B,2C,2D).

I've searched but found no approach yet.

It is not clear. Can you show the expected output? What is the dimension of patmat1? — akrun
@akrun The desired output would be matrix(data = c(0,1,1,0), nrow=2, byrow=T) with colnames <- rownames <- c("2A", "2B") In this example, the patmat1 dimension would be 2 2 (2A,2B). Thanks! — user5835099
You could try acast(pat1, ID~ID, value.var="SUB", length) from library(reshape2) — akrun
Thanks. I have tried this now, but it only changes the values of the diagonal. It should populate the [2A,2B] and [2B,2A] cells with a '1' and shouldn't changes the diagonal. — user5835099

akrun akrun · Accepted Answer · 2016-01-25T03:44:56

May be this helps

library(reshape2)
 if(length(unique(pat1$PATID))==1) pat1$SUB2 <- rev(pat1$SUB)
acast(pat1, SUB~SUB2, value.var='PATID', length)
#   2A 2B
#2A  0  1
#2B  1  0

Update

For the new dataset, we can split by 'PATID', do the acast on individual list elements and use bdiag to collapse them.

library(Matrix)
patmat1[] <- as.matrix(bdiag(lapply(lst, function(x) 
        acast(transform(x, SUB2=rev(SUB)), SUB~SUB2, 
        value.var='PATID', length))))
patmat1
#   2A 2B 2C 2D
#2A  0  1  0  0
#2B  1  0  0  0
#2C  0  0  0  1
#2D  0  0  1  0

Or we can do this by using data.table. We convert the 'data.frame' to 'data.table' (setDT(pat1)), grouped by 'PATID', create the 'SUB2' as the rev of 'SUB', then use acast from reshape2 to change from 'long' to 'wide' format, and specify the fun.aggregate as length.

library(data.table)
acast(setDT(pat1)[, SUB2:= rev(SUB), PATID], SUB~SUB2, 
                    value.var='PATID', length)
#   2A 2B 2C 2D
#2A  0  1  0  0
#2B  1  0  0  0
#2C  0  0  0  1
#2D  0  0  1  0

Data frame: Populate matrix with corresponding values of matching cells

1 Answers

Update