2
votes

I have a dataframe in R, containing several binary variables

df1 <- cbind(w = c(1,0,0,0,0), x = c(1,1,1,1,0), y = c(0,0,0,1,1), z = (c(0,0,0,0,1)))

> df1

#     w x y z
#[1,] 1 1 0 0
#[2,] 0 1 0 0
#[3,] 0 1 0 0
#[4,] 0 1 1 0
#[5,] 0 0 1 1

There's another dataframe (df2), in which every row consists of a specified pair of the variables from df1

df2 <- cbind (var1 = c("w","y","y"), var2 = c("z","x","w"))
> df2

#    var1 var2
#[1,] "w"  "z" 
#[2,] "y"  "x" 
#[3,] "y"  "w" 

I want to take all the combinations found in rows of df2 ("w_z", "y_x", and "y_w", and add them as columns to df1. The values in each of the newly created "combination" columns should indicate whether the specific row contains a "1" in one or the other variable specified for that column. For example, the column w_z should indicate whether that row has a "1" in either "w" or "z". The resulting dataframe should look something like

> df1_New

#     w x y z w_z y_x y_w
#[1,] 1 1 0 0   1   1   1
#[2,] 0 1 0 0   0   1   0
#[3,] 0 1 0 0   0   1   0
#[4,] 0 1 1 0   0   1   1
#[5,] 0 0 1 1   1   1   1

I would appreciate any help very much!

1

1 Answers

3
votes

Try this

cbind(df1,`colnames<-`(+apply(df2, 1, function(k) rowSums(df1[,k])>0),do.call(paste,c(as.data.frame(df2),sep = "_"))))

which gives

     w x y z w_z y_x y_w
[1,] 1 1 0 0   1   1   1
[2,] 0 1 0 0   0   1   0
[3,] 0 1 0 0   0   1   0
[4,] 0 1 1 0   0   1   1
[5,] 0 0 1 1   1   1   1