replace some column values from a data.frame based on another data.frame

Question

I have two data.frames, (df1, df2) and I would like to replace the values in columns P1-P10 the letters with the values of df1$V2 but keeping the first two columns of df2.

df1 = data.frame(V1=LETTERS, V2=rnorm(26))

df2 <- data.frame(Name=sample(LETTERS, 6), bd=sample(1:6), P1=sample(LETTERS,6), P2=sample(LETTERS, 6), P3=sample(LETTERS, 6), P4=sample(LETTERS, 6), P5=sample(LETTERS, 6), P6=sample(LETTERS, 6), P7=sample(LETTERS, 6), P8=sample(LETTERS, 6), P9=sample(LETTERS, 6), P10=sample(LETTERS, 6))

My approach is the following:

df3 <- matrix(setNames(df1[,2], df1[,1])[as.character(unlist(df2[,3:12]))], nrow=6, ncol=10)
df4 <- data.frame(cbind(df2[,1:2], df3))

Which gives me my desire output, my real data has 10,000 columns, is there any way to avoid the cbind step or make the process fast?

> df4
Name bd         X1          X2         X3         X4         X5         X6        X7         X8         X9        X10
1    V  6 -1.8991102  0.40269050 -0.1517500 -2.5297829  1.5315622  1.4897071  1.364071 -1.2443708 -1.3197276 -0.4917057
2    T  1 -2.5297829 -0.44614123 -0.1894970 -0.6693774 -0.1517500 -1.0650962 -0.151750 -0.4461412 -0.6693774 -1.1351770
3    R  5 -0.6693774  0.09059365 -2.5297829  0.3233827 -0.9383348 -0.4461412  1.281797  1.5315622  1.4897071 -0.4461412
4    B  4 -0.4461412 -0.93833476 -1.2443708 -0.4461412 -0.1894970 -0.9383348 -1.135177 -1.8991102 -0.1894970  0.4026905
5    K  2 -1.0180271 -1.06509624 -0.1939600 -0.1894970  1.4897071 -0.6693774 -1.899110 -1.3197276  1.5315622 -0.1517500
6    Y  3  1.5315622 -0.19396005 -0.4917057 -0.4664239 -1.8991102  0.4026905 -1.065096 -0.9383348 -1.2443708 -0.4664239

Thanks

In your example P1-P10 are factors. Is it really so in your dataset? — ECII

Sven Hohenstein Sven Hohenstein · Accepted Answer · 2013-12-11T19:48:52

You can match the values of df2[3:12] in df1[[1]]. These row numbers are used to extract the values from df1[2].

df2[3:12] <- df1[match(as.character(unlist(df2[3:12])), 
                       as.character(df1[[1]])), 2]

The result (df2):

  Name bd         P1         P2         P3         P4         P5         P6         P7         P8         P9        P10
1    H  5  0.1199355  0.3752010 -0.3926061 -1.1039548 -0.1107821  0.9867373 -0.3360094 -0.7488000 -0.3926061  2.0667704
2    U  4  0.1168599  0.1168599  0.9867373  1.3521418  0.9867373 -0.3360094 -0.7724007 -0.3926061 -0.3360094 -1.2543480
3    R  3 -1.2337890 -0.1107821 -0.7724007  2.0667704  0.3752010  0.4645504  0.9867373  0.1168599 -0.0981773 -0.3926061
4    G  2 -0.3926061  0.3199261 -0.0981773 -0.1107821  2.0667704 -1.1039548 -1.2337890  0.3199261 -1.2337890 -2.1534678
5    C  6 -2.1534678 -1.1039548 -1.1039548 -0.7488000  0.4645504  0.3199261 -2.1534678 -0.3360094  0.9867373  0.8771467
6    I  1  0.6171634  0.6224091  1.8011711  0.7292998  0.8771467  2.0667704  0.3752010  0.4645504 -2.1534678 -0.7724007

If you don't want to replace the values inside df2, you can create a new data frame df4 with

df4 <- "[<-"(df2, 3:12, value = df1[match(as.character(unlist(df2[3:12])), 
                                          as.character(df1[[1]])), 2])

replace some column values from a data.frame based on another data.frame

2 Answers