2
votes

Let I have such data frames(df1 and df2):

df1

ID  y
4   12
2   65
3   7
5   878
1   1
7   122

df2

ID  z

2   90
5   16
1   22

In df2 the ID's are also avaible in df1. Namely df2 is subset of df1 in terms of ID column.

I want create a new data frame(df3) such that

ID  y
4   12
2   90
3   7
5   16
1   22
7   122

Namely, in df1 y values are replaced with z values in df2 for the common ID's.

How can do that using R? I will be vet glad for any help. Thanks a lot.

1
Here's a fun one: within(merge(df1, df2, all = TRUE), { y[!is.na(z)] <- na.omit(z); rm(z) }), but the row order will be different - Rich Scriven

1 Answers

2
votes

Using data.table we can join the two data.tables and update y by reference

library(data.table)   ## version 1.9.6

## Using your original data.frame objects you would use
# dt1 <- as.data.table(df1)
# dt2 <- as.data.table(df2) 

dt1 <- data.table(id = c(4,2,3,5,1,7),
                  y = c(12, 65, 7, 878, 1, 122))

dt2 <- data.table(id = c(2,5,1),
                  z = c(90, 16, 22))


dt1[ dt2, on="id", y := z  ]
dt1
#    id   y
# 1:  4  12
# 2:  2  90
# 3:  3   7
# 4:  5  16
# 5:  1  22
# 6:  7 122

You can also specify the join column in the keys (which will work for older versions of data.table)

setkey(dt1, id)
setkey(dt2, id)

dt1[ dt2, y := z  ]
dt1