Could you please tell me how I could transform the data frame like this:
tg qr loc a1 a2 a3 b1 b2 b3 c1 c2 c3
1 A 1 89 NA NA NA 1 2 3 1 2 3
2 A 1 61 1 2 3 NA NA NA 1 2 3
3 A 2 38 4 5 6 NA NA NA NA NA NA
4 B 1 40 4 5 6 NA NA NA NA NA NA
5 B 1 3 NA NA NA NA NA NA 4 5 6
into this:
tg qr loc a1 a2 a3 b1 b2 b3 c1 c2 c3
1 A 1 15 1 2 3 1 2 3 1 2 3
2 A 2 95 4 5 6 NA NA NA NA NA NA
3 B 1 42 4 5 6 NA NA NA 4 5 6
The function should:
- merge all rows with same values in columns 'tg' AND 'qr' into one row
- while merging, replace all "NAs" with existing values - never in the opposite direction
- Often there will be cases, when a variable is present in both rows being merged, but its value will be always equal (then it doesn't matter from which row it would be taken)
- 'loc' column values differ, but are not relevant, the column could be even deleted
Code for these sample dataframes is:
df = rbind(c("A","1",floor(runif(1,1,100)),c(NA,NA,NA),c(1,2,3),c(1,2,3)),
c("A","1",floor(runif(1,1,100)),c(1,2,3),c(NA,NA,NA),c(1,2,3)),
c("A","2",floor(runif(1,1,100)),c(4,5,6),c(NA,NA,NA),c(NA,NA,NA)),
c("B","1",floor(runif(1,1,100)),c(4,5,6),c(NA,NA,NA),c(NA,NA,NA)),
c("B","1",floor(runif(1,1,100)),c(NA,NA,NA),c(NA,NA,NA),c(4,5,6)))
df = as.data.frame(df)
colnames(df) = c("target","query","loc",c("a1","a2","a3"),c("b1","b2","b3"),c("c1","c2","c3"))
df2 = rbind(c("A","1",floor(runif(1,1,100)),c(1,2,3),c(1,2,3),c(1,2,3)),
c("A","2",floor(runif(1,1,100)),c(4,5,6),c(NA,NA,NA),c(NA,NA,NA)),
c("B","1",floor(runif(1,1,100)),c(4,5,6),c(NA,NA,NA),c(4,5,6)))
df2 = as.data.frame(df2)
colnames(df2) = c("target","query","loc",c("a1","a2","a3"),c("b1","b2","b3"),c("c1","c2","c3"))
Thank you for your support.
merge(df,df2,by=c("target","query"))? - Metricsdf2. I posted it as the sample output I'd like to obtain. - mjktfw