I have a dataframe with missing data for some rows. The correct data can be found in another set of columns. I want to replace the NAs with the correct data.
My data looks like this:
df <- data.frame(M_1=c(1,NA,3,NA,6),
M_2=c(5,NA,3,NA,1),
M_3=c(6,NA,2,NA,4),
M_C_1=c(NA,2,NA,6,NA),
M_C_2=c(NA,1,NA,4,NA),
M_C_3=c(NA,7,NA,3,NA))
df
# M_1 M_2 M_3 M_C_1 M_C_2 M_C_3
#1 1 5 6 NA NA NA
#2 NA NA NA 2 1 7
#3 3 3 2 NA NA NA
#4 NA NA NA 6 4 3
#5 6 1 4 NA NA NA
For all records, I either have a complete set of records for variables M_1, M_2, and M_3 or I have a complete set for variables M_C_1, M_C_2, and M_C_3.
For each row that has NAs in the first set of variables (M_1:M_3), I would like to replace with the values from the second set of values (M_C_1:M_C_2). I dont need to retain the second set of values. So my desired data frame would look like:
df
# M_1 M_2 M_3
#1 1 5 6
#2 2 1 7
#3 3 3 2
#4 6 4 3
#5 6 1 4
My real dataset contains many columns in this notation, so I need a general solutions (ie, I dont want to refer to each column individually).
I would like to do this with dplyr if possible.