I've got three data frames (Df1, Df2, Df3). These data frames have some variable in common, but they also each contain some unique variables. I'd like to make sure that all variables are represented in all data frames, eg material is present in Df2 but not Df1, so I'd like to create a variable named material in Df1 and set that variable to be NA. Thanks for any help.
Starting point (dfs):
Df1 <- data.frame("color"=c(1,1,1),"price"=c(1,1,1),"buyer"=c(1,1,1))
Df2 <- data.frame("color"=c(1,1,1),"material"=c(1,1,1),"size"=c(1,1,1))
Df3 <- data.frame("color"=c(1,1,1),"price"=c(1,1,1),"key"=c(1,1,1))
Desired outcome (dfs):
Df1 <- data.frame("color"=c(1,1,1),"price"=c(1,1,1),"material"=c(NA,NA,NA),"buyer"=c(1,1,1),"size"=c(NA,NA,NA),"key"=c(NA,NA,NA))
Df2 <- data.frame("color"=c(1,1,1),"price"=c(NA,NA,NA),"material"=c(1,1,1),"buyer"=c(NA,NA,NA),"size"=c(1,1,1),"key"=c(NA,NA,NA))
Df3 <- data.frame("color"=c(1,1,1),"price"=c(1,1,1),"material"=c(NA,NA,NA),"buyer"=c(NA,NA,NA),"size"=c(NA,NA,NA),"key"=c(1,1,1))
My code so far: (I'm trying to compare the variable names in an individual data frame with the variable names in all three data frames, and use the ones not present in the individual data frame to generate the new variables set to NA. But I end up with: Error in VarDf1[, NewVariables] <- NA :incorrect number of subscripts on matrix). Don't know how to fix it.
dfs <- list(Df1,Df2,Df3)
numdfs <- length(dfs)
for (i in 1:numdfs)
{
VarDf1 <- as.vector(names(Df1))
VarDf2 <- as.vector(names(Df2))
VarDf3 <- as.vector(names(Df3))
VarAll <- c(VarDf1, VarDf2,VarDf3)
NewVariables <- as.vector(setdiff(VarAll, dfs[i]))
dfs[i][ , NewVariables] <- NA
}