I am a novice R user and I have a large dataframe (1700 columns) that is organized by data and flag columns:
df <- data.frame( "100249 MERCURY TOTAL ug/L" = runif(10),
"100397 TRIHALOMETHANES ug/L" = runif(10),
"100397 TRIHALOMETHANES ug/L FLAG" = c("L", "L", NA, "L", "L", NA, "L", NA, NA, NA),
"100407 XYLENE ug/L" = runif(10),
"100407 XYLENE ug/L FLAG" = c("L", NA, "L", "L", "L", NA, "L", NA, "L", "L"),
check.names=FALSE )
There is no pattern to the data, not every parameter has an associated FLAG column.
I would like to merge the matching data and flags together, with the flag first and then the numeric data in each cell, and rename the merged columns '100397 TRIHALOMETHANES ug/_COMB' and repeat for every pair of data & flag columns.
So far I have got it to search for the FLAG columns using grepl, create a new header using gsub and paste0, but the lapply and merge functions to combine the columns are not working for me. I have also have been reading that people use 'paste' or tidyr's 'unite' to merge columns, but nothing has been successful so far.
A sample of my code:
lapply(df, function(x) if (grepl("*\\FLAG$", colnames(df(x)))) {
newcol = paste0("df$", gsub("*\\FLAG$", "", colnames(df(x))), "_COMB")
assign(newcol, merge(df[x], df[x-1], all= T))
})
Any advice would be greatly appreciated, thank you.