I have 2 reproducible dataframes over here. I am trying to identify which column contain values that are similar to another column. I hope my code will randomly select 1 value from each column, and loop through every single column in df2.
df1 <- data.frame(fruit=c("Apple", "Orange", "Pear"), location = c("Japan", "China", "Nigeria"), price = c(32,53,12))
df2 <- data.frame(grocery = c("Durian", "Apple", "Watermelon"), place=c("Korea", "Japan", "Malaysia"), invoice = c("XD1", "XD2", "XD3"))
df1$source <- "DF1"
df2$source <- "DF2"
df1
fruit location price source
1 Apple Japan 32 DF1
2 Orange China 53 DF1
3 Pear Nigeria 12 DF1
df2
grocery place invoice source
1 Durian Korea XD1 DF2
2 Apple Japan XD2 DF2
3 Watermelon Malaysia XD3 DF2
This is the output I hope to obtain under a new dataframe called df3.
df3
grocery place invoice source
1 fruit location NA DF1
The source column will allow the user to identify where the respective columns (fruit/location) comes from. The column name of df3 = the column names from df2, whereas the values under row1 = column names from df1.
The column Grocery is matched with fruit as there is a matching value i.e. "Apple" and "Japan" can be found in both place and location columns respectively.
Thank you!
df2as column names fordf3and column names fromdf1as 1st row? - Ronak ShahsetNames(data.frame(t(names(df1))), names(df2))? - Ronak Shah