i'm trying to match two columns in a dataframe to another dataframe, and I want the value returned to be the one in the second dataframe that first matches the two initial columns.
For example: I want to take the following dataframe:
Fasta<-c("X1","X1","X2","X2","X3","X3")
Species<-c("Kiwi","Chicken","Weta","Cricket","Tuatara","Gecko")
testdata<-as.data.frame(cbind(Fasta,Species))
testdata<-aggregate(Species ~ Fasta, testdata, I)
testdata<-aggregate(Species ~ Fasta, testdata, I)
Fasta Species1 Species2
X1 Kiwi Chicken
X2 Weta Cricket
X3 Tuatara Gecko
The following is my second dataframe
Species<-c("Kiwi","Chicken","Weta","Cricket","Frog","Gecko")
Genus<-c("Orn","Norn","Genus2","Genus2","Spec","NoSpec")
Order<-c("Bird","Bird","Order2","Order2","Norder","Geckn")
Kingdom<-rep("Animal",each=6)
lookup<-data.frame(cbind(Species,Genus,Order,Kingdom))
Species Genus Order Kingdom
Kiwi Orn Bird Animal
Chicken Norn Bird Animal
Weta Genus2 Order2 Animal
Cricket Genus2 Order2 Animal
Frog Spec Norder Animal
Gecko NoSpec Geckn Animal
I want to find the first column in the second dataframe that matches both Species1 and Species2 and return its name. Ideally this would give me the following output:
Fasta Species1 Species2 MatchLevel
X1 Kiwi Chicken Order
X2 Weta Cricket Genus
X3 Tuatara Gecko Kingdom
Open to the data in different formats,
testdata$MatchLevel <- mapply(function(s1, s2){names(lookup)[which(unlist(lookup[s1 == lookup$Species, ]) == unlist(lookup[s2 == lookup$Species, ]))[1]]}, testdata$Species1, testdata$Species2)
, though I suspect there's a more elegant alternative – alistaire