I have two data frame in data.table forms. One has grouped data for which I would like extract index of values from the second data.table frame. Below are sample data
snp_bygene<-data.table(V2=c("SNP1","SNP2","SNP3","SNP4","SNP5","SNP11","SNP12","SNP13","SNP14","SNP15"),
GENE=c( rep("GENE1",5),rep("GENE2",5) ),START=c(rep(100,5),rep(200,5)),END=c(rep(190,5),rep(290,5)) )
snp_data<-data.table(V2=c("SNP1","SNP2","SNP3","SNP4","SNP5","SNP11","SNP12","SNP13","SNP14","SNP15"),BP=c(101,102,105,110,125,201,202,205,210,225))
I would like to get index for V2 in snp_bygene matched by snp_data V2. Per gene I would like to get SNP position.
setkey(snp_data, V2)
snp_bygene[snp_data]
How do I proceed?
Final output would be look like:finalindex_perGene<-list("GENE1"=c(1, 2, 3, 4, 5) , "GENE2" =c(6, 7, 8, 9, 10))
Edit 1: there is no GENE group in snp_data
snp_data- Death Metal