I have two large data.frames:
DF1
AB2 CF34 FGH23 P53T a b c d e bv sd we sa s qw fd fg df lk po
DF2
AB2 CF34 FGH23 P53T a b c m n m sd we sa s py fd fgq df lk pq
I "simlpy" would like to match the two data.frames column by column each two columns (according to the corresponding column name) and return the number of matched items resulting from the pairwise comparison. In other words a sort of:
merge(DF1, DF2, by = "AB2")
merge(DF1, DF2, by = "CF34")
and so on. The problem is that the two files are too large to be able to do this comparison manually as I reported using the merge function.
Any idea about?
Thanks a lot!
E.
sapply(names(DF1),function(n) nrow(merge(DF1,DF2,by=n))
? – Ben Bolker