Suppose I have two data frames:
A = data frame consisting of unique phone numbers with an additional factor column. Suppose nrow(A) = 20
B = data frame consisting of rows representing unique households and four columns for listed phone numbers, and fifth column for unique household ID. It is possible that the same number is repeated in multiple of the B columns. Suppose nrow(B) = 100
I want to return a table that has the the "A" unique phone numbers with the household ID from "B" after checking whether the A phone numbers are in either of the four columns.
So for example:
a <- data.frame(phone=c("12345","12346","12456"),
factor=c("OK","BAD","BAD"))
b <- data.frame(ph1 = c("12345","","12346","12347",""),
ph2 = c("","","12346","","12348"),
ph3 = c("","","","12456","67890"),
hhid = seq(1121,1125))
How can I return C that would look like the following:
c <- data.frame(phone = c("12345","12346","12456"),
factor = c("OK","BAD","BAD"),
hhid = c("1121","1123","1124"))
I'm sure it's possible to do this in a really elegant way or with minimal amount of code. I thought about using for loops or merge but think that is on the wrong track. Open to using whichever packages.