Scenario:
2-column dataframe_1 (300,000 rows)
head(dataframe_1):
CHR POS
1 2000
1 3000
2 1500
3 3000
3-column dataframe_2 (300 rows)
head(dataframe_2):
CHR POS_START POS_END
1 1500 2500
1 3200 4000
2 1200 1600
2 2000 2200
3 5000 5500
4 1000 1200
The goal is to take dataframe_1 and compare the POS column of each row against dataframe_2 (columns POS_START and POS_END) and return a vector (length = nrow(dataframe_1)) that indicates which row of dataframe_1 lists a POS value that is within the range as indicated in dataframe_2. Note that each POS value is linked to a particular CHR value.
Example return vector:
CHR POS EXAMPLE_RETURN_VECTOR
1 2000 TRUE
1 3000 FALSE
2 1500 TRUE
3 3000 FALSE
What’s the best strategy here?
Thanks!