I have a dataset that has a date object (one date per individual).
I'd like to get the sum of all possible pairs that meet pre-specified criteria. For example, I want the total number of pairs that could be "matched" based on the fact that their dates were within, say, 15 days of each other:
df <- data.frame("ID"=c(1:5), "Date"=c("2005-01-05","2005-01-08","2005-01-21","2005-01-22","2005-02-04"))
df
ID1 matches with ID2, ID2 matches with ID3, ID2 matches with ID4 ID3 matches with ID4, ID3 matches with ID5 ID4 matches with ID5
All I want for output is the sum of all possible pairs (in this case, n=6 possible pairs).
I've played with ddply and aggregate functions quite a bit for this problem, but really can't nail down where I'm going wrong. I suspect it has something to do with the fact that I have date objects. I'll spare you all of my elementary attempted, unsuccessful solutions.
And no, this is not homework. Somewhat new to R and this is part of a larger cluster analysis project I am working on.
