I have data with repeat measurements on each subject (id) at a variable number of timepoints. I would like to retain two row for each subject, timepoint == 0 and the timepoint closest to 4. In the case rows with two candidate timepoints equally distant from 4, e.g. (3, 5), I want to chose the lowest (3).
As shown in the 'choice' column of the image below, rows with "x" would not be retained.
dat <- structure(list(id = c(172507L, 172507L, 172507L, 172525L, 172525L,
172525L, 172526L, 172526L, 172526L, 172527L, 172527L, 172527L,
172527L, 172527L), timepoint = c(0L, 2L, 6L, 0L, 4L, 5L, 0L,
5L, 2L, 2L, 3L, 5L, 6L, 0L)), class = "data.frame", row.names = c(NA,
-14L))
timepoint = 3
for the single instance ofid = 172528
, but discardingtimepoint = 5
for the single instance of172529
, ortimepoint = 6
for the single instance of172530
? – neilfwsid = 172529
andid = 172530
? – Maurits Evers