I am working with a data frame (lets call is MyData) looking like below. What I want to do is to group by PatientKey and create a new Id called NewID. Every time for the same PatientKey that the TimeBetweenTests is > 14, the new Id should increase by 1, and stay on that particular new value until a new PatientKey shows up OR for the same patientKey a new TimeBetweenTests > 14 shows up.
PatientKey TimeBetweenTests NewId
1 0 NewId should be 1 (first patient)
1 0 NewId should be 1
1 1 NewId should be 1
1 2 NewId should be 1
2 3 NewId should be 2 (new patient)
3 4 NewId should be 3 (new patient)
3 16 NewId should be 4 (same patient but TimeBetweenTests > 14)
3 80 NewId should be 5 (same patient but TimeBetweenTests > 14)
4 3 NewId should be 6 (new patient)
4 0 NewId should be 6 (new patient)
4 90 NewId should be 7 (same patient but TimeBetweenTests > 14)
4 110 NewId should be 8 (same patient but TimeBetweenTests > 14)
5 3 NewId should be 9 (new patient)
5 3 NewId should be 9
5 3 NewId should be 9
etc
I have tried using dplyr for this but the problem is that the subsequent values does not change when I try code similar to:
MyData <- MyData %>% group_by(PatientKey) %>% mutate(NewId = ifelse(TimeBetweenTests > 14, lag(NewId), NewId))
Anyone has a convenient dplyr or data.table solution for this, alternatively a for loop approach.