0
votes

Let's say that I have the following data.

structure(list(year_month = c("2016-10", "2016-11", "2016-12", 
"2017-01", "2017-02", "2017-05", "2017-08", "2017-09", "2016-10", 
"2016-11", "2016-12", "2017-01"), site_owner = c("Adam", 
"Adam", "Adam", "Adam", "Adam", "Adam", 
"Allison", "Allison", "Allison", "Allison", 
"Allison", "Allison"), N = c(4L, 10L, 4L, 11L, 8L, 
15L, 8L, 7L, 2L, 5L, 6L, 2L)), .Names = c("year_month", "site_owner", 
"N"), row.names = c(NA, -12L), class = c("data.table", "data.frame"
))

I want to find the number of consecutive months per each group/person.

To get this desired output, I need to find the difference between the current previous and previous month.

ddf$year_month = as.Date(paste(ddf$year_month, "01", sep="-"))
ddf
ddf[, diffa := year_month-shift(year_month), .(site_owner)]
ddf
ddf[, diffs := (year_month-shift(year_month))/(365.25/12), .(site_owner)]
ddf

This doesn't seem to work.

If I can get the difference, then I can get the count by doing.

dt[diffa==1, .N, by=.(site_owner)]

Here is the desired output.

name      conecutive months
adam      5 
allison   6
1
Could you please check your data. Is it '2016-10' or '2017-10' for row no. 9? Still, the desired output is not matchingakrun
Using the example data, I see 5 consecutive months for Adam (Oct 2016 - Feb 2017) and two sets for Allison: 4 months (Oct 2016 - Jan 2017) and 2 months (Aug 2017 - Sep 2017). Please clarify what the output should be.neilfws

1 Answers

1
votes

May be this would help

library(zoo)
library(data.table)
setDT(ddf)[, {
     v1 <- zoo::as.yearmon(year_month)
    .(consecutive_months = sum((v1 +1/12) == shift(v1, type = "lead"), na.rm = TRUE))}, 
       by = site_owner]
#   site_owner consecutive_months
#1:       Adam                  4
#2:    Allison                  5

NOTE: Assuming that the 'year_month' for rows 9:12 are '2017-10', '2017-11', '2017-12', '2018-01'

data