How can I use dplyr and lubridate packages to count active records by a series of given dates?

Question

Packages I'm working with:

dplyr
lubridate

Starting data set I'm working with:

Employee_ID      Start_Date      End_Date
1                2/1/2013        12/31/9999
2                5/14/2013       10/5/2017
3                9/7/2014        8/30/2017

Data set I'm looking to create from the starting data set:

Date             Active_Employee_Count
12/31/2013       2
12/31/2014       3
12/31/2015       3
12/31/2016       3
12/31/2017       1

Explanation

I'd like to be able to get the counts of active employees on December 31st of each year. I'm using the dplyr and lubridate packages to try to do this, but I'm struggling. I know of the filter, group_by, summarise, mutate, and distinct functions, but I'm not sure these are sufficient to do this task. The main challenge I'm facing is how do I somehow work into the process these December 31st dates? Do I need to create some sort of array of dates (12/31/2013, 12/31/2014, etc) and then somehow use that to do this?

Leonardo Siqueira Leonardo Siqueira · Accepted Answer · 2018-02-19T20:23:40

So, I changed the ID of the employees to actual names.

df = data.frame(Employee_ID = c("John", "Hugh", "Jack"),
                Start_Date = mdy(c(02012013, 05142013, 09072014)),
                End_Date = mdy(c(12319999, 10052017, 08302017)))  

Years_End2 = data.frame(Employee_ID = rep(df$Employee_ID, each = dim(Years_End)[1]),
                        Years_End = seq(from = mdy(12312013), to = mdy(12312017), by = "year"))

df %>% left_join(Years_End2, by = "Employee_ID") %>%
      mutate(Active = (End_Date > Years_End & Start_Date < Years_End)) %>%
      group_by(Years_End) %>%
      summarise(sum(Active))

  Years_End  `sum(Active)`
  <date>             <int>
1 2013-12-31             2
2 2014-12-31             3
3 2015-12-31             3
4 2016-12-31             3
5 2017-12-31             1

How can I use dplyr and lubridate packages to count active records by a series of given dates?

2 Answers