I have data on different persons (ID
), the dates they have worked (Date
), and how many hours they have worked each date (Hours
).
Monday - Friday are considered working days. For each ID, I want to group consecutive working days. Weekends and holidays should be omitted when deciding if working days are consecutive.
Some examples:
If a person worked Monday, Tuesday and Wednesday, then skipped Thursday, and worked again on Friday, then Monday - Wednesday is considered one group and Friday another group.
If a person worked Thursday & Friday one week and Monday & Tuesday the next week, then these four days should be in the same group. Thus, the days in the weekend are omitted when checking if working days are consecutive.
If Monday - Friday of one week and Monday - Friday of the next week, then I'd count those whole two weeks as consecutive working days.
I would also like to take into account common US Holidays (e.g. New Year), such that 12/31/2020 to 1/4/2021 would still count as consecutive workdays.
Once the grouping variable is created, I want to sum the working hours in each group.
Example data:
df1 = structure(list(ID = c(1, 1, 1, 1, 2, 2, 3, 3,
3, 3), Date = structure(c(18781, 18782, 18785, 18750, 18687,
18688, 18626, 18627, 18631, 18634), class = "Date"), Hours = c(8,
8, 8, 16, 8, 8, 8, 8, 8, 8)), row.names = c(NA, -10L), class = "data.frame")
ID Date Hours
1 1 2021-06-03 8
2 1 2021-06-04 8
3 1 2021-06-07 8
4 1 2021-05-03 16
5 2 2021-03-01 8
6 2 2021-03-02 8
7 3 2020-12-30 8
8 3 2020-12-31 8
9 3 2021-01-04 8
10 3 2021-01-07 8
I imagine my output to look something like this:
ID Date1 Date2 Hours
1 1 2021-06-03 2021-06-07 24
# the weekend, June 5-6, is omitted
# when the group of consecutive working days is created
2 1 2021-05-03 2021-05-03 16
3 2 2021-03-01 2021-03-02 16
4 3 2020-12-30 2021-01-04 24
# the public holiday (Jan 1) and the weekend (Jan 2-3) are omitted
5 3 2021-01-07 2021-01-07 8
My top priority is to at least get the consecutive work week figured out, the holiday portion would be an added bonus.