1
votes

I am trying to calculate my email durations. I have emails grouped by ID. In this example, I have already grouped my emails by group A. I wish to calculate my email read time duration for group A. The code I am currently using, calculates its last and first time in seconds.

data <-rawdata %>%
    group_by(ID) %>%
    summarize(diff = difftime(last(as.POSIXct(Endtime, format ="%m/%d/%Y %I:%M:%S %p")),
            first(as.POSIXct(Starttime, format = "%m/%d/%Y %I:%M:%S %p" )), units = "secs"))

However, I do not think this is an accurate display of my email reads. Overall, I am wanting the time difference by each row for a more accurate reading. The desired output would be (below) because it reveals the time difference by each row, allowing me to further SUM the entire diff column in order to determine my email duration in seconds.

        Starttime               Endtime                     ID         diff

        12/18/2019 4:06:59PM    12/18/2019 4:07:05 PM        A        6 secs
        12/18/2019 4:07:26PM    12/18/2019 4:07:28 PM        A        1 secs
        12/17/2019 6:48:06PM    12/17/2019 6:48:07PM         A        1 sec
        12/17/2019 6:25:16PM    12/17/2019 6:25:22PM         A        6 secs

Any help is appreciated. I will continue to research this!

1

1 Answers

2
votes

If you want the difference between start and end time of email read you can do

library(dplyr)

rawdata %>%
  mutate_at(vars(ends_with('time')), lubridate::mdy_hms) %>%
  mutate(diff = difftime(Endtime, Starttime, units = "secs"))

#            Starttime             Endtime ID   diff
#1 2019-12-18 16:06:59 2019-12-18 16:07:05  A 6 secs
#2 2019-12-18 16:07:26 2019-12-18 16:07:28  A 2 secs
#3 2019-12-17 18:48:06 2019-12-17 18:48:07  A 1 secs
#4 2019-12-17 18:25:16 2019-12-17 18:25:22  A 6 secs

Or in base R :

transform(transform(rawdata, 
     Starttime = as.POSIXct(Starttime, format = "%m/%d/%Y %I:%M:%S %p"), 
     Endtime = as.POSIXct(Endtime, format = "%m/%d/%Y %I:%M:%S %p")), 
               diff = difftime(Endtime, Starttime, units = "secs"))

data

rawdata <- structure(list(Starttime = structure(c(3L, 4L, 2L, 1L), 
.Label = c("12/17/2019 6:25:16PM", "12/17/2019 6:48:06PM", "12/18/2019 4:06:59PM", 
"12/18/2019 4:07:26PM"), class = "factor"), Endtime = structure(c(3L, 4L, 2L, 1L), 
.Label = c("12/17/2019 6:25:22PM", "12/17/2019 6:48:07PM", "12/18/2019 4:07:05 PM", 
"12/18/2019 4:07:28 PM"), class = "factor"), ID = structure(c(1L, 1L, 1L, 1L), 
.Label = "A", class = "factor")), row.names = c(NA, -4L), class = "data.frame")