6
votes

I am not sure what is going wrong with selecting date times as a POSIXct format. I have read several comments on subsetting a dataframe based on as.Date and I can get that to work without an issue. I have also read many posts suggesting that filtering POSIXct formats should work, but for some reason I cannot get it to work.

An example dataframe:

library(lubridate)
library(dplyr)

date_test <- seq(ymd_hms('2016-07-01 00:00:00'),ymd_hms('2016-08-01 00:00:00'), by = '15 min')
date_test <- data.frame(date_test)
date_test$datetime <- date_test$date_test
date_test <- select(date_test, -date_test)

I checked that it is in POSIXct format and then tried several ways to subset the dataframe greater than 2016-07-01 01:15:00. However the output never shows the date times less than 2016-07-01 01:15:00 being removed. I am sorry if this has been asked somewhere and I cannot find it but I have looked and tried to get this to work. I am using UTC as the timezone to avoid daylight savings time issues so that is not the issue here - unless the filter requires it.

class(date_test$datetime)

date_test <- date_test %>% filter(datetime > '2016-07-01 01:15:00')

date_test <- date_test %>% 
  filter(datetime > as.POSIXct("2016-07-01 00:15"))

date_test <- subset(date_test, datetime > as.POSIXct('2016-07-01 01:15:00')) 

Now if I filter using:

date_test <- date_test %>% 
  filter(datetime > as.POSIXct("2016-07-10 01:15:00"))

the output is very strange with a day behind and the wrong time?

2016-07-09 13:30:00
2016-07-09 13:45:00
2016-07-09 14:00:00
2016-07-09 14:15:00
2016-07-09 14:30:00

If it helps I am using MAC OS Sierra with R Studio Version 1.0.143 and R You Stupid Darkness, DPLYR 0.5 and Lubridate 1.6

1
Try this: date_test[date_test$datetime > as.POSIXct("2016-07-01 01:15:00", tz="UTC"),]Dave2e
ymd_hms uses as.POSIXct times in "UTC" timezone by default - as.POSIXct uses the system timezone (e.g. - Australia for me, Illinois for you) - you need to consistently use ymd_hms or change to the "UTC" timezone as per Dave's suggestion.thelatemail
OK, so that solves a huge issue I have been having for a while. Right now I am in New Zealand not Illinois as the time suggests and explains why time functions have not been working. It was never clear to me that when you use dates without specifying UTC that it defaulted to the system clock. I was able to get the following to work. THANK YOU SO MUCH date_test <- date_test %>% filter(datetime > as.POSIXct('2016-07-01 01:15:00', tz="UTC")) and test <- date_test[date_test$datetime > as.POSIXct("2016-07-01 01:15:00", tz="UTC"),] test <- as.data.frame(test)Bill Perry
Also, if anyone runs into this in the future as @thelatemail suggested this also works `date_test <- date_test %>% filter(datetime > ymd_hms('2016-07-2 00:00:00'))'. Thank you again and I am not sure how to mark your comment as the answer but it was perfect!Bill Perry

1 Answers

11
votes

ymd_hms uses POSIXct times in "UTC" timezone by default - as.POSIXct uses the system timezone (e.g. - Australia for me) - you need to consistently use ymd_hms or change to the "UTC" timezone as per Dave's suggestion in the comments.

E.g.: these examples work:

date_test <- seq(ymd_hms('2016-07-01 00:30:00'),ymd_hms('2016-07-01 01:30:00'), by = '15 min')
date_test <- data.frame(datetime=date_test)
date_test

#             datetime
#1 2016-07-01 00:30:00
#2 2016-07-01 00:45:00
#3 2016-07-01 01:00:00
#4 2016-07-01 01:15:00
#5 2016-07-01 01:30:00

date_test %>% 
  filter(datetime > as.POSIXct("2016-07-01 01:00:00", tz="UTC"))

date_test %>% 
  filter(datetime > ymd_hms("2016-07-01 01:00:00"))

#             datetime
#1 2016-07-01 01:15:00
#2 2016-07-01 01:30:00