0
votes

I have several date variables in a data.frame.

They look for example like this:

[1] "10/14/18 17:55:28"        "10/15/18 19:27:56"       
  [3] "11/04/18 15:47:46"        "Thu Feb  7 14:51:55 2019"
  [5] "Thu Feb  7 17:14:15 2019" "Thu Feb  7 15:46:09 2019"
  [7] "Thu Feb  7 11:42:27 2019" "Thu Feb  7 13:24:16 2019"
  [9] "Thu Feb  7 18:02:29 2019" "Mon Oct 15 08:48:43 2018"
 [11] "10/17/18 17:08:38"        "12/08/18 08:08:11"       
 [13] "10/11/18 21:25:30"        "10/14/18 19:15:30"       
 [15] "10/16/18 11:18:01"        "10/16/18 18:19:27"       
 [17] "Tue Oct 16 19:49:24 2018" "Wed Oct 17 21:36:32 2018"
 [19] "Sat Oct 13 11:22:35 2018" "Fri Dec  7 17:12:33 2018"

At the moment this is a character variable. I want to change it with as.Date to substract the variables from each other.

I already found this:

as.Date( DATE$Sess1, format = "%m/%d/%y")

I would prefer to keep not only the date but also the time. The real problem is that they include Apple and Windows format which makes it even more complicated.

I would prefer dplyr solutions ;)

3
Use strptime, with a condition on the format string (for instance, if the first character of the input is a letter) to distinguish both cases (assuming there are only two cases). There is a special place in hell for those who don't format dates consistently...user13517564
The dates come from different computers. So you have to blame the developers of apple and windows for that. Actually, they should collaborate ;)SDahm
Check out the anytime package.David Arenburg

3 Answers

2
votes

For keeping the time, it's best to use a different date format, e.g. POSIXlt or POSIXct. You can also extend the format string to include the time (e.g. format = "%m/%d/%y %H:%M:%S") - see https://astrostatistics.psu.edu/su07/R/html/base/html/strptime.html for more details on these codes.

as.POSIXlt(DATE$Sess1, format = "%m/%d/%y %H:%M:%S")

As for handling different formats, because the ones you have aren't unambiguous on their own, I suggest having a vector of possible formats, then trying each in turn until one works.

2
votes

You can use lubridates parse_date_time and include all the formats that it could take.

x <- c("10/14/18 17:55:28" ,       "10/15/18 19:27:56" ,      
       "11/04/18 15:47:46" ,       "Thu Feb  7 14:51:55 2019",
       "Thu Feb  7 17:14:15 2019", "Thu Feb  7 15:46:09 2019")


lubridate::parse_date_time(x,c('mdyT', 'amdTY'))

#[1] "2018-10-14 17:55:28 UTC" "2018-10-15 19:27:56 UTC" "2018-11-04 15:47:46 UTC"
#[4] "2019-02-07 14:51:55 UTC" "2019-02-07 17:14:15 UTC" "2019-02-07 15:46:09 UTC"

Read ?parse_date_time to know different format details.

To get the dates, you can wrap as.Date around it.

as.Date(lubridate::parse_date_time(x,c('mdyT', 'amdTY')))
#[1] "2018-10-14" "2018-10-15" "2018-11-04" "2019-02-07" "2019-02-07" "2019-02-07"
0
votes

If you're using the tidyverse, use {lubridate} to reformat. There are two different date/time formats in your example, so you'll need to format them twice.

lubridate::as_datetime(DATE$Sess1, format = "%a %b %e %H:%M:%S %Y")

and then for all the NA results...

lubridate::as_datetime(DATE$Sess1, format = "%m/%d/%y %H:%M:%S")