0
votes

I am working with data from csv files that will all look the same so I am hoping to come up with a code that can be easily applied to all of them. However, sadly enough I am failing at step one :-(.

The csv files have the date and time saved in one column, so when I import them with read.csv that column gets read as a chr. How can I most easily convert this into a date that I then can use for plotting and analysis?

Here is what I tried:

load the data --> will save the date and time as chr under mydata$Date.Time (e.g. 1/1/15 0:00)

mydata<-read.csv(file.choose(), stringsAsFactors = FALSE,
              strip.white = TRUE,
              na.strings = c("NA",""), skip=16,
              header=TRUE)

separate the Date.Time into Date and Time:

new <- do.call( rbind , strsplit( as.character( mydata$Date.Time ) , " " ) )

add these two back to the df mydata:

cbind( mydata , Date = new[,2] , Time = new[,1] )

convert Date into a date format via as.Date:

mydata$Date <- as.Date(new[,1], format="")

So this works fine for the date however I am stuck with the time, I tried this:

mydata$Time <- format(as.POSIXct(new[,2], format="%H:%M"))

this gives me the following error:

Error in as.POSIXlt.character(x, tz, ...) : character string is not in a standard unambiguous format

I wonder if there is a smarter way of doing this? Reading in time and date seems to be one of the substantial tasks that I would like to understand. Is there a way of R directly recognizing the date and time from the csv? Or is it generally smarter to generate a time vector by its own, if so how would I do that?

Thanks so much for your help. Sandra

2

2 Answers

1
votes

If you want to use time only, consider using the chron package:

library(chron)
mytime <- times("21:19:37")

or in your case

times(new[,2])

assuming that that's a character vector.

0
votes

I tried the chron approach but it wouldn't work for me :-(. So what I ended up doing is just creating a time vector for the period that I am loading the data in for:

date <-seq(as.POSIXct("2015/1/1 00:00"), as.POSIXct("2015/1/31 23:00"), "hours")

and then adding it back to the df. Not what I wanted but it will work until I find the ultimate solution :-)