I am pulling data from a SQL Server database where each row uses an ISO timestamp as its key. In the database, the timestamps are stored as datetime2
datatype.
When I run the following statement:
data <- dbGetQuery(con, "SELECT timestamp from table1")
it returns a data frame where the timestamp column is a chr type. Here is what str() returns on the data frame:
$ timestamp: chr "2020-03-25 12:19:48.0000000" "2020-03-25 12:20:48.0000000"...
With this I can convert it back to a POSIXct object using the following code:
data$timestamp <- as.POSIXct(data$timestamp, format = '%Y-%m-%d %H:%M:%S')
However, the database is storing values every minute (and occasionally every second), so if I try to convert a few months or years worth of timestamps using the as.POSIXct() function, it can take a long time (in the range of minutes). The dbGetQuery function is relatively fast when pulling that much data (in the range of a few seconds).
Is there a parameter in the dbGetQuery or other method that will directly interpret the timestamps as datetime, rather than strings?
Here is my connection command:
con <- dbConnect(odbc(),
Driver = "SQL Server",
Server = "server1",
Database = "db1",
UID = "user",
PWD = "pwd")
DATETIME
types, then normallyodbc
recognizes it and will give youPOSIXt
. Unfortunately, I discouragedatetime
since it does not include TZ, preferringDATETIMEOFFSET
. Unfortunately,odbc
does not recognize that and will return it ascharacter
. You are therefore forced to process it post-fetch. – r2evansDBI::dbGetQuery(con, "select cast ( SYSDATETIMEOFFSET() as datetime ) as now")
returnsPOSIXt
, whereasDBI::dbGetQuery(con, "select cast ( SYSDATETIMEOFFSET() as datetimeoffset ) as now")
does not. – r2evans%OS
instead of%S
, as in"%Y-%m-%d %H:%M:%OS"
. There are several packages that deal with converting timestamps toPOSIXt
, some claim ease-of-use, some claim speed. I can't vouch for them, but you can certainly check them out (and usingmicrobenchmark
to confirm speed at-scale), includinglubridate
andanytime
. – r2evansdbGetQuery
means you're usingDBI
andodbc
. You should never need to dolibrary(odbc)
, usingDBI::dbConnect(odbc::odbc(), ...)
usually suffices. And for the most part, connection-specific drivers (includingodbc
) really augment'sDBI
's methods for most of the functions. ... So I'm not at all surprised that usingDBI
gave you the same results, it's likely using the same functions internally. – r2evans