2
votes

I am currently pulling in a data set with dbplyr using the following:

connectInfo <- dbConnect(
  odbc(),
  Driver = "SQL Server",
  Server = "myServerName",
  Database = "myDatabaseName",
  Trusted_Connection = "True"
)

tbl(connectInfo, "tableName") %>%
  summarise(
    nbDate = LEFT(nbDate, 5),
    book,
    rateFeeChg
  ) %>%
  mutate(
    rateFeeChg = rateFeeChg * 100
  )

With the following output:

   nbDate    book rateFeeChg
    <chr>   <chr>      <dbl>
 1  38348 Classic  0.0000000
 2  38744 Classic  2.1270990
 3  39640 Classic  2.8999999
 4  40423 Classic  0.0000000
# ... with more rows

What I want to do is convert these 5 digit date values into regular looking date values inside the mutate function. I know by using the janitor library I can easily convert it, but when when I try putting

mutate(
    rateFeeChg = rateFeeChg * 100,
    nbDate = janitor::excel_numeric_to_date(nbDate)
)

I receive the following error:

Error in janitor::excel_numeric_to_date(nbDate) : 
  object 'nbDate' not found
1
You have to understand that each function that you use inside a mutate call in a dbplyr chain either has an adhoc translation to SQL, or is tried as an SQL function, so you can't use the code from another package there, unless you use collect first of course.Moody_Mudskipper
I think as.Date is translated, so you may be able to use it, but I believe excel dates have a different zero than R dates so you may have to add or substract a constant. before transforming.Moody_Mudskipper
@Moody_Mudskipper thank you for the fast response. When I attempt to use as.Date I was receiving the error: Conversion failed when converting date and/or time from character string. So if I have to stick with collect then will I just transform the output into a dataframe (for example) and then use my function on the column as normal?jadki
try converting to numeric first, I hadn't seen your column was character. as.Date(as.numeric(nbDate)+whatever_constant_is_needed)Moody_Mudskipper
Take also a look at the accepted answer here, it might be used to format dates: stackoverflow.com/questions/47438567/…Moody_Mudskipper

1 Answers

2
votes

the main thing is that in MS SQL this conversion will first need to be into datetime, and then into date. Since in dbplyr there are currently no function that coerces into datetime, the best thing is to use sql() to pass the MS SQL command. This way, you can keep all of your process server side, and avoid collect()ing data into R memory. The default datetime conversion in SQL seems to have the same base date as Excel, so you should get the same dates back. Here's a proposed solution I just tested on my system:

tbl(connectInfo, "tableName") %>%
   summarise(
   nbDate = sql("CAST(LEFT(nbDate, 5) as datetime)"),
   book,
   rateFeeChg
 ) %>%
 mutate(
  nbDate = as.Date(nbDate),
  rateFeeChg = rateFeeChg * 100
 )