1
votes

I have a bunch of sql queries that worked fine but now, for some reason, do not work any more. The data has not changed. The code has not changed.

I keep getting this error message:

Error in rsqlite_send_query(conn@ptr, statement) : duplicate column name: Ret

These errors tend to happen with left joins. Find below an example:

 g.cper<-sqldf("select a.*, b.NAV_EUR, b.AUM_EUR
           from g2_c as a
            left join 
            nav_master as b
            on a.fund_id=b.fund_id and a.period = b.period")

None of the tables in question contain a variable named "Ret"

I recently updated all my packages.

This is legacy code. I tend to use dplyr::left_join when possible. But left_join will never do what a left join in SQL could achieve (inequalities as constraints, etc).

These are the packages I load:

packages <- c("ISLR","gam","biglm","dplyr","gtools","tidyr", "randomForest","splines", "tree", "pROC","lfe","lubridate", "stargazer", "scales", "ggplot2", "scales", "data.table", "zoo","PerformanceAnalytics", "stats","proto", "timeSeries","timeDate","gsubfn","fBasics","DBI","RSQLite","sqldf", "RODBC", "tcltk","reshape","xts", "data.table","parallel", "lfe", "readr", "purrr", "tibble", "hms", "stringr", "lubridate", "forcats")

This is my sessioninfo ():

sessionInfo() R version 3.3.3 (2017-03-06) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux Server 7.3 (Maipo)

locale: 1 LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=en_US.UTF-8 [4] LC_COLLATE=en_US.UTF-8
LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: 1 parallel tcltk splines stats
graphics grDevices utils datasets methods base

other attached packages: 1 forcats_0.2.0
stringr_1.2.0 hms_0.3 [4] tibble_1.2 purrr_0.2.2
readr_1.0.0 [7] reshape_0.8.6
RODBC_1.3-14 sqldf_0.4-10 [10] RSQLite_1.1-2 fBasics_3011.87
gsubfn_0.6-6 [13] timeSeries_3022.101.2
timeDate_3012.100 proto_1.0.0 [16] PerformanceAnalytics_1.4.3541 xts_0.9-7 zoo_1.7-14 [19] data.table_1.10.4 ggplot2_2.2.1
scales_0.4.1 [22] stargazer_5.2
lubridate_1.6.0 lfe_2.5-1998 [25] Matrix_1.2-8 pROC_1.9.1
tree_1.0-37 [28] randomForest_4.6-12
tidyr_0.6.1 gtools_3.5.0 [31] dplyr_0.5.0 biglm_0.9-1 DBI_0.5-1 [34] gam_1.14 foreach_1.4.3
ISLR_1.0

loaded via a namespace (and not attached): 1 reshape2_1.4.2
lattice_0.20-34 colorspace_1.3-2 chron_2.3-50 plyr_1.8.4
munsell_0.4.3 [7] gtable_0.2.0 codetools_0.2-15 memoise_1.0.0 labeling_0.3 Rcpp_0.12.9 xtable_1.8-2 [13] digest_0.6.12 stringi_1.1.2 grid_3.3.3 tools_3.3.3 sandwich_2.3-4
magrittr_1.5 [19] lazyeval_0.2.0 Formula_1.2-1 assertthat_0.1 iterators_1.0.8 R6_2.2.0

Not sure if this is related to this question Keep in mind I am using RSQLite_1.1-2 (earlier than 2.0)

I honestly do not know what's going on and have not found anything online...

UPDATE I: I have upgraded to sqldf_0.4-11 and RSQLite_2.0.... Still get this problem. I also tried loading sqldf (and dependencies).... Code still does not work

UPDATE II: First, I would like to thank G. Grothendieck for his help on this issue and for his R contributions all these years.

On this particular issue, I tried to run a test query using mtcars. This is the code:

b<- sqldf("select a.*, b.mpg as test from mtcars as a left join mtcars as b on a.mpg=b.mpg")

This query worked!!!. I then run the code that was not working even after updating to sqldf 0.4.11 and RSQLite 2.0 (see Update I). To my surprise it now works!!!....I do not know what happened, but all my sqldf queries are now working. FYI...I am working in AWS............ I sometimes get these unexplainable weird stuff happening....

UPDATE III The problem is back. So I run the test code in update II again. This works. And after having run that test code, all my sqldf joins work again....GO FIGURE

1
Could you provide a reproducible example?Scarabee
I was getting the same error message but it went away after upgrading RSQLite to 2.1.1jsta

1 Answers

0
votes

In this reproducible example (using sqldf 0.4.11 and RSQLiute 2.0) it works as expected. Please review https://stackoverflow.com/help/how-to-ask and https://stackoverflow.com/help/mcve .

library(sqldf)

g2_c <- nav_master <- data.frame(fund_id = 1:2, period = 1:2, NAV_EUR = 0, AUM_EUR = 0)
sqldf("select a.*, b.NAV_EUR, b.AUM_EUR
       from g2_c as a
       left join nav_master as b
         on a.fund_id=b.fund_id and a.period = b.period")

giving:

  fund_id period NAV_EUR AUM_EUR NAV_EUR AUM_EUR
1       1      1       0       0       0       0
2       2      2       0       0       0       0