sqldf is primarily intended to process data frames so it creates databases and database tables transparently and removes them after completing the sql. Thus your first statement would not be expected to work since sqldf would remove the database after the statement completes.
If the SQL creates the database or table rather than sqldf itself then sqldf won't know about it so it won't delete it. Here we create the database using attach
and the table using create table
to fool sqldf. In the last line it won't remove the database oir table because they were already there before that line started and it never removes objects it did not create:
library(sqldf)
read.csv.sql("jobs.csv", sql = c("attach 'test1.sqlite' as new",
"create table new.jobs2 as select * from file"))
sqldf("select * from jobs2", dbname = "test1.sqlite")
The other thing that might go wrong would be line endings. Typically sqldf can figure it out but if not you may have to specify the eol
character. The need to specify it might occur, for example, if you were trying to read a file created on one operating system in another operating system. See FAQ 11. Why am I having difficulty reading a data file using SQLite in the sqldf README.
Note: read.csv.sql
is normally used to just read in a portion of the data. For example, this skips the first 100 rows and then reads columns a
and b
from the next 1000 rows but the query can be arbitrarily complex since you have all of SQLite's SQL to use:
read.csv.sql("jobs.csv", sql = "select a, b from file limit 1000 offset 100")
The entire file is read into a temporary sqlite database but only the requested portion is ever read into R so the entire file could be larger than R can handle.
Typically if one is trying to achieve persistence one uses RSQLite directly rather than sqldf.