I have encountered an error when trying to insert thousands of rows with R/RJDBC and the dbSendUpdate
command on an Oracle database.
Problem can be reproduced by creating a test table with
CREATE TABLE mytest (ID NUMBER(10) not null);
and then executing the following R script
library(RJDBC)
drv<-JDBC("oracle.jdbc.OracleDriver","ojdbc-11.1.0.7.0.jar") # place your JDBC driver here
conn <- dbConnect(drv, "jdbc:oracle:thin:@MYSERVICE", "myuser", "mypasswd") # place your connection details here
for (i in 1:10000) {
dbSendUpdate(conn,"INSERT INTO mytest VALUES (?)",i))
}
Searching the Internet provided the information, that one should close result cursors, which is obvious (e.g. see java.sql.SQLException: - ORA-01000: maximum open cursors exceeded or Unable to resolve error - java.sql.SQLException: ORA-01000: maximum open cursors exceeded).
But the help file for ??dbSendUpdate
claims for not using result cursors at all:
.. that dbSendUpdate is used with DBML queries and thus doesn't return any result set.
Therefore this behavior doesn't make much sense to me :-(
Can anybody help?
Thanks, a lot!
PS: Found something interessting in the RJDBC documentation http://www.rforge.net/RJDBC/
Note that the life time of a connection, result set, driver etc. is determined by the lifetime of the corresponding R object. Once the R handle goes out of scope (or if removed explicitly by rm) and is garbage-collected in R, the corresponding connection or result set is closed and released. This is important for databases that have limited resources (like Oracle) - you may need to add gc() by hand to force garbage collection if there could be many open objects. The only exception are drivers which stay registered in the JDBC even after the corresponding R object is released as there is currently no way to unload a JDBC driver (in RJDBC).
But again, even inserting gc()
within the loop will produce the same beahvoir.
Roracle
package. In their blog, here Oracle did some benchmarks. – agstudy