6
votes

I need to make a large amount of inserts, i.e. double digit millions, into an Oracle-DB using JDBC. To do so I'm using something like the following class, inspired by Efficient way to do batch INSERTS with JDBC:

public class Inserter {
    private final int batchSize;
    private final Connection con; // with .setAutoCommit(false)
    private final PreparedStatement ps;
    private int currentSize = 0;

    public Inserter(Connection con, PreparedStatement ps, int batchSize) {
        this.con = con;
        this.ps = ps;
        this.batchSize = batchSize;
    }

    public void addInsert(Object[] vals) throws SQLException {
        ps.clearParameters(); // should be redundant, but better safe than sorry
        for (int i = 0; i < val.length; i++) {
            this.ps.setObject(i + 1, vals[i]);
        }
        ps.addBatch();
        currentSize++;

        if (currentSize >= batchSize) {
            ps.executeBatch();
            currentSize = 0;
        }
    }

    public void flush() {/** to flush leftovers */}
}

While this way of inserting works fine, it is very slow. JDBC batch insert performance describes how basically this exact issue can be dealt with for MySQL, as rewriteBatchedStatements does however not seem to exist on Oracle it is not of much help here.

To improve performance I have also tried to switch the Statement to one big INSERT ALL .../INSERT APPEND ... statement, as per Oracle 11g - most efficient way of inserting multiple rows, which made everything even slower.

My question would thus be if there is any way to optimize these inserts beyond simply using addBatch() and executeBatch()? Or is there maybe some gross, inefficient mistake in the Inserter class above? Any help would be very much appreciated.


More possibly useful information:

  • The tables to be inserted to are partitioned with around one to ten million rows per partition.

  • There is a unique constraint on the tables looking something like this unique(id1, id2, id3, id4) where all columns in it are of type NUMBER and further bound by foreign key constraints to primary keys in other tables.


EDIT:

Following the suggestions from the comments I switched the setObject(index, val) call to:

  1. setInt(index, val), setFloat(index, val), ..., setNull(index, type) calls where apprpropriate

  2. setObject(index, val, type) and setNull(index, typR)

Neither version improved performance significantly.

Furthermore I tried inserting the data into a staging table without any constraints on it, which also did not lead to better performance.

To compare, exporting the data to CSV and loading it in using SQL*Loader lead to a significant performance increase, i.e. ~4.5k => ~50k+ rows per second for the slowest table1.

This leads me to believe the bottleneck to lie with JDBC.

1Alas, the use of SQL*Loader is not a (desired) option in my particular case.

1
How large is batchSize? - Andreas
@Andreas I've tried {1000,10000,25000,100000} thus far, all were quite slow - abcalphabet
Have you tried loading your 10+ million records using the SQL*Loader, just to get a baseline idea of the database server performance of doing that many inserts into a table with unique indexes and foreign keys? - Andreas
@Andreas Yeah, have done that, it's roughly ten times faster with SQL*Loader - abcalphabet
I've used similar code in the past and successfully inserted about 100,000 rows a second into Oracle on the same machine. How many rows can you inserted in a second? My code didn't use setObject but more specific setInt and setString. And the target was a staging table – initially empty and without any indexes. Try a staging table as well to check if the bottleneck is with JDBC or with the target table. Batch size was 50 if I remember correctly. - Codo

1 Answers

-1
votes

you never commit the batch. If it is possible add commit after executeBatch if the query is not committed you will create large rollback segments that can slow down the database. Also remove ps.clearParameters() since you always overwrite all parameters or none. Better use Specialized version of setter instead of setObject