0
votes

I have a fixed length data file with 1.3 million records, using sql loader I am loading the data into a stage table. If the table does not have any identity column then direct path load is taking 9 secs and conventional path load is taking 17 seconds to load 1.3 million records. But If I include the identity column(named as ROW_NUM) in the table both the loads are taking more than 11 mins to complete the load. Please advise how to improve the performance in this case. My control file looks like below. Thanks in Advance for any help. ROW_NUM identity is present in table, we are not passing the value from sql loader.

OPTIONS (ROWS=200000,DIRECT=FALSE,ERRORS=0)
    LOAD DATA
    CHARACTERSET 'AL32UTF8'
    DISCARDMAX 100
    INFILE 'path/datafile.dat'
    BADFILE 'path/datafile.bad_rec.txt'
    DISCARDFILE 'path/datafile_dsc_rec.txt'
        REPLACE PRESERVE BLANKS INTO TABLE ABCD.STAGE_TABLE
        TRAILING NULLCOLS
(
A1  POSITION(1:11),
A2  POSITION(12:13),
A3  POSITION(14:19),
A4  POSITION(20:22),
A5  POSITION(23:23),
A6  POSITION(24:38),
A7  POSITION(39:53),
A8  POSITION(54:68),
A9  POSITION(69:83),
A10 POSITION(84:98)
)

command:
sqlldr "connection string" control=control_file.ctl

Please Note:

  1. ROW_NUM identity I am using to do MERGE from STAGE to BASE table and do a commit after every2000 rows. Built that logic in a SP.
  2. I can not use external table as my data file is in different server than the DB server.
1

1 Answers

0
votes

This is how I see it (which doesn't have to be correct).

"Identity column" means that you're on Oracle 12c or higher. Is that correct? Because, in previous Oracle database versions, we had to use combination of a sequence and a database trigger which would fire for each row, fetch next value from the sequence and insert it into the ID column.

I don't know how Oracle implemented the identity column, but I suppose that it looks more or less similar to what I previously described, but everything is done "behind the scene" which makes it easier for developers.

If that's so, then that "behind-the-scene-trigger" still has to fire 1.3 million times, and that's time consuming. As you've noticed, without the identity column, SQL*Loader works very fast.

What can you do? One option might be to

  • recreate the stage table so that it doesn't use identity column; let ID be an ordinary NUMBER datatype column
  • load data fast
  • update table using update stage set id = myseq.nextval;

Another option: load the sequence value during SQL*Loader session; control file would look like this (the ID column is also not an identity column!):

load data
infile ... 
into table stage
  (id      SEQUENCE(MAX, 1),            --> this
   a1,
   ...
  )

See if anything of this helps. If not, I'm afraid I don't have any other ideas at the moment.