0
votes

I'm loading in the following data in the control file and for some reason it is skipping the first character.

Initially the problem was skipping the first character, loading the second character and ignoring the rest! But I fixed this by using TERMINATED BY '\n'.

I've added the UTF8 characterset (because the file is in UTF-8 as stated by Notepad++), but the results are the same with or without it.

I can't find any other information regarding this - the nearest answer was that the CONTINUEIF was causing this issue, but I'm not using that. Perhaps SQL*Loader is reading first character of each record for something else?

Here is the table and sequence:

create table test_table
( id number, raw_data varchar2(100) )
/

create sequence test_seq
/

Here is the control file and data. The sequence bit is running fine and populating the TEST_TABLE.ID column as expected. The TEST_DATA.RAW_DATA column skips the first character of each record (i.e. A, C, E, etc).

LOAD DATA
CHARACTERSET UTF8
INFILE * 
TRUNCATE INTO TABLE test_table
(
 id "TEST_SEQ.NEXTVAL",
 raw_data TERMINATED BY '\n'
)
BEGINDATA
A123 B456
C789 D012
E345 F678
J901 K234
L567 M890

I'm expecting this to load each line into the TEST_TABLE.RAW_DATA column. Instead it is skipping the first character of each record and happily loading the rest; i.e. the 5 records in TEST_TABLE.RAW_DATA are:

123 B456
789 D012
345 F678
901 K234
567 M890

How can I stop it skipping the first character?

1

1 Answers

1
votes

Your field definition currently says that id is a field in your file, and since you haven't specified a data type, it defaults to a single character. Reading that consumes that first character, which is then discarded as the SQL expression overrides it. The raw_data field then starts at the second character.

You can declare that the ID is purely an expression, rather than a field in the data file, by using the EXPRESSION keyword:

...
(
 id EXPRESSION"TEST_SEQ.NEXTVAL",
 raw_data TERMINATED BY '\n'
)
...

It then won't look at (or consume) any data from the file when working out that ID value, and raw_data will start from the first character.

You need to terminate the raw_data column because you haven't specified a data type for that either, so that would also default to a single character. You could declare that with a size to match your table definition if you prefer:

...
(
 id expression "TEST_SEQ.NEXTVAL",
 raw_data CHAR(100)
)
...

You could also set a kind-of-dummy fields terminated by clause instead. So many options...