1
votes

I'm working through bulk loading information into Snowflake using Python. I use pandas to extract and load data from a source system into a csv. The issue occurs when copying data into Snowflake.

I receive the below error:

snowflake.connector.errors.ProgrammingError: 100065 (22000): Found character '4' instead of field delimiter ','

The field is hanging on the following text enclosed within a field.

"[**EXTERNAL**]: BID: ""4597290 - PERSONNE"

As you can see there are double quotes lined up just before the 4597290 that are causing an escape. Being unversed in how to escape out of this particular issue but what i've successfully used for other files is below:

'file_format = (type = "csv" SKIP_HEADER = 1 FIELD_OPTIONALLY_ENCLOSED_BY = '"' ESCAPE = ' ')

Is there a file format that will allow me to accommodate the above example or is the best practice remove quotes on this field alone?

1
If TAB character is not occurring in your data, you could use TAB delimited files (FIELD_DELIMITER = '\t'). Please check this other question which I think provides solutions to your problem too: stackoverflow.com/questions/65986126/… - Francesco Quaratino

1 Answers

1
votes

There's a few options I can think of plus you can look through the forums for other ideas (i.e., how to process double quotes within double quotes fields).

  1. You can pre-process the data. This may include removing the double quotes, potentially escaping it with another pair of double quotes, or using a different set of field delimiters or field enclosure.

  2. You can load the data in with all the double quotes included by removing the FIELD_OPTIONALLY_ENCLOSED_BY field and post-processing later as needed. Example below using your existing file format and removing that field.

COPY INTO FORUMTEST from @TEST/forumtest.csv.gz file_format = (type = CSV ESCAPE = ' ');

  1. You can do a transformative SQL statement during the COPY and remove the FIELD_OPTIONALLY_ENCLOSED_BY field. SQL statement below supposing we had 3 columns with the double-quoted data, just as an example.

COPY into FORUMTEST from (select replace($1,'"',''), replace($2,'"',''), replace($3,'"','') from @TEST/forumtest.csv.gz file_format) file_format=(type = CSV ESCAPE = ' '));