0
votes

Problem Statement - While doing the data load using the copy command and by defining the escape property the statement is not eliminating the escape character from the data. Ex- I'm trying to load the data from the CSV file. The file is having the data in the following Format in one of the column ('EC F&G BREWER\'S INT\'L BEER'). The expectation on this is that the escape character in the data (which is backslash '\') should be removed from the data field after the load.

Following is the copy statement I'm using -

COPY INTO SANDBOX.TEST_SBX.TEST_20200512
                      FROM @STAGE_NAME/TEST_FILE/
                      FILE_FORMAT = (FIELD_DELIMITER = ','
                                     RECORD_DELIMITER = '\n'
                                     SKIP_HEADER = 0
                                     FIELD_OPTIONALLY_ENCLOSED_BY = '\047'
                                     TRIM_SPACE = FALSE
                                     ESCAPE ='\134'
                                     ESCAPE_UNENCLOSED_FIELD='\134'
                                     ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE
                                     DATE_FORMAT = 'AUTO' TIMESTAMP_FORMAT = 'AUTO' NULL_IF = ('NULL', 'null', '')         )
                      PATTERN='.*.gz.*'
                      PURGE=FALSE
                      ON_ERROR = ABORT_STATEMENT
                      FORCE = FALSE
                      RETURN_FAILED_ONLY = FALSE
                      ; 

There are many columns in this file that have the () backslash in the character field, and all are getting ignored during the load.

I cannot switch to the manual column selection mode and use the Regexp to replace the escape character, I have to use the copy command without switching to the column selection mode.

I expected that the escape character configured in the file format is treated appropriately (escaped) without having to treat it as a transformation similar to how it is treated as escape characters in the other data processing/loading engines.

Please suggest what I can do here on this.

2
ESCAPE_UNENCLOSED_FIELD='\134' means the file format is using the backslash as an escape character. From your description it seems that you don't want to use it that way. You can remove that parameter entirely in that case.Greg Pavlik
Is the expected value "EC F&G BREWER'S INT'L BEER"? If so try using ESCAPE ='\\' which works, verified. As suggested above, you will not need ESCAPE_UNENCLOSED_FIELD='\134'.Abhi Reddy
Thank you Greg Pavlik and Abhi Reddy , I tried both the options but nothing seems to be working . Also \134 - Octal representation for (\) , I have tried directly putting ESCAPE='\\' Its not giving "EC F&G BREWER'S INT'L BEER". The property seems to be not working.Anshul Agrawal
There must be something else in the file that is causing this. I just tried this value and it works for me. I'd suggest you open a support case along with providing the sample file which will allow us to debug.Abhi Reddy
The support case is already logged with Snowflake. I have the file as well I can provide the file on the support case. Also please share the command use have used.Anshul Agrawal

2 Answers

0
votes

I tried replicating exactly the same still not getting the same result. Got the understanding there is problem with the file . Or something wrong at the account level parameters which is cause the problem.

CREATE OR REPLACE TABLE SANDBOX.AAGRA018_SBX.TEXT_FILE_TST 
 (COL1 VARCHAR(50)
 );

 TRUNCATE TABLE SANDBOX.AAGRA018_SBX.TEXT_FILE_TST;
 COPY INTO SANDBOX.AAGRA018_SBX.TEXT_FILE_TST  
 FROM @~/Test_File.txt.gz
 FILE_FORMAT = (FIELD_DELIMITER = ',' RECORD_DELIMITER = '\n' 
                SKIP_HEADER = 0 FIELD_OPTIONALLY_ENCLOSED_BY = '\047' 
                TRIM_SPACE = FALSE ESCAPE ='\\' --ESCAPE_UNENCLOSED_FIELD='\134' 
                NULL_IF = ('NULL', 'null', '') 
                 ) 
                force=true;


select * from SANDBOX.AAGRA018_SBX.TEXT_FILE_TST;

Data File -

YB MARTHA\'S VINEYARD LOUNGE
EC F&G BREWER\'S INT\'L BEER

This is not giving the desired result 
[Snowflake UI Image copy][1]

  [1]: https://i.stack.imgur.com/IBgNz.png
0
votes

This is the code I have implemented

--Data File

'EC F&G ANSHUL\'s Public COMPANY'

'YB MARTHA\'S VINEYARD LOUNGE'

'EC F&G BREWER\'S INT\'L BEER'

     COPY INTO test_so FROM @file_format_stage/Test_File.csv.gz 
FILE_FORMAT = (FIELD_DELIMITER = ',' RECORD_DELIMITER = '\n' 
               SKIP_HEADER = 1 FIELD_OPTIONALLY_ENCLOSED_BY = '\047'
               TRIM_SPACE = FALSE 
               ESCAPE ='\\' 
               --ESCAPE_UNENCLOSED_FIELD=NONE 
               NULL_IF = ('NULL', 'null', '')
              SKIP_BYTE_ORDER_MARK = False) 
               force=true
 ;

And the result Result