0
votes

I am trying to unload a CSV file from Snowflake where only non-null values are quoted and enclosed in double quotes.

Here is my table

|BizName        |BizOwner      |
|----------+-------------------|
| "NiceOne" LLC | Robert       |
| GoodRX        |              |
| MaxLift       | Brian        |
| P,N and B     | Jane         |
+----------+--------------------

Here is my desired output -

"""NiceOne"" LLC","Robert","GoodRX",,"Maxift","Brian","P,N and B","Jane"

Here are 2 examples of formats I have tried

I have tried using the below format, which wraps null values in quotes

create or replace file format formatA
    COMPRESSION = GZIP
    RECORD_DELIMITER = ','
    SKIP_HEADER = 1
    FIELD_DELIMITER =  NONE
    FILE_EXTENSION = 'csv'
    ESCAPE =  NONE
    ESCAPE_UNENCLOSED_FIELD = NONE 
    FIELD_OPTIONALLY_ENCLOSED_BY = '"'
    NULL_IF = ( '','NULL' , 'null' , '""' )
    EMPTY_FIELD_AS_NULL = TRUE
    copy into @mytablestage from mytable FILE_FORMAT = ( FORMAT_NAME = 'formatA' ) ON_ERROR = CONTINUE

I have tried the below format, which wraps nothing in quotes

create or replace file format formatB
    COMPRESSION = GZIP
    RECORD_DELIMITER = ','
    SKIP_HEADER = 1
    FIELD_DELIMITER =  NONE
    FILE_EXTENSION = 'csv'
    ESCAPE =  NONE
    ESCAPE_UNENCLOSED_FIELD = NONE 
    NULL_IF = ( '','NULL' , 'null' , '""' )
    EMPTY_FIELD_AS_NULL = TRUE
copy into @mytablestage from mytable FILE_FORMAT = ( FORMAT_NAME = 'formatB' ) ON_ERROR = CONTINUE
2

2 Answers

0
votes

Looking at the desired output, the field delimiter, and record delimiter can not be the same to differentiate between rows and fields. Choose a different record delimiter and FIELD_OPTIONALLY_ENCLOSED_BY = '"' with NULL_IF = ( '','NULL' , 'null' , '""' )should give you the desired output.

0
votes

You can use the second option specified in the Snowflake documentation:

https://docs.snowflake.com/en/user-guide/data-unload-considerations.html

You can specify FIELD_OPTIONALLY_ENCLOSED_BY=NONE and EMPTY_FIELD_AS_NULL = FALSE in which case you'd need to provide a value to be used for NULLs (NULL_IF=('NULL')

"Leave string fields unenclosed by setting the FIELD_OPTIONALLY_ENCLOSED_BY option to NONE (default), and set the EMPTY_FIELD_AS_NULL value to FALSE to unload empty strings as empty fields.

If you choose this option, make sure to specify a replacement string for NULL data using the NULL_IF option, to distinguish NULL values from empty strings in the output file. If you later choose to load data from the output files, you will specify the same NULL_IF value to identify the NULL values in the data files."

I did something similar and my query looked like the following:

COPY INTO @~/unload/table FROM (
SELECT * FROM table
)
FILE_FORMAT = (TYPE = 'CSV' COMPRESSION = 'GZIP' 
FIELD_DELIMITER = '\u0001' 
EMPTY_FIELD_AS_NULL = FALSE 
FIELD_OPTIONALLY_ENCLOSED_BY = NONE    
NULL_IF=('NULL')) 
OVERWRITE = TRUE;

This saved me the annoying part of having to process the file additionally to get rid of the enclosing quotes.