So, if you have this table in Snowflake:
create table t (x string, y string) as select '', null;
and you copy it to an external stage with file_format csv, you get this error if you don't set field_optionally_enclosed_by to something other than none:
Cannot unload empty string without file format option field_optionally_enclosed_by being specified.
so, let's say it's set to '"'.
create stage some_stg
url='s3://<some-bucket>/<some-dir>'
file_format = (type = csv field_optionally_enclosed_by='"' compression = none)
credentials = (aws_role = '<your-arn-for-snowflake>')
I'm sure this issue reproduces with an internal stage if you don't want to mess with getting snowflake to use your s3 bucket.
When you run a copy for table t above:
copy into @some_stg/t.csv from t overwrite = true;
you get a file (t_0_0_0.csv) that looks like this:
"","\N"
And after creating the equivalent table in postgres:
create table t (x varchar, y varchar);
When you load that into postgres with psql copy like this:
psql -h <host> -U <user> -c "copy t from stdin with csv null '\N'" < t_0_0_0.csv
The contents of t on postgres is:
x, y
"","\N"
Now this make sense because snowflake put the \N in double quotes, so the psql copy preserved it. If you edit t_0_0_0.csv and remove the double quotes around the \N:
"",\N
And run psql copy again then the \N is correctly converted to null
There does not appear to be a way to generate a csv file from snowflake that supports empty string and null that can be preserved loading into postgres. I messed with the snowflake configs EMPTY_FIELD_AS_NULL and NULL_IF which in snowflake's documentation it even speaks to this issue:
When unloading empty string data from tables, choose one of the following options:
Preferred: Enclose strings in quotes by setting the FIELD_OPTIONALLY_ENCLOSED_BY option, to distinguish empty strings from NULLs in output CSV files.
It does "distinguish" them but not in a way that psql copy can use without manipulating the file with sed beforehand.
Does anyone know how to generate a snowflake csv that preserves empty strings and nulls in a way that psql copy can reproduce?