COPY INTO Snowflake Table with Extra Columns

3

votes

I've got a table defined in Snowflake as:

GLPCT

BATCH_KEY NUMBER(38,0) NULL
CTACCT VARCHAR(100) NULL
CTPAGE NUMBER(38,0) NULL

and a file that looks like this:

GLPCT.csv

CTACCT VARCHAR(100)
CTPAGE NUMBER(38,0)

example:

CTACCT,CTPAGE
"Test Account",100
"Second Account", 200

My copy into command looks like this:

copy into GLPCT_POC from 'azure://ouraccount.blob.core.windows.net/landing/GLPCT' credentials=(azure_sas_token='<SAS_TOKEN') file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');

Problem

Snowflake is throwing an error due to a column number mismatch. How can I get Snowflake to ignore the column that isn't present in the file and not throw an error? I can move BATCH_KEY to the end of the table if that will help.

sqlsnowflake-cloud-data-platform

5

votes

It appears it's possible to indicate what columns to insert into with a COPY INTO statement, so ours becomes:

copy into GLPCT_POC (CTACCT, CTPAGE) from 'azure://ouraccount.blob.core.windows.net/landing/GLPCT' credentials=(azure_sas_token='<SAS_TOKEN') file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');

We could not use a transformation as mentioned in a previous answer, due to this being an external file.

1

votes

Snowflake allows you to set the ERROR_ON_COLUMN_COUNT_MISMATCH in the file format.

ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE | FALSE Boolean that specifies whether to generate a parsing error if the number of delimited columns (i.e. fields) in an input data file does not match the number of columns in the corresponding table.

If set to FALSE, an error is not generated and the load continues. If the file is successfully loaded:

If the input file contains records with more fields than columns in the table, the matching fields are loaded in order of occurrence in the file and the remaining fields are not loaded.

If the input file contains records with fewer fields than columns in the table, the non-matching columns in the table are loaded with NULL values.

https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html#type-csv

0

votes

You can add a "transformation" as you pull data in with a copy into query. In this case your transformation can be to add a NULL column.

However, in order to use this feature, you need to create a stage for your external source

create or replace stage my_stage 
url='azure://ouraccount.blob.core.windows.net/landing/GLPCT'
credentials=(azure_sas_token='<SAS_TOKEN')
file_format=(TYPE=CSV, SKIP_HEADER = 1, FIELD_OPTIONALLY_ENCLOSED_BY='"');

copy into GLPCT_POC 
from (SELECT NULL, $1, $2 FROM @my_stage);

The $1 and $2 line up with the columns in the file, and then the order of the columns in the select clause line up with the columns of the table.

The extra benefit of this is if you are reusing that copy statement and/or stage, you don't need to have all the credential and file format information repeated.

See Data load with transformation syntax

COPY INTO Snowflake Table with Extra Columns

GLPCT

GLPCT.csv

Problem

3 Answers