I am trying to move parquet data from an AWS S3 stage into a table in Snowflake and keep getting data type errors. Specifically, this error keep popping up no matter how I tweak my columns
Numeric value '' is not recognized
The challenge is that I'm moving in 90-something columns and when I get the error Snowflake doesn't tell me which column or row is broken so it's very difficult to find and fix. I also can’t use the validate function to troubleshoot the problem because VALIDATE and VALIDATE_PIPE_LOAD do not support COPY with transform. When I isolate each individual column in a select statement, everything outputs fine - so i can’t even locate the problem cell itself!
Is there a way to have Snowflake ignore the errors and just move it in or to recast the columns all as something else without identifying the needle in a haystack cell? Some of the things that I’ve tried without resolution include:
- Removing the cast :: in my copy statement completely and trying to move over everything as a VARIANT - this gives me the Failed to cast variant value "" to FIXED error
- Changing everything to VARCHAR
- Changing the file format to csv
- Wrapping numbers and decimals in TRY_TO_NUMBER/DECIMAL()
- I’ve also tried all the suggestions listed here to no avail - "Numeric value '' is not recognized" - what column?
My code is below:
# create the file format for parquet files
CREATE FILE FORMAT MYPARQUETFORMAT
TYPE = PARQUET
COMPRESSION = snappy;
# create my table
create or replace table mytable (
ADM VARCHAR,
ADMDATER DATE default null,
admit_dateR DATE default null,
ADMTYPE NUMBER(38,0) default null,
BEGDATER VARCHAR,
BILL DECIMAL default null,
CLMCNT1 VARCHAR,
CLMCNT2 VARCHAR,
CLMCNT3 VARCHAR,
DAYS NUMBER(38,0) default null,
DISPSTAT VARCHAR,
DRG VARCHAR,
DX1 VARCHAR,
DX2 VARCHAR,
DX3 VARCHAR,
DX4 VARCHAR,
DX5 VARCHAR,
DX6 VARCHAR,
DX7 VARCHAR,
DX8 VARCHAR,
DX9 VARCHAR,
DX10 VARCHAR,
DX11 VARCHAR,
DX12 VARCHAR,
DX13 VARCHAR,
DX14 VARCHAR,
DX15 VARCHAR,
DX16 VARCHAR,
DX17 VARCHAR,
DX18 VARCHAR,
DX19 VARCHAR,
DX20 VARCHAR,
DX21 VARCHAR,
DX22 VARCHAR,
DX23 VARCHAR,
DX24 VARCHAR,
DX25 VARCHAR,
INSTTYPE VARCHAR,
Intake_BENCAT VARCHAR,
MDC NUMBER(38,0) default null,
PAID DECIMAL,
POA2 VARCHAR,
POA3 VARCHAR,
POA4 VARCHAR,
POA5 VARCHAR,
POA6 VARCHAR,
POA7 VARCHAR,
POA8 VARCHAR,
POA9 VARCHAR,
POA10 VARCHAR,
POA11 VARCHAR,
POA12 VARCHAR,
POA13 VARCHAR,
POA14 VARCHAR,
POA15 VARCHAR,
POA16 VARCHAR,
POA17 VARCHAR,
POA18 VARCHAR,
POA19 VARCHAR,
POA20 VARCHAR,
POA21 VARCHAR,
POA22 VARCHAR,
POA23 VARCHAR,
POA24 VARCHAR,
POA25 VARCHAR,
RandomID INT,
RWP NUMBER(38,0) default null,
TOTDAYS NUMBER(38,0) default null,
PROC2 VARCHAR,
PROC3 VARCHAR,
PROC4 VARCHAR,
PROC5 VARCHAR,
PROC6 VARCHAR,
PROC7 VARCHAR,
PROC8 VARCHAR,
PROC9 VARCHAR,
PROC10 VARCHAR,
PROC11 VARCHAR,
PROC12 VARCHAR,
PROC13 VARCHAR,
PROC14 VARCHAR,
PROC15 VARCHAR,
PROC16 VARCHAR,
PROC17 VARCHAR,
PROC18 VARCHAR,
PROC19 VARCHAR,
PROC20 VARCHAR,
PROC21 VARCHAR,
PROC22 VARCHAR,
PROC23 VARCHAR,
PROC24 VARCHAR,
PROC25 VARCHAR
);
# Copy data using copy and select statements
COPY INTO TEDI
FROM(SELECT
$1:ADM::VARCHAR,
$1:ADMDATER::DATE,
$1:admit_dateR::DATE,
$1:ADMTYPE::NUMBER(38,0),
$1:BEGDATER::VARCHAR,
$1:BILL::DECIMAL,
$1:CLMCNT1::VARCHAR,
$1:CLMCNT2::VARCHAR,
$1:CLMCNT3::VARCHAR,
$1:DAYS::NUMBER(38,0),
$1:DISPSTAT::VARCHAR,
$1:DRG::VARCHAR,
$1:DX1::VARCHAR,
$1:DX2::VARCHAR,
$1:DX3::VARCHAR,
$1:DX4::VARCHAR,
$1:DX5::VARCHAR,
$1:DX6::VARCHAR,
$1:DX7::VARCHAR,
$1:DX8::VARCHAR,
$1:DX9::VARCHAR,
$1:DX10::VARCHAR,
$1:DX11::VARCHAR,
$1:DX12::VARCHAR,
$1:DX13::VARCHAR,
$1:DX14::VARCHAR,
$1:DX15::VARCHAR,
$1:DX16::VARCHAR,
$1:DX17::VARCHAR,
$1:DX18::VARCHAR,
$1:DX19::VARCHAR,
$1:DX20::VARCHAR,
$1:DX21::VARCHAR,
$1:DX22::VARCHAR,
$1:DX23::VARCHAR,
$1:DX24::VARCHAR,
$1:DX25::VARCHAR,
$1:INSTTYPE::VARCHAR,
$1:Intake_BENCAT::VARCHAR,
$1:MDC::NUMBER(38,0),
$1:PAID::DECIMAL,
$1:POA10::VARCHAR,
$1:POA11::VARCHAR,
$1:POA12::VARCHAR,
$1:POA13::VARCHAR,
$1:POA14::VARCHAR,
$1:POA15::VARCHAR,
$1:POA16::VARCHAR,
$1:POA17::VARCHAR,
$1:POA18::VARCHAR,
$1:POA19::VARCHAR,
$1:POA2::VARCHAR,
$1:POA20::VARCHAR,
$1:POA21::VARCHAR,
$1:POA22::VARCHAR,
$1:POA23::VARCHAR,
$1:POA24::VARCHAR,
$1:POA25::VARCHAR,
$1:POA3::DATE,
$1:POA4::DATE,
$1:POA5::DATE,
$1:POA6::DATE,
$1:POA7::DATE,
$1:POA8::DATE,
$1:POA9::DATE,
$1:RandomID::INT,
$1:RWP::NUMBER(38,0),
$1:TOTDAYS::NUMBER(38,0),
$1:PROC2::VARCHAR,
$1:PROC3::VARCHAR,
$1:PROC4::VARCHAR,
$1:PROC5::VARCHAR,
$1:PROC6::VARCHAR,
$1:PROC7::VARCHAR,
$1:PROC8::VARCHAR,
$1:PROC9::VARCHAR,
$1:PROC10::VARCHAR,
$1:PROC11::VARCHAR,
$1:PROC12::VARCHAR,
$1:PROC13::VARCHAR,
$1:PROC14::VARCHAR,
$1:PROC15::VARCHAR,
$1:PROC16::VARCHAR,
$1:PROC17::VARCHAR,
$1:PROC18::VARCHAR,
$1:PROC19::VARCHAR,
$1:PROC20::VARCHAR,
$1:PROC21::VARCHAR,
$1:PROC22::VARCHAR,
$1:PROC23::VARCHAR,
$1:PROC24::VARCHAR,
$1:PROC25::VARCHAR
FROM @s3_stage/tedi.parquet
(file_format => MYPARQUETFORMAT));