0
votes

I have a JSON file that I am trying to load via SnowSQL into Snowflake. I have an SQL script that's partially working. Here is a sample of the JSON file:

    {"year":"2011","_113_cause_name":"Nephritis, nephrotic syndrome and nephrosis (N00-     N07,N17-N19,N25-N27)","cause_name":"Kidney disease","state":"South Dakota","deaths":"49","aadr":"4.5"}

I have a series of SQL statements in a file that I then run fin SnowSQL CLI - see below:

enter code here

    /* create a table with variant column */
    CREATE TABLE IF NOT EXISTS CDC_SCHEMA.CDC_VAR
    (
       TEST_DATA VARIANT);
    /*create file format */
    create or replace file format CDC_DB.CDC_SCHEMA.sf_tut_csv_format
    field_delimiter = none
    record_delimiter = '\\n';

   create or replace temporary stage CDC_DB.CDC_SCHEMA.sf_tut_stage
   file_format = CDC_DB.CDC_SCHEMA.sf_tut_csv_format; 

   put file:///Users/rfrenkel/Downloads/LeadingCausesofDeath_noOA.json
   @CDC_DB.CDC_SCHEMA.sf_tut_stage; 

   copy into CDC_VAR
   from @CDC_DB.CDC_SCHEMA.sf_tut_stage/LeadingCausesofDeath_noOA.json.jz
   file_format = CDC_DB.CDC_SCHEMA.sf_tut_csv_format;

   SELECT * from CDC_VAR;

When I run the SQL statements above - I get below output in the CLI:

| status | |----------------------------------------------| | CDC_VAR already exists, statement succeeded. | +----------------------------------------------+ 1 Row(s) produced. Time Elapsed: 0.192s +-----------------------------------------------------+
| status | |-----------------------------------------------------| | File format SF_TUT_CSV_FORMAT successfully created. | +-----------------------------------------------------+ 1 Row(s) produced. Time Elapsed: 0.190s +-----------------------------------------------+
| status | |-----------------------------------------------| | Stage area SF_TUT_STAGE successfully created. | +-----------------------------------------------+ 1 Row(s) produced. Time Elapsed: 0.202s LeadingCausesofDeath_noOA.json_c.gz(0.09MB): [##########] 100.00% Done (1.973s, 0.04MB/s). +--------------------------------+-----------------------------------+-------------+-------------+--------------------+--------------------+----------+---------+ | source | target | source_size | target_size | source_compression | target_compression | status | message | |--------------------------------+-----------------------------------+-------------+-------------+--------------------+--------------------+----------+---------| | LeadingCausesofDeath_noOA.json | LeadingCausesofDeath_noOA.json.gz | 1671572 | 91648 | NONE | GZIP | UPLOADED | | +--------------------------------+-----------------------------------+-------------+-------------+--------------------+--------------------+----------+---------+ 1 Row(s) produced. Time Elapsed: 4.028s +---------------------------------------+
| status | |---------------------------------------| | Copy executed with 0 files processed. | +---------------------------------------+ 1 Row(s) produced. Time Elapsed: 1.221s +-----------+
| TEST_DATA | |-----------|

It looks like COPY only produces one row in the CDC_VAR and I do not see any data when I look at it via the Graphical interface. I do not know what's wrong, and I am suspecting that the file format is incorrect. Any help is appreciated.

1

1 Answers

0
votes

Just in case you see this before you see the other post...

The file format on the stage table is delimited. You want to create a JSON file format and specify that instead.

Here is the code for the file format and the stage table.

create or replace file format CDC_DB.CDC_SCHEMA.JSON_FILE_FORMAT
TYPE = 'JSON'
COMPRESSION = 'AUTO' 
ENABLE_OCTAL = FALSE 
ALLOW_DUPLICATE = FALSE 
STRIP_OUTER_ARRAY = TRUE 
STRIP_NULL_VALUES = FALSE 
IGNORE_UTF8_ERRORS = FALSE;

create or replace stage CDC_DB.CDC_SCHEMA.sf_tut_stage
  file_format = CDC_DB.CDC_SCHEMA.JSON_FILE_FORMAT;