I have a CSV file I'm creating by exporting a table in SQL Server 2016 SP2 using the bulk copy utility (bcp.exe). I'm setting the code page to 65001 (which Microsoft's documentation states is UTF-8). However, when I stage the file in Snowflake and then try to use the COPY command to move it into a table, I get an error that says, "Invalid UTF8 detected in string '0xFF0xFE00x0000x0010x0010x0060x0000x0000x0000x0000x0010x00n0x0040x00M0x00M0x00c0x00A0x00A0x00M0x00'."
If I use the IGNORE_UTF8_ERRORS flag, I get data in my table that is unintelligible. Any suggestions about how to fix the issue would be gratefully received.
Here's my BCP call:
BCP "SELECT Id, Name FROM database_name.owner_name.table_name WHERE Id = '0011602001r4ddgAZA'" queryout C:\temp\test.csv "-t|" -w -T -S. -C 65001
Here's the code in Snowflake:
--Create a file format
create or replace file format SFCI_Account
type = 'CSV'
field_delimiter = '|'
validate_utf8 = True
;
-- Create a Stage object
create or replace stage SFCI_Account_stage
file_format = SFCI_Account;
-- Check my file is there
list @SFCI_Account_stage;
-- Copy the file into the table
copy into Test
from @SFCI_Account_stage
file_format = (format_name = SFCI_Account)
pattern='.*.csv.gz'
on_error = 'skip_file';