1
votes

I am trying to copy from azure data lake gen2 to table in azure synapse warehouse using local ssms. The COPY INTO statement is neither throwing any errors and nor dumping the data. I am copying the pandas df from centos server to azure data lake gen2 using sep=',', encoding='utf-8'. Here is the COPY statement that I am using.

COPY INTO dbo.SALES_CUTOMER_D 
FROM 'https://acoount_name/test-file-system/SALES_CUSTOMER_D_0.csv'
WITH (
 FILE_TYPE = 'csv',
 CREDENTIAL=(IDENTITY= 'Storage Account Key', SECRET=''),
 FIELDQUOTE = '"',
 FIELDTERMINATOR=',',
 ROWTERMINATOR='\r\n',
 ENCODING = 'UTF8',
 FIRSTROW = 2
)
1
Try ERRORFILE option to see if it prints any errors to ERRORFILE. Remove the FIELDQUOTE, FIELDTERMINATOR, ROWTERMINATOR, you're using default values. It could have an effect. Beyond that you'll have to put the contents of the csv file and DDL for SALES_CUTOMER_D if you want someone to help you more. Also a good idea to specify the list of columns (COPY INTO dbo.SALES_CUTOMER_D (col1, col2, ...) FROM ...).Kashyap
@Kashyap yes i tried it for sample csv called employee.csv wirh 10 rows with schema id int and name varchar 100 even for taht also same thing is happeninguser8112786
also when am trying to write errors to error file again it is throwing the error that access is denied error code 5user8112786
For ERROR FILE, provide full URL and for ERROR FILE CREDENTIAL use sas token not account key. Also run unix2dos on this input CSV file and try.Kashyap

1 Answers

0
votes

Check if your file has Unix-style line endings (LF) instead of Windows-style (CRLF).

See Difference between CR LF, LF and CR line break types? if you're not clear on CRLF.

Easiest way I know of checking is to open file in vi in binary mode with set list:

vi -b -c 'set list' <file>

To verify if this is the problem or not, you can do one of the following:

  1. Tell COPY what line endings are in your file:

     COPY INTO dbo.SALES_CUTOMER_D 
     FROM 'https://acoount_name/test-file-system/SALES_CUSTOMER_D_0.csv'
     WITH (
        FILE_TYPE = 'csv',
        CREDENTIAL=(IDENTITY= 'Storage Account Key', SECRET=''),
        ROWTERMINATOR='0x0A',
        FIRSTROW = 2
     )
    
  2. Confirm that it's actually reading the file by making it parse the header. Remove FIRSTROW = 2.

  3. Change line-endings:

  • unix2dos <csv file>
  • upload to datalake and try COPY again, without ROWTERMINATOR='\r\n', (that's the default value).

A little gotcha:

COPY treats ‘\n’ as ‘\r\n’ internally. For more information, see the ROWTERMINATOR section.

In other words:

  • If we don’t specify ROWTERMINATOR option or specify ROWTERMINATOR=’\n’ or ROWTERMINATOR=’0x0D0A’, then the engine uses \r\n as terminator (Windows style).
  • If we specify ROWTERMINATOR=’0x0A’ then engine uses ‘\n’ as the terminator (Unix style)