0
votes

I am just starting to learn Data Factory and I have a very simple pipeline that is loading a CSV file from Azure Blob Storage into Azure SQL Data Warehouse.

The source CSV file column 3 includes dates in the format of MM/dd/yyyy and my target SQL DW table has this column typed as date.

My pipeline fails with the following error, presumably because I am not in US and so an implicit type conversion won't work:

Activity BlobToSQLDW failed: ErrorCode=FailedDbOperation,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error happened when loading data into SQL Data Warehouse.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Data.SqlClient.SqlException,Message=HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopSqlException: Error converting data type VARCHAR to DATETIME.,Source=.Net SqlClient Data Provider,SqlErrorNumber=107090,Class=16,ErrorCode=-2146232060,State=1,Errors=[{Class=16,Number=107090,State=1,Message=HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopSqlException: Error converting data type VARCHAR to DATETIME.,},],'

I have tried making edits to the schema for the blob source, including setting format and culture (see screenshot below) but to no avail yet.

Azure Blob Storage Schema screenshot:

Azure Blob Storage Schema screenshot

I did manage to workaround the issue by creating a staging table in SQL DW with all columns typed as varchar and changing the ADF pipeline to target that instead, and then I complete the task inside SQL DW using a T-SQL Insert (staging-to-target) to make the type conversions.

I'm sure there is a proper way to do this within ADF though. Can anyone please advise?

2

2 Answers

2
votes

It is because ADF is using Polybase for loading file data to SQL DW. According to official documentation Polybase default date format is yyyy-MM-dd. I haven't found any way to specify a date format in ADF Copy Activity properties nor source Dataset properties. The only option is disabling Polybase and It will work. You should uncheck "Allow Polybase" in "Sink" tab in Copy Activity properties.

0
votes

Date is only supported in SQL type data source (Azure SQL, SQL server). in other dataset, only DateTime is available. I didn't find there is a date option in blob dataset schema, can you check again if it is a blob dataset ?