2
votes

I have an Azure data factory pipeline which defines data imports from CSV files to SQL server database tables. Some of the tables have nullable datetime fields and the CSV files supply nulls as "null" (i.e. within quotes). However, when I run the pipeline, I'm getting several errors failing to convert 'null' to datetime.

I've checked the Azure documentation which states that you can define how null values are presented in the CSV file. Currently, I have the following configuration:

"type": "AzureBlob",
"linkedServiceName": "AzureStorageLinkedService",
"typeProperties": {
  "folderPath": "processingtransactions/",
  "format": {
    "type": "TextFormat",
    "columnDelimiter": ",",
    "quoteChar": "\"",
    "nullValue":  "null",
    "firstRowAsHeader": true
  }
},

However, I'm still getting the error:

"Message=Column 'DateOfBirth' contains an invalid value 'null'. Cannot convert 'null' to type 'DateTime'.,Source=Microsoft.DataTransfer.Common,''Type=System.FormatException,Message=The string was not recognized as a valid DateTime."

I've also tried changing the configuration to: "NullValue": "null" and "nullValue": "NULL"

But I'm still getting the same error. The only way I've managed to get the data to import is to replace all the "null" values in the CSV file with "" (empty string) but this isn't ideal.

Does anyone know what syntax I need to get the import to accept the "null" string?

1

1 Answers

2
votes

"null" in quotes won't work unfortunately. You have three choices as I see it:

  1. fix up what ever process generates the raw files to simply leave the field empty; this is the normal for .csvs:

    1,"29 February 2016",,,
    2,,,,
    

    Otherwise using the value null (not in quotes) would also work if you set the nullValue property to null, eg

    "format": {
        "type": "TextFormat",
        "columnDelimiter": ",",
        "nullValue": "null",
        "quoteChar": "\""
    }
    

This file will work:

1,"29 February 2016"
2,null

This file will NOT work:

1,"29 February 2016"
2,"null"

Other options:

  1. Load the data to a staging table. Make the target column a string / VARCHAR in the staging table. Load the data to there first and clean it / remove the "null" string before inserting into the main table.
  2. If your target / sink Azure SQL Database or normal SQL Server (on a VM or otherwise) use a stored procedure task with table-valued parameter as per here.