I am trying to copy data from on-prem SQL server to Azure Data Lake Storage (ADLS) via Azure Data Factory (ADF). Everything seems to work, except when I run (debug or trigger) the pipeline, I get the error:
{ "errorCode": "2200", "message": "Failure happened on 'Sink' side. ErrorCode=UserErrorAdlsFileWriteFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Writing to 'AzureDataLakeStore' failed. Message: The remote server returned an error: (411) Length Required.. Response details: \r\nLength Required\r\n\r\n
Length Required
\r\nHTTP Error 411. The request must be chunked or have a content length.
\r\n\r\n,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Net.WebException,Message=The remote server returned an error: (411) Length Required.,Source=System,'", "failureType": "UserError", "target": "CopyData1" }
What is really odd, is that the following pipelines DO work:
- SQL tbl1 -> SQL tbl2
- ADLS source.txt -> ADLS sink.txt
I.e. read/write access works as expected. The latter pipeline is also able to create/overwrite the sink.txt file.
But when I run the pipeline
- SQL tbl1 -> sink.txt
I get the Length Required error. And if sink.txt exists, the pipeline even deletes it!
I'm using ADFv2, ADLS Gen1, ADF & ADLS resides in the same subscription/resource group, using selfhosted/Azure Integration Runtime (for SQL / ADLS respectively). I have tested with source statement as simple as "SELECT 1 Col". Also tested without dataset schema, and with schemas+mappings.
Is this a bug, or am I missing something? Which “Length” is required?
EDIT 1: Minimal JSON scripts
pipeline1.json
{
"name": "pipeline1",
"properties": {
"activities": [
{
"name": "CopyData1",
"type": "Copy",
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": "SELECT TOP 1 'x' AS col1 FROM sys.tables"
},
"sink": {
"type": "AzureDataLakeStoreSink"
},
"enableStaging": false,
"dataIntegrationUnits": 0
},
"inputs": [
{
"referenceName": "table1",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "sink1",
"type": "DatasetReference"
}
]
}
]
}
}
table1.json
{
"name": "table1",
"properties": {
"linkedServiceName": {
"referenceName": "SqlServer1",
"type": "LinkedServiceReference"
},
"type": "SqlServerTable",
"typeProperties": {
"tableName": "sys.tables"
}
}
}
sink1.json
{
"name": "sink1",
"properties": {
"linkedServiceName": {
"referenceName": "AzureDataLakeStore1",
"type": "LinkedServiceReference"
},
"type": "AzureDataLakeStoreFile",
"structure": [
{
"name": "col1",
"type": "String"
}
],
"typeProperties": {
"format": {
"type": "TextFormat",
"columnDelimiter": ",",
"rowDelimiter": "",
"nullValue": "\\N",
"treatEmptyAsNull": true,
"skipLineCount": 0,
"firstRowAsHeader": true
},
"fileName": "sink1.txt",
"folderPath": "myDir"
}
}
}
EDIT 2: Summary of conducted tests
- SQL -> ADLS Error
- Oracle -> ADLS Error
- SQL -> Blob OK
- Oracle -> Blob OK
- SQL -> SQL OK
- ADLS -> ADLS OK
- AzureSQLDB -> ADLS OK