I would like to use Azure Data Factory with Azure Data Lake Analytics as action, but without success.
This is my PIPELINE script
{
"name": "UsageStatistivsPipeline",
"properties": {
"description": "Standardize JSON data into CSV, with friendly column names & consistent output for all event types. Creates one output (standardized) file per day.",
"activities": [{
"name": "UsageStatisticsActivity",
"type": "DataLakeAnalyticsU-SQL",
"linkedServiceName": {
"referenceName": "DataLakeAnalytics",
"type": "LinkedServiceReference"
},
"typeProperties": {
"scriptLinkedService": {
"referenceName": "BlobStorage",
"type": "LinkedServiceReference"
},
"scriptPath": "adla-scripts/usage-statistics-adla-script.json",
"degreeOfParallelism": 30,
"priority": 100,
"parameters": {
"sourcefile": "wasb://nameofblob.blob.core.windows.net/$$Text.Format('{0:yyyy}/{0:MM}/{0:dd}/0_647de4764587459ea9e0ce6a73e9ace7_2.json', SliceStart)",
"destinationfile": "$$Text.Format('wasb://nameofblob.blob.core.windows.net/{0:yyyy}/{0:MM}/{0:dd}/DailyResult.csv', SliceStart)"
}
},
"inputs": [{
"type": "DatasetReference",
"referenceName": "DirectionsData"
}
],
"outputs": [{
"type": "DatasetReference",
"referenceName": "OutputData"
}
],
"policy": {
"timeout": "06:00:00",
"concurrency": 10,
"executionPriorityOrder": "NewestFirst"
}
}
],
"start": "2018-01-08T00:00:00Z",
"end": "2017-01-09T00:00:00Z",
"isPaused": false,
"pipelineMode": "Scheduled"
}}
I have two parameters variables sourcefile
and destinationfile
, which are dynamic (path is from Date).
Then I have this ADLA script for execution.
REFERENCE ASSEMBLY master.[Newtonsoft.Json];
REFERENCE ASSEMBLY master.[Microsoft.Analytics.Samples.Formats];
USING Microsoft.Analytics.Samples.Formats.Json;
@Data =
EXTRACT
jsonstring string
FROM @sourcefile
USING Extractors.Tsv(quoting:false);
@CreateJSONTuple =
SELECT
JsonFunctions.JsonTuple(jsonstring) AS EventData
FROM
@Data;
@records =
SELECT
JsonFunctions.JsonTuple(EventData["records"], "[*].*") AS record
FROM
@CreateJSONTuple;
@properties =
SELECT
JsonFunctions.JsonTuple(record["[0].properties"]) AS prop,
record["[0].time"] AS time
FROM
@records;
@result =
SELECT
...
FROM @properties;
OUTPUT @result
TO @destinationfile
USING Outputters.Csv(outputHeader:false,quoting:true);
Job execution fails and the error is :
EDIT:
It seems, that Text.Format is not executed and passed into script like string ... Then in Data Lake Analytics Job detail is this :
DECLARE @sourcefile string = "$$Text.Format('wasb://nameofblob.blob.core.windows.net/{0:yyyy}/{0:MM}/{0:dd}/0_647de4764587459ea9e0ce6a73e9ace7_2.json', SliceStart)";