4
votes

I'm having trouble using parameterized data sets when referenced through a Data Factory Mapping Data Flow.

I'm trying to write a generic ELT pipeline that will read from a CSV file in blob storage and write it to a table in Azure SQL Database.

When using a Copy Activity, everything works well (refer to pipeline "Import CSV file into generic SQL table using Copy Activity").

A CSV file (file name passed into the Data Factory Pipeline as a parameter) is read from blob storage. The structure of the CSV file is determined using a Get Metadata activity.

The structure of the CSV file is passed to a SQL Server stored procedure, which locates a view with the same structure as the CSV file. If a view with the same structure doesn't exist, a table and view are created. In either case the schema name and view name are passed back to the pipeline. The contents of the CSV file are then read and written to the SQL Server view.

This works well, but I don't have the opportunity to add metadata to each written row. So I created a mapping Data Flow that will add metadata (Data Factory name, Data Factory Pipeline Name, Data Factory Pipeline Run ID) using a derived column transform.

When I attempt to debug the dataflow, I get an error "An error occurred, please view notification for more details." I'm unable to locate any notifications that give information on the error.

The data flow works when I use data sets with the filename hardcoded. When I switch to the parameterized data flow (passing in parameters like pipeline().DataFactory, etc), that's when I get the error.

I've tried hard-coding the mapping data flow parameter values, still no joy.

The template passed validation, but if I try to run the pipeline "Import CSV file into generic SQL table using Data Flow" by triggering the pipeline, it fails very quickly and gives me a cryptic error message:

ErrorCode=InvalidTemplate, ErrorMessage=Unable to parse expression 'body('Copy Generic CSV Source to Generic SQL SinkComposeRuntimeVariables')?.Copy Generic CSV Source to Generic SQL Sinkd7ea532482e64afc88501b46924214b3?.ReadFromSourceCSVFileInBlobStorage.FileName'

Any Azure Data Factory experts out there willing to lend a hand?

All source code is available at https://github.com/marc-jellinek/AzureDataFactoryDemo_GenericSqlSink

Thanks in advance!

2

2 Answers

5
votes

The problem is there are spaces in the name of the data flow.

The issue is parameter expression is not supported for data flow which has whitespaces in the name. Could you try to rename your data flow “Copy Generic CSV Source to Generic SQL SinkComposeRuntimeVariables” and then preview? Meanwhile we will add more validation on UX side to handle the naming restriction of data flow properly.

I have removed the spaces from the name of the dataflow and successfully tested.

1
votes

You can pass those values directly into the data flow activity from the pipeline using expressions in data flow parameters, you don't need to use dataset parameters.

To debug, switch on the Debug switch and then in the Data Flow designer, go to the Data Preview tab on your transformations. You can test with default values in the data flow parameters or set it at debug time using the debug settings.