1
votes

I am trying to use the GetMetadata activity and a CopyData activity together. My setup looks like this.

enter image description here

I am trying to get files names (using GetMetadata) and load these into a field in a SQL Server table (in conjunction with the CopyData). The CopyData works perfectly fine, but I don't see any way to have the GetMetadata get file names and pass those into a field in a table. In my example, I have 4 fields in the source data which match 4 fields in the destination table. The 5th field, presumably, will be the file name. Apparently, it doesn't really work like this. I read through the documentation below and I still can't figure it out.

https://docs.microsoft.com/en-us/azure/data-factory/control-flow-get-metadata-activity

1

1 Answers

2
votes

Update July 2020

A new feature has been added to the Copy activity recently to allow you add columns, $$FILEPATH is currently the only supported variable. See here for more detail:

https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-overview#add-additional-columns-during-copy


Original Answer

Adding an extra column to a dataset might be considered Transform and the Azure Data Factory v2 (ADF v2) Copy Task does not lend itself easily to Transform. It can do a couple of things like convert from one format (eg csv) to other formats (eg JSON) but it is limited. Maybe at some point in the future they add something to the mapping which allows adding string literals or something similar to the SSIS Derived Column feature, but these type of features are getting added to Mapping Data Flows at the moment it seems.

One way to achieve this however, is to use a stored procedure target with a parameter for the filename and a table-type parameter for the main dataset. It looks a bit like this: ADF Copy Task with Stored Proc target

The downside is you now have to create a supporting table-type in your database (CREATE TYPE) and a stored proc to handle it, something like this:

CREATE TYPE dbo.typ_multiFile AS TABLE (
    col1    CHAR(1) NOT NULL,
    col2    CHAR(1) NOT NULL,
    col3    CHAR(1) NOT NULL
)
GO


CREATE OR ALTER PROC dbo.usp_ins_myTable (
    @fileName       AS VARCHAR (100),
    @typ            AS dbo.typ_multiFile READONLY
    )
AS
SET NOCOUNT ON

INSERT INTO dbo.myTable ( [fileName], col1, col2, col3 )
SELECT @fileName, col1, col2, col3
FROM @typ 

RETURN
GO

Note the Copy Task is inside a ForEach task, as per this diagram:

ADF Logical Diagram