0
votes

I am trying to use Get Metadata activity in Azure Data Factory in order to get blob filenames and copy them to Azure SQL database table. I added the stored procedure activity after Get Metadata activity. Here is my new activity stored procedure, I added the parameter as suggested however, I changed the name to JsonData as my stored procedure requires this parameter.

storedprocedure1

This is my stored procedure.

/****** Object:  StoredProcedure [dbo].[InsertDataJSON]    Script Date: 10/14/2020 11:01:30 AM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
/*USE datafactorypoc1*/

ALTER PROCEDURE [dbo].[InsertDataJSON] (
    @JsonData NVARCHAR (MAX)
)
AS
BEGIN
    DECLARE @err int

    INSERT INTO extractFileNameTest1 (ItemName, ItemType, EffIntegrationRunTieme, ExecutionDuration, DurationInQueue)
    SELECT ItemName, ItemType, EffIntegrationRunTieme, ExecutionDuration, DurationInQueue
    FROM OPENJSON (@JsonData, N'$') WITH (
        ItemName VARCHAR(255) N'$.ItemName',
        ItemType VARCHAR(255) N'$.ItemType',
        EffIntegrationRunTieme VARCHAR(255) N'$.EffIntegrationRunTieme',
        ExecutionDuration INT N'$.ExecutionDuration',
        DurationInQueue INT N'$.DurationInQueue'
    )    

    SELECT @err = @@ERROR
    RETURN (@err)
END

storedprocedure2

I get this error at the stored procedure:

{
    "errorCode": "2402",
    "message": "Execution fail against sql server. Sql error number: 13609. Error Message: JSON text is not properly formatted. Unexpected character 'S' is found at position 0.",
    "failureType": "UserError",
    "target": "Stored procedure1",
    "details": []
}

storedprocedure3

But when I check the input, it seems like it already successfully reading the json string itemName.

storedprocedure4

But, when I check output, it's not there.

storedprocedure5

Could you please help me check what I did wrong here? Is it my stored procedure? Thank you very much in advance.

Update 15/10/2020 I created a new pipeline and move the Stored Procedure inside ForEach activity to connect to Get_Filename_2 instead of the first one ad followed:

ADFPipelineTwo1

Here I changed the value of parameter to Get_Filename_2 and output as itemName instead of childitems (because I got an error with using childitems because childitems is from Get_Filename_1 and not 2).

ADFpipelinetwo2

After executing the pipeline (which was failed), the input of the last stored procedure is:

{
    "storedProcedureName": "[dbo].[InsertDataJSON]",
    "storedProcedureParameters": {
        "JsonData": {
            "value": "FRRNSC84FIN1_A2276801_20200103-152534.json",
            "type": "String"
        }
    }
}

ADFpipelinetwo3

And the output is:

{
    "effectiveIntegrationRuntime": "DefaultIntegrationRuntime (West Europe)",
    "executionDuration": 0,
    "durationInQueue": {
        "integrationRuntimeQueue": 0
    },
    "billingReference": {
        "activityType": "ExternalActivity",
        "billableDuration": [
            {
                "meterType": "AzureIR",
                "duration": 0.016666666666666666,
                "unit": "Hours"
            }
        ]
    }
}

ADFpipelinetwo4

For this pipeline, it's failed with following error message.

{
    "errorCode": "2402",
    "message": "Execution fail against sql server. Sql error number: 13609. Error Message: JSON text is not properly formatted. Unexpected character 'F' is found at position 0.\r\nFRRNSC84FIN1_A2276801_20200103-152534.json",
    "failureType": "UserError",
    "target": "Stored procedure1",
    "details": []
}

ADFpipelinetwo4

On my old pipeline where I have stored procedure outside of ForEach loop. The pipeline did not fail:

ADFpipeline1

Here is the input of the last stored procedure activity:

{
    "storedProcedureName": "[dbo].[InsertDataJSON]",
    "storedProcedureParameters": {
        "JsonData": {
            "value": "[{\"name\":\"FRRNSC84FIN1_A2274001_20200103-143748_back_camera_0.jpeg\",\"type\":\"File\"},{\"name\":\"FRRNSC84FIN1_A2274001_20200103-143748_right_camera_0.jpeg\",\"type\":\"File\"},{\"name\":\"FRRNSC84FIN1_A2274801_20200103-144811_right_camera_0.jpeg\",\"type\":\"File\"},{\"name\":\"FRRNSC84FIN1_A2275201_20200103-145229_right_camera_0.jpeg\",\"type\":\"File\"},{\"name\":\"FRRNSC84FIN1_A2276801_20200103-152534.json\",\"type\":\"File\"}]"
        }
    }
}

ADFpipieline2

And here is the output:

{
    "effectiveIntegrationRuntime": "DefaultIntegrationRuntime (West Europe)",
    "executionDuration": 0,
    "durationInQueue": {
        "integrationRuntimeQueue": 0
    },
    "billingReference": {
        "activityType": "ExternalActivity",
        "billableDuration": [
            {
                "meterType": "AzureIR",
                "duration": 0.016666666666666666,
                "unit": "Hours"
            }
        ]
    }
}

ADFpipeline3

This pipeline ran successfully however the result in the SQL is not what I really expected, also many NULL values before the first json string but it insert all itemName in the same location. This is because it's outside of ForEach loop I think. But why it insert so many NULL?

SQLresult

Here is my stored procedure that I modified:

/****** Object:  StoredProcedure [dbo].[InsertDataJSON]    Script Date: 15/10/2020 10:31:51 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
/*USE datafactorypoc1*/

ALTER PROCEDURE [dbo].[InsertDataJSON] (
    @JsonData NVARCHAR (MAX)
)
AS
BEGIN
    PRINT @JsonData

    /*INSERT INTO Logs values (DEFAULT, @JsonData)*/
    INSERT INTO extractFileNameTest1 values (@JsonData, DEFAULT)
    SELECT * FROM
        OPENJSON (@JsonData)
    WITH (
        ItemName VARCHAR(255) N'$.ItemName',
        ItemType VARCHAR(255) N'$.ItemType'
    )
END

I tried to delete all data in my test SQL table to see the output of the 2nd fail pipeline. I notice that it did parse correctly the json string filename inside my table and all of them (I only have 5 files in blob storage). But other data are NULL.

ADFpipelinetwo6

To conclude, the result in SQL table of the new pipeline is much better, each file name on the different cell of the table but I got pipeline run failed. Could you please help me check what I did wrong here? Is it the stored procedure or the expression in my last Stored procedure activity inside ForEach loop?

Thank you very much in advance.

2
Pleas show us the settings and output of the "Get_File_Name1". And you have a get metaata active "Get_File_Name2" in foreach contained active. I guess that you are using Foreach to loop the files name(Get_File_Name1.output.childtiems) and using "Get_File_Name2" to get the file item name. Then the stored procedure should connect to the "Get_File_Name2" not after foreach.Leon Yue
Hello thank you very much for your reply @Leon Yue I did more test on a new pipeline to compare the result. I updated my question above as I cannot place screenshot here. Could you please help me check? Thank you very much in advance.Shuti
I'm trying to repeat the error. I create the stored procedure in SSMS and it's strange that I also can't get itemName and itemType, it's null at my side too. Still need time to figure out it.Leon Yue
Thank you very much for your help on this @Leon Yue, it’s really strangeShuti
Yeah, it makes me have a headache and lose my hair!Leon Yue

2 Answers

0
votes

For this scenario, I use only 1 Get Metadata activity with: Dataset of Get Metadata activity = Binary files from Blob storage Field List of Get Metadata = Child items

This Get Metadata activity has output connected to ForEach activity: ForEach activity settings => Items => @activity('Get_FileName_1').output.childItems

Inside ForEach activity, there is 1 activity which is Stored Procedure: Stored Procedure Settings has Linked Service = Azure SQL Database with selected Stored Procedure. In the stored procedure that I wrote, I defined the columns that I will read from blob file names. Then, for the parameters of Stored Procedure activity, I define exact same parameters numbers = all columns of @JsonData in stored procedure. Then, for each value of the parameter, I use Azure Data Factory function @substring to extract different parts of blob file name. Then insert these to the correct columns in Azure SQL table.

Source for Substring function: https://docs.microsoft.com/en-us/azure/data-factory/control-flow-expression-language-functions#substring

0
votes

I also faced same issue while loading output of get_metadata activity with help of sp.

The simple answer is while feeding output of get_metadata activity to SP parameter, you need to convert it to string. Like below:

@string(activity('get_metadata').output.childitems)