0
votes

I have an Azure Data Factory v2 pipeline that's pulling data from a Cosmos DB collection. This collection has a property that's an array.

I want to, at the least, be able to dump that entire property's value into a column in SQL Azure. I don't need it parsed (although that would be great too), but ADF lists this column as "Unsupported Type" in the dataset definition and listed it in the Excluded Columns section.

Here is an example of the JSON I'm working with. The property I want is "MyArrayProperty":

{
    "id": "c4e2012e-af82-4c48-8960-11e0436e6d3f",
    "Created": "2019-06-14T16:04:13.9572567Z",
    "Updated": "2019-06-14T16:04:14.1920988Z",
    "IsActive": true,
    "MyArrayProperty": [
                {
                    "SomeId": "a4427015-ca69-4958-90d3-0918fd5dcac1",
                    "SomeName": "BlahBlah"
                }
            ]
    }
}

I've tried manually specifying a column in the ADF data source like "MyArrayProperty" and using a string data type, but the value always comes across as null.

2

2 Answers

0
votes

please check this document about schema mapping example between MongoDB and Azure SQL. Basically you should define your collectionReference that will iterate through your nested array of objects and do cross apply.

0
votes

There may be a better way to solve this problem, but I ended up creating a second copy activity which uses a query against Cosmos rather than a collection based capture. The query flattened the array like so:

SELECT m.id, c.SomeId, c.SomeName
FROM myCollection m join c in m.MyArrayProperty

I then took this data set and dumped it into a table in SQL then did my other work inside SQL Azure itself. You could also use the new Join pipeline task to do this in memory before it gets to the destination.