5
votes

Does anyone know which connection and Data Flow Component to use for ADLS (Azure Data Lake Store) gen2?

I've managed to use the blob connector in the connection manager and successfully connect to ADLS Gen2, but when I try to use the blob source component I get a 400 bad request. Works fine if it's just a blob storage without HNS.

The ADLS components states it's just for ADLS gen 1.

So how to read and write to/from ADLS Gen 2?

5
"Blob storage APIs aren't yet available to Azure Data Lake Storage Gen2 accounts." Source: Known issues with Azure Data Lake Storage Gen2. The preferred way is to Copy data to or from Azure Data Lake Storage Gen2 using Azure Data Factory - rickvdbosch

5 Answers

2
votes

A current version of SSIS Azure Feature Pack supports ADLS Gen2. It can be used as a data source or destination in dataflow:

enter image description here

The screenshot is to show it as a destination, but the ADLSgen2 works well also as a source via corresponding "Flexible File Destination" and "Flexible File Source"

enter image description here

0
votes

First of all, based on the great link provided by @rickvdbosch it looks like that there are many temporary limitations with Azure Data Lake Storage Gen2 concerning the BLOB Storage API. Which means that it is not a component limitation and maybe you should wait until it will be integrated with SSIS.

Microsoft SQL SERVER Feature pack for Azure

If you meant these components when you mentioned that:

The ADLS components states it's just for ADLS gen 1.

Then ignore this part.

I am not pretty sure if it supports Gen2, but I think you can use the Azure Data Lake Store components which are a part of the Microsoft SQL SERVER feature pack for Azure. For more information you can refer to:

Download Link


Other methods

If the suggestion above didn't work then you should use Azure Data Factory or a command line by Installing AWS CLI and using AzCopy v10

0
votes

I got the following info: "At the moment Gen 2 don’t support BLOB API (but it will in a short time) and hence, SSIS is not able to connect."

So for SSIS it's currently either ADLS Gen 1, or blob store

0
votes

I used the Script Task to write files or System.Objects (converted to csv in Memory) to Azure Storage Gen 2 (Hierarchical Namespace Enabled) using the Rest API. I did this as a demo until the SSIS components are released.

0
votes

You can't write to ADLS Gen2 using the old components from the Azure Feature Pack, but you can connect to the blob Gen2 (non-hierarchical) using the Azure Blob Destination Component.

enter image description here