1
votes

What's the best (and quick) way to copy data from Azure blob into Azure Data Lake storage? This copy is a one time job. The data set is about 50 G and includes about 10 files. Each file has about 20 columns.

I have looked at Azure Data Factory and it requires to create the datesets for each file in the Azure Data Factory. This is very tedious.

2

2 Answers

2
votes

You can use AdlCopy for copying Azure Storage Blobs in to Azure Data Lake Store.

1
votes

As a contemporary answer for those coming across this question.

It depends if you mean quick in terms of "speed to develop" or "speed to transfer".

I suspect speed to develop based upon your concerns with data factory.

Data Factory now has a "Copy Data Wizard" which makes it quick to setup these things.

Judging by the age of the question, this is likely in the Data Factory V1 timeframe. Data Factory V2 is out now and far easier.

Copy Data Wizard

https://docs.microsoft.com/en-gb/azure/data-factory/quickstart-create-data-factory-copy-data-tool

In regards to the speed to transfer, Data Factory publishes the following stats, so it'll be around 105MB/s transfer speed, I suspect even faster.

transfer speeds for data factory between blob storage and ADLS

For Azure Data Lake Gen2, AdlCopy isn't mentioned in the documentation anymore, instead AzCopy is mentioned.

I realise you already have your question answered, but just in case people come across this in the future.