We are modelling a directory structure in Azure Blob storage. I would like to be able to copy all files from a folder to a local directory. Is there any way to copy multiple files from blob storage at once that match a pattern or do I have to get these files individually?
2 Answers
As you may already know, blob storage only support 1 level hierarchy: You have blob containers (folder) and each container contains blobs (files). There's no concept of folder hierarchy there. The way you create an illusion of folder hierarchy is via something called blob prefix
. For example, look at the screenshot below:
In the picture above, you see two folders under images
blob container: 16x16
and 24x24
. In cloud, the blob names include these folder names. So the name of AddCertificate.png
file in folder 24x24 in blob storage is 24x24/AddCertificate.png
.
Now coming to your question, you would still need to download individual files but what storage client library allows you to do is fetch a list of blobs by blob prefix. So if you want to download all files in folder 24x24
(or in other words you want to download all blobs with prefix 24x24
), you would first list the blobs with prefix 24x24
and then download each blob individually. On the local computer, you could create a folder by the name of the prefix.
You can refer below code as a sample reference ((its written in javascript but you can easily map the logic to any language). This code is maintained by Microsoft.
https://github.com/WindowsAzure/azure-sdk-tools-xplat/blob/master/lib/commands/storage.blob._js#L144 https://github.com/WindowsAzure/azure-sdk-tools-xplat/blob/master/lib/commands/storage.blob._js#L689
The second link explains how to parse blob prefixes and get the folder hierarchy out of it. It also shows how to download blob using multiple threads and ensure the integrity of the blob using MD5.
Just including the high level code that process blob path containing prefixes. Please refer above link for full implementation, I cannot copy paste entire implementation here.
if (!fileName) {
var structure = StorageUtil.getStructureFromBlobName(specifiedBlobName);
fileName = structure.fileName;
fileName = utils.escapeFilePath(fileName);
structure.dirName = StorageUtil.recursiveMkdir(dirName, structure.dirName);
fileName = path.join(structure.dirName, fileName);
dirName = '.'; //FileName already contain the dirname
}