1
votes

Scenario: -Backups located in Azure storage -Folder containing >100,000 folders -Inside of each folder is a file in the following format: MM-DD-YYYY_randomnumber.wav.gz

We need to pull all files from two months (unknown number, likely 30,000-40,000).

Looks like AzCopy is the utility we need to accomplish this.

The following command is how I'd imagine it would work, but am unable to get it to do so:

AzCopy /Source:https://path.to.files/path/to/files /Dest:C:\test /SourceKey:key /Pattern:11-*-2016_*.wav.gz /S

The following grabs the parent folder (it's named the same as the files within them, just without the .wav.gz extension), which wouldn't be ideal, but would be workable. However, this would grab files from multiple years:

AzCopy /Source:https://path.to.files/path/to/files /Dest:C:\test /SourceKey:key /Pattern:11 /S

I've read the documentation, and it mentions wildcards can be used in some circumstances but not others, but I'm not entirely sure what it's meaning.

Thanks!

3

3 Answers

1
votes

I know this question has been long since asked, but there is now a --include-pattern flag in azcopy that will allow you to create more specific wildcard patterns like the one suggested in the original post.

0
votes

No, it's infeasible. Per the document of AzCopy regarding /Pattern option:

If the specified source is a blob container or virtual directory, then wildcards are not applied. If option /S is specified, then AzCopy interprets the specified file pattern as a blob prefix. If option /S is not specified, then AzCopy matches the file pattern against exact blob names.

0
votes

You don't have to use Azcopy, a few lines of powershell using the AzureRM module should be able to download the files. Altough I have not tested it with a large numer of files.

And with powershell, you can easily edit the "blobnames" variable and pick the correct folders and files you want using wildcards. The folders in the Azure storage account you chose will also be created on your local drive.

$blobnames = "somebackups/2016.*/11-*-2016_*.wav.gz"

The code below is pretty self explaining.

$storageAccountName = "storage" 
$storageAccountKey = "SuperLongKey"
$containerName = "backup"
$localDirectory = "c:/temp/stackoverflow"
$blobnames = "11-*-2016_*.wav.gz"

$ctx = New-AzureStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $storageAccountKey

$blobsToDownload = Get-AzureStorageBlob -Context $ctx -blob $blobnames -Container $containerName

$blobsToDownload | Get-AzureStorageBlobContent -Destination $localDirectory