I need to copy one storage account into another. I have created a Runbook
and schedule it to run daily. This is an incremental copy.
What I am doing is
- list the blobs in source storage container
- Check the blobs in destination storage container
- If it doesn't exist in destination container copy the blob
Start-AzureStorageBlobCopy
While this works for containers with small size, this takes a very long time and is certainly cost ineffective for containers with say 10 million block blobs because every time I run the task I have to go through all those 10 million blobs.
I don't see it in documentation but is there any way i can use conditional headers like DateModifedSince
some thing like Get-AzureStorageBlob -DateModifiedSince date
in powershell.
I have not tried but I can see it is possible to use DateModifiedSince
in nodejs library
Is there anyway I can do it with powershell so that I can be able to use Runbooks
?
EDIT:
Using AzCopy made a copy of storage account that contains 7 million blobs, I uploaded few new blobs and started the azcopy again. It still takes significant amount of time to copy few new uploaded files.
AzCopy /Source:$sourceUri /Dest:$destUri /SourceKey:$sourceStorageKey /DestKey:$destStorageAccountKey /S /XO /XN /Y
It is possible to filter for a blob with blob name in no time
For example Get-AzureStorageBlob -Blob
will return the blob immediately from 7 million records
It should have been possible to filter blob(s) with other properties too..
\XO \XN
option. Since it is my first copying it's taking time. My container is > 100 gb. Once that is done i will test if AzCopy doesn't take the same amount just to copy one new blob. Still if AzCopy works (fingers crossed) I will have to move out of automation (use VM instead probably) – Sami