I need to transfer files from google cloud storage to azure blob storage.
Google gives a code snippet to download files to byte variable like so:
# Get Payload Data
req = client.objects().get_media(
bucket=bucket_name,
object=object_name,
generation=generation) # optional
# The BytesIO object may be replaced with any io.Base instance.
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, req, chunksize=1024*1024)
done = False
while not done:
status, done = downloader.next_chunk()
if status:
print 'Download %d%%.' % int(status.progress() * 100)
print 'Download Complete!'
print fh.getvalue()
I was able to modify this to store to file by changing the fh object type like so:
fh = open(object_name, 'wb')
Then I can upload to azure blob storage using blob_service.put_block_blob_from_path
.
I want to avoid writing to local file on machine doing the transfer.
I gather Google's snippet loads the data into the io.BytesIO() object a chunk at a time. I reckon I should probably use this to write to blob storage a chunk at a time.
I experimented with reading the whole thing into memory, and then uploading using put_block_blob_from_bytes
, but I got a memory error (file is probably too big (~600MB).
Any suggestions?
azure-storage-python
doesn't seem to support it yet. - minghan