I have a bunch of storage buckets that publish notifications to a Pub/Sub topic when files get uploaded. Then I have a cloud function subscribed to the Pub/Sub topic that copies these files to their final destination buckets. This all works just fine for most files, but when I have large files (> 1GB) they fail to copy. The source buckets are multi-regional and the destination buckets are regional and nearline.
My code is essentially:
client = storage.Client()
src_bucket = client.get_bucket(src_bucket_name)
src_blob = src_bucket.get_blob(src_filename)
dst_bucket = client.get_bucket(dst_bucket_name)
dst_blob = dst_bucket.blob(dst_filename)
dst_blob.rewrite(src_blob)
Initially, the cloud function was timing out at 60 seconds so I assumed that was the issue, but then I bumped the cloud function timeout to 540 seconds, but the function is still timing out. I have the function retrying for 20 minutes so I can see that the issue is repeatable. After bumping the cloud function timeout up failed, I read the blob docs and saw that blob.rewrite also has a default timeout of 60 seconds so I bumped that up to 540 seconds as well, but that is still timing out.
At this point, I am not sure what I am missing. Is this a timeout issue? Or does this have something to do with Pub/Sub publishing multiple messages so I could have multiple cloud functions trying to make the same copy simultaneously? Or is there a better way to move large files between buckets automatically?